Posts

Showing posts from May, 2026

Databricks Q&A

Image
On May 5 th , I had the pleasure of presenting “Data Cleansing using Databricks ( https://www.meetup.com/ohio-north-database-training/events/314363881/ ).  During the meeting, many good questions were raised.  Listed below are the answers to these questions:   What are jobs in databricks? Jobs are workloads that can be scheduled, managed, and automated without manual intervention. Workloads can be notebooks, SQL queries, or pipelines on a cluster.   How do jobs compare to other automation tools?   What is a “Delta Live Table”? Delta Live Tables (DLT) is a Databricks feature that makes it much easier to build and run data pipelines, either batch or streaming. Simply write the transformations in SQL or Python, and DLT takes care of setting up the infrastructure, tracking dependencies, handling errors, and enforcing data quality rules through its built ‑ in expectations. ...

6 Ways to Prevent Sensitive Data Leaks in AI/ML Applications

Image
1. Utilize Dynamic Data Masking This is a built-in feature in SQL Server 2016 and later. Learn more here .   2. Use Proper Prompt Engineering Provide great detail to ensure the LLM stays on track and adheres to the instructions. Include specifics like the length of the response, what to include and not include in the response, etc.   3. Utilize Content Safety This is a feature available in Microsoft Foundry for many models   4. Use Identity features in Azure OpenAI Azure OpenAI can prevent data leaks by replacing insecure static keys with dynamic, role-based authentication. In addition, leveraging Microsoft Entra ID and Managed Identities, organizations can enforce strict "zero-trust" access controls that ensure only authorized users or applications can interact with sensitive AI resources.  Learn more here . 5. Replace sensitive data columns with foreign key ...

5 Advantages of Granite 4.1 LLMs

Granite 4.1 is IBM’s new family of dense decoder ‑ only LLMs (3B, 8B, 30B) trained on ~15 trillion tokens with a five ‑ phase pre ‑ training pipeline, followed by 4.1M curated SFT (Supervised Fine Tuning). The family of models is released under Apache 2.0.   Granite   4.1 models consistently match or outperform larger competitors, with lower hardware requirements: 30B model outperforms Google’s Gemma ‑ 4 ‑ 31B ‑ it 8B model beats Gemma ‑ 4 ‑ 26B ‑ A4B ‑ it Dense architecture ensures predictable latency and stable token usage   Enterprise ‑ Grade Predictable Inference Granite   4.1 is designed for real ‑ world business workloads where speed, cost, and determinism matter. Strong instruction ‑ following and tool ‑ calling without long chains of thought. Dense models avoid the variability of MoE (Mixture-of-Experts) routing FP8 quantization options reduce memory footprint while preserving accuracy.   High ‑...