Posts

5 Advantages of Granite 4.1 LLMs

Granite 4.1 is IBM’s new family of dense decoder ‑ only LLMs (3B, 8B, 30B) trained on ~15 trillion tokens with a five ‑ phase pre ‑ training pipeline, followed by 4.1M curated SFT (Supervised Fine Tuning). The family of models is released under Apache 2.0.   Granite   4.1 models consistently match or outperform larger competitors, with lower hardware requirements: 30B model outperforms Google’s Gemma ‑ 4 ‑ 31B ‑ it 8B model beats Gemma ‑ 4 ‑ 26B ‑ A4B ‑ it Dense architecture ensures predictable latency and stable token usage   Enterprise ‑ Grade Predictable Inference Granite   4.1 is designed for real ‑ world business workloads where speed, cost, and determinism matter. Strong instruction ‑ following and tool ‑ calling without long chains of thought. Dense models avoid the variability of MoE (Mixture-of-Experts) routing FP8 quantization options reduce memory footprint while preserving accuracy.   High ‑...

AIProjectClient vs. Azure Open AI client

Image
Overview AIProjectClient is the unified Azure AI Foundry project SDK. It's for full Azure AI Foundry project management + agents + datasets + indexes + evaluations + OpenAI client generation. Azure OpenAI Client for Direct model inference (chat, embeddings, images) using OpenAI ‑ compatible endpoints.   Capabilities AIProjectClient provides control in the following areas: 1. Full lifecycle Agents: Create, update, delete, and run Azure AI Agents.  Azure OpenAI client cannot do this.  2. Datasets & File Management: Upload documents, create datasets, and use them in agents or evaluations.  3. Search Indexes: Create and manage RAG indexes inside your project.  4. Evaluations: Rules, taxonomies, evaluators, and insights for model quality.  5. Unified OpenAI Client: AIProjectClient can generate a fully configured OpenAI client (`get_openai_client()`), so no need to manage separate credentials....

MS Foundry Developer Migration Checklist

Image
Microsoft Foundry recently implemented a variety of changes across various areas. To assist with these changes, a Developer Migration Checklist is provided below to help maneuver these changes.   1. Migrate to the Unified SDK 2.0 [] Replace all uses of the old azure-ai-agents package with azure-ai-projects 2.0. [] Update code to use AIProjectClient for model inference, agents, evaluations, memory, and tracing. [] Remove legacy preview flags and update any custom tool or MCP integrations. [] Validate that all tool schemas and agent configurations work under the new client.   2. Move Agents to the Foundry Agent Service (GA) [] Migrate existing agent deployments to the new Agent Service runtime. [] Update agent code to use the OpenAI Responses compatible interface. [] Reconfigure private networking, Entra RBAC, and tracing endpoints. [] Test agent behavior in the updated Agent Playground and tracing...

What is Query Boosting, Weighting, and Thresholding?

Image
Query Boosting means increasing the importance of certain terms or fields in a search query so they influence the ranking more strongly. Sometimes not all parts of a query are equally important. For example: - In a product search, matching the title might matter more than matching the description. - In a document search, matching a keyword might matter more than matching the body text.   For example, if you search for: title:"machine learning"^3 description:"machine learning" The "^3" means “boost the title match 3× more than the description match.”     Weighting is the general idea of assigning different levels of importance to features, fields, or signals during ranking or scoring. Boosting is a type of weighting, but weighting can apply to: - Query terms  - Document fields  - Machine ‑ learning features  - User behavior signals (clicks, recency, popul...

All About AI Foundry Agents

Image
In a recent episode of "Made for Dev" with Stephen Simon, I get into the world of AI Agents in Microsoft Foundry. I explain why agents are the "new microservices" and demonstrate how to build, configure, and orchestrate them using natural language.   What you’ll learn in this video: • What is an AI Agent? Understand the shift from LLMs that just answer questions to agents that execute business tasks [ 01:30 ]. • The "Physician" Analogy: Why an LLM is like a doctor who knows the theory but needs "tools" to actually take your blood pressure [ 02:05 ]. • Step-by-Step Demo: Watch Sam build a weather-forecasting agent from scratch in Azure AI Foundry, including tool integration via Logic Apps [ 04:54 ]. • Evaluation & Quality: How to interpret "Task Adherence" and "Intent Resolution" scores to ensure your agent is performing accurately [ 10:34 ]. • Advanced Orchestra...

Copilot Studio vs. Microsoft Foundry

Image
One of the questions many tech leaders ask is “Which tool should I use for AI applications, Copilot Studio or Microsoft Foundry?”   Copilot Studio is a Low ‑ code, business ‑ friendly tool for the not-very technical user.  It’s fast to deploy and accessible at https://copilotstudio.microsoft.com   Microsoft Foundry (formerly known as Azure AI Foundry) is a Pro ‑ code tool designed to be more developer ‑ centric for the more technical user.  It offers full control, scalability, and is Azure native but accessed from a different URL ( https://ai.azure.com )   Summary    

Handling Missing Data - Part 1

One of the big topics I have been speaking about is “ Data Cleansing for Machine Learning ”.  A major part of the data cleansing process is filling in missing values.  These values could be missing for a variety of reasons (manually entered data, improper data collection, incorrect exception handling, etc.)  Missing data can be summed up into 3 categories: MAR: Missing at Random, for patterns tied to known variables (e.g., age, education) where someone did not want to answer a question. MCAR: Missing Completely at Random, for random glitches, accidental loss (i.e. sensor failure) MNAR: Missing Not at Random, describes a situation where data is missing because of the value of the missing data itself.   To counter the issue of missing data, several techniques are available to fill in missing data.  In my video , I discuss 3 of those techniques. Multiple Imputation by Chained Equations (MICE) handles numeric or...

Mar '26 Regional Tech Events

User Groups Mar 3: Ohio North Database Training Mar 11: Azure Cleveland Mar 19: GLUG .NET Mar 25: Cleveland C# User Group   Conferences Feb – Jun: Agent Camp Mar 16: Memphis Agent Camp  

Databricks Q&A

Yesterday I delivered a presentation on "Data Cleansing using Databricks".  Listed below are the questions that came up as well as the answers related to Databricks.   Does state transfer from one cell to another in the Databricks notebook? Yes, state does transfer from one cell to another, as long as you're running in the same notebook session and on the same cluster.   Can Databricks Jobs (aka "pipelines") be accessed through a notebook? Yes, you can fully interact with Jobs from a notebook using the Databricks REST API or the Databricks CLI. Some of these interactions from a notebook include: Trigger a job run Check job run status Cancel a run Retrieve job metadata   However, you can't directly "open" the Jobs UI from a notebook, and you can't modify job definitions without using the API   Listed below is an example for accessing a Job from a notebook import requests import json   token = dbutils.secre...

Layers of RAG Architecture Patterns

Retrieval ‑ Augmented Generation (RAG) has become one of the most important design patterns in modern AI because it gives language models direct access to external knowledge. Instead of relying solely on what a model has memorized during training, RAG systems retrieve relevant information from documents, databases, or other sources and feed it into the model at generation time. This idea dramatically improves accuracy, reduces hallucinations, and allows AI systems to stay current without constant retraining.   RAG has evolved into a rich ecosystem of architectural layers, addressing different challenges: Core Retrieval layer focuses on improving how information is found, from basic vector search to more advanced techniques like query expansion and hierarchical retrieval. Structure ‑ Aware layer organizes and interprets data based on relationships, formats, or time, enabling retrieval from graphs, tables, or multimodal sources. Reasoning ‑ Enhanced layer st...

Linking one App.config in multiple projects

Image
I recently had a .NET solution where I wanted to use the same App.config in 2 different projects.  I wanted to always ensure that changes made in the App.config of the original project would be reflected in other project(s) automatically.  Listed below are the steps I used to facilitate that process. 1.      Right-click your project in Solution Explorer 2.      Select "Add" -> "Existing Item..." 3.      Navigate to the file that you want to add to the solution 4.      [Important]  Instead of hitting Enter or clicking the Add button, you want to click the down-arrow icon at the right edge of the Add button, and select "Add As Link".