Thursday, January 15, 2026

Layers of RAG Architecture Patterns

RetrievalAugmented Generation (RAG) has become one of the most important design patterns in modern AI because it gives language models direct access to external knowledge.

Instead of relying solely on what a model has memorized during training, RAG systems retrieve relevant information from documents, databases, or other sources and feed it into the model at generation time.

This idea dramatically improves accuracy, reduces hallucinations, and allows AI systems to stay current without constant retraining.

 

RAG has evolved into a rich ecosystem of architectural layers, addressing different challenges:

  1. Core Retrieval layer focuses on improving how information is found, from basic vector search to more advanced techniques like query expansion and hierarchical retrieval.
  2. StructureAware layer organizes and interprets data based on relationships, formats, or time, enabling retrieval from graphs, tables, or multimodal sources.
  3. ReasoningEnhanced layer strengthens the model's ability to think with retrieved information through multistep search, agentic planning, selfreflection, and highaccuracy fusion.
  4. SystemLevel Orchestration layer coordinates multiple retrieval and reasoning strategies, integrating tools, memory, personalization, and routing to build adaptive, productionready AI systems.

 

Layer

Arch.

Pattern

Pattern Description

Strengths

Weaknesses

Best Use

Core
Retrieval

Basic RAG

Single-pass retrieval: embed query, retrieve top-k chunks, feed to LLM.

Simple, fast, easy to implement.

Weak with vague or ambiguous queries.

Baseline RAG, small datasets.

Query Expansion RAG

Expands the user query into multiple variants to improve recall.

Handles vague or short queries well.

Can retrieve irrelevant results.

Search interfaces, consumer chatbots.

Multi-Vector RAG

Stores multiple embeddings per document (sentence-level or attribute-level).

High precision for dense or multi-topic documents.

Higher storage and compute cost.

Technical manuals, scientific papers.

Hybrid Search RAG

Combines vector search, keyword search, and metadata filters.

High recall and precision.

More complex retrieval logic.

Enterprise search, compliance.

Cluster-Based RAG

Clusters documents and retrieves from the most relevant cluster.

Faster retrieval; scalable.

Cluster quality matters.

Large-scale corpora.

Hierarchical RAG

Two-stage retrieval: coarse (document) then fine (paragraph or sentence).

Reduces noise, scales to long documents.

More complex pipeline.

Legal texts, long PDFs, structured corpora.

Structure
Aware Retrieval

Graph-Based RAG

Converts data into a knowledge graph and retrieves via relationships.

Strong relational reasoning.

Requires graph construction and maintenance.

Enterprise knowledge bases.

Chunk-Graph RAG

Builds a graph of chunk-to-chunk relationships for better navigation.

Strong for long or interconnected texts.

Requires preprocessing.

Books, manuals, long reports.

Structured RAG

Retrieves structured data (tables, SQL, JSON) alongside text.

Accurate factual grounding.

Requires schema alignment.

Finance, logistics, analytics.

Temporal RAG

Retrieval is time-aware (recentness, versioning, time decay).

Great for evolving data.

Requires timestamped corpora.

News, markets, real-time systems.

Multimodal RAG

Retrieves images, audio, or video embeddings alongside text.

Richer context; cross-modal reasoning.

Requires multimodal indexing.

Vision-language agents.

Reasoning
Enhanced Retrieval

Multi-Hop RAG

Performs sequential retrieval steps to answer multi-step questions.

Excellent for reasoning across documents.

Slower and more complex.

Research, academic QA.

Agentic RAG

LLM plans retrieval steps and iteratively refines queries.

Strong for multi-step reasoning.

Expensive and harder to control.

Research workflows, complex tasks.

Self-Reflective or Feedback-Loop RAG

LLM critiques its answer and triggers additional retrieval rounds.

Reduces hallucinations; improves reliability.

Higher latency.

High-stakes or regulated domains.

Speculative RAG

LLM predicts what information it needs before retrieval.

Faster; reduces unnecessary retrieval.

Can mispredict needs.

Low-latency assistants.

Fusion-in-Decoder RAG (FiD)

Encodes each retrieved chunk separately and fuses them during decoding.

Very high accuracy; handles many chunks.

Heavy compute cost.

High-quality QA systems.

Retrieval-Graded RAG

Ranks retrieved chunks using a secondary scoring model or LLM.

Higher quality context.

Extra inference cost.

Precision-critical tasks.

System
Level Orchestration

Routing or Mixture-of-Experts RAG

Router selects the best retriever or workflow for each query.

Domain-aware and flexible.

Requires router training.

Multi-domain assistants.

Tool-Augmented RAG

LLM decides when to call external tools (SQL, APIs) alongside retrieval.

Strong for structured data.

Requires tool orchestration.

Analytics, BI, enterprise workflows.

Memory-Augmented RAG

Stores long-term memory for retrieval (episodic or semantic).

Personalization and continuity.

Requires memory management.

Personal assistants, tutoring systems.

Personalized RAG

Retrieval tuned to user profile or history.

Highly relevant results.

Requires user modeling.

Personalized assistants, education.

Contextual RAG

Uses conversation history or metadata to refine retrieval.

Strong for multi-turn chat.

Can drift if context is noisy.

Customer support, assistants.

Generative Index RAG

LLM generates synthetic summaries or embeddings to improve retrieval.

Better recall; compact indexes.

Risk of synthetic errors.

Large corpora with redundancy.

 

 

Tuesday, January 6, 2026

Linking one App.config in multiple projects

I recently had a .NET solution where I wanted to use the same App.config in 2 different projects.  I wanted to always ensure that changes made in the App.config of the original project would be reflected in other project(s) automatically.  Listed below are the steps I used to facilitate that process.

1.     Right-click your project in Solution Explorer

2.     Select "Add" -> "Existing Item..."

3.     Navigate to the file that you want to add to the solution

4.     [Important] Instead of hitting Enter or clicking the Add button, you want to click the down-arrow icon at the right edge of the Add button, and select "Add As Link".

 

 

Monday, January 5, 2026

Uploading documents to AI Foundry Agents

Q: Can we upload documents like pdf as input and configure the agents to retrieve the required content that is expected

 

A: The short answer is yes, but with some configuration.  Working in the Agent Playground, you have the option to “Add” Knowledge.  This knowledge can be from a variety of different data sources as seen below. As stated below “Currently only a single instance per each type of data source is supported.”  In my scenario, I had a single text file setup when I configured my agent, before being published.

 

Another option would be to utilize the Azure AI Search to index multiple documents from a data store, where documents can be uploaded after the agent is published.  As of today, there are 6 other options available for accessing documents

 

 

Wednesday, December 31, 2025

AI Agents vs. Agentic AI

AI agents and agentic AI are related but not the same. AI agents are task oriented systems built around LLMs, while agentic AI refers to a broader paradigm where AI systems exhibit autonomy, goal directed behavior, and self improving capabilities.

 

AI Agents vs. Agentic AI

AI Agents: perform tasks but do not necessarily set their own goals.

•             Modular systems built around LLMs or LIMs.

•             Designed for narrow, task specific automation.

•             Operate through tool integration, prompt engineering, and workflow orchestration.

 

Examples:

·                  A customer service chatbot

·                  A research assistant that retrieves and summarizes documents

·                  A coding agent that fixes bugs when prompted

 

Agentic AI: behave like agents (goal driven, adaptive, and self improving)

•             A broader paradigm where AI systems exhibit autonomy, self direction, and persistent goal pursuit.

•             Goes beyond task execution to include:

·                  Planning

·                  Reflection and self correction

·                  Long horizon reasoning

·                  Adaptive behavior

•             Often involves multi step, self initiated workflows.

 

 

Side by Side Comparison

 

Feature

AI Agents

Agentic AI

Scope

Narrow tasks

Broad, multi step goals

Autonomy

Low to moderate

High

Goal Setting

User defined

AI may refine or generate goals

Reasoning Depth

Shallow to moderate

Deep, reflective, iterative

Architecture

Modular workflows

Self directed cognitive loops

Examples

Chatbots, RPA-like tools

Autonomous research systems, self improving agents

 

Why the Distinction Matters

The research argues that the two concepts diverge in design philosophy and capabilities:

•             AI agents are an engineering pattern—a way to wrap LLMs in tools and workflows.

•             Agentic AI is a behavioral paradigm—systems that act with increasing independence.

 

This matters for:

•             Safety (agentic systems require stronger oversight)

•             Applications (agentic AI can handle long term, complex tasks)

•             Regulation (autonomy introduces new risks and responsibilities)

 

 

Examples to Make It Concrete

AI Agent Example

You ask: "Summarize these 10 PDFs."

The agent:

  1. Retrieves files
  2. Summarizes them
  3. Returns results

It does not decide to read more papers or refine the topic unless instructed.

 

Agentic AI Example

You ask: "Research the best battery technology for drones."

An agentic system might:

  1. Break the problem into sub goals
  2. Search literature
  3. Evaluate trade offs
  4. Generate experiments
  5. Identify missing data
  6. Suggest next steps

It acts like a researcher, not just a tool.

 

Wednesday, October 29, 2025

Deployment Types in AI Foundry

Deploying a model in Azure AI Foundry can be done in 9 different ways.  Depending on the type of deployment chosen, it may impact one of more factors, such as cost, latency, efficiency for processing large datasets, compliance.  Listed below is a description of each deployment type, along with advantages and disadvantages.  For more details, please visit https://learn.microsoft.com/en-us/azure/ai-foundry/foundry-models/concepts/deployment-types


Deployment Type
Description
Advantage
Disadvantage
Global Standard
Shared global infrastructure for general-purpose model inference.
Cost-effective and easy to scale.
Performance may vary under high demand.
Global Provisioned
Dedicated global infrastructure for consistent performance.
Reliable throughput and latency.
Higher cost due to dedicated resources.
Global Batch
Asynchronous global batch processing for large-scale inference jobs.
Efficient for processing large datasets.
Not suitable for real-time applications.
Data Zone Standard
Shared infrastructure within a specific data zone for compliance needs.
Meets data residency requirements affordably.
Limited performance consistency.
Data Zone Provisioned
Dedicated infrastructure in a data zone for high-performance workloads.
Combines compliance with consistent performance.
More expensive than shared options.
Data Zone Batch
Batch processing within a data zone for regulated data workflows.
Ideal for compliant, large-scale processing.
Slower response times; not real-time.
Standard
Default shared deployment for general use across Azure AI Foundry.
Simple setup and broad compatibility.
May lack advanced performance or compliance features.
Regional Provisioned
Dedicated infrastructure in a specific region for localized performance.
Optimized for regional latency and control.
Higher cost and limited to regional availability.
Developer (Fine-tuned)
Lightweight deployment for testing and iterating fine-tuned models.
Fast iteration and low cost for development.
Not suitable for production-scale workloads.

Friday, October 10, 2025

Converting .NET Application from Oracle to SQL Server

The SQL Server equivalent of Oracle.ManagedDataAccess.Client is either System.Data.SqlClient or Microsoft.Data.SqlClient

System.Data.SqlClient is the older built-in provider.

Microsoft.Data.SqlClient is the newer, actively maintained version with better support for .NET Core and .NET 5+.

 

Feature

Oracle.ManagedDataAccess.Client

System.Data.SqlClient / Microsoft.Data.SqlClient

Database

Oracle

SQL Server

Namespace

Oracle.ManagedDataAccess.Client

System.Data.SqlClient or Microsoft.Data.SqlClient

Connection class

OracleConnection

SqlConnection

Command class

OracleCommand

SqlCommand

Data reader class

OracleDataReader

SqlDataReader

NuGet package

Oracle.ManagedDataAccess

System.Data.SqlClient (legacy) or Microsoft.Data.SqlClient (modern)

 


How can I remove GitHub bindings from a Visual Studio 2022 Solution

To remove Git from a solution in Visual Studio 2022, effectively unbinding it from source control, follow these steps:
  1. Ensure the solution is NOT open in the Visual Studio IDE.
  2. Navigate to the root directory of your solution using File Explorer.
  3. If you cannot see the .git folder, you need to enable the display of hidden files and folders in File Explorer. In Windows, open File Explorer, go to the "View" tab, and check "Hidden items."
  4. Delete the .git folder within your solution's root directory. This folder contains all the Git repository information, including history, branches, and tags for the solution and all projects within it.
  5. Visual Studio should now recognize that the Git repository is no longer present and will no longer manage it with Git source control.