Posts

DeepSeek R1 in a Nutshell

DeepSeek R1 is an advanced AI model developed by the Chinese startup DeepSeek AI. It has gained significant attention for the following reasons: Open Source, available to for use by anyone. Comparable Performance to OpenAI's GPT-4 and ChatGPT o1 models on various benchmarks. DeepSeek R1 was reportedly trained on 2,788 GPUs at a cost of around $6 million, significantly less than the estimated $100 million cost to train OpenAI's GPT-4. Excels in reasoning tasks and has been trained using large-scale reinforcement learning without supervised fine-tuning. Availability on platforms like Azure AI Foundry and GitHub, making it accessible for developers and researchers. DeepSeek R1's open-source nature and cost-effective training have made it a notable player in the AI community, challenging the notion that larger models and more data always lead to better performance.   To get started, DeepSeek R1 is now available via a serverless endpoint through the model catalog ...

What is Automated Intelligence?

Automated Intelligence refers to the use of technology to automate repetitive, rule-based tasks that typically require minimal human intervention. This includes everything from data entry to workflow management and beyond. The goal of Automated Intelligence is to streamline processes, increase efficiency, and reduce the potential for human error.   Artificial Intelligence encompasses a broader scope, including machine learning, natural language processing, and more. AI is designed to simulate human intelligence and can perform complex tasks like understanding language, recognizing patterns, and making decisions based on data. AI systems can learn and adapt over time, improving their performance with more data and experience.   Automated Intelligence (AI) and Artificial Intelligence (AI). Although they share the same abbreviation, their applications and implications can differ significantly. - Scope: Automated Intelligence focuses on automating specific tasks, w...

"Data Science with .NET and Polyglot Notebooks" By Matt Eland

Image
In the fall of 2024, I had the opportunity to work with Matt Eland and be one of the editors for his book “ Data Science with .NET and Polyglot Notebooks: Programmer's guide to data science using ML.NET, OpenAI, and Semantic Kernel ”.  Matt is a very intelligent and knowledgeable data science developer and it definitely reflected in his work.  He walks the reader through step-by-step directions to demonstrate key concepts in data science, machine learning, as well as polyglot notebooks.  This was one of the rare books that I could hardly put down.  I urge you to pick-up a copy and upgrade your data science skills.      

Recap of "How To Tune A Multi-Terabyte Database For Optimum Performance"

On October 29, 2024 at the GroupBy conference, I was moderator for Jeff Taylor's session "How To Tune A Multi-Terabyte Database For Optimum Performance" The video is available at https://www.youtube.com/watch?v=9j51bD0DPZE   Listed below are some take aways and Q&As from his session:   Ideal Latency time: 20ms for IO 10ms for TempDB   Crystal Disk Mark is a simple disk benchmark software: https://crystalmark.info/en/software/crystaldiskmark/   What is the overhead of running these diagnostics (i.e. diskspd and Crystal Disk)? No adverse effects during mid-day testing, but don't run it during a busy time. It's best to test it during both busy and non-busy times   Mutlipath: multiple network cards between host, switch and SAN appliance   For tempdb storage, what's preferable? Shared space on a disk pool with a lot of drives or dedicated pool with just 2 drives (raid 1)? all drives of the same type (N...

Temporal Tables FAQ

I had the pleasure of presenting Temporal Tables to the Capital Area .NET User Group on December 10, 2024.  Some interesting FAQ arose from that meeting so I thought it would be good to share it on my blog for reference.   Is the historical table logged to the Transaction log in the same way a conventional table is?  Will we “see” the inserts in the Tran log the same way we see them for a normal table? No   Are Temporal Tables available in Azure? Yes, in Azure SQL Database and Azure SQL Managed Instance   Will Temporal Tables work with graph tables? No, Node and edge tables can't be created as system-versioned temporal tables Ref: https://learn.microsoft.com/en-us/sql/relational-databases/graphs/sql-graph-architecture?view=sql-server-ver16   What triggers the purge of SQL Server Temporal Tables? A background task is created to perform aged data cleanup for all temporal tables with finite retention period...

Questions on Copilot Data Privacy

Q: concerned about what Microsoft and / or the US government can do with the data for a custom copilot. I’ve looked at the Microsoft copilot documentation but I didn’t find anything that clearly states what Microsoft can and cannot do with data used in custom copilots, do you have any resources that you can share?   A: Microsoft posted info about this topic specifically at https://learn.microsoft.com/en-us/legal/cognitive-services/openai/data-privacy?context=%2Fazure%2Fcognitive-services%2Fopenai%2Fcontext%2Fcontext#see-also   In a nutshell: Your prompts (inputs) and completions (outputs), your embeddings, and your training data: are NOT available to other customers. are NOT available to OpenAI. are NOT used to improve OpenAI models. are NOT used to improve any Microsoft or 3rd party products or services. are NOT used for automatically improving Azure OpenAI models for your use in your resource (The models are stateless, unless you explicitly ...

'Build Your Own Copilot" Resource Links

Interested in building your own Copilot using Azure AI Studio?  Listed below are some useful links: Azure AI Studio Architecture Quickstart: Create a project and use the chat playground in Azure AI Studio Tutorial: Deploy an Enterprise Chat web app AI Studio FAQ