Skip to content

Add Foundry local samples #170

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Jun 6, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion 03-CoreGenerativeAITechniques/05-ImageGenerationOpenAI.md
Original file line number Diff line number Diff line change
Expand Up @@ -155,4 +155,4 @@ Image generation opens up new creative possibilities for your applications, allo

You've learned how to generate images with Azure OpenAI in your .NET applications. In the next lesson, you'll learn how to run AI models locally on your machine.

👉 [Running models locally with AI Toolkit and Docker](./06-AIToolkitAndDockerModels.md)
👉 [Running models locally with AI Toolkit, Docker, and Foundry Local](./06-LocalModelRunners.md)
214 changes: 214 additions & 0 deletions 03-CoreGenerativeAITechniques/06-LocalModelRunners.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,214 @@
# Running AI Models Locally: AI Toolkit, Docker, and Foundry Local

In this lesson, you'll learn how to run AI models locally using three popular approaches:

- **[AI Toolkit for Windows](https://learn.microsoft.com/windows/ai/toolkit/)** – A suite of tools for Windows that enables running AI models locally
- **[Docker Model Runner](https://docs.docker.com/model-runner/)** – A containerized approach for running AI models with Docker
- **[Foundry Local](https://learn.microsoft.com/azure/ai-foundry/foundry-local/)** – A cross-platform, open-source solution for running Microsoft AI models locally

Running models locally provides several benefits:

- Data privacy – Your data never leaves your machine
- Cost efficiency – No usage charges for API calls
- Offline availability – Use AI even without internet connectivity
- Customization – Fine-tune models for specific use cases

## AI Toolkit for Windows

The AI Toolkit for Windows is a collection of tools and technologies that help you build and run AI applications locally on Windows PCs. It leverages the Windows platform capabilities to optimize AI workloads.

### Key Features

- **DirectML** – Hardware-accelerated machine learning primitives
- **Windows AI Runtime (WinRT)** – Runtime environment for AI models
- **ONNX Runtime** – Cross-platform inference accelerator
- **Local model downloads** – Access to optimized models for Windows

### Getting Started

1. [Install the AI Toolkit](https://learn.microsoft.com/windows/ai/toolkit/install)
2. Download a supported model
3. Use the APIs through .NET or other supported languages

> 📝 **Note**: AI Toolkit for Windows requires Windows 10/11 and compatible hardware for optimal performance.

## Docker Model Runner

Docker Model Runner is a tool for running AI models in containers, making it easy to deploy and run inference workloads consistently across different environments.

### Key Features - Docker Model Runner

- **Containerized models** – Package models with their dependencies
- **Cross-platform** – Run on Windows, macOS, and Linux
- **Built-in API** – RESTful API for model interaction
- **Resource management** – Control CPU and memory usage

### Getting Started - Docker Model Runner

1. [Install Docker Desktop](https://www.docker.com/products/docker-desktop/)
2. Pull a model image
3. Run the model container
4. Interact with the model through the API

```bash
# Pull and run a Llama model
docker run -d -p 12434:8080 \
--name deepseek-model \
--runtime=nvidia \
ghcr.io/huggingface/dockerfiles/model-runner:latest \
deepseek-ai/deepseek-llm-7b-chat
```

## Foundry Local

[Foundry Local](https://learn.microsoft.com/azure/ai-foundry/foundry-local/) is an open-source, cross-platform solution for running Microsoft AI models on your own hardware. It supports Windows, Linux, and macOS, and is designed for privacy, performance, and flexibility.

- **Official documentation:** https://learn.microsoft.com/azure/ai-foundry/foundry-local/
- **GitHub repository:** https://github.com/microsoft/Foundry-Local/tree/main

### Key Features - Foundry Local

- **Cross-platform** – Windows, Linux, and macOS
- **Microsoft models** – Run models from Azure AI Foundry locally
- **REST API** – Interact with models using a local API endpoint
- **No cloud dependency** – All inference runs on your machine

### Getting Started - Foundry Local

1. [Read the official Foundry Local documentation](https://learn.microsoft.com/azure/ai-foundry/foundry-local/)
2. Download and install Foundry Local for your OS
3. Start the Foundry Local server and download a model
4. Use the REST API to interact with the model

## Sample Code: Using AI Toolkit for Windows with .NET

The AI Toolkit for Windows provides a way to run AI models locally on Windows machines. We have two examples that demonstrate how to interact with AI Toolkit models using .NET:

### 1. Semantic Kernel with AI Toolkit

The [AIToolkit-01-SK-Chat](./src/AIToolkit-01-SK-Chat/) project shows how to use Semantic Kernel to chat with a model running via AI Toolkit for Windows.

```csharp
// Example code demonstrating AI Toolkit with Semantic Kernel integration
var builder = Kernel.CreateBuilder();
// Configure to use a locally installed model through AI Toolkit
builder.AddAzureOpenAIChatCompletion(
modelId: modelId,
endpoint: new Uri(endpoint),
apiKey: apiKey);
var kernel = builder.Build();
// Create a chat history and use it with the kernel
```

### 2. Microsoft Extensions for AI with AI Toolkit

The [AIToolkit-02-MEAI-Chat](./src/AIToolkit-02-MEAI-Chat/) project demonstrates how to use Microsoft Extensions for AI to interact with AI Toolkit models.

```csharp
// Example code demonstrating AI Toolkit with MEAI
OpenAIClientOptions options = new OpenAIClientOptions();
options.Endpoint = new Uri(endpoint);
ApiKeyCredential credential = new ApiKeyCredential(apiKey);
// Create a chat client using local model through AI Toolkit
ChatClient client = new OpenAIClient(credential, options).GetChatClient(modelId);
```

## Sample Code: Using Docker Models with .NET

In this repository, we have two examples that demonstrate how to interact with Docker-based models using .NET:

### 1. Semantic Kernel with Docker Models

The [DockerModels-01-SK-Chat](./src/DockerModels-01-SK-Chat/) project shows how to use Semantic Kernel to chat with a model running in Docker.

```csharp
var model = "ai/deepseek-r1-distill-llama";
var base_url = "http://localhost:12434/engines/llama.cpp/v1";
var api_key = "unused";

// Create a chat completion service
var builder = Kernel.CreateBuilder();
builder.AddOpenAIChatCompletion(modelId: model, apiKey: api_key, endpoint: new Uri(base_url));
var kernel = builder.Build();

var chat = kernel.GetRequiredService<IChatCompletionService>();
var history = new ChatHistory();
history.AddSystemMessage("You are a useful chatbot. Always reply in a funny way with short answers.");

// ... continue with chat functionality
```

### 2. Microsoft Extensions for AI with Docker Models

The [DockerModels-02-MEAI-Chat](./src/DockerModels-02-MEAI-Chat/) project demonstrates how to use Microsoft Extensions for AI to interact with Docker-based models.

```csharp
var model = "ai/deepseek-r1-distill-llama";
var base_url = "http://localhost:12434/engines/llama.cpp/v1";
var api_key = "unused";

OpenAIClientOptions options = new OpenAIClientOptions();
options.Endpoint = new Uri(base_url);
ApiKeyCredential credential = new ApiKeyCredential(api_key);

ChatClient client = new OpenAIClient(credential, options).GetChatClient(model);

// Build and send a prompt
StringBuilder prompt = new StringBuilder();
prompt.AppendLine("You will analyze the sentiment of the following product reviews...");
// ... add more text to the prompt

var response = await client.CompleteChatAsync(prompt.ToString());
Console.WriteLine(response.Value.Content[0].Text);
```

## Sample Code: Using Foundry Local with .NET

This repository includes two demos for Foundry Local:

### 1. Semantic Kernel with Foundry Local

The [AIFoundryLocal-01-SK-Chat](./src/AIFoundryLocal-01-SK-Chat/Program.cs) project shows how to use Semantic Kernel to chat with a model running via Foundry Local.

### 2. Microsoft Extensions for AI with Foundry Local

The [AIFoundryLocal-01-MEAI-Chat](./src/AIFoundryLocal-01-MEAI-Chat/Program.cs) project demonstrates how to use Microsoft Extensions for AI to interact with Foundry Local models.

## Running the Samples

To run the samples in this repository:

1. Install Docker Desktop, AI Toolkit for Windows, or Foundry Local as needed
2. Pull or download the required model
3. Start the local model server (Docker, AI Toolkit, or Foundry Local)
4. Navigate to one of the sample project directories
5. Run the project with `dotnet run`

## Comparing Local Model Runners

| Feature | AI Toolkit for Windows | Docker Model Runner | Foundry Local |
|---------|------------------------|---------------------|--------------|
| Platform | Windows only | Cross-platform | Cross-platform |
| Integration | Native Windows APIs | REST API | REST API |
| Deployment | Local installation | Container-based | Local installation |
| Hardware Acceleration | DirectML, DirectX | CPU, GPU | CPU, GPU |
| Models | Optimized for Windows | Any containerized model | Microsoft Foundry models |

## Additional Resources

- [AI Toolkit for Windows Documentation](https://learn.microsoft.com/windows/ai/toolkit/)
- [Docker Model Runner Documentation](https://docs.docker.com/model-runner/)
- [Foundry Local Documentation](https://learn.microsoft.com/azure/ai-foundry/foundry-local/)
- [Foundry Local GitHub Repository](https://github.com/microsoft/Foundry-Local/tree/main)
- [Semantic Kernel Documentation](https://learn.microsoft.com/semantic-kernel/overview/)
- [Microsoft Extensions for AI Documentation](https://learn.microsoft.com/dotnet/ai/)

## Summary

Running AI models locally with AI Toolkit for Windows, Docker Model Runner, or Foundry Local offers flexibility, privacy, and cost benefits. The samples in this repository demonstrate how to integrate these local models with your .NET applications using Semantic Kernel and Microsoft Extensions for AI.

## Next Steps

You've learned how to run AI models locally using AI Toolkit for Windows, Docker Model Runner, and Foundry Local. Next, you'll explore how to create AI agents that can perform tasks autonomously.

👉 [Check out AI Agents](./04-agents.md)
2 changes: 1 addition & 1 deletion 03-CoreGenerativeAITechniques/readme.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ For this lesson, we will subdivide the content into the following sections:
- [Vision and audio AI applications](./03-vision-audio.md)
- [Image Generation with Azure OpenAI](./05-ImageGenerationOpenAI.md)
- [Agents](04-agents.md)
- [Running models locally with AI Toolkit and Docker](./06-AIToolkitAndDockerModels.md)
- [Running models locally with AI Toolkit, Docker, and Foundry Local](./06-LocalModelRunners.md)

Starting with Language Model completions and Chat applications and function implementations with language models in .NET.

Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
<Project Sdk="Microsoft.NET.Sdk">

<PropertyGroup>
<OutputType>Exe</OutputType>
<TargetFramework>net9.0</TargetFramework>
<RootNamespace>AIFoundryLocal_01_MEAI_Chat</RootNamespace>
<ImplicitUsings>enable</ImplicitUsings>
<Nullable>enable</Nullable>
</PropertyGroup>
<ItemGroup>
<PackageReference Include="Microsoft.Extensions.AI" Version="9.5.0" />
<PackageReference Include="Microsoft.Extensions.AI.OpenAI" Version="9.5.0-preview.1.25265.7" />
<PackageReference Include="Microsoft.Extensions.Configuration" Version="9.0.5" />
<PackageReference Include="Microsoft.Extensions.Configuration.Json" Version="9.0.5" />
<PackageReference Include="Microsoft.Extensions.Configuration.UserSecrets" Version="9.0.5" />
</ItemGroup>
</Project>
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
using OpenAI;
using OpenAI.Chat;
using System.ClientModel;
using System.Text;

var model = "Phi-3.5-mini-instruct-cuda-gpu";
var baseUrl = "http://localhost:5273/v1";
var apiKey = "unused";

OpenAIClientOptions options = new OpenAIClientOptions();
options.Endpoint = new Uri(baseUrl);
ApiKeyCredential credential = new ApiKeyCredential(apiKey);

ChatClient client = new OpenAIClient(credential, options).GetChatClient(model);

// here we're building the prompt
StringBuilder prompt = new StringBuilder();
prompt.AppendLine("You will analyze the sentiment of the following product reviews. Each line is its own review. Output the sentiment of each review in a bulleted list and then provide a generate sentiment of all reviews. ");
prompt.AppendLine("I bought this product and it's amazing. I love it!");
prompt.AppendLine("This product is terrible. I hate it.");
prompt.AppendLine("I'm not sure about this product. It's okay.");
prompt.AppendLine("I found this product based on the other reviews. It worked for a bit, and then it didn't.");

// send the prompt to the model and wait for the text completion
var response = await client.CompleteChatAsync(prompt.ToString());

// display the response
Console.WriteLine(response.Value.Content[0].Text);
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
<Project Sdk="Microsoft.NET.Sdk">

<PropertyGroup>
<OutputType>Exe</OutputType>
<TargetFramework>net9.0</TargetFramework>
<RootNamespace>AIFoundryLocal_01_SK_Chat</RootNamespace>
<ImplicitUsings>enable</ImplicitUsings>
<Nullable>enable</Nullable>
</PropertyGroup>

<ItemGroup>
<PackageReference Include="Microsoft.Extensions.Configuration.UserSecrets" Version="9.0.5" />
<PackageReference Include="Microsoft.SemanticKernel" Version="1.55.0" />
</ItemGroup>

</Project>
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
#pragma warning disable SKEXP0001, SKEXP0003, SKEXP0010, SKEXP0011, SKEXP0050, SKEXP0052
using Microsoft.SemanticKernel;
using Microsoft.SemanticKernel.ChatCompletion;
using Microsoft.SemanticKernel.Connectors.OpenAI;
using System.Text;

var model = "Phi-3.5-mini-instruct-cuda-gpu";
var baseUrl = "http://localhost:5273/v1";
var apiKey = "unused";

// Create a chat completion service
var kernel = Kernel.CreateBuilder()
.AddOpenAIChatCompletion(modelId: model, apiKey: apiKey, endpoint: new Uri(baseUrl))
.Build();

var chat = kernel.GetRequiredService<IChatCompletionService>();
var history = new ChatHistory();
history.AddSystemMessage("You are a useful chatbot. Always reply in a funny way with short answers.");

var settings = new OpenAIPromptExecutionSettings
{
MaxTokens = 50000,
Temperature = 1
};

while (true)
{
Console.Write("Q: ");
var userQuestion = Console.ReadLine();
if (string.IsNullOrWhiteSpace(userQuestion))
{
break;
}
history.AddUserMessage(userQuestion);

var responseBuilder = new StringBuilder();
Console.Write("AI: ");
await foreach (var message in chat.GetStreamingChatMessageContentsAsync(history, settings, kernel))
{
responseBuilder.Append(message.Content);
Console.Write(message.Content);
}
Console.WriteLine();

history.AddAssistantMessage(responseBuilder.ToString());
}
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@

<ItemGroup>
<PackageReference Include="Microsoft.Extensions.Configuration.UserSecrets" Version="9.0.5" />
<PackageReference Include="Microsoft.SemanticKernel" Version="1.54.0" />
<PackageReference Include="Microsoft.SemanticKernel" Version="1.55.0" />
</ItemGroup>

</Project>
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@

<ItemGroup>
<PackageReference Include="Microsoft.Extensions.Configuration.UserSecrets" Version="9.0.2" />
<PackageReference Include="Microsoft.SemanticKernel" Version="1.54.0" />
<PackageReference Include="Microsoft.SemanticKernel" Version="1.55.0" />
<PackageReference Include="Microsoft.Extensions.Configuration" Version="9.0.5" />
</ItemGroup>

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@
</PropertyGroup>

<ItemGroup>
<PackageReference Include="Microsoft.SemanticKernel" Version="1.54.0" />
<PackageReference Include="Microsoft.SemanticKernel" Version="1.55.0" />
<PackageReference Include="Microsoft.Extensions.Configuration" Version="9.0.5" />
<PackageReference Include="Microsoft.SemanticKernel.Connectors.Ollama" Version="1.54.0-alpha" />
</ItemGroup>
Expand Down
Loading
Loading