Building a Streaming AI CLI Tool with Microsoft Agents Framework (C#)

Nov 16, 2025· Raza Balbale

Introduction

The Microsoft Agents Framework provides a powerful abstraction layer for building AI-powered applications in .NET. In this post, we’ll explore how to create a command-line interface (CLI) tool that leverages this framework to stream AI responses in real-time, providing users with immediate feedback as the AI generates content.

Why Streaming Matters

When working with large language models (LLMs), responses can take several seconds to generate. Streaming allows us to display tokens as they’re generated, rather than waiting for the complete response. This creates a more responsive user experience and makes the application feel faster and more interactive.

Prerequisites

To follow along with this tutorial, you’ll need:

.NET 9.0 SDK or later
Ollama installed and running locally
Basic knowledge of C# and async programming

Setting Up the Project

First, create a new console application and add the required NuGet packages:

dotnet new console -n StreamingAICLI
cd StreamingAICLI
dotnet add package Microsoft.Agents.AI --version 1.0.0-preview.251028.1
dotnet add package OllamaSharp --version 5.4.8
dotnet add package Spectre.Console --version 0.49.1

The packages we’re using:

Microsoft.Agents.AI: The core framework for building AI agents
OllamaSharp: Client library for interacting with Ollama
Spectre.Console: Rich console UI library for enhanced terminal output

Core Implementation

Connecting to Ollama

Start by establishing a connection to your Ollama server and retrieving available models:

using Microsoft.Extensions.AI;
using Microsoft.Agents.AI;
using OllamaSharp;
using Spectre.Console;
using System.Text;

// Configure the Ollama server connection
var serverUrl = "http://localhost:11434";
var ollama = new OllamaApiClient(serverUrl);

// Retrieve available models
var models = await ollama.ListLocalModelsAsync();
var modelNames = models.Select(m => m.Name).ToList();

if (!modelNames.Any())
{
    AnsiConsole.MarkupLine("[red]No local models found. Please install a model first.[/]");
    return;
}

Creating the Chat Agent

The ChatClientAgent class from the Microsoft Agents Framework wraps the AI model and provides streaming capabilities:

// Select a model
var selectedModel = "llama3.2:latest";

// Create the chat client
var chatClient = new OllamaApiClient(serverUrl, selectedModel);

// Configure the agent with instructions
var agent = new ChatClientAgent(
    chatClient, 
    new ChatClientAgentOptions
    {
        Name = "AI Assistant",
        Instructions = "You are a helpful assistant that provides clear and concise answers."
    });

Streaming Responses

The key to streaming is using the RunStreamingAsync method, which returns an IAsyncEnumerable<string>:

var prompt = "Explain what streaming responses are and why they're useful.";

// Stream the response token by token
await foreach (var token in agent.RunStreamingAsync(prompt))
{
    Console.Write(token);
}
Console.WriteLine();

This simple loop writes each token to the console as it arrives, creating a typewriter effect that shows the response being generated in real-time.

Enhanced User Experience with Spectre.Console

For a more polished experience, we can use Spectre.Console to add visual feedback:

var responseBuilder = new StringBuilder();

await AnsiConsole.Status()
    .Spinner(Spinner.Known.Dots)
    .Start("[green]Generating response...[/]", async ctx =>
    {
        await foreach (var token in agent.RunStreamingAsync(prompt))
        {
            Console.Write(token);
            responseBuilder.Append(token);
        }
    });

Console.WriteLine();

This displays a spinner while connecting to the model, then streams the response as it’s generated.

Building a Conversation Loop

A complete CLI tool should support multiple queries in a single session:

while (true)
{
    var userPrompt = AnsiConsole.Ask<string>("[green]Your question:[/]");
    
    if (userPrompt.Equals("exit", StringComparison.OrdinalIgnoreCase))
    {
        break;
    }

    var response = new StringBuilder();
    
    await AnsiConsole.Status()
        .Start("[green]Thinking...[/]", async ctx =>
        {
            await foreach (var token in agent.RunStreamingAsync(userPrompt))
            {
                Console.Write(token);
                response.Append(token);
            }
        });
    
    Console.WriteLine("\n");
}

Advanced Features

Saving Conversation History

You can capture the streamed responses and save them for later reference:

var timestamp = DateTime.Now.ToString("yyyy-MM-dd_HH-mm-ss");
var fileName = $"conversation_{timestamp}.md";

var markdown = $"""
# Conversation - {DateTime.Now:yyyy-MM-dd HH:mm:ss}

## Model
{selectedModel}

## Prompt
{userPrompt}

## Response
{response.ToString()}

---
*Generated on {DateTime.Now:yyyy-MM-dd HH:mm:ss}*
""";

await File.WriteAllTextAsync(fileName, markdown);

Model Switching

Allow users to switch between different AI models during a session:

if (userPrompt.Equals("switch model", StringComparison.OrdinalIgnoreCase))
{
    selectedModel = AnsiConsole.Prompt(
        new SelectionPrompt<string>()
            .Title("[green]Select a model:[/]")
            .AddChoices(modelNames));
    
    // Recreate the agent with the new model
    chatClient = new OllamaApiClient(serverUrl, selectedModel);
    agent = new ChatClientAgent(chatClient, new ChatClientAgentOptions
    {
        Name = "AI Assistant",
        Instructions = "You are a helpful assistant."
    });
    
    continue;
}

Custom Instructions

Allow users to configure the agent’s behavior:

var instructions = AnsiConsole.Ask<string>(
    "[green]Enter instructions for the AI:[/]", 
    "You are a helpful assistant that provides clear answers.");

var agent = new ChatClientAgent(chatClient, new ChatClientAgentOptions
{
    Name = "AI Assistant",
    Instructions = instructions
});

Complete Example

Here’s a minimal but complete working example:

using Microsoft.Extensions.AI;
using Microsoft.Agents.AI;
using OllamaSharp;
using Spectre.Console;

AnsiConsole.MarkupLine("[bold cyan]AI Chat Assistant[/]");

var serverUrl = "http://localhost:11434";
var ollama = new OllamaApiClient(serverUrl);
var models = await ollama.ListLocalModelsAsync();

if (!models.Any())
{
    AnsiConsole.MarkupLine("[red]No models found![/]");
    return;
}

var model = models.First().Name;
var chatClient = new OllamaApiClient(serverUrl, model);
var agent = new ChatClientAgent(chatClient, new ChatClientAgentOptions
{
    Name = "Assistant",
    Instructions = "You are a helpful AI assistant."
});

AnsiConsole.MarkupLine($"[cyan]Using model:[/] {model}\n");

while (true)
{
    var prompt = AnsiConsole.Ask<string>("[green]You:[/]");
    
    if (prompt.Equals("exit", StringComparison.OrdinalIgnoreCase))
        break;
    
    Console.Write("\n[AI]: ");
    
    await foreach (var token in agent.RunStreamingAsync(prompt))
    {
        Console.Write(token);
    }
    
    Console.WriteLine("\n");
}

AnsiConsole.MarkupLine("[yellow]Goodbye![/]");

Key Takeaways

Microsoft Agents Framework provides a clean abstraction for working with AI models in .NET
Streaming responses significantly improve user experience by providing immediate feedback
IAsyncEnumerable is the key pattern for consuming streamed content in C#
Spectre.Console enhances CLI applications with rich, interactive UI elements
The framework is model-agnostic, working with Ollama, OpenAI, Azure OpenAI, and other providers

Performance Considerations

Streaming responses reduces perceived latency but doesn’t necessarily reduce actual processing time. However, users perceive the application as faster because they see progress immediately. This is especially important for:

Long-form content generation
Complex reasoning tasks
Multi-step processing

Conclusion

The Microsoft Agents Framework makes it straightforward to build sophisticated AI-powered CLI tools with streaming capabilities. By leveraging async enumerables and the framework’s abstractions, you can create responsive, professional command-line applications that provide real-time feedback to users.

The combination of the Agents Framework with libraries like Spectre.Console enables you to build CLI tools that rival graphical applications in terms of user experience while maintaining the efficiency and composability that command-line tools are known for.

Source Code

A complete working example demonstrating these concepts is available with proper error handling, configuration options, and additional features for production use.