微軟的Foundry Local AI套件簡介

微軟近期釋出了Foundry Local這個新AI套件，讓開發者可以輕鬆的使用LLM模型，它的好處是會自動偵測適合這個硬體版本的對應模型，例如指定phi-4模型，Foundry Local就會偵測是否有GPU或NPU，然後再下載對應的模型，才不會有無法執行的問題，不過這中間有一些雷就是了，所以以下是筆者使用後的心得。

首先初始化必要的物件：

using Microsoft.AI.Foundry.Local;
using Betalgo.Ranul.OpenAI.ObjectModels.RequestModels;
using Microsoft.Extensions.Logging;


CancellationToken ct = CancellationToken.None;
var modelAlias = "phi-4";
var config = new Configuration
{
    AppName = "demo-Founddry",
    LogLevel = Microsoft.AI.Foundry.Local.LogLevel.Information,
};
using var loggerFactory = LoggerFactory.Create(builder =>
{
    builder.SetMinimumLevel(Microsoft.Extensions.Logging.LogLevel.Information);
});
var logger = loggerFactory.CreateLogger<AILib>();
await FoundryLocalManager.CreateAsync(config, logger);
if (FoundryLocalManager.IsInitialized)
{
    Console.WriteLine("Foundry Local Manager 初始化成功。");
}
else
{
    Console.WriteLine("Foundry Local Manager 初始化失敗。");
    return;
}
var manager = FoundryLocalManager.Instance;
ICatalog _catalog = await manager.GetCatalogAsync(ct);

接下來列出它可用的模型：

var models = await _catalog.ListModelsAsync();
foreach (var m in models)
{
    Console.WriteLine($"{m.Alias}, {m.Id}");
}

同一種模型有時會列出GPU、NPU、CPU三種版本，如果電腦沒獨顯也沒有NPU，就選擇有CPU版本的模型。

接下來載入模型，如果先前沒下載，會先載到快取資料夾：

Model? model = await _catalog.GetModelAsync(modelAlias);
if (model == null)
{
    Console.WriteLine("找不到指定的模型，請確認模型別名是否正確，或是該模型是否已預載。");
    return;
}
Console.WriteLine($"模型資訊: {model.Alias} ({model.Id})");

await model.DownloadAsync(progress =>
{
    Console.Write($"\rDownloading model: {progress:F2}%");
    if (progress >= 100f)
    {
        Console.WriteLine();
    }
});

再來就是實際執行，輸入提示詞：

await model!.LoadAsync();
var chatClient = await model.GetChatClientAsync();
List<ChatMessage> messages = new()
{
    new ChatMessage {  Role = "user", Content = "IBM有出過哪些型號的CPU，請用繁體中文回答" }
};
var streamingResponse = chatClient.CompleteChatStreamingAsync(messages, ct);
await foreach (var chunk in streamingResponse)
{
    Console.Write(chunk.Choices[0].Message.Content);
    Console.Out.Flush();
}
Console.WriteLine();

// unload the model
await model.UnloadAsync();

實際的程式碼很簡單，但這邊要注意一個地方，就是前面設定的AppName非常重要，像是程式碼設定它為demo-Founddry，那下載模型時，它的路徑會是：C:\Users\<user name>\.demo-Founddry\cache\models\Microsoft\Phi-4-cuda-gpu-1，其中Phi-4-cuda-gpu-1這個會依據裝置不同而有不同的版本，如果你是在CLI上直接執行而不是透過程式碼，那路徑則是：C:\Users\<user name>\.foundry\cache\models\Microsoft\Phi-4-cuda-gpu-1

參考資料

Get started with Foundry Local