Description
Background and motivation
Currently, when using OpenAIResponseClient.AsIChatClient
, there are various implicit supported mappings from the abstractions API to the Responses API so that for the most part things just work (i.e. AIFunction and HostedWebSearchTool. One very important area left out is the ability to perform semantic search as part of the ResponseCreationOptions which looks like the following in OpenAI API:
var options = new ResponseCreationOptions
{
Tools =
{
ResponseTool.CreateFileSearchTool([storeId1], maxResultCount: 10, new FileSearchToolRankingOptions
{
ScoreThreshold = 0.7,
}, filters: binaryDataFilters1),
ResponseTool.CreateFileSearchTool([storeId2], maxResultCount: 20, new FileSearchToolRankingOptions
{
ScoreThreshold = 0.9,
}, filters: binaryDataFilters2),
},
};
The tool supports searching only one vector id per CreateFileSearchTool
(despite the API implying an array of values), but the accompanying settings (maxResultCount
, FileSearchToolRankingOptions
and optional BinaryData
-based filters to apply file-based attribute filters (see docs) is quite necessary for the abstraction to be usable.
Given the existing way of loosely mapping settings via the options.AdditionalProperties
, perhaps the introduction of a new marker AITool is warranted to cover this important aspect in using OpenAI.
API Proposal
namespace Microsoft.Extensions.AI;
public class SemanticSearchTool(string id) : AITool
{
public string Id => id;
}
The tool Id
can be used to provide more than one set of settings for multiple semantic search tools (which is important given the 10k files limitation per vector store in OpenAI).
The mapping code in OpenAIResponseChatClient.cs
could perform the following lookup:
foreach (AITool tool in tools)
{
switch (tool)
{
// other cases
case SemanticSearchTool sst:
int? maxResultCount = null;
FileSearchToolRankingOptions? rankingOptions = null;
BinaryData? filters = null;
if (tool.AdditionalProperties.TryGetValue($"{sst.Id}.maxResultCount", out object? objMaxResultCount))
{
maxResultCount = objMaxResultCount as int?;
}
if (tool.AdditionalProperties.TryGetValue($"{sst.Id}.rankingOptions", out object? objRankingOptions))
{
rankingOptions = objRankingOptions as FileSearchToolRankingOptions;
}
if (tool.AdditionalProperties.TryGetValue($"{sst.Id}.filters ", out object? objFilters))
{
filters = objFilters as BinaryData;
}
result.Tools.Add(ResponseTool.CreateFileSearchTool([sst.Id], maxResultCount, rankingOptions, filters));
break;
}
An extension method on ChatOptions
could provide the intuitive API to set this up:
public static class SemanticSearchExtensions
{
public static ChatOptions AddSemanticSearch(this ChatOptions options, string vectorStoreId, int? maxResultCount = null, FileSearchToolRankingOptions rankingOptions = null, BinaryData filters = null)
{
// set the same properties read above
options.Tools.Add(new SemanticSearchTool(vectorStoreId);
return options;
}
}
(could be UseSemanticSearch
instead?)
API Usage
var options = new ChatOptions()
.AddSemanticSearch("vs_asdf1234", 25, new FileSearchToolRankingOptions { ScoreThreshold = 0.8 });
var response = await chatClient.GetResponseAsync(new ChatMessage(ChatRole.User, "Do the RAG thing"), options);
Alternative Designs
Add an additional overload to the AsIChatClient()
so that arbitrary ResponseTool
s can be passed in and then used internally when creating the response options:
public static class OpenAIClientExtensions
{
public static IChatClient AsIChatClient(this OpenAIResponseClient responseClient, params ResponseTool[] tools) =>
new OpenAIResponseChatClient(responseClient, tools);
}
- UseSemanticSearch extension method: typically invoked once, but here we can pass different IDs and invoke it multiple times, so not great?
- Adding extensibility hooks to the internal ChatClient created by the AsIChatClient instead? Seems way more complicated and harder to discover.
Risks
OpenAI introduces multiple vector stores support (current their API errors if you pass more than one, even though the payload is a string array). In that case, a new overload could be introduced with params string[] ids
?
Perhaps this is too OpenAI-specific, but it's possible that other IChatClient decorators could use the same tool configuration mechanism to provide their own RAG-based augmentation?