Skip to content

[API Proposal]: SemanticSearchTool to enable mapping to OpenAI's Reponses API #6378

Open
@kzu

Description

@kzu

Background and motivation

Currently, when using OpenAIResponseClient.AsIChatClient, there are various implicit supported mappings from the abstractions API to the Responses API so that for the most part things just work (i.e. AIFunction and HostedWebSearchTool. One very important area left out is the ability to perform semantic search as part of the ResponseCreationOptions which looks like the following in OpenAI API:

var options = new ResponseCreationOptions
{
    Tools =
    {
        ResponseTool.CreateFileSearchTool([storeId1], maxResultCount: 10, new FileSearchToolRankingOptions
        {
            ScoreThreshold = 0.7,
        }, filters: binaryDataFilters1),
        ResponseTool.CreateFileSearchTool([storeId2], maxResultCount: 20, new FileSearchToolRankingOptions
        {
            ScoreThreshold = 0.9,
        }, filters: binaryDataFilters2),
    },
};

The tool supports searching only one vector id per CreateFileSearchTool (despite the API implying an array of values), but the accompanying settings (maxResultCount, FileSearchToolRankingOptions and optional BinaryData-based filters to apply file-based attribute filters (see docs) is quite necessary for the abstraction to be usable.

Given the existing way of loosely mapping settings via the options.AdditionalProperties, perhaps the introduction of a new marker AITool is warranted to cover this important aspect in using OpenAI.

API Proposal

namespace Microsoft.Extensions.AI;

public class SemanticSearchTool(string id) : AITool
{
   public string Id => id;
}

The tool Id can be used to provide more than one set of settings for multiple semantic search tools (which is important given the 10k files limitation per vector store in OpenAI).

The mapping code in OpenAIResponseChatClient.cs could perform the following lookup:

foreach (AITool tool in tools)
{
    switch (tool)
    {
        // other cases
        case SemanticSearchTool sst:
            int? maxResultCount = null;
            FileSearchToolRankingOptions? rankingOptions = null;
            BinaryData? filters = null;
            if (tool.AdditionalProperties.TryGetValue($"{sst.Id}.maxResultCount", out object? objMaxResultCount))
            {
                maxResultCount = objMaxResultCount as int?;
            }
            if (tool.AdditionalProperties.TryGetValue($"{sst.Id}.rankingOptions", out object? objRankingOptions))
            {
                rankingOptions = objRankingOptions as FileSearchToolRankingOptions;
            }
            if (tool.AdditionalProperties.TryGetValue($"{sst.Id}.filters ", out object? objFilters))
            {
                filters = objFilters as BinaryData;
            }

            result.Tools.Add(ResponseTool.CreateFileSearchTool([sst.Id], maxResultCount, rankingOptions, filters));
            break;
    }

An extension method on ChatOptions could provide the intuitive API to set this up:

public static class SemanticSearchExtensions 
{ 
    public static ChatOptions AddSemanticSearch(this ChatOptions options, string vectorStoreId, int? maxResultCount = null, FileSearchToolRankingOptions rankingOptions = null, BinaryData filters = null)
    {
        // set the same properties read above
        options.Tools.Add(new SemanticSearchTool(vectorStoreId);
        return options;
    }
}

(could be UseSemanticSearch instead?)

API Usage

var options = new ChatOptions()
    .AddSemanticSearch("vs_asdf1234", 25, new FileSearchToolRankingOptions { ScoreThreshold = 0.8 });

var response = await chatClient.GetResponseAsync(new ChatMessage(ChatRole.User, "Do the RAG thing"), options);

Alternative Designs

Add an additional overload to the AsIChatClient() so that arbitrary ResponseTools can be passed in and then used internally when creating the response options:

public static class OpenAIClientExtensions
{
    public static IChatClient AsIChatClient(this OpenAIResponseClient responseClient, params ResponseTool[] tools) =>
        new OpenAIResponseChatClient(responseClient, tools);
}
  • UseSemanticSearch extension method: typically invoked once, but here we can pass different IDs and invoke it multiple times, so not great?
  • Adding extensibility hooks to the internal ChatClient created by the AsIChatClient instead? Seems way more complicated and harder to discover.

Risks

OpenAI introduces multiple vector stores support (current their API errors if you pass more than one, even though the payload is a string array). In that case, a new overload could be introduced with params string[] ids?

Perhaps this is too OpenAI-specific, but it's possible that other IChatClient decorators could use the same tool configuration mechanism to provide their own RAG-based augmentation?

Metadata

Metadata

Assignees

No one assigned

    Labels

    api-suggestionEarly API idea and discussion, it is NOT ready for implementationarea-aiMicrosoft.Extensions.AI libraries

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions