Skip to content

Conversation

@VarSuren
Copy link
Contributor

@VarSuren VarSuren commented Aug 8, 2025

Description

Allows to use Anthropic Native API with GCP Vertex.

Registered endpoint /v1/messages

Tested:

Streaming/non streaming

url -X POST http://localhost:1062/v1/messages \ -H "Content-Type: application/json" \ -H "x-ai-eg-selected-backend: gcp-anthropic-backend" \ -d '{ "model": "claude-3-7-sonnet@20250219", "max_tokens": 1024, "messages": [ { "role": "user", "content": [{"type": "text", "text": "Hey Claude! Are you running on GCP?"}] } ], "temperature": 0.1, "stream": false }'

Special notes for reviewers (if applicable)

Anthropic vs GCPAnthropic:

  1. Anthropic API version is not equal to GCPAnthropic version i.e.
    Anthropic version passed in header of native API cannot be passed through to GCP. Configuration for that field is expected from yaml.

  2. Model field that is passed in native API cannot be passed in the body to GCP.

  3. I had to re-define root messages struct because go's implementation has no field stream ( in sdk those are two different methods )

  4. I'm using chatCompletionMetrics in message factory as it's doing what we need and also use generic LLMTokenUsage but naming is misleading maybe re-name it ?

Open question

Official GCP SDK limit they way you can pass request i.e.

content: "my content" won't work with current set up in requires content : [{"text: "my content"}]

As a workaround I suggest using plain Map we will loose validation and maybe translator will got a bit messier but it will allow to really do pass through

Using dictionary as a struct as Claude code sending GO struct non compatible requests

TODO:

verify integration with Claude code

Claude code send to url + ?beta=true so I have to register both, didn't find a way to register dynamic URL.

Initial request from testing is fine regular gets dropped with connection issue but that's some internal configuration I'm not aware of, request itself is valid and can be used if you hit GCP directly.

Extproc config I used:

  - name: gcp-anthropic-backend
    schema:
      name: GCPAnthropic
      version: "vertex-2023-10-16"

@VarSuren VarSuren requested a review from a team as a code owner August 8, 2025 19:56
@VarSuren VarSuren changed the title Native api support gcp feat: Add native Anthropic API support for GCP integration Aug 8, 2025
var isGzip bool
switch c.responseEncoding {
case "gzip":
br, err = gzip.NewReader(bytes.NewReader(body.Body))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is applicable to openAI python client as it uses httpx python object, plain curl doesn't care about those things we need to verify with Anthropic Python SDK but I assume it would need same removal.

Bottomline: needs to be verified with SDK's but curl works fine

@VarSuren VarSuren changed the title feat: Add native Anthropic API support for GCP integration feat: add native Anthropic API support for GCP integration Aug 8, 2025
}
server.Register("/v1/chat/completions", extproc.ChatCompletionProcessorFactory(chatCompletionMetrics))
server.Register("/v1/embeddings", extproc.EmbeddingsProcessorFactory(embeddingsMetrics))
server.Register("/v1/messages", extproc.MessagesProcessorFactory(chatCompletionMetrics))

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe anthropic/v1/message for clean separation of vendor-specific APIs?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah makes sense, prolly make it configurable from yaml ? @yuzisun your take ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that is exactly what we are working on it in #1020, so leaving it to us for that separation concern

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR landed. you can check the change there and add anthropic prefix config accordingly to do the separation

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you for more merge conflicts :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rebased and fixed

@yuzisun
Copy link
Contributor

yuzisun commented Aug 8, 2025

@VarSuren can you sign off the commits ?

if headerMutation == nil {
headerMutation = &extprocv3.HeaderMutation{}
}
// TODO: this is a hotfix, we should update this to recompress since its in the header
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@VarSuren we should not need this logic here to remove content encoding as we already inserted the header mutation filter in the extension server, can you check the implementation in the chat completion processor? https://github.com/envoyproxy/ai-gateway/pull/818/files

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure lemme check

@VarSuren VarSuren force-pushed the native-api-support-gcp branch 3 times, most recently from c78ef54 to 2bf5f3b Compare August 11, 2025 14:44
server.Register("/v1/chat/completions", extproc.ChatCompletionProcessorFactory(chatCompletionMetrics))
server.Register("/v1/embeddings", extproc.EmbeddingsProcessorFactory(embeddingsMetrics))
server.Register("/v1/messages", extproc.MessagesProcessorFactory(chatCompletionMetrics))
server.Register("/v1/messages?beta=true", extproc.MessagesProcessorFactory(chatCompletionMetrics))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to register this separately ? use case should still be able pass query parameters

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah 100% this should be handled in one processor.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Usually yes, but I didn't find a way to do dynamic query

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we might need to fix processorForPath to trim query on processor search

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I initially did that, but that would have affected all endpoint, don't know whether other endpoint might use them in future

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but also think about it, you might want to have different processor for beta if say it has additional fields functionality

Copy link
Contributor

@yuzisun yuzisun Aug 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since for Anthropic we are mostly pass through and let cloud handle the validation, we should be fine with a single processor.


// MessagesRequest represents a request to the Anthropic Messages API.
// Uses a dictionary approach to handle any JSON structure flexibly.
type MessagesRequest map[string]interface{}
Copy link
Contributor

@yuzisun yuzisun Aug 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's the reason to change to the dict type , is it for efficiency?

Copy link
Contributor Author

@VarSuren VarSuren Aug 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GOSDK has different signatures not compatible with Claude code requests i.e. content: "hello I'm Claude" , go only expects content: [{"type": "text", "content": "Hey I'm Claude"}]

@VarSuren VarSuren force-pushed the native-api-support-gcp branch from a399e08 to 4a25262 Compare August 11, 2025 19:36
yuzisun
yuzisun previously approved these changes Aug 11, 2025
func (a *anthropicToGCPAnthropicTranslator) RequestBody(_ []byte, body *anthropicschema.MessagesRequest, _ bool) (
headerMutation *extprocv3.HeaderMutation, bodyMutation *extprocv3.BodyMutation, err error,
) {
log.Println("hit GCP translator")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove the debug log line

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

addressed, btw what's the approach for this project, we don't use logs at all ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We still want logging if needed but this one obviously is not :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

arguable but okay

Comment on lines 87 to 89
if status, ok := headers[":status"]; ok {
log.Printf("response status: %s", status)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove the debug log line

func (a *anthropicToGCPAnthropicTranslator) ResponseBody(_ map[string]string, body io.Reader, endOfStream bool) (
headerMutation *extprocv3.HeaderMutation, bodyMutation *extprocv3.BodyMutation, tokenUsage LLMTokenUsage, err error,
) {
log.Printf("hit translator - processing response body (endOfStream: %v)", endOfStream)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here

Mutation: &extprocv3.BodyMutation_Body{Body: bodyBytes},
}

log.Println("response translation completed")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here

@VarSuren VarSuren force-pushed the native-api-support-gcp branch from bcd7537 to a370b8b Compare August 11, 2025 23:51
@yuzisun yuzisun self-requested a review August 12, 2025 03:51
@yuzisun yuzisun dismissed their stale review August 12, 2025 03:54

clicked the approve by mistake

@codecov-commenter
Copy link

codecov-commenter commented Aug 12, 2025

Codecov Report

❌ Patch coverage is 77.81155% with 73 lines in your changes missing coverage. Please review.
✅ Project coverage is 78.16%. Comparing base (fb1f875) to head (0066554).

Files with missing lines Patch % Lines
internal/extproc/messages_processor.go 71.42% 49 Missing and 15 partials ⚠️
...ernal/extproc/translator/anthropic_gcpanthropic.go 89.41% 7 Missing and 2 partials ⚠️

❌ Your project status has failed because the head coverage (78.16%) is below the target coverage (86.00%). You can increase the head coverage or adjust the target coverage.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1022      +/-   ##
==========================================
- Coverage   78.33%   78.16%   -0.18%     
==========================================
  Files          81       84       +3     
  Lines        9349     9678     +329     
==========================================
+ Hits         7324     7565     +241     
- Misses       1681     1755      +74     
- Partials      344      358      +14     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@mathetake
Copy link
Member

mathetake commented Aug 12, 2025

We need extproc e2e test cases here to cover the code:

for _, tc := range []struct {

Suren Vartanian added 9 commits August 12, 2025 15:10
Signed-off-by: Suren Vartanian <[email protected]>
- Add Anthropic schema to CRD validations
- Update API documentation
- Enable /v1/messages endpoint with GCP Anthropic backend

Signed-off-by: Suren Vartanian <[email protected]>
- Remove deleted aws_invokemodel translator references from chatcompletion_processor
- Delete AWS Bedrock examples (anthropic-aws-bedrock.yaml, AWS_BEDROCK_SETUP.md)
- Update examples/anthropic/README.md to remove all AWS Bedrock references
- Fix linting issues: convert if-else to switch statements, add periods to comments
- Add new files: messages_processor.go, gcpanthropic_to_native.go translator

Signed-off-by: Suren Vartanian <[email protected]>
- Create internal/apischema/anthropic package using official Anthropic SDK types
- Update messages_processor.go to follow same pattern as chatcompletion_processor
- Replace manual JSON parsing with parseAnthropicMessagesBody() function
- Use structured approach: parse to SDK types, access fields directly
- Remove schema validation to align with chatcompletion processor
- Fix imports to use internal/metrics instead of filterapi/x
- Add proper originalRequestBody storage and stream detection

Signed-off-by: Suren Vartanian <[email protected]>
Signed-off-by: Suren Vartanian <[email protected]>
Signed-off-by: Suren Vartanian <[email protected]>
Suren Vartanian added 13 commits August 12, 2025 15:10
Signed-off-by: Suren Vartanian <[email protected]>
Signed-off-by: Suren Vartanian <[email protected]>
Signed-off-by: Suren Vartanian <[email protected]>
Signed-off-by: Suren Vartanian <[email protected]>
- Restore clean version from upstream to resolve syntax errors
- Remove duplicate function definition and missing brace issues

Signed-off-by: Suren Vartanian <[email protected]>
…oint

- Register dual paths in main.go for /v1/messages and /v1/messages?beta=true
- Simplify messages_processor.go by removing query parameter handling
- Convert MessagesRequest from struct to map[string]interface{} in anthropic schema
- Fix translator tests to use map syntax instead of struct fields
- Clean up debug logs in anthropic_gcpanthropic.go, keep only essential logging
- Update test assertions for stream field inclusion in maps
- Update CRD manifests

This enables Claude Code compatibility by accepting the beta parameter
while maintaining clean endpoint-specific routing architecture.

Signed-off-by: Suren Vartanian <[email protected]>
- Remove emoji icons from all log statements
- Remove response body preview logging for security/privacy
- Keep only essential logging: translator hits, status, size, errors

This ensures production-appropriate logging without exposing sensitive data.

Signed-off-by: Suren Vartanian <[email protected]>
…rshal

- Work directly with map[string]interface{} instead of marshal->unmarshal roundtrip
- Copy map directly since MessagesRequest is already map[string]interface{}
- Eliminates 1 json.Marshal() and 1 json.Unmarshal() call per request
- Reduces memory allocations and improves performance
- Maintains same functionality with all tests passing

Signed-off-by: Suren Vartanian <[email protected]>
- Add anthropicPrefix flag with default value '/v1'
- Use path.Join for consistent prefix handling like OpenAI endpoints
- Register both /messages and /messages?beta=true with configurable prefix
- Enables custom Anthropic endpoint prefixes (e.g., /api/anthropic/v1/messages)
- Maintains backward compatibility with default /v1 prefix
- Follows same pattern as existing openAIPrefix implementation

Example usage:
  --anthropicPrefix='/api/anthropic/v1'  # Custom prefix
  (default uses '/v1' for backward compatibility)

Signed-off-by: Suren Vartanian <[email protected]>
- Change anthropicPrefix default from '/v1' to '/anthropic/v1' for clear separation
- Remove all debug logging statements from GCP Anthropic translator
- Remove unused log import and clean up formatting
- Fix unused parameter linting issue (headers -> _)

This addresses PR feedback by:
1. Providing clear endpoint separation (/v1/* for OpenAI, /anthropic/v1/* for Anthropic)
2. Removing debug noise for production-ready code
3. Maintaining same functionality with cleaner implementation

All tests passing and precommit checks clean.

Signed-off-by: Suren Vartanian <[email protected]>
- Strip query parameters in server.go for processor lookup instead of registering duplicate endpoints
- Remove duplicate /messages?beta=true registration from main.go
- Keep single /messages endpoint registration with query stripping logic
- Both /messages and /messages?beta=true now route to same processor cleanly

Signed-off-by: Suren Vartanian <[email protected]>
- Add MessagesProcessor for /anthropic/v1/messages endpoint
- Add comprehensive unit tests for messages processor and anthropic schema
- Add e2e test support with proper GCP Vertex AI configuration
- Fixed gcpAnthropicAISchema to include vertex-2023-10-16 version
- Improved test coverage to meet project thresholds (70% file, 81% package)

Signed-off-by: Suren Vartanian <[email protected]>
@VarSuren VarSuren force-pushed the native-api-support-gcp branch from 48cac5f to 4fe9ffe Compare August 12, 2025 19:13
- Add test case for /anthropic/v1/messages endpoint (non-streaming)
- Add test case for /anthropic/v1/messages endpoint (streaming)
- Fixed streaming response format to match SSE standard with proper whitespace
- Tests validate request transformation from Anthropic to GCP Vertex AI format

Signed-off-by: Suren Vartanian <[email protected]>
@yuzisun
Copy link
Contributor

yuzisun commented Aug 12, 2025

/lgtm, @mathetake any more comments for this ?

@mathetake
Copy link
Member

mathetake commented Aug 12, 2025

q: this doesn't support a real Anthropic backend yet right? I think that sounds like a must-have if we want to promote this as a "feature" in v0.3. How long do you think it would take to get it working in a follow up PR? (Also this needs a documentation as well 100%)

@yuzisun
Copy link
Contributor

yuzisun commented Aug 12, 2025

q: this doesn't support a real Anthropic backend yet right? I think that sounds like a must-have if we want to promote this as a "feature" in v0.3. How long do you think it would take to get it working in a follow up PR? (Also this needs a documentation as well 100%)

This is a real Anthropic backend just it is hosted on gcp, I think for 0.3 we are targeting to fully support gcp Gemini via chat completion and Anthropic models via either chat completion or native Anthropic message API. We can target for 0.4 to support first party Anthropic and AWS Anthropic via Anthropic message API.

@mathetake
Copy link
Member

mathetake commented Aug 13, 2025

sounds good. could you create the umbrella issue that tracks them like "Anthropic Input/SDK Support" that documents what works now and what doesn't as well as the plans for v0.4? Then this PR is good to go

@mathetake
Copy link
Member

mathetake commented Aug 13, 2025

oh i think we can reuse #847 -- can you post the updates assuming this PR lands? I haven't gotten around to reviewing all but will land it either i push the changes or as-is within today on west coast

@yuzisun yuzisun mentioned this pull request Aug 13, 2025
3 tasks
@yuzisun
Copy link
Contributor

yuzisun commented Aug 13, 2025

oh i think we can reuse #847 -- can you post the updates assuming this PR lands? I haven't gotten around to reviewing all but will land it either i push the changes or as-is within today on west coast

Thanks! updated the issue

@mathetake mathetake merged commit 0f56580 into envoyproxy:main Aug 13, 2025
26 checks passed
missBerg pushed a commit to missBerg/ai-gateway that referenced this pull request Dec 20, 2025
…y#1022)

**Description**

Allows to use Anthropic Native API with GCP Vertex backend.

---------

Signed-off-by: Suren Vartanian <[email protected]>
Signed-off-by: Erica Hughberg <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants