feat: add native Anthropic API support for GCP integration #1022

VarSuren · 2025-08-08T19:56:36Z

Description

Allows to use Anthropic Native API with GCP Vertex.

Registered endpoint /v1/messages

Tested:

Streaming/non streaming

url -X POST http://localhost:1062/v1/messages \ -H "Content-Type: application/json" \ -H "x-ai-eg-selected-backend: gcp-anthropic-backend" \ -d '{ "model": "claude-3-7-sonnet@20250219", "max_tokens": 1024, "messages": [ { "role": "user", "content": [{"type": "text", "text": "Hey Claude! Are you running on GCP?"}] } ], "temperature": 0.1, "stream": false }'

Special notes for reviewers (if applicable)

Anthropic vs GCPAnthropic:

Anthropic API version is not equal to GCPAnthropic version i.e.
Anthropic version passed in header of native API cannot be passed through to GCP. Configuration for that field is expected from yaml.
Model field that is passed in native API cannot be passed in the body to GCP.
I had to re-define root messages struct because go's implementation has no field stream ( in sdk those are two different methods )
I'm using chatCompletionMetrics in message factory as it's doing what we need and also use generic LLMTokenUsage but naming is misleading maybe re-name it ?

Open question

~~Official GCP SDK limit they way you can pass request i.e.~~

~~content: "my content" won't work with current set up in requires content : [{"text: "my content"}]~~

~~As a workaround I suggest using plain Map we will loose validation and maybe translator will got a bit messier but it will allow to really do pass through~~

Using dictionary as a struct as Claude code sending GO struct non compatible requests

TODO:

~~verify integration with Claude code~~

Claude code send to url + ?beta=true so I have to register both, didn't find a way to register dynamic URL.

Initial request from testing is fine regular gets dropped with connection issue but that's some internal configuration I'm not aware of, request itself is valid and can be used if you hit GCP directly.

Extproc config I used:

  - name: gcp-anthropic-backend
    schema:
      name: GCPAnthropic
      version: "vertex-2023-10-16"

VarSuren · 2025-08-08T20:22:49Z

internal/extproc/messages_processor.go

+	var isGzip bool
+	switch c.responseEncoding {
+	case "gzip":
+		br, err = gzip.NewReader(bytes.NewReader(body.Body))


This is applicable to openAI python client as it uses httpx python object, plain curl doesn't care about those things we need to verify with Anthropic Python SDK but I assume it would need same removal.

Bottomline: needs to be verified with SDK's but curl works fine

vdabravolski · 2025-08-08T20:26:13Z

cmd/extproc/mainlib/main.go

 	}
 	server.Register("/v1/chat/completions", extproc.ChatCompletionProcessorFactory(chatCompletionMetrics))
 	server.Register("/v1/embeddings", extproc.EmbeddingsProcessorFactory(embeddingsMetrics))
+	server.Register("/v1/messages", extproc.MessagesProcessorFactory(chatCompletionMetrics))


maybe anthropic/v1/message for clean separation of vendor-specific APIs?

yeah makes sense, prolly make it configurable from yaml ? @yuzisun your take ?

that is exactly what we are working on it in #1020, so leaving it to us for that separation concern

The PR landed. you can check the change there and add anthropic prefix config accordingly to do the separation

thank you for more merge conflicts :)

rebased and fixed

yuzisun · 2025-08-08T20:52:57Z

@VarSuren can you sign off the commits ?

yuzisun · 2025-08-11T12:50:42Z

internal/extproc/messages_processor.go

+		if headerMutation == nil {
+			headerMutation = &extprocv3.HeaderMutation{}
+		}
+		// TODO: this is a hotfix, we should update this to recompress since its in the header


@VarSuren we should not need this logic here to remove content encoding as we already inserted the header mutation filter in the extension server, can you check the implementation in the chat completion processor? https://github.com/envoyproxy/ai-gateway/pull/818/files

sure lemme check

yuzisun · 2025-08-11T19:14:43Z

cmd/extproc/mainlib/main.go

 	server.Register("/v1/chat/completions", extproc.ChatCompletionProcessorFactory(chatCompletionMetrics))
 	server.Register("/v1/embeddings", extproc.EmbeddingsProcessorFactory(embeddingsMetrics))
+	server.Register("/v1/messages", extproc.MessagesProcessorFactory(chatCompletionMetrics))
+	server.Register("/v1/messages?beta=true", extproc.MessagesProcessorFactory(chatCompletionMetrics))


Do we need to register this separately ? use case should still be able pass query parameters

yeah 100% this should be handled in one processor.

Usually yes, but I didn't find a way to do dynamic query

I think we might need to fix processorForPath to trim query on processor search

I initially did that, but that would have affected all endpoint, don't know whether other endpoint might use them in future

but also think about it, you might want to have different processor for beta if say it has additional fields functionality

Since for Anthropic we are mostly pass through and let cloud handle the validation, we should be fine with a single processor.

yuzisun · 2025-08-11T19:15:52Z

internal/apischema/anthropic/anthropic.go

+
+// MessagesRequest represents a request to the Anthropic Messages API.
+// Uses a dictionary approach to handle any JSON structure flexibly.
+type MessagesRequest map[string]interface{}


what's the reason to change to the dict type , is it for efficiency?

GOSDK has different signatures not compatible with Claude code requests i.e. content: "hello I'm Claude" , go only expects content: [{"type": "text", "content": "Hey I'm Claude"}]

yuzisun · 2025-08-11T19:21:12Z

internal/extproc/translator/anthropic_gcpanthropic.go

+func (a *anthropicToGCPAnthropicTranslator) RequestBody(_ []byte, body *anthropicschema.MessagesRequest, _ bool) (
+	headerMutation *extprocv3.HeaderMutation, bodyMutation *extprocv3.BodyMutation, err error,
+) {
+	log.Println("hit GCP translator")


remove the debug log line

addressed, btw what's the approach for this project, we don't use logs at all ?

We still want logging if needed but this one obviously is not :)

arguable but okay

yuzisun · 2025-08-11T19:22:44Z

internal/extproc/translator/anthropic_gcpanthropic.go

+	if status, ok := headers[":status"]; ok {
+		log.Printf("response status: %s", status)
+	}


remove the debug log line

yuzisun · 2025-08-11T19:22:55Z

internal/extproc/translator/anthropic_gcpanthropic.go

+func (a *anthropicToGCPAnthropicTranslator) ResponseBody(_ map[string]string, body io.Reader, endOfStream bool) (
+	headerMutation *extprocv3.HeaderMutation, bodyMutation *extprocv3.BodyMutation, tokenUsage LLMTokenUsage, err error,
+) {
+	log.Printf("hit translator - processing response body (endOfStream: %v)", endOfStream)


yuzisun · 2025-08-11T19:25:02Z

internal/extproc/translator/anthropic_gcpanthropic.go

+		Mutation: &extprocv3.BodyMutation_Body{Body: bodyBytes},
+	}
+
+	log.Println("response translation completed")


cmd/extproc/mainlib/main.go

clicked the approve by mistake

codecov-commenter · 2025-08-12T17:24:40Z

Codecov Report

❌ Patch coverage is 77.81155% with 73 lines in your changes missing coverage. Please review.
✅ Project coverage is 78.16%. Comparing base (fb1f875) to head (0066554).

Files with missing lines	Patch %	Lines
internal/extproc/messages_processor.go	71.42%	49 Missing and 15 partials ⚠️
...ernal/extproc/translator/anthropic_gcpanthropic.go	89.41%	7 Missing and 2 partials ⚠️

❌ Your project status has failed because the head coverage (78.16%) is below the target coverage (86.00%). You can increase the head coverage or adjust the target coverage.

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1022      +/-   ##
==========================================
- Coverage   78.33%   78.16%   -0.18%     
==========================================
  Files          81       84       +3     
  Lines        9349     9678     +329     
==========================================
+ Hits         7324     7565     +241     
- Misses       1681     1755      +74     
- Partials      344      358      +14

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

mathetake · 2025-08-12T17:27:58Z

We need extproc e2e test cases here to cover the code:

ai-gateway/tests/extproc/testupstream_test.go

Line 75 in fb1f875

for _, tc := range []struct {

Signed-off-by: Suren Vartanian <[email protected]>

- Add Anthropic schema to CRD validations - Update API documentation - Enable /v1/messages endpoint with GCP Anthropic backend Signed-off-by: Suren Vartanian <[email protected]>

- Remove deleted aws_invokemodel translator references from chatcompletion_processor - Delete AWS Bedrock examples (anthropic-aws-bedrock.yaml, AWS_BEDROCK_SETUP.md) - Update examples/anthropic/README.md to remove all AWS Bedrock references - Fix linting issues: convert if-else to switch statements, add periods to comments - Add new files: messages_processor.go, gcpanthropic_to_native.go translator Signed-off-by: Suren Vartanian <[email protected]>

Signed-off-by: Suren Vartanian <[email protected]>

- Create internal/apischema/anthropic package using official Anthropic SDK types - Update messages_processor.go to follow same pattern as chatcompletion_processor - Replace manual JSON parsing with parseAnthropicMessagesBody() function - Use structured approach: parse to SDK types, access fields directly - Remove schema validation to align with chatcompletion processor - Fix imports to use internal/metrics instead of filterapi/x - Add proper originalRequestBody storage and stream detection Signed-off-by: Suren Vartanian <[email protected]>

Signed-off-by: Suren Vartanian <[email protected]>

- Restore clean version from upstream to resolve syntax errors - Remove duplicate function definition and missing brace issues Signed-off-by: Suren Vartanian <[email protected]>

…oint - Register dual paths in main.go for /v1/messages and /v1/messages?beta=true - Simplify messages_processor.go by removing query parameter handling - Convert MessagesRequest from struct to map[string]interface{} in anthropic schema - Fix translator tests to use map syntax instead of struct fields - Clean up debug logs in anthropic_gcpanthropic.go, keep only essential logging - Update test assertions for stream field inclusion in maps - Update CRD manifests This enables Claude Code compatibility by accepting the beta parameter while maintaining clean endpoint-specific routing architecture. Signed-off-by: Suren Vartanian <[email protected]>

- Remove emoji icons from all log statements - Remove response body preview logging for security/privacy - Keep only essential logging: translator hits, status, size, errors This ensures production-appropriate logging without exposing sensitive data. Signed-off-by: Suren Vartanian <[email protected]>

…rshal - Work directly with map[string]interface{} instead of marshal->unmarshal roundtrip - Copy map directly since MessagesRequest is already map[string]interface{} - Eliminates 1 json.Marshal() and 1 json.Unmarshal() call per request - Reduces memory allocations and improves performance - Maintains same functionality with all tests passing Signed-off-by: Suren Vartanian <[email protected]>

- Add anthropicPrefix flag with default value '/v1' - Use path.Join for consistent prefix handling like OpenAI endpoints - Register both /messages and /messages?beta=true with configurable prefix - Enables custom Anthropic endpoint prefixes (e.g., /api/anthropic/v1/messages) - Maintains backward compatibility with default /v1 prefix - Follows same pattern as existing openAIPrefix implementation Example usage: --anthropicPrefix='/api/anthropic/v1' # Custom prefix (default uses '/v1' for backward compatibility) Signed-off-by: Suren Vartanian <[email protected]>

- Change anthropicPrefix default from '/v1' to '/anthropic/v1' for clear separation - Remove all debug logging statements from GCP Anthropic translator - Remove unused log import and clean up formatting - Fix unused parameter linting issue (headers -> _) This addresses PR feedback by: 1. Providing clear endpoint separation (/v1/* for OpenAI, /anthropic/v1/* for Anthropic) 2. Removing debug noise for production-ready code 3. Maintaining same functionality with cleaner implementation All tests passing and precommit checks clean. Signed-off-by: Suren Vartanian <[email protected]>

- Strip query parameters in server.go for processor lookup instead of registering duplicate endpoints - Remove duplicate /messages?beta=true registration from main.go - Keep single /messages endpoint registration with query stripping logic - Both /messages and /messages?beta=true now route to same processor cleanly Signed-off-by: Suren Vartanian <[email protected]>

Signed-off-by: Suren Vartanian <[email protected]>

- Add MessagesProcessor for /anthropic/v1/messages endpoint - Add comprehensive unit tests for messages processor and anthropic schema - Add e2e test support with proper GCP Vertex AI configuration - Fixed gcpAnthropicAISchema to include vertex-2023-10-16 version - Improved test coverage to meet project thresholds (70% file, 81% package) Signed-off-by: Suren Vartanian <[email protected]>

- Add test case for /anthropic/v1/messages endpoint (non-streaming) - Add test case for /anthropic/v1/messages endpoint (streaming) - Fixed streaming response format to match SSE standard with proper whitespace - Tests validate request transformation from Anthropic to GCP Vertex AI format Signed-off-by: Suren Vartanian <[email protected]>

yuzisun · 2025-08-12T23:39:09Z

/lgtm, @mathetake any more comments for this ?

mathetake · 2025-08-12T23:42:32Z

q: this doesn't support a real Anthropic backend yet right? I think that sounds like a must-have if we want to promote this as a "feature" in v0.3. How long do you think it would take to get it working in a follow up PR? (Also this needs a documentation as well 100%)

yuzisun · 2025-08-12T23:49:12Z

q: this doesn't support a real Anthropic backend yet right? I think that sounds like a must-have if we want to promote this as a "feature" in v0.3. How long do you think it would take to get it working in a follow up PR? (Also this needs a documentation as well 100%)

This is a real Anthropic backend just it is hosted on gcp, I think for 0.3 we are targeting to fully support gcp Gemini via chat completion and Anthropic models via either chat completion or native Anthropic message API. We can target for 0.4 to support first party Anthropic and AWS Anthropic via Anthropic message API.

mathetake · 2025-08-13T00:05:02Z

sounds good. could you create the umbrella issue that tracks them like "Anthropic Input/SDK Support" that documents what works now and what doesn't as well as the plans for v0.4? Then this PR is good to go

mathetake · 2025-08-13T00:07:09Z

oh i think we can reuse #847 -- can you post the updates assuming this PR lands? I haven't gotten around to reviewing all but will land it either i push the changes or as-is within today on west coast

yuzisun · 2025-08-13T00:37:01Z

oh i think we can reuse #847 -- can you post the updates assuming this PR lands? I haven't gotten around to reviewing all but will land it either i push the changes or as-is within today on west coast

Thanks! updated the issue

…y#1022) **Description** Allows to use Anthropic Native API with GCP Vertex backend. --------- Signed-off-by: Suren Vartanian <[email protected]> Signed-off-by: Erica Hughberg <[email protected]>

VarSuren requested a review from a team as a code owner August 8, 2025 19:56

VarSuren changed the title ~~Native api support gcp~~ feat: Add native Anthropic API support for GCP integration Aug 8, 2025

VarSuren commented Aug 8, 2025

View reviewed changes

VarSuren changed the title ~~feat: Add native Anthropic API support for GCP integration~~ feat: add native Anthropic API support for GCP integration Aug 8, 2025

vdabravolski reviewed Aug 8, 2025

View reviewed changes

mathetake mentioned this pull request Aug 8, 2025

feat: make prefix of endpoints configurable #1020

Merged

yuzisun reviewed Aug 11, 2025

View reviewed changes

VarSuren force-pushed the native-api-support-gcp branch 3 times, most recently from c78ef54 to 2bf5f3b Compare August 11, 2025 14:44

yuzisun reviewed Aug 11, 2025

View reviewed changes

VarSuren force-pushed the native-api-support-gcp branch from a399e08 to 4a25262 Compare August 11, 2025 19:36

yuzisun previously approved these changes Aug 11, 2025

View reviewed changes

mathetake reviewed Aug 11, 2025

View reviewed changes

cmd/extproc/mainlib/main.go Outdated Show resolved Hide resolved

VarSuren force-pushed the native-api-support-gcp branch from bcd7537 to a370b8b Compare August 11, 2025 23:51

yuzisun self-requested a review August 12, 2025 03:51

Suren Vartanian added 9 commits August 12, 2025 15:10

initial commit for discussion

0de076a

Signed-off-by: Suren Vartanian <[email protected]>

Add native Anthropic API support for GCP integration

0479dd0

- Add Anthropic schema to CRD validations - Update API documentation - Enable /v1/messages endpoint with GCP Anthropic backend Signed-off-by: Suren Vartanian <[email protected]>

Revert chatcompletion_processor.go to clean version from origin/main

3968506

Signed-off-by: Suren Vartanian <[email protected]>

remove some logs change

80a2691

Signed-off-by: Suren Vartanian <[email protected]>

Add interface for messages, add anthropic struct

80c71ff

Signed-off-by: Suren Vartanian <[email protected]>

further clean up

46eef90

Signed-off-by: Suren Vartanian <[email protected]>

remove optional fields, clean up tests

9d3d047

Signed-off-by: Suren Vartanian <[email protected]>

Suren Vartanian added 13 commits August 12, 2025 15:10

remove anthropic from comment

3b0e2df

Signed-off-by: Suren Vartanian <[email protected]>

remove outdated ReadMe

36a3cee

Signed-off-by: Suren Vartanian <[email protected]>

clean yamls

3cee9da

Signed-off-by: Suren Vartanian <[email protected]>

remove response recreate crd

4f00927

Signed-off-by: Suren Vartanian <[email protected]>

Fix syntax error in chatcompletion_processor.go

027a479

- Restore clean version from upstream to resolve syntax errors - Remove duplicate function definition and missing brace issues Signed-off-by: Suren Vartanian <[email protected]>

Fix godot linting error: add period to comment

1a88aa4

Signed-off-by: Suren Vartanian <[email protected]>

VarSuren force-pushed the native-api-support-gcp branch from 48cac5f to 4fe9ffe Compare August 12, 2025 19:13

yuzisun mentioned this pull request Aug 13, 2025

Support Native Anthropic API #847

Closed

3 tasks

mathetake approved these changes Aug 13, 2025

View reviewed changes

mathetake merged commit 0f56580 into envoyproxy:main Aug 13, 2025
26 checks passed

mathetake mentioned this pull request Aug 15, 2025

Refactor processors code to share common logic #1083

Closed

Copilot AI mentioned this pull request Aug 22, 2025

Refactor processors code to share common logic #1111

Closed

feat: add native Anthropic API support for GCP integration #1022

feat: add native Anthropic API support for GCP integration #1022

Uh oh!

Conversation

VarSuren commented Aug 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

yuzisun commented Aug 8, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

yuzisun Aug 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

yuzisun Aug 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

VarSuren Aug 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

codecov-commenter commented Aug 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

mathetake commented Aug 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

yuzisun commented Aug 12, 2025

Uh oh!

mathetake commented Aug 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

yuzisun commented Aug 12, 2025

Uh oh!

VarSuren commented Aug 8, 2025 •

edited

Loading

yuzisun Aug 11, 2025 •

edited

Loading

yuzisun Aug 11, 2025 •

edited

Loading

VarSuren Aug 11, 2025 •

edited

Loading

codecov-commenter commented Aug 12, 2025 •

edited

Loading

mathetake commented Aug 12, 2025 •

edited

Loading

mathetake commented Aug 12, 2025 •

edited

Loading

mathetake commented Aug 13, 2025 •

edited

Loading

mathetake commented Aug 13, 2025 •

edited

Loading