Skip to content

Conversation

@enyst
Copy link
Collaborator

@enyst enyst commented Feb 20, 2025

Simplify prompt caching for new Anthropic API by marking only the last user or tool message as cacheable. Updated unit tests accordingly.

Reference: Anthropic simplified their implementation a little, and we can too.


To run this PR locally, use the following command:

docker run -it --rm   -p 3000:3000   -v /var/run/docker.sock:/var/run/docker.sock   --add-host host.docker.internal:host-gateway   -e SANDBOX_RUNTIME_CONTAINER_IMAGE=docker.all-hands.dev/all-hands-ai/runtime:efc368e-nikolaik   --name openhands-app-efc368e   docker.all-hands.dev/all-hands-ai/openhands:efc368e

@enyst enyst mentioned this pull request Feb 20, 2025
1 task
@enyst enyst marked this pull request as ready for review February 20, 2025 22:01
@enyst
Copy link
Collaborator Author

enyst commented Feb 20, 2025

I can confirm testing locally, that the reported cache hits/writes look as expected:

... step 21
Input tokens: 12572 | Output tokens: 103
Input tokens (cache hit): 12331
Input tokens (cache write): 238

... step 22
Input tokens: 12699 | Output tokens: 53
Input tokens (cache hit): 12569
Input tokens (cache write): 127

@enyst enyst merged commit 22c5ad8 into main Feb 20, 2025
14 checks passed
@enyst enyst deleted the fix-prompt-caching branch February 20, 2025 22:38
adityasoni9998 pushed a commit to adityasoni9998/OpenHands that referenced this pull request Mar 3, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants