Add prefix cache aware routing #641

varungup90 · 2025-02-07T00:04:01Z

No description provided.

pkg/cache/cache.go

pkg/plugins/gateway/algorithms/least_kv_cache.go

pkg/utils/util.go

pkg/plugins/gateway/algorithms/prefix_cache.go

Jeffwan · 2025-02-08T00:42:54Z

pkg/cache/cache.go

+			end = len(unMatchedTokens)
+		}
+		chunk := unMatchedTokens[i:end]
+		prefixHash := xxhash.Sum64(IntArrayToByteArray(chunk))


here. it just consider the current block size? just like to confirm this is not 100% same as vLLM's solution right? their 1st block hash is part of 2nd hash input

Yes, hash only for current block. No link list kind of behavior that vllm has. For our usecase we do not need that link list behavior.

if that case, we need to do up to O(n) calculations? n=number of blocks

Yes. If you are referring to that link list behavior could prevent O(n) calculations then it wont be the case. Total computations stays the same.

it's a little bit different. LinkedList you can do binary search etc for optimization. In this way, we can only do O(n).

Let me look into this, but for our usecase we need to evaluate all blocks to ensure a 50%+ hit rate.

gaocegege

Some nits.

pkg/utils/util.go

pkg/cache/cache.go

Jeffwan

Let's move a little bit faster and track those TODOs in separate issues

* Add prefix cache aware routing * end to end stiching * fix lint errors * nit * add integ test for prefix caching * add constants * address review comments * add prefix cache eviction * add unit test for prefix cache eviction

Add prefix cache aware routing

4249013

varungup90 changed the title ~~Add prefix cache aware routing~~ WIP: Add prefix cache aware routing Feb 7, 2025

varungup90 added 3 commits February 6, 2025 16:04

Merge branch 'main' into add-prefix-cache

a7df900

end to end stiching

1e454be

Merge branch 'main' into add-prefix-cache

825dab7

varungup90 changed the title ~~WIP: Add prefix cache aware routing~~ Add prefix cache aware routing Feb 7, 2025

varungup90 added 4 commits February 7, 2025 10:59

fix lint errors

12f168b

nit

75ad50d

add integ test for prefix caching

72980f7

add constants

e7baa38

Jeffwan reviewed Feb 8, 2025

View reviewed changes

address review comments

c470cc1

gaocegege reviewed Feb 9, 2025

View reviewed changes

pkg/utils/util.go Show resolved Hide resolved

pkg/cache/cache.go Show resolved Hide resolved

add prefix cache eviction

c6a01e6

Jeffwan approved these changes Feb 10, 2025

View reviewed changes

add unit test for prefix cache eviction

5cb876a

varungup90 merged commit f40c973 into main Feb 10, 2025
9 of 10 checks passed

varungup90 deleted the add-prefix-cache branch February 10, 2025 22:09

This was referenced Feb 14, 2025

[router] LSH based prefix cache aware router #672

Open

[router] Use string instead of token ids #673

Closed

Add prefix cache aware routing #641

Add prefix cache aware routing #641

Uh oh!

Conversation

varungup90 commented Feb 7, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Jeffwan Feb 8, 2025

Choose a reason for hiding this comment

Uh oh!

varungup90 Feb 8, 2025

Choose a reason for hiding this comment

Uh oh!

Jeffwan Feb 9, 2025

Choose a reason for hiding this comment

Uh oh!

varungup90 Feb 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Jeffwan Feb 10, 2025

Choose a reason for hiding this comment

Uh oh!

varungup90 Feb 10, 2025

Choose a reason for hiding this comment

Uh oh!

gaocegege left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Jeffwan left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

varungup90 Feb 9, 2025 •

edited

Loading

Jeffwan left a comment •

edited

Loading