Skip to content

[Bug] LoRA buffer eviction does not correctly handle adapters with different target weights #7426

Open
@lifuhuang

Description

@lifuhuang

Checklist

  • 1. I have searched related issues but cannot get the expected help.
  • 2. The bug has not been fixed in the latest version.
  • 3. Please note that if the bug-related issue you submitted lacks corresponding environment info and a minimal reproducible demo, it will be challenging for us to reproduce and resolve the issue, reducing the likelihood of receiving feedback.
  • 4. If the issue you raised is not a bug but a question, please raise a discussion at https://github.com/sgl-project/sglang/discussions/new/choose Otherwise, it will be closed.
  • 5. Please use English, otherwise it will be closed.

Describe the bug

In the load_lora_weight_to_buffer function, we zero out A_buffer when uid == None (code reference) to prevent leftover weights of the previously evicted LoRA adapters from interfering with subsequent computations.

However, I suspect we should do the same even when uid != None, because in theory different adapters could target different modules (e.g., some adapters do not target k_proj). Our code might not be able to handle this case correctly, for example, if we have two adapters: lora1 targets k_proj, lora2 does not. If lora2 is reusing the memory buffer left by lora1 after its eviction, the k_proj weight of lora1 would remain in the buffer and potentially contaminate the computation of lora2. I discussed this with @Fridge003 and @Qiaolin-Yu offline and they have the same suspicion.

As this is a rare corner case, I have not got a chance to construct a test to verify it. I am creating this issue to track this potential bug. We need to:

  1. verify: construct a test case to repro the issue, e.g., setting max-loras-per-batch = 1 but have 2 adapters with different target weights.
  2. fix: always zero out buffer during gpu buffer eviction.
  3. benchmark: verify perf overheads introduced by the zero-out operation.

Reproduction

See first comment.

Environment

Bug is environment agnostic

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions