Skip to content

Proposal: Use Bucketed Symbol Index for Fast "Find All References" #19951

@teraoka-k

Description

@teraoka-k

Motivation

Currently, rust-analyzer’s "Find All References" can become slow on large projects due to scanning many files and symbols repeatedly. I propose a bucketed symbol-to-files index to dramatically speed up reference searches.

Algorithm Overview

Initialization (Run once on analyzer startup)

  1. Traverse all files (N files) and collect a unique set of symbols.
    • Complexity: O(N) assuming bounded symbols per file.
  2. Traverse all files again to create a map from symbol to the files that contain it (M symbols).
    • Data structure: HashMap<SymbolID, HashSet>
    • Complexity: O(M)

Since these two steps run sequentially, total cost is O(max(N, M)) — linear time, which is efficient for large codebases.

Finding References (find_references_by_bucket)

When a user requests all references for symbol S:

  1. Lookup the list of files from the bucket index in O(1) time.
  2. Search for references to S only in these files, rather than the whole project.
  3. Return the found references.

This search takes O(c) or at worst O(cN), where c is the cost to search a single file. This results in a very fast lookup for most cases, and remains performant even in worst-case scenarios.

Handling New or Modified Files

To maintain correctness incrementally:

  1. On new or edited files, track their paths in a Vec (or similar structure).
  2. When running find_references_by_bucket, include these new/modified files in the search along with the precomputed bucket files for the symbol.

This approach avoids the need for full reindexing on every file change and keeps the system responsive.

Benefits

  • Significant speedup for "Find All References" due to drastically reduced search space (O(1) in most cases and O(N) in the worst case).
  • Linear-time indexing on startup is practical for large projects.
  • Incremental updates keep the index accurate without full reprocessing.
  • Simple data structures and algorithms make it feasible to implement and maintain.

Next Steps

I would like feedback on this approach and am happy to work on a prototype implementation if the maintainers find this direction promising.

Activity

teraoka-k

teraoka-k commented on Jun 9, 2025

@teraoka-k
Author

Related issue: #17491

flodiebold

flodiebold commented on Jun 9, 2025

@flodiebold
Member

What makes finding references slow is AFAIK not finding the initial candidates, but verifying that they actually refer to the requested definition, which requires name resolution and type inference. So I'm not convinced this proposal would have much of an effect. And such an index would require a lot of memory, which we use a lot of already.

flodiebold

flodiebold commented on Jun 9, 2025

@flodiebold
Member

Or to put it differently, rg new in the rust-analyzer repo takes about 20ms. That's how fast finding candidates can be without an index.

teraoka-k

teraoka-k commented on Jun 10, 2025

@teraoka-k
Author

Got it. Thank you for the quick and well-reasoned response. I’ll go ahead and close this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    C-featureCategory: feature request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

      Development

      No branches or pull requests

        Participants

        @flodiebold@teraoka-k

        Issue actions

          Proposal: Use Bucketed Symbol Index for Fast "Find All References" · Issue #19951 · rust-lang/rust-analyzer