Skip to content

optimization opportunity: typeindex #1652

Open
@adonovan

Description

@adonovan

Hi Dominik. I'm looking at gopls' DidChange latency now that it runs many staticcheck analyzers by default. The benchmark spends about 16% of CPU in pattern matching, and three analyzers in particular stand out: s1038, qf1004, sa4030. I notice that all three involve an Or pattern with several cases, and I suspect this contributes to the slower running time.

A trick we started using recently in gopls is the typeindex helper analyzer, which builds a reverse index of types.Info.Uses and Selections, allowing other analyzers to do two things quickly:

  1. skip files that don't reference a specific symbol, even if they do import the symbol's package; and
  2. go directly to the references of those symbols, avoiding a traversal.

It occurs to me that all three of these analyzers could take advantage of the index, even if just for item 1, since all of the analyzers depend on symbols that are not super-common and thus a quick check to skip the entire file or package would be a big win.

Of course, typeindex is internal, but you could easily fork it, and eventually we could publish it (after any refinements).

Longer term, it would be neat if your pattern matcher could automatically compute the fast path condition. For example, a pattern such as (AND X (OR Y Z)) could be dismissed if the index for the package doesn't contain X, or doesn't contain at least one of Y and Z.

If you want to get really sophisticated you could apply the same trick at the node level. For example, ahead of time you could do (something like) build a bit set of Cursor.Index values for all references to symbol Y. Then when you check the OR node against a candidate Node n, you can test whether the Cursor.Index range represented by n's subtree intersects with the bitset. If not, you can skip that subtree. (Currently the Cursor.Index value is the index of the "push" event; you can't currently ask for the pop event directly but we could add that; you'd have to compute it from the next event in the preorder sequence.) I bet there are a variety of strategies we could use to preindex the tree to make such matching more efficient; this is just the first this came to mind.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions