-
Notifications
You must be signed in to change notification settings - Fork 87
feat: Cache parameters & return type during call_unsafe_wdf_function_binding
macro expansion
#295
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
call_unsafe_wdf_function_binding
macro expansioncall_unsafe_wdf_function_binding
macro expansion
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copilot reviewed 13 out of 13 changed files in this pull request and generated no comments.
Low priority question: what does the benchmarking look like with
set in cargo.toml to optimize macros used during build? This saved me significant time compiling a complex driver using lots of calls to call_unsafe_wdf_function_binding; I expect your change easily outperforms this, but I'm curious if enabling this in the cargo.toml still provides meaningful speedups with your change. |
Oh that's interesting. I didn't know proc-macros always compile with no optimizations. If having
Something else of note from the cargo docs:
Because of this, it may be worth using a more targeted approach: [profile.dev.package.<PKG NAME>]
opt-level = 3 We can have the above snippet for all packages that:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for digging into this and making this change! I've definitely been meaning to make this change for a long while now, but never had the time to :) Glad you could implement it and that its showing significant benefit
I requested various comments mostly on testing/structuring for testing, some small style things, and some potential perf bumps.
Co-authored-by: Melvin Wang <[email protected]> Signed-off-by: Leon Durrenberger <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Copilot reviewed 16 out of 16 changed files in this pull request and generated no comments.
Comments suppressed due to low confidence (1)
crates/wdk-macros/tests/unit-tests-input/generated-types-robust.rs:13
- [nitpick] There's an extra space before the period in the comment. Consider removing the space after
TableIndex
for consistency.
// Build script should ignore any entry that doesn't end in `TableIndex` .
crates/wdk-macros/tests/unit-tests-input/generated-types-robust.rs
Outdated
Show resolved
Hide resolved
crates/wdk-macros/tests/unit-tests-input/no-wdf-func-enum-contents.rs
Outdated
Show resolved
Hide resolved
crates/wdk-macros/tests/unit-tests-input/generated-types-robust.rs
Outdated
Show resolved
Hide resolved
We can certainly add it to our templates. Just need to investigate and establish which build dependencies should receive this treatment. FYI @krishnakumar4a4 and @svasista-ms. No action needed right now but something to consider for later. |
Overview
Currently, each invocation of the
call_unsafe_wdf_function_binding
macro loads the entiretypes.rs
file into memory, and parses it for relevant information. This is very inefficient becausetypes.rs
is generated to be over 200,000 lines long, and we are searching the entire file for just a few lines of information.This PR instead introduces a caching mechanism that stores the relevant information from
types.rs
in storage, so that we don't have to re-parse the entire file every time we need to look up a type or return value. This is done ingenerate_derived_ast_fragments
by generating a temporaryscratch
directory, dumping the relevant information into a file in that directory, and then loading that file into memory when we need to look up the parameters & return value. This generated file is only 32 KB, as opposed totypes.rs
, which is 5 MB.Results
Using the

crox
tool we can evaluate the difference in performance of this change. The previous version of this implementation took around 3 seconds per invocation, as seen here:After this change, the first invocation of this macro increases to a bit under 5 seconds:

But after each file is cached, each subsequent invocation of the macro takes around 3 milliseconds.

The time saved per build scales with the number of WDF function invocations, meaning this greatly improves the developer experience.