Description
Hi!
Since the library cares a lot (according to the README) about performance, I decided to apply Profile-Guided Optimization (PGO) on it (as I already did for many other applications - check my GitHub repo). Hope the results will be interesting for someone.
Test environment
- Fedora 39
- Linux kernel 6.6.13
- AMD Ryzen 9 5900x
- 48 Gib RAM
- SSD Samsung 980 Pro 2 Tib
- Compiler - Rustc 1.75
- minitrace-rust version: the latest for now from the
master
branch on commitee49263f56ebbe038ca70377166d319fdb170601
- Disabled Turbo boost (for more stable results across benchmark runs)
Benchmark
Built-in benchmarks are invoked with cargo bench --all-features --workspace
. PGO instrumentation phase on benchmarks is done with cargo pgo bench -- --all-features --workspace
. PGO optimization phase is done with cargo pgo optimize bench -- --all-features --workspace
.
All PGO optimization steps are done with cargo-pgo tool.
Results
I got the following results:
- Release: https://gist.github.com/zamazan4ik/741d89a5e553433a50b7e74415b3748a
- PGO optimized compared to Release: https://gist.github.com/zamazan4ik/9a64b42e8618a3bc66c586ceae796d7e
- (just for reference) PGO instrumented compared to Release: https://gist.github.com/zamazan4ik/7ca47ae3aa8decf0a7cb1e1768996c7d
At least in the provided by the project benchmarks, I see measurable performance improvements in many cases.
Possible further steps
I can suggest the following things to consider:
- Perform more PGO benchmarks in other scenarios. If it shows improvements - add a note to the documentation about possible improvements in the tracing library performance with PGO (I guess somewhere in the README file will be enough).
I will be happy to answer all your questions about PGO.