Description
Hi!
Recently I checked Profile-Guided Optimization (PGO) improvements on multiple projects. The results are here. E.g. PGO results for LLVM-related tooling are here. According to the tests, PGO usually helps with the compiler and compiler-like workloads (like static analysis or code formatters) - e.g. Clang gets +20% compilation speed with PGO. Since this, I think trying to optimize Slint tools like compiler, code formatted and LSP would be a good idea. I already did some PGO benchmarks on slint-fmt
and want to share my results here.
Test environment
- Fedora 39
- Linux kernel 6.5.11
- AMD Ryzen 9 5900x
- 48 Gib RAM
- SSD Samsung 980 Pro 2 Tib
- Compiler - Rustc 1.73
- Slint version: the latest for now from the
master
branch on commitdf7657dc2d1fb37f17ff0f7285af68bf6704fd22
Benchmark
For benchmark purposes, I run slint-fmt
on all .slint
files in the examples
directory with slint-fmt <file names>
. PGO training phase was done on the same files. Release build is done with cargo build --release --bin slint-fmt
. PGO build is done with cargo-pgo (cargo pgo build -- --bin slint-fmt
+ run on the training workload + cargo pgo optimize build -- --bin slint-fmt
).
Benchmarks are done on the same machine, with the same background "noise". The benchmark was performed multiple times - the results are reproducible.
Results
I got the following results (with hyperfine
benchmark tool):
hyperfine --warmup 500 --min-runs 5000 `./slint-fmt_release skipped_input_files` `./slint-fmt_optimized skipped_input_files`
Benchmark 1: ./slint-fmt_release skipped_input_files
Time (mean ± σ): 18.0 ms ± 1.0 ms [User: 14.1 ms, System: 3.7 ms]
Range (min … max): 17.3 ms … 36.9 ms 5000 runs
Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet system without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options.
Benchmark 2: ./slint-fmt_optimized skipped_input_files
Time (mean ± σ): 15.8 ms ± 0.8 ms [User: 11.9 ms, System: 3.8 ms]
Range (min … max): 15.3 ms … 33.6 ms 5000 runs
Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet system without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options.
Summary
./slint-fmt_optimized skipped_input_files ran
1.14 ± 0.08 times faster than ./slint-fmt_release skipped_input_files
where slint_fmt-release
- default Release build, slint-fmt_optimized
- Release + PGO-optimized build.
At least in the scenario above PGO helps with achieving better performance with slint-fmt
.
Further steps
I can suggest the following action points:
- Perform more PGO benchmarks on Slint tools. If it shows improvements - add a note to the documentation about possible improvements in Slint tools' performance with PGO.
- Providing an easier way (e.g. a build option) to build scripts with PGO can be helpful for the end-users and maintainers since they will be able to optimize Slint tools according to their workloads.
- Optimize pre-built binaries with PGO (if it's possible to prepare a generic enough profile).
Testing Post-Link Optimization techniques (like LLVM BOLT) would be interesting too (Clang and Rustc already use BOLT as an addition to PGO) but I recommend starting from the usual PGO.
Here are some examples of how PGO optimization is already integrated into other projects:
- Rustc: a CI script for the multi-stage build
- GCC:
- Clang: Docs
- Python:
- Go: Bash script
- V8: Bazel flag
- ChakraCore: Scripts
- Chromium: Script
- Firefox: Docs
- Thunderbird has PGO support too
- PHP - Makefile command and old Centminmod scripts
- MySQL: CMake script
- YugabyteDB: GitHub commit
- FoundationDB: Script
- Zstd: Makefile
- Foot: Scripts
- Windows Terminal: GitHub PR
- Pydantic-core: GitHub PR
- file.d: GitHub PR
- OceanBase: CMake flag
I am not sure how performance is critical for Slint now. If it isn't the biggest priority right now - could be a nice feature to get in the future.