Skip to content

Evaluate using Profile-Guided Optimization (PGO) #3909

@zamazan4ik

Description

@zamazan4ik

Hi!

Recently I checked Profile-Guided Optimization (PGO) improvements on multiple projects. The results are here. E.g. PGO results for LLVM-related tooling are here. According to the tests, PGO usually helps with the compiler and compiler-like workloads (like static analysis or code formatters) - e.g. Clang gets +20% compilation speed with PGO. Since this, I think trying to optimize Slint tools like compiler, code formatted and LSP would be a good idea. I already did some PGO benchmarks on slint-fmt and want to share my results here.

Test environment

  • Fedora 39
  • Linux kernel 6.5.11
  • AMD Ryzen 9 5900x
  • 48 Gib RAM
  • SSD Samsung 980 Pro 2 Tib
  • Compiler - Rustc 1.73
  • Slint version: the latest for now from the master branch on commit df7657dc2d1fb37f17ff0f7285af68bf6704fd22

Benchmark

For benchmark purposes, I run slint-fmt on all .slint files in the examples directory with slint-fmt <file names>. PGO training phase was done on the same files. Release build is done with cargo build --release --bin slint-fmt. PGO build is done with cargo-pgo (cargo pgo build -- --bin slint-fmt + run on the training workload + cargo pgo optimize build -- --bin slint-fmt).

Benchmarks are done on the same machine, with the same background "noise". The benchmark was performed multiple times - the results are reproducible.

Results

I got the following results (with hyperfine benchmark tool):

hyperfine --warmup 500 --min-runs 5000 `./slint-fmt_release skipped_input_files` `./slint-fmt_optimized skipped_input_files`

Benchmark 1: ./slint-fmt_release skipped_input_files
  Time (mean ± σ):      18.0 ms ±   1.0 ms    [User: 14.1 ms, System: 3.7 ms]
  Range (min … max):    17.3 ms …  36.9 ms    5000 runs

  Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet system without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options.

Benchmark 2: ./slint-fmt_optimized skipped_input_files
  Time (mean ± σ):      15.8 ms ±   0.8 ms    [User: 11.9 ms, System: 3.8 ms]
  Range (min … max):    15.3 ms …  33.6 ms    5000 runs

  Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet system without any interferences from other programs. It might help to use the '--warmup' or '--prepare' options.

Summary
  ./slint-fmt_optimized skipped_input_files ran
    1.14 ± 0.08 times faster than ./slint-fmt_release skipped_input_files

where slint_fmt-release - default Release build, slint-fmt_optimized - Release + PGO-optimized build.

At least in the scenario above PGO helps with achieving better performance with slint-fmt.

Further steps

I can suggest the following action points:

  • Perform more PGO benchmarks on Slint tools. If it shows improvements - add a note to the documentation about possible improvements in Slint tools' performance with PGO.
  • Providing an easier way (e.g. a build option) to build scripts with PGO can be helpful for the end-users and maintainers since they will be able to optimize Slint tools according to their workloads.
  • Optimize pre-built binaries with PGO (if it's possible to prepare a generic enough profile).

Testing Post-Link Optimization techniques (like LLVM BOLT) would be interesting too (Clang and Rustc already use BOLT as an addition to PGO) but I recommend starting from the usual PGO.

Here are some examples of how PGO optimization is already integrated into other projects:

I am not sure how performance is critical for Slint now. If it isn't the biggest priority right now - could be a nice feature to get in the future.

Metadata

Metadata

Assignees

No one assigned

    Labels

    packagingPackaging and ease of downloading/obtaining Slint

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions