Skip to content

speed up RodasTableau constructors #2712

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

oscardssmith
Copy link
Member

fixes #2710.
Turns out hvncat is much more expensive than it should be/isn't constant folded.
Before:

julia> @benchmark OrdinaryDiffEqRosenbrock.Rodas4PTableau(Float64, Float64)
BenchmarkTools.Trial: 10000 samples with 4 evaluations per sample.
 Range (min … max):   6.920 μs …   7.793 ms  ┊ GC (min … max):  0.00% … 99.63%
 Time  (median):     11.370 μs               ┊ GC (median):     0.00%
 Time  (mean ± σ):   13.247 μs ± 105.500 μs  ┊ GC (mean ± σ):  11.22% ±  1.41%

                ▂▆███▆▂▁                                        
  ▃█▇▆▄▂▂▂▁▁▁▁▂▆█████████▆▅▄▄▃▃▂▂▂▂▂▂▂▂▂▂▂▂▂▂▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁ ▃
  6.92 μs         Histogram: frequency by time         21.5 μs <
 Memory estimate: 22.20 KiB, allocs estimate: 142.

After

julia> @benchmark OrdinaryDiffEqRosenbrock.Rodas4PTableau(Float64, Float64)
BenchmarkTools.Trial: 10000 samples with 1000 evaluations per sample.
 Range (min … max):  1.939 ns … 11.273 ns  ┊ GC (min … max): 0.00% … 0.00%
 Time  (median):     2.219 ns              ┊ GC (median):    0.00%
 Time  (mean ± σ):   2.208 ns ±  0.198 ns  ┊ GC (mean ± σ):  0.00% ± 0.00%

                    ▂                             █▂          
  ▁▂▄▂▁▁▁▂▂▁▁▁▃▃▁▁▁▁██▂▁▁▁▁▃▂▂▂▃▆▁▁▁▁▂▃▂▁▁▂▄▄▂▁▁▁▆██▂▂▁▁▂▃▄▂ ▂
  1.94 ns        Histogram: frequency by time        2.38 ns <

 Memory estimate: 0 bytes, allocs estimate: 0.

julia> @benchmark OrdinaryDiffEqRosenbrock.Rodas4PTableau(Float32, Float32)
BenchmarkTools.Trial: 10000 samples with 909 evaluations per sample.
 Range (min … max):  124.638 ns …   1.764 μs  ┊ GC (min … max):  0.00% … 87.86%
 Time  (median):     139.304 ns               ┊ GC (median):     0.00%
 Time  (mean ± σ):   169.512 ns ± 179.489 ns  ┊ GC (mean ± σ):  15.16% ± 12.63%

  █▅▃                                                           ▁
  ████▇▇▇▅▃▅▅▅▅▃▃▃▁▃▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▃▃▄▃▁▅▅▆▄▆▇▇█ █
  125 ns        Histogram: log(frequency) by time       1.38 μs <

 Memory estimate: 704 bytes, allocs estimate: 10.

@oscardssmith
Copy link
Member Author

note that as of Julia 1.12, this probably won't be nearly as important due to JuliaLang/julia#39729, but might as well write it this way anyway...

@oscardssmith
Copy link
Member Author

Any idea if https://github.com/SciML/OrdinaryDiffEq.jl/actions/runs/15029774249/job/42239283382?pr=2712 cxould be related? I have a hard time imagining this changing anything...

@ChrisRackauckas
Copy link
Member

No that's on master and I haven't had time to track down what that is.

@ChrisRackauckas ChrisRackauckas merged commit 2c592c5 into SciML:master May 14, 2025
135 of 150 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Rodas4P() solver dominated by 2 allocations. Any way to get around this?
2 participants