Memory optimizations #90

fatkodima · 2023-02-08T00:41:00Z

As part of sidekiq/sidekiq#5768, I noticed that this gem allocates quite some memory and was able to reduce it within this PR.

I ran a little tweaked sidekiq's benchmark on ruby 3.2.0:

diff --git a/bin/sidekiqload b/bin/sidekiqload
index 8159d424..cf52e78c 100755
--- a/bin/sidekiqload
+++ b/bin/sidekiqload
@@ -57,7 +57,7 @@ end

 class Loader
   def initialize
-    @iter = ENV["GC"] ? 10 : 500
+    @iter = ENV["GC"] ? 10 : 100
     @count = Integer(ENV["COUNT"] || 1_000)
     @latency = Integer(ENV["LATENCY"] || 1)
   end
@@ -130,7 +130,7 @@ class Loader

   def monitor
     @monitor = Thread.new do
-      GC.start
+      # GC.start
       loop do
         sleep 0.2
         qsize = Sidekiq.redis do |conn|

$ THREADS=1 LATENCY=0 PROFILE=1 bundle exec ruby-memory-profiler --scale-bytes --out tmp/memory_profiler.txt bin/sidekiqload

Before

Total allocated: 697.19 MB (6431073 objects)
Total retained:  2.57 MB (22652 objects)

allocated memory by gem
-----------------------------------
 231.40 MB  json-2.6.2
 194.04 MB  sidekiq/lib
 175.63 MB  redis-client-0.12.1
  50.74 MB  connection_pool-2.3.0
  22.96 MB  other
   9.69 MB  lib
   3.32 MB  concurrent-ruby-1.1.10
   2.64 MB  ruby-prof-1.4.5
   1.78 MB  activesupport-7.0.4.2
   1.67 MB  activerecord-7.0.4.2
   1.18 MB  rake-13.0.6
 858.80 kB  yard-0.9.28
 398.82 kB  sqlite3-1.6.0-x86_64-darwin
 376.38 kB  activemodel-7.0.4.2
 322.39 kB  i18n-1.12.0
 110.02 kB  toxiproxy-2.0.2
  57.38 kB  after_commit_everywhere-1.3.0
  10.78 kB  bundler-2.3.22
   4.02 kB  rubygems

allocated memory by file
-----------------------------------
 231.28 MB  /Users/fatkodima/.asdf/installs/ruby/3.2.0/lib/ruby/gems/3.2.0/gems/json-2.6.2/lib/json/common.rb
 106.24 MB  /Users/fatkodima/.asdf/installs/ruby/3.2.0/lib/ruby/gems/3.2.0/gems/redis-client-0.12.1/lib/redis_client/ruby_connection/resp3.rb
  60.06 MB  /Users/fatkodima/Desktop/oss/sidekiq/lib/sidekiq/processor.rb
  48.08 MB  /Users/fatkodima/.asdf/installs/ruby/3.2.0/lib/ruby/gems/3.2.0/gems/redis-client-0.12.1/lib/redis_client/ruby_connection/buffered_io.rb
  44.91 MB  /Users/fatkodima/Desktop/oss/sidekiq/lib/sidekiq/client.rb
  42.63 MB  /Users/fatkodima/.asdf/installs/ruby/3.2.0/lib/ruby/gems/3.2.0/gems/connection_pool-2.3.0/lib/connection_pool.rb
...

After

Total allocated: 600.07 MB (5626286 objects)
Total retained:  2.57 MB (22682 objects)

allocated memory by gem
-----------------------------------
 231.46 MB  json-2.6.2
 194.04 MB  sidekiq/lib
  78.64 MB  redis-client/lib
  50.74 MB  connection_pool-2.3.0
  22.96 MB  other
   9.69 MB  lib
   3.32 MB  concurrent-ruby-1.1.10
   2.44 MB  ruby-prof-1.4.5
   1.78 MB  activesupport-7.0.4.2
   1.67 MB  activerecord-7.0.4.2
   1.18 MB  rake-13.0.6
 858.80 kB  yard-0.9.28
 398.82 kB  sqlite3-1.6.0-x86_64-darwin
 376.28 kB  activemodel-7.0.4.2
 322.80 kB  i18n-1.12.0
 110.02 kB  toxiproxy-2.0.2
  57.47 kB  after_commit_everywhere-1.3.0
  10.76 kB  bundler-2.3.22
   4.02 kB  rubygems

allocated memory by file
-----------------------------------
 231.34 MB  /Users/fatkodima/.asdf/installs/ruby/3.2.0/lib/ruby/gems/3.2.0/gems/json-2.6.2/lib/json/common.rb
  60.06 MB  /Users/fatkodima/Desktop/oss/sidekiq/lib/sidekiq/processor.rb
  48.08 MB  /Users/fatkodima/Desktop/oss/redis-client/lib/redis_client/ruby_connection/buffered_io.rb
  44.91 MB  /Users/fatkodima/Desktop/oss/sidekiq/lib/sidekiq/client.rb
  42.63 MB  /Users/fatkodima/.asdf/installs/ruby/3.2.0/lib/ruby/gems/3.2.0/gems/connection_pool-2.3.0/lib/connection_pool.rb
  39.23 MB  /Users/fatkodima/Desktop/oss/sidekiq/lib/sidekiq/fetch.rb
  24.04 MB  /Users/fatkodima/Desktop/oss/sidekiq/lib/sidekiq/middleware/chain.rb
  16.82 MB  /Users/fatkodima/Desktop/oss/sidekiq/lib/sidekiq/job_logger.rb
  12.06 MB  /Users/fatkodima/Desktop/oss/redis-client/lib/redis_client/decorator.rb
   9.27 MB  /Users/fatkodima/Desktop/oss/redis-client/lib/redis_client/ruby_connection/resp3.rb
   8.88 MB  /Users/fatkodima/Desktop/oss/redis-client/lib/redis_client/command_builder.rb
....

fatkodima · 2023-02-08T01:43:00Z

lib/redis_client/ruby_connection/resp3.rb

@@ -54,6 +54,9 @@ def new_buffer
      String.new(encoding: Encoding::BINARY, capacity: 127)
    end

+    SIZE_TO_STRING = Hash.new { |h, k| h[k] = k.to_s }


I am thinking if this is actually threadsafe? If not, we can just prepopulate an array of size, for example, 100 and use it instead of this hash.

Well, thread safe can mean a lot of things, but on MRI it's acceptably safe (may generate some extra strings in case of race but they'll be GC). On Truffle or JRuby however I believe you may get problems.

But either way I don't think this is a good idea.

That has can grow unbounded and will never be shrunk or reclaimed. That is basically a memory leak, and that hash will have to be marked regularly, slowing down GC pauses.

These small strings are entirely embeded, have no reference, an never held onto. So from a GC perspective they take very little time.

Generally speaking allocations aren't necessarily a problem, they might if they cause GC to trigger more often, and GC is slow for other reasons. But here I really don't think it's worth.

We can use something like

SIZE_TO_STRING = (0..100).to_a.map(&:to_s) def size_to_string(size) SIZE_TO_STRING[size] || size.to_s end

Assuming sizes most of the times should not be large numbers.

This saves 20Mb per 100k sidekiq jobs, but yes, your points are still valid about these micro optimizations.

The memory space metric has to be interpreted carefully. All these strings will be under the embeded string limit, so they'll all use just one object slot without any associated malloc. Which mean very little impact on GC performance.

Allocations (both memsize and object count) can be an interesting proxy for code performance, but it has to be carefully interpreted. Less allocations doesn't always mean faster. Allocating an embeded object is just a pointer bump, it's incredibly cheap. It would be costly if the string was larger and had to call malloc.

I wouldn't be surprised if the size_to_string method call, plus the hash lookup wouldn't end up being slower than the embedded string allocation.

require 'benchmark/ips' SIZE_TO_STRING = (0..100).to_a.map(&:to_s) def size_to_string(size) SIZE_TO_STRING[size] || size.to_s end puts "== hit ==" Benchmark.ips do |x| x.report("Integer#to_s") { 42.to_s } x.report("size_to_string") { size_to_string(42) } x.compare!(order: :baseline) end puts "== miss ==" Benchmark.ips do |x| x.report("Integer#to_s") { 420.to_s } x.report("size_to_string") { size_to_string(420) } x.compare!(order: :baseline) end puts "minor_gc: #{GC.stat(:minor_gc_count)}"

Results:

$ RUBY_GC_HEAP_INIT_SLOTS=1000000 ruby -v /tmp/int_to_str.rb ruby 3.2.0 (2022-12-25 revision a528908271) [arm64-darwin22] RUBY_GC_HEAP_INIT_SLOTS=1000000 (default value: 10000) == hit == Warming up -------------------------------------- Integer#to_s 1.314M i/100ms size_to_string 1.431M i/100ms Calculating ------------------------------------- Integer#to_s 13.236M (± 2.4%) i/s - 67.021M in 5.066487s size_to_string 14.139M (± 2.0%) i/s - 71.528M in 5.061079s Comparison: Integer#to_s: 13236033.7 i/s size_to_string: 14139247.5 i/s - 1.07x (± 0.00) faster == miss == Warming up -------------------------------------- Integer#to_s 1.311M i/100ms size_to_string 896.525k i/100ms Calculating ------------------------------------- Integer#to_s 13.077M (± 2.7%) i/s - 65.554M in 5.016665s size_to_string 8.820M (± 4.1%) i/s - 44.826M in 5.091855s Comparison: Integer#to_s: 13076807.3 i/s size_to_string: 8820024.3 i/s - 1.48x (± 0.00) slower minor_gc: 242

That's a very small gain on hit, bit a big loss on miss. I really don't think it's worth it.

Reverted this change.

Made some simple micro benchmark:

# frozen_string_literal: true require "bundler/inline" gemfile(true) do source "https://rubygems.org" git_source(:github) { |repo| "https://github.com/#{repo}.git" } gem "benchmark-ips" end arr = (1..100).to_a + (101..120).to_a SIZE_TO_STRING = (0..100).to_a.map(&:to_s) def size_to_string(size) SIZE_TO_STRING[size] || size.to_s end Benchmark.ips do |x| x.report("to_s") do arr.each(&:to_s) end x.report("cached") do arr.each do |e| size_to_string(e) end end x.compare! end

Warming up -------------------------------------- to_s 5.879k i/100ms cached 7.666k i/100ms Calculating ------------------------------------- to_s 58.378k (± 1.5%) i/s - 293.950k in 5.036388s cached 76.132k (± 1.5%) i/s - 383.300k in 5.035706s Comparison: cached: 76132.4 i/s to_s: 58378.1 i/s - 1.30x (± 0.00) slower

casperisfine · 2023-02-08T07:54:30Z

lib/redis_client/ruby_connection.rb

+      @buffer.clear
+      RESP3.dump(command, @buffer)


So String#clear here free the malloced region: https://bugs.ruby-lang.org/issues/17790.

So this save allocating a string slot, but will malloc anyway which will eventually trigger a GC.

If we want to be smart we want to re-used that malloced region, but have to be careful not to end up with a giant buffer, so we'd need to clear it if it past a certain size. Unfortunately the only way to get the size of the malloc is via ObjectSpace.memsize_of(str). I'd need to check its performance first.

We also have to be super careful not to leak data, as that happened in the past with a similar optimization ruby/net-protocol#19

If we want to be smart we want to re-used that malloced region

Can we currently do this from ruby? I see the proposed patch into ruby is not merged yet.

Can we currently do this from ruby?

Well, if you pass the string to read without clearing it, it will re-use that space.

But we are using it just for write in this case (write and write_multi), so this is not yet possible?

Ah yeah, sorry I missed we where passing it to RESP3.dump. Yeah I see no solution here.

I think if we call clear we start over from an empty string, so we lose the pre-allocation benefits.

casperisfine · 2023-02-08T07:55:24Z

lib/redis_client/ruby_connection.rb

@@ -77,6 +77,8 @@ def initialize(config, connect_timeout:, read_timeout:, write_timeout:)
        read_timeout: read_timeout,
        write_timeout: write_timeout,
      )
+
+      @buffer = String.new(encoding: Encoding::BINARY, capacity: 127)


That 127 is quite arbitrary, would be worth putting some thoughts into it, or making it configurable.

This was taken from the buffer definition it already uses -

redis-client/lib/redis_client/ruby_connection/resp3.rb

Lines 53 to 55 in aa5a308

def new_buffer

String.new(encoding: Encoding::BINARY, capacity: 127)

end

casperisfine · 2023-02-08T07:56:12Z

@fatkodima do summarize I don't think SIZE_TO_STRING is a good idea, so please revert it.

For the buffer re-use, it might make sense but I'll have to carefully review it.

fatkodima · 2023-02-08T11:31:36Z

Feel free to close if it is not worth it. And thank you for your time giving a 💪 review, as always!

byroot · 2023-02-08T11:48:39Z

Yeah, I don't think either of these opts are a clear enough cut, so I'll close.

Also for people for which this matter, they can use hiredis-client which will save more memory than what we can do on the Ruby side.

Thanks for trying to improve redis-client!

fatkodima mentioned this pull request Feb 8, 2023

Optimize traversing middleware chain sidekiq/sidekiq#5773

Merged

fatkodima commented Feb 8, 2023

View reviewed changes

casperisfine reviewed Feb 8, 2023

View reviewed changes

Reuse write buffer in RubyConnection

0ff376d

fatkodima force-pushed the memory-optimizations branch from 12d4437 to 0ff376d Compare February 8, 2023 11:17

byroot closed this Feb 8, 2023

fatkodima deleted the memory-optimizations branch February 8, 2023 12:44

	def new_buffer
	String.new(encoding: Encoding::BINARY, capacity: 127)
	end

Memory optimizations #90

Memory optimizations #90

Uh oh!

Conversation

fatkodima commented Feb 8, 2023

Before

After

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

casperisfine commented Feb 8, 2023

Uh oh!

fatkodima commented Feb 8, 2023

Uh oh!

byroot commented Feb 8, 2023

Uh oh!

Uh oh!