[benchmarking-cli] Add --keys-limit=Option<usize> and --random-seed=Option<u64>#10884
Conversation
| let first_key = self | ||
| .params | ||
| .random_seed | ||
| .map(|seed| sp_storage::StorageKey(blake2_256(&seed.to_be_bytes()[..]).to_vec())); |
There was a problem hiding this comment.
If there are not keys_limit keys behind first_key this will "break". We should instead just load all the keys and then do sample_iter.
This can then directly replace the shuffle call below.
There was a problem hiding this comment.
sample_iter seems to create new values (IIUC). What if we use choose_multiple here?
let mut keys: Vec<_> = client.storage_keys(hash, None, None)?.collect();
let (mut rng, _) = new_rng(self.params.random_seed);
keys = keys.choose_multiple(&mut rng, self.params.keys_limit.unwrap_or(keys.len())).cloned().collect();
There was a problem hiding this comment.
Yeah you can also use choose_multiple.
There was a problem hiding this comment.
Thanks @bkchr, after discussing with the team loading all keys is exactly what breaks (OOM) the workflow for our huge storage chains.
But we get your first_key concern...let me try a different approach here and will ping you back.
There was a problem hiding this comment.
@bkchr just pushed a new approach to get more keys when it is necessary.
LMK what do you think about it.
There was a problem hiding this comment.
@arturgontijo I'm fine with the approach, but can we get this into some shared function? :D
It can probably take two lambdas to abstract the different ways to read the entries.
There was a problem hiding this comment.
Add a shared function in 6ed88c4
I tried to simplify it even more but the complexity (mostly trait bounds) was getting too high.
There was a problem hiding this comment.
Hey @bkchr if you have time, could you please take another look at this one? Thanks
|
@arturgontijo please merge master. |
…r/bench-keys-limit # Conflicts: # substrate/utils/frame/benchmarking-cli/src/storage/read.rs # substrate/utils/frame/benchmarking-cli/src/storage/write.rs
There was a problem hiding this comment.
You probably dont have OOM issues for child tree, only main trie, or?
There was a problem hiding this comment.
Yeah, we don't get OOM for child keys as our chain does not use child tree.
But, for completeness, I'll implement the same logic for them too.
Head branch was pushed to by a user without write access
|
@arturgontijo sorry, I have not seen your ping. Could you please add a prdoc to fix CI? Then we can merge it. |
|
@bkchr done...not sure how to fix the failing checks though =/ |
5e8782a
Description
This PR adds two optional new params to the
benchmarkcli subcommand:1 -
--keys-limit=N: Limits the number of keys processed during read and write benchmarks.2 -
--random-seed=M: Provides deterministic randomness for benchmark reproducibility by seeding the random number generator used for key shuffling.The motivation here is that dealing with huge storage (multiple terabytes) the benchmark workflow could easily eat all the target machine resources, making it impossible (or very expensive) to complete.