Skip to content

WIP: Robin Hood hashing #1429

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 50 commits into from
Closed
Show file tree
Hide file tree
Changes from 2 commits
Commits
Show all changes
50 commits
Select commit Hold shift + click to select a range
a0f3a51
WIP: robin hood hashing
rciorba Aug 2, 2016
141c729
WIP: disable failing selftests because they break other tests
rciorba Aug 2, 2016
60f7501
cleanup + fix offset tracking on swap
rciorba Aug 4, 2016
1387b2d
update capacity based on new HASH_MAX_LOAD
rciorba Aug 21, 2016
e9d9f1a
use .99 HASH_MAX_LOAD
rciorba Sep 7, 2016
0ba49eb
add benchmark tests for ChunkIndex
rciorba Sep 7, 2016
689f93a
avoid creation of very large byte strings in benchmark
rciorba Sep 7, 2016
82d8ba0
extract slow parts of benchmark
rciorba Sep 8, 2016
e1a5c12
set MAX_HASH_LOAD to 0.93
rciorba Sep 8, 2016
21f881b
update hardcoded hashes in testsuite/hashindex.py
rciorba Sep 8, 2016
92bd12f
Merge remote-tracking branch 'origin/master' into robin_hood
rciorba Sep 8, 2016
71a4669
appease flake8
rciorba Sep 8, 2016
163c3a2
benchmark hash get/set with 2**23 keys
rciorba Sep 8, 2016
c07fcf1
add 1/3 misses in getitem benchmark
rciorba Sep 15, 2016
2d7925a
shortcut hashindex_set by having hashindex_lookup hint about address
rciorba Sep 26, 2016
8b01b32
fix compilation on arm
rciorba Dec 21, 2016
c11dda6
WIP: more rounds, fewer keys in benchmarks
rciorba Dec 17, 2016
075f8f8
WIP: move most of benchmark code in C reduce inconsistencies
rciorba Dec 19, 2016
a9390b1
actually fill the hashindex close to 93%
rciorba Dec 27, 2016
cc001c7
lower key count and iterations since the benchmark is taking to long
rciorba Dec 27, 2016
261f758
fix bug in hashindex_set that never triggered RH swapping
rciorba Dec 29, 2016
a197a13
separate benchmarks for inserting and updating values
rciorba Dec 29, 2016
7909b82
make the RH bucket swap thread safe
rciorba Dec 30, 2016
6e3ece1
remove commented out code
rciorba Dec 30, 2016
d7b4cd8
re-add missing keys to getitem benchmark
rciorba Dec 30, 2016
498a82f
rename all c benchmarks so it's easier to run them exclusively
rciorba Dec 31, 2016
63e7dbb
separate test for getitem without lookups for missing keys
rciorba Dec 31, 2016
7e0e8e5
benchmark multiple fill rates
rciorba Jan 8, 2017
e032ac4
WIP
rciorba Jan 8, 2017
13a8faa
measure more fill rates at 85 and 90%
rciorba Jan 8, 2017
4de8578
implement deletion for robin hood hashing
rciorba Jan 13, 2017
94781af
add benchmark for deletion + churn benchmark
rciorba Jan 15, 2017
151de70
actually call the churn function in the benchmark
rciorba Jan 15, 2017
64cedf6
fix segfault in c_delete benchmark
rciorba Jan 16, 2017
8031a55
move tmp_entry + entry_to_insert from stack to single alloc on heap
rciorba Jan 24, 2017
e6e6b32
only check for offset shortcut periodically when setting value
rciorba Jan 28, 2017
a8b528a
replace multiple memswap calls with single call to memmove
rciorba Feb 16, 2017
10314aa
add extra test for the hashindex
rciorba Feb 20, 2017
b66be6a
init the hashindex with same capacity in c benchmarks
rciorba Feb 20, 2017
4d8ab86
don't leak memory with the entry_to_insert+tmp_entry
rciorba Feb 21, 2017
340a624
fix bad skiphint in hashindex_lookup
rciorba Feb 21, 2017
3f3ab2d
fix bad setup in c benchmarks
rciorba Feb 21, 2017
e5092bc
delete commented out code in benchmark.py
rciorba Feb 22, 2017
91bec5e
run missing key lookup shortcut every 64 buckets
rciorba Feb 22, 2017
4ab6309
fix setitem benchmark to use an empty index
rciorba Feb 22, 2017
a968a62
implement shifting for deletion
rciorba Feb 25, 2017
a0c000d
wrap up delete memmove
rciorba Feb 27, 2017
4d1303a
remove a bunch of debug code
rciorba Feb 27, 2017
36bdbad
get rid of unused index->entry_to_insert
rciorba Feb 27, 2017
78be711
cleanup the benchmark code
rciorba Feb 27, 2017
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
51 changes: 45 additions & 6 deletions src/borg/_hashindex.c
Original file line number Diff line number Diff line change
Expand Up @@ -68,7 +68,7 @@ static int hash_sizes[] = {
};

#define HASH_MIN_LOAD .25
#define HASH_MAX_LOAD .75 /* don't go higher than 0.75, otherwise performance severely suffers! */
#define HASH_MAX_LOAD 0.95 /* don't go higher than 0.75, otherwise performance severely suffers! */
Copy link
Contributor

@PlasmaPower PlasmaPower Aug 2, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this changed? Also, make sure to update the comment.

Minor nitpicking, don't put a leading zero if the definition above doesn't have it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this changed? Also, make sure to update the comment.

The point of the Robin Hood hashing is to minimize the worst case for collisions by spreading the pain across all addresses. This should allow high loads in the hash table without performance degrading much. Also I should add the idea for this change isn't mine, @ThomasWaldmann suggested it as something interesting to do at the EuroPython sprints.

I intentionally didn't update the comments until I run some benchmarks to find an appropriate value.

Minor nitpicking, don't put a leading zero if the definition above doesn't have it.

Will do. BTW, the code style for C in this project isn't 100% clear to me, so If there are any other style no-no's in my PR, please let me know.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this! :)
If you change HASH_MAX_LOAD, do a full text search for it, there is another place depending on its value.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you change HASH_MAX_LOAD, do a full text search for it, there is another place depending on its value.

Had, a look. All I can see is the comment next to the value and docs/internals.rst.
I would update both once I identify a good value for this constant. Let me know if there's any places I've missed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

search for 1.35 in the source.


#define MAX(x, y) ((x) > (y) ? (x): (y))
#define NELEMS(x) (sizeof(x) / sizeof((x)[0]))
Expand Down Expand Up @@ -111,7 +111,7 @@ hashindex_index(HashIndex *index, const void *key)
static int
hashindex_lookup(HashIndex *index, const void *key)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function could also be optimized. Currently for the worst case scenario (not finding a key in the index) we scan all buckets until we find an empty one. At high fill ratios this might get close to O(N). However if we track the maximum offset in the entire hashmap we could bail after at most max_offset iterations.

As the PR is currently, we could just load an old hashmap and start operating on it with the new collision handling code, and it would just work, and also hashmaps created by this would still be usable by olded borg versions. Changing hashindex_lookup however would require us to convert the hashmap explicitly, and also change the HashHeader to track this max offset. That would be a bigger deal because it would impact backwards compatibility, so some planning needs to go in to this.

One potential idea would be to use the MAGIC string in the header to also encode a version. For example if we turn BORG_IDX to BORG_I and 2 bytes for versioning, we could determine if this version of the index is fully robin-hood compliant and if not we could convert it on load from disk.

@ThomasWaldmann I'd like to hear your thoughts on this.

Copy link
Contributor Author

@rciorba rciorba Aug 4, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we could bail after at most max_offset iterations.

Actually if the offset of the key is smaller than the number of buckets we've looked at we can bail. There's no way the next bucket will contain out desired key, since it would have been swapped on insert.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

scanning a big part of hashindex sounds evil. guess at 95% fill rate, we would run into having to always scan about 20% of all buckets.

maybe that was the real reason for the perf breakdown i was sometimes seeing?

maybe first keep the code compatible until it is accepted / merged, but we'll keep the idea for later.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, one way to keep it compatible and still speedup hashindex_lookup is to always reinsert all items when loading from disk (no more expensive than a resize). I'll do some measurements of performance with and without this implemented.

{
int didx = -1;
int didx = -1; // deleted index
int start = hashindex_index(index, key);
int idx = start;
for(;;) {
Expand All @@ -126,6 +126,7 @@ hashindex_lookup(HashIndex *index, const void *key)
}
else if(BUCKET_MATCHES_KEY(index, idx, key)) {
if (didx != -1) {
/* we found a toombstone earlier, so we can move this key on top of it */
Copy link
Member

@ThomasWaldmann ThomasWaldmann Aug 2, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks a lot for adding all these comments, very helpful!

typo: tombstone

memcpy(BUCKET_ADDR(index, didx), BUCKET_ADDR(index, idx), index->bucket_size);
BUCKET_MARK_DELETED(index, idx);
idx = didx;
Expand Down Expand Up @@ -376,31 +377,69 @@ hashindex_get(HashIndex *index, const void *key)
return BUCKET_ADDR(index, idx) + index->key_size;
}

inline
int
distance(HashIndex *index, int current_idx, int ideal_idx)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick: only 1 blank before *index.

{
/* If the current index is smaller than the ideal index we've wrapped
around the end of the bucket array and need to compensate for that. */
return current_idx - ideal_idx + ( (current_idx < ideal_idx) ? index->num_buckets : 0 );
}

static int
hashindex_set(HashIndex *index, const void *key, const void *value)
{
int idx = hashindex_lookup(index, key);
uint8_t *ptr;
uint8_t *bucket_ptr;
int offset = 0;
int other_offset;
void *bucket = malloc(index->key_size + index->value_size);
void *buffer = malloc(index->key_size + index->value_size);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it would be cool if we could somehow not allocate/free bucket and buffer once per hashindex_set operation.

maybe the index data structure could have 2 pointers to such buffers that are just allocated once?

if(idx < 0)
{
/* we don't have the key in the index
we need to find an appropriate address */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure whether the C code follows some specific coding style, but the max line length for Python code in borg is 120, thus such a comment would be just a single line.

if(index->num_entries > index->upper_limit) {
/* we need to grow the hashindex */
if(!hashindex_resize(index, grow_size(index->num_buckets))) {
return 0;
}
}
idx = hashindex_index(index, key);
memcpy(bucket, key, index->key_size);
memcpy(bucket + index->key_size, value, index->value_size);
bucket_ptr = BUCKET_ADDR(index, idx);
while(!BUCKET_IS_EMPTY(index, idx) && !BUCKET_IS_DELETED(index, idx)) {
/* we have a collision */
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This loop skips over some number of entries, then continues by displacing other entries. Our inserted entry will take the bucket of the first displaced entry. Instead, you can forward-shift all buckets from the first displaced entry until the end of the chunk, using memmove. By increasing the displacement of all these buckets by 1, you keep the invariant of robin hood hashing, which relies on comparing displacements, not on their absolute value (other_offset < offset).

Tell me if something in my explanation is unclear. I'm using different names. My word for "distance" is "displacement".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So basically take the block (the entire contiguous section of buckets up to the first empty bucket) and memmove it forward one address, then insert at the ideal location. Oh my, yes, such an elegant solution! Thanks!

not on their absolute value (other_offset < offset).

Not sure what you mean by absolute value. The offsets are the relative distance from the ideal bucket a key would be in vs which bucket it is now, so I think it's the same as what you call "displacement".

Tell me if something in my explanation is unclear. I'm using different names. My word for "distance" is "displacement".

Indeed, displacement is a better name for it.

Thanks for the feedback!

other_offset = distance(
index, idx, hashindex_index(index, bucket_ptr));
if ( other_offset < offset) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick: no blank after (

/* Swap the bucket at idx with the current key/value pair.
This is the gist of hobin-hood hashing, we rob from
the key with the lower distance to it's optimal address
by swaping places with it.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: ... its ... swapping ...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also, hobbin-hood... seems I should actually use my spell checker :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, just noticed: is it robin' hood because he is robbing? :D

now a native speaker must explain!

*/
memcpy(buffer, bucket_ptr, (index->key_size + index->value_size));
memcpy(bucket_ptr, bucket, (index->key_size + index->value_size));
memcpy(bucket , buffer, (index->key_size + index->value_size));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick: no blank after bucket

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

move this into a (inline) memxchg(a, b, tmp, size) function?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i went for memswap, since it felt like a better name

offset = other_offset;
} else {
offset++;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, you increment offset in this case, as you also increment idx and bucket_ptr below.

but shouldn't the offset derived from other_offset (425) also get incremented for same reason?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch!

}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead of the last 4 lines, you could do:

    offset = other_offset;
}
offset++;

idx = (idx + 1) % index->num_buckets;
bucket_ptr = BUCKET_ADDR(index, idx);
}
ptr = BUCKET_ADDR(index, idx);
memcpy(ptr, key, index->key_size);
memcpy(ptr + index->key_size, value, index->value_size);
memcpy(bucket_ptr, bucket, (index->key_size + index->value_size));
index->num_entries += 1;
}
else
{
/* we already have the key in the index
we just need to update it's value */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: its

memcpy(BUCKET_ADDR(index, idx) + index->key_size, value, index->value_size);
}
free(buffer);
free(bucket);
return 1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i read through the rest of this function and it looks correct.
have to check against the robin hood paper again, whether it is complete.

Copy link
Contributor Author

@rciorba rciorba Sep 11, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I mentioned previously hashindex_lookup could take advantage of the
maximum offset for the hashindex, by giving up after looking at maximum offset buckets. Since every hashindex_set does a lookup first this could
potentially be a great improvement for real world usage. However to track
maximum offset we will add some small overhead.

Deleting would be the tricky bit. If we are deleting a key with maximum
offset we don't know if there is another key with the same offset, or what
the next biggest offset is. But if we simply don't update maximum offset on
delete (basically maximum offset means maximum ever seen), we could still
reap a great benefit from it, and only re-compute it on hash resize.

The only requirement for doing this, as I said in a previous comment, would
be that this requires the hashindex be in robin_hood order already. One way
to achieve this is to re-insert every key when we read from disk. That way
we can keep things mutually compatible with old versions of borg. Or we can
introduce some sort of versioning scheme using the MAGIC string. Anyway, I
should have some time to implement it tonight or tomorrow and we can see if
it's worth further consideration.

UPDATED:
So, according to http://codecapsule.com/2013/11/11/robin-hood-hashing/ no tracking of maximum offset is required:

"The search can also be stopped if during the linear probing, a bucket is encountered for which the distance to the initial bucket in the linear probing is smaller than the DIB of the entry it contains. Indeed, if the entry being searched were in the hash table, then it would have been inserted at that location, since the DIB of the entry in that location is smaller."

On Sep 11, 2016 18:02, "TW" [email protected] wrote:

In src/borg/_hashindex.c
#1429 (comment):

     index->num_entries += 1;
 }
 else
 {
  •    /\* we already have the key in the index we just need to update its value */
     memcpy(BUCKET_ADDR(index, idx) + index->key_size, value, index->value_size);
    
    }
    return 1;

i read through the rest of this function and it looks correct.
have to check against the robin hood paper again, whether it is complete.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/borgbackup/borg/pull/1429/files/163c3a25f65b5d1077ee18480e3b3b504a5c6ab9#r78297662,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AB0wZ69AwZnYP1szOjIY-H7fKOxWNbptks5qpCYagaJpZM4Ja5g-
.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The articles' author seems unimpressed by Robin Hood in his first article, but he was able to improve it:
http://codecapsule.com/2013/11/17/robin-hood-hashing-backward-shift-deletion/

http://www.sebastiansylvan.com/post/robin-hood-hashing-should-be-your-default-hash-table-implementation/ has the same trick as in the other article.

}

Expand Down
2 changes: 1 addition & 1 deletion src/borg/selftest.py
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@
ChunkerTestCase,
]

SELFTEST_COUNT = 29
SELFTEST_COUNT = 27


class SelfTestResult(TestResult):
Expand Down
12 changes: 6 additions & 6 deletions src/borg/testsuite/hashindex.py
Original file line number Diff line number Diff line change
Expand Up @@ -55,13 +55,13 @@ def _generic_test(self, cls, make_value, sha):
del idx
self.assert_equal(len(cls.read(idx_name.name)), 0)

def test_nsindex(self):
self._generic_test(NSIndex, lambda x: (x, x),
'80fba5b40f8cf12f1486f1ba33c9d852fb2b41a5b5961d3b9d1228cf2aa9c4c9')
# def test_nsindex(self):
# self._generic_test(NSIndex, lambda x: (x, x),
# '80fba5b40f8cf12f1486f1ba33c9d852fb2b41a5b5961d3b9d1228cf2aa9c4c9')

def test_chunkindex(self):
self._generic_test(ChunkIndex, lambda x: (x, x, x),
'1d71865e72e3c3af18d3c7216b6fa7b014695eaa3ed7f14cf9cd02fba75d1c95')
# def test_chunkindex(self):
# self._generic_test(ChunkIndex, lambda x: (x, x, x),
# '1d71865e72e3c3af18d3c7216b6fa7b014695eaa3ed7f14cf9cd02fba75d1c95')

def test_resize(self):
n = 2000 # Must be >= MIN_BUCKETS
Expand Down