Skip to content

[ML] After Boost 1.83 upgrade murmur hash is no longer faster than Boost hash #2574

Open
@droberts195

Description

@droberts195
Contributor

This is not a bug, more of an observation that's interesting to make searchable in an issue.

Since we upgraded to Boost 1.83 the test that asserts that murmur hash lookup performance is better than the default Boost hash lookup performance regularly fails, even with an 8% tolerance on the timings.

BOOST_TEST_REQUIRE(murmurLookupTime < (defaultLookupTime * 108) / 100);

Some examples of timings from before the Boost upgrade:

2023-08-16 09:08:30,805755 UTC [4304] DEBUG /buildkite/builds/bk-agent-prod-k8s-1692174718318253470/elastic/ml-cpp-staging-builds/lib/core/unittest/CHashingTest.cc@301 default insert runtime = 1855ms, murmur insert runtime = 1467ms
2023-08-16 09:08:30,805765 UTC [4304] DEBUG /buildkite/builds/bk-agent-prod-k8s-1692174718318253470/elastic/ml-cpp-staging-builds/lib/core/unittest/CHashingTest.cc@303 default lookup runtime = 5110ms, murmur lookup runtime = 2882ms

2023-06-12 14:12:54,350673 UTC [4280] DEBUG /buildkite/builds/bk-agent-elastic-k8s-1686576616312423140/elastic/ml-cpp-staging-builds/lib/core/unittest/CHashingTest.cc@301 default insert runtime = 2067ms, murmur insert runtime = 1683ms
2023-06-12 14:12:54,350683 UTC [4280] DEBUG /buildkite/builds/bk-agent-elastic-k8s-1686576616312423140/elastic/ml-cpp-staging-builds/lib/core/unittest/CHashingTest.cc@303 default lookup runtime = 6868ms, murmur lookup runtime = 4048ms

2023-09-21 08:36:59,722971 UTC [4304] DEBUG /buildkite/builds/bk-agent-prod-k8s-1695283442532840168/elastic/ml-cpp-staging-builds/lib/core/unittest/CHashingTest.cc@301 default insert runtime = 1214ms, murmur insert runtime = 864ms
2023-09-21 08:36:59,722979 UTC [4304] DEBUG /buildkite/builds/bk-agent-prod-k8s-1695283442532840168/elastic/ml-cpp-staging-builds/lib/core/unittest/CHashingTest.cc@303 default lookup runtime = 3493ms, murmur lookup runtime = 1779ms

The most obvious thing to note here is the massive variation in performance that the BuildKite Kubernetes cluster that we run on delivers. So we cannot compare the 3 sample runs with each other. But within each run it's pretty clear that murmur hash is significantly faster for lookups than Boost hash.

Since the Boost upgrade from 1.77 to 1.83 we have these:

2023-09-26 11:25:45,478041 UTC [4311] DEBUG /buildkite/builds/bk-agent-prod-k8s-1695724698789569410/elastic/ml-cpp-snapshot-builds/lib/core/unittest/CHashingTest.cc@301 default insert runtime = 1998ms, murmur insert runtime = 2143ms
2023-09-26 11:25:45,478067 UTC [4311] DEBUG /buildkite/builds/bk-agent-prod-k8s-1695724698789569410/elastic/ml-cpp-snapshot-builds/lib/core/unittest/CHashingTest.cc@303 default lookup runtime = 2193ms, murmur lookup runtime = 2250ms

2023-09-29 09:53:44,909002 UTC [4286] DEBUG /buildkite/builds/bk-agent-prod-k8s-1695978320865836751/elastic/ml-cpp-staging-builds/lib/core/unittest/CHashingTest.cc@301 default insert runtime = 1889ms, murmur insert runtime = 1971ms
2023-09-29 09:53:44,909012 UTC [4286] DEBUG /buildkite/builds/bk-agent-prod-k8s-1695978320865836751/elastic/ml-cpp-staging-builds/lib/core/unittest/CHashingTest.cc@303 default lookup runtime = 2147ms, murmur lookup runtime = 2342ms

The first is an example of where the test passed due to the 8% tolerance, but Boost hash was still faster. The second is an example of a test failure as Boost hash was faster by more than 8%.

Based on these figures it would be going too far to say Boost hash is now better. The performance of the two hash algorithms is roughly the same now. This means there's no value in removing murmur hash from our codebase, but equally no benefit in jumping through hoops to use it in preference to Boost hash in future code changes.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

      Development

      No branches or pull requests

        Participants

        @droberts195

        Issue actions

          [ML] After Boost 1.83 upgrade murmur hash is no longer faster than Boost hash · Issue #2574 · elastic/ml-cpp