Skip to content

[BUG] sporadic concurrent_modification_exception during query in 2.14 #14032

Closed
@janheise

Description

@janheise

Describe the bug

As you can see from the screenshot, there is a ConcurrentModificationException going on.

Screenshot 2024-06-06 at 15 11 43

Graylog users that started to use OpenSearch 2.14 noticed that as a problem happening for them in queries, so we started to investigate.

The resulting output from an msearch, that carries this exception looks like the following:

"failed":1,"failures":[{"shard":0,"index":"graylog_0",
"node":"jxRdA49HT4uuwWu7VVGyjw","reason":{"type":"concurrent_modification_exception","reason":null}}]}

So there is no stacktrace or logs at all.

While trying to reproduce the problem, I was lucky to have the debugger attached that caught the exception/resulted in the screenshot above.

The following line in private void updateStaleCountOnCacheInsert(CleanupKey cleanupKey) { throws the exception:

cleanupKeyToCountMap.computeIfAbsent(shardId, k -> new HashMap<>()).merge(cleanupKey.readerCacheKeyId, 1, Integer::sum);

which was introduced with #12707 if I'm correct - which also means that it could have/should have probably already hit in 2.13?

The error condition seems to be a bit awkward to reproduce:

A graylog instance that has a random message generator running where I had the attached script/query running reproduced the error quite consistently every 2.5/3k queries against an OpenSearch 2.14 in docker.

Reproducing it, running OpenSearch via ./gradlew run and attaching the debugger takes ca. 40-50k queries until the error shows up.

msearch3-loop.sh.txt

msearch3.req.txt

The query stays identical but fails at some point. I think there needs to be some traffic on the index so that the query is evaluated every time and not cached.

Let me know if you need more infos.

Related component

Search

To Reproduce

We're working on a setup.

Expected behavior

no concurrent modification exception should occur

Additional Details

Plugins
Please list all plugins currently enabled.

Screenshots
If applicable, add screenshots to help explain your problem.

Host/Environment (please complete the following information):

  • OS: [e.g. iOS]
  • Version [e.g. 22]

Additional context
Add any other context about the problem here.

Metadata

Metadata

Labels

SearchSearch query, autocomplete ...etcbugSomething isn't workingv2.15.0Issues and PRs related to version 2.15.0

Type

No type

Projects

Status

✅ Done

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions