Clarifying concurrency-based SLA testing for rosetta endpoints #357
linconvidal
started this conversation in
Ideas
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
@matiwinnetou brought up a crucial perspective: we should focus explicitly on measuring how many concurrent virtual users each individual endpoint can handle before reaching a defined SLA thresholds (let’s say we set at p95 and p99 latencies below 1 second). In essence, this means testing endpoints individually and progressively increasing concurrency until we clearly identify the maximum number of simultaneous users each endpoint can reliably support. The endpoint that first breaches this latency SLA under load defines our overall concurrency limit. Establishing this simple yet clear initial metric sets a foundation, which we can then iteratively refine.
TLDR: we’re trying to answer the question “How many concurrent users can the system handle and still stay below 1 second at p95 or p99?” by:
Currently, our artillery tests have been inspired by existing setups that primarily use phases based on arrival rates. This method aligns closely with Artillery's recommended testing philosophy, which emphasizes a hybrid workload model:
Artillery explicitly recommends this hybrid model to avoid the pitfalls of coordinated omission—a problem common with purely closed-loop (fixed concurrency) tests, where slow responses artificially reduce request rates, masking performance bottlenecks.
However, this hybrid approach doesn't inherently measure concurrency levels explicitly. @matiwinnetou's request highlights precisely this nuance: we need clarity on concurrency limits rather than just request rates. Therefore, to clearly meet both Artillery’s philosophy and our explicit concurrency-measurement requirements, I propose a carefully structured adaptation:
maxVusers
parameter (which caps the maximum concurrent users). This approach lets us clearly and effectively simulate stable concurrency levels.I suggest conducting these tests incrementally, clearly stepping up through concurrency levels (such as 1, 2, 4, 8, 12, 16… concurrent users), and observing precisely when and where we exceed our defined latency thresholds. This structured incremental testing clarifies exactly which endpoints handle load effectively and which ones need attention.
The ideal outcome from such a test would look like this:
In this hypothetical scenario, we'd quickly see
I have documented this approach in more detail on the Wiki for future reference/discussion./search/transactions
identified as our critical performance bottleneck, clearly limiting the overall concurrency of our system to 10 users. This clarity directs our immediate optimization efforts effectively.Beta Was this translation helpful? Give feedback.
All reactions