add stats for text embedding processors with flags #1332

will-hwang · 2025-05-19T22:14:28Z

Description

Enhance Stats for Text Embedding Processor, including stats for skip_existing option enabled

Updated Response

{
	"_nodes": {
		"total": 1,
		"successful": 1,
		"failed": 0
	},
	"cluster_name": "integTest",
	"info": {
		"cluster_version": "3.1.0",
		"processors": {
			"ingest": {
				"text_chunking_delimiter_processors": 0,
				"text_chunking_fixed_length_processors": 0,
				"text_embedding_processors_in_pipelines": 1,
				"text_embedding_skip_existing_processors": 1,
				"text_chunking_processors": 0
			}
		}
	},
	"all_nodes": {
		"processors": {
			"ingest": {
				"text_chunking_executions": 0,
				"text_embedding_executions": 2,
				"text_embedding_skip_existing_executions": 2,
				"text_chunking_fixed_length_executions": 0,
				"text_chunking_delimiter_executions": 0
			}
		}
	},
	"nodes": {
		"rMyVPGp2SsWL-sLQ3HSjCQ": {
			"processors": {
				"ingest": {
					"text_chunking_executions": 0,
					"text_embedding_executions": 2,
					"text_embedding_skip_existing_executions": 2,
					"text_chunking_fixed_length_executions": 0,
					"text_chunking_delimiter_executions": 0
				}
			}
		}
	}
}

Check List

New functionality includes testing.
New functionality has been documented.
API changes companion pull request created.
Commits are signed per the DCO using --signoff.
Public documentation issue/PR created.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

heemin32 · 2025-05-19T22:25:23Z

Could you update the PR description with updated response of the stats api?

q-andy

lgtm, minor nits

q-andy · 2025-05-19T22:45:14Z

src/main/java/org/opensearch/neuralsearch/stats/info/InfoStatsManager.java

+        increment(stats, InfoStatName.TEXT_EMBEDDING_PROCESSORS);
+        Object skipExisting = processorConfig.get(TextEmbeddingProcessor.SKIP_EXISTING);
+        if (Objects.nonNull(skipExisting) && skipExisting.equals(Boolean.TRUE)) {
+            increment(stats, InfoStatName.TEXT_EMBEDDING_SKIP_EXISTING_PROCESSORS);


Can you use this caster helper here? Or does this not work because it's a boolean primative? https://github.com/opensearch-project/neural-search/blob/main/src/main/java/org/opensearch/neuralsearch/stats/info/InfoStatsManager.java#L180-L193

it directly reads as boolean, not as map. I think it's cleaner this way

q-andy · 2025-05-19T22:49:08Z

src/main/java/org/opensearch/neuralsearch/processor/TextEmbeddingProcessor.java

@@ -72,6 +72,7 @@ public void doExecute(
            generateAndSetInference(ingestDocument, processMap, inferenceList, handler);
            return;
        }
+        EventStatsManager.increment(EventStatName.TEXT_EMBEDDING_PROCESSOR_SKIP_EXISTING_EXECUTIONS);


Not this change but I think I noticed a bug in my initial PR, we should be incrementing the stat even when we run batch execute but currently we only increment during single execute so I think it might not be counted when we run pipelines in batch. Do you think you could include that change in this PR as well? If not I can try to catch it in the next one?

sure, i can add it to this pr

q-andy · 2025-05-20T17:11:03Z

Currently we increment the stat whenever we have an execution that has the option enabled. Should we also have a stat to record when we have a "cache hit" and successfully skip an inference, or do you think that would be redundant?

cc: @heemin32

will-hwang · 2025-05-20T19:17:20Z

Currently we increment the stat whenever we have an execution that has the option enabled. Should we also have a stat to record when we have a "cache hit" and successfully skip an inference, or do you think that would be redundant?

cc: @heemin32

@q-andy It really comes down to what insight we want to gain from these stats.

Do we want to know how many domains are using certain features?
Do we also want to know how the features are performing with the features enabled?

I think we should start with 1 (which this PR addresses), and scope out how, and to what extent we should support 2.

heemin32 · 2025-05-20T21:09:51Z

For cache_hit stats, we can wait until there is an ask from users with the feature.

heemin32 · 2025-05-21T04:14:13Z

I'm not sure if the current stat name text_embedding_skip_existing_processors is appropriate, since it's not a standalone processor. Would it make more sense to track how many processors have the skip option enabled, regardless of type?

Also, do we really need a separate execution stat like text_embedding_skip_existing_executions? Is it providing distinct value?

will-hwang · 2025-05-21T17:32:49Z

@heemin32 the naming convention follows the one for text chunking which have different available algorithm options (link). As for execution stats, it would provide stats for how many times the processor with skip_existing flag was executed, which would be different than how many processors have skip_existing flag enabled. If we're okay with just having the latter, i'm okay with it too. But It seems like EventStat and InfoStat share the same for other processors like text chunking and normalization

Signed-off-by: will-hwang <[email protected]>

junqiu-lei · 2025-05-22T21:31:19Z

@will-hwang You might need rebase with main branch to pass the CI.

heemin32 · 2025-05-28T21:52:51Z

Regarding the stats APIs in the neural plugin — do we really need to track processor executions metrics? Wouldn’t the number of processors alone be sufficient to measure adoption?

My main concern is scalability. The stats APIs have limitations in that area, and once we add a metric, it's difficult to remove it later. That’s why I’d prefer to keep the metrics as minimal as possible.

will-hwang · 2025-05-28T22:00:33Z

Regarding the stats APIs in the neural plugin — do we really need to track processor executions metrics? Wouldn’t the number of processors alone be sufficient to measure adoption?

My main concern is scalability. The stats APIs have limitations in that area, and once we add a metric, it's difficult to remove it later. That’s why I’d prefer to keep the metrics as minimal as possible.

@heemin32
if we are okay with simply tracking adoption, i think we can remove the eventStats metrics and only keep one for infoStats. The change will probably need to be made for other processors too then. If we do make this change, for what cases should event execution metrics be emitted?

martin-gaievski · 2025-05-28T22:06:02Z

Regarding the stats APIs in the neural plugin — do we really need to track processor executions metrics? Wouldn’t the number of processors alone be sufficient to measure adoption?

My main concern is scalability. The stats APIs have limitations in that area, and once we add a metric, it's difficult to remove it later. That’s why I’d prefer to keep the metrics as minimal as possible.

detailed metrics would be useful too, we can see which search configuration is most used one and invest there our efforts to improve relevance or other aspects of search. If number of metrics is critical for infra team then we can have only number of processors, that is P0.

heemin32 · 2025-05-28T22:46:26Z

Regarding the stats APIs in the neural plugin — do we really need to track processor executions metrics? Wouldn’t the number of processors alone be sufficient to measure adoption?
My main concern is scalability. The stats APIs have limitations in that area, and once we add a metric, it's difficult to remove it later. That’s why I’d prefer to keep the metrics as minimal as possible.

@heemin32 if we are okay with simply tracking adoption, i think we can remove the eventStats metrics and only keep one for infoStats. The change will probably need to be made for other processors too then. If we do make this change, for what cases should event execution metrics be emitted?

For event execution, we need to add metrics which cannot be retrieved from eventStats. For example, number of neural query execution of which data is not available from cluster info.

will-hwang · 2025-05-28T22:51:32Z

Regarding the stats APIs in the neural plugin — do we really need to track processor executions metrics? Wouldn’t the number of processors alone be sufficient to measure adoption?
My main concern is scalability. The stats APIs have limitations in that area, and once we add a metric, it's difficult to remove it later. That’s why I’d prefer to keep the metrics as minimal as possible.

@heemin32 if we are okay with simply tracking adoption, i think we can remove the eventStats metrics and only keep one for infoStats. The change will probably need to be made for other processors too then. If we do make this change, for what cases should event execution metrics be emitted?

For event execution, we need to add metrics which cannot be retrieved from eventStats. For example, number of neural query execution of which data is not available from cluster info.

sounds reasonable to me. What do others think?
@q-andy @martin-gaievski

heemin32 · 2025-05-28T22:51:39Z

Regarding the stats APIs in the neural plugin — do we really need to track processor executions metrics? Wouldn’t the number of processors alone be sufficient to measure adoption?
My main concern is scalability. The stats APIs have limitations in that area, and once we add a metric, it's difficult to remove it later. That’s why I’d prefer to keep the metrics as minimal as possible.

detailed metrics would be useful too, we can see which search configuration is most used one and invest there our efforts to improve relevance or other aspects of search. If number of metrics is critical for infra team then we can have only number of processors, that is P0.

I agree that more data can be better, but we also have to consider that adding a metric isn't without cost. Also, there's a chance that a single user could generate a large number of calls to a specific processor, which might skew the data and not accurately reflect its true popularity.
Perhaps tracking the number of processors used would be a more reliable metric for measuring adoption and it alone might be sufficient?

bzhangam · 2025-06-02T18:12:51Z

src/main/java/org/opensearch/neuralsearch/processor/TextEmbeddingProcessor.java

@@ -109,6 +110,7 @@ public void doBatchExecute(List<String> inferenceList, Consumer<List<?>> handler
    @Override
    public void subBatchExecute(List<IngestDocumentWrapper> ingestDocumentWrappers, Consumer<List<IngestDocumentWrapper>> handler) {
        try {
+            EventStatsManager.increment(EventStatName.TEXT_EMBEDDING_PROCESSOR_EXECUTIONS);


Should we count this by the num of the docs that we are processing? I feel we may want to know how many docs are processed by the processor.

Or we may want to use another event name for the batch processing use case otherwise it can be confusing if we want to rely on this event to tell how many docs we have processed.

i think we're more concerned with the number of execution than how many docs being processed per execution in general for other processors as well

q-andy · 2025-06-02T21:48:51Z

My main concern is scalability. The stats APIs have limitations in that area, and once we add a metric, it's difficult to remove it later. That’s why I’d prefer to keep the metrics as minimal as possible.

Had a chat with infra team, the primary concern is high memory consumption on large clusters seen when calling APIs like _node/stats and _node/state due to large response payloads. For production clusters they mitigate this by calling multiple times and filtering specific stats. I opened #1360 to give an option to mitigate the size of the payload, and opened an issue to add more filtering options in #1363.

Based on this, for 3.1 it should be okay to add granular stats, and caller side can filter them as needed if we run into scalability concerns.

codecov · 2025-06-02T22:50:19Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 0.00%. Comparing base (979a9fc) to head (2e3afdd).
Report is 4 commits behind head on main.

Additional details and impacted files

@@             Coverage Diff              @@
##               main   #1332       +/-   ##
============================================
- Coverage     82.62%       0   -82.63%     
============================================
  Files           149       0      -149     
  Lines          7257       0     -7257     
  Branches       1164       0     -1164     
============================================
- Hits           5996       0     -5996     
+ Misses          811       0      -811     
+ Partials        450       0      -450

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

will-hwang requested review from heemin32, navneet1v, VijayanB, vamshin, jmazanec15, naveentatikonda, junqiu-lei, martin-gaievski, sean-zheng-amazon, model-collapse, zane-neo, vibrantvarun, zhichao-aws, yuye-aws and minalsha as code owners May 19, 2025 22:14

will-hwang force-pushed the optimzed_embedding_processor_stats branch from f827460 to 7217183 Compare May 19, 2025 22:16

q-andy reviewed May 19, 2025

View reviewed changes

will-hwang force-pushed the optimzed_embedding_processor_stats branch from 7217183 to 9188606 Compare May 20, 2025 00:01

will-hwang closed this May 20, 2025

will-hwang reopened this May 20, 2025

q-andy mentioned this pull request May 8, 2025

[FEATURE] Update neural-search stats API spec with new stats added in 3.1 opensearch-project/opensearch-api-specification#890

Open

7 tasks

q-andy mentioned this pull request May 20, 2025

[DOC] Update neural-search stats API docs with new stats added in 3.1 opensearch-project/documentation-website#9943

Closed

11 tasks

add stats for text embedding processor with different settings

2e3afdd

Signed-off-by: will-hwang <[email protected]>

will-hwang force-pushed the optimzed_embedding_processor_stats branch from e9ac623 to 2e3afdd Compare May 22, 2025 21:31

bzhangam reviewed Jun 2, 2025

View reviewed changes

heemin32 approved these changes Jun 2, 2025

View reviewed changes

junqiu-lei approved these changes Jun 2, 2025

View reviewed changes

heemin32 merged commit c0faee8 into opensearch-project:main Jun 2, 2025
47 of 50 checks passed

add stats for text embedding processors with flags #1332

add stats for text embedding processors with flags #1332

Conversation

will-hwang commented May 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Updated Response

Check List

Uh oh!

heemin32 commented May 19, 2025

Uh oh!

q-andy left a comment

Choose a reason for hiding this comment

Uh oh!

q-andy May 19, 2025

Choose a reason for hiding this comment

Uh oh!

will-hwang May 19, 2025

Choose a reason for hiding this comment

Uh oh!

q-andy May 19, 2025

Choose a reason for hiding this comment

Uh oh!

will-hwang May 19, 2025

Choose a reason for hiding this comment

Uh oh!

q-andy commented May 20, 2025

Uh oh!

will-hwang commented May 20, 2025

Uh oh!

heemin32 commented May 20, 2025

Uh oh!

heemin32 commented May 21, 2025

Uh oh!

will-hwang commented May 21, 2025

Uh oh!

junqiu-lei commented May 22, 2025

Uh oh!

heemin32 commented May 28, 2025

Uh oh!

will-hwang commented May 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

martin-gaievski commented May 28, 2025

Uh oh!

heemin32 commented May 28, 2025

Uh oh!

will-hwang commented May 28, 2025

Uh oh!

heemin32 commented May 28, 2025

Uh oh!

bzhangam Jun 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

will-hwang Jun 2, 2025

Choose a reason for hiding this comment

Uh oh!

q-andy commented Jun 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

codecov bot commented Jun 2, 2025

Codecov Report

Uh oh!

Uh oh!

will-hwang commented May 19, 2025 •

edited

Loading

will-hwang commented May 28, 2025 •

edited

Loading

bzhangam Jun 2, 2025 •

edited

Loading

q-andy commented Jun 2, 2025 •

edited

Loading