Skip to content

[Feature branch] Add Neural Stats API #1208

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

q-andy
Copy link
Contributor

@q-andy q-andy commented Mar 4, 2025

Description

Implementing Neural Stats API framework design proposed in #1196. This initial PR sets up the foundation for the framework to track event and state stats throughout the neural search plugin and exposed their values via API.

Image

  • Event-based stats
    • Event stats are recorded in code at a node level (processor executions, documents ingested, etc)
    • When an API call is made, all node-level maps are fetched via transport action and returned in the response.
  • State stats
    • State stats are defined by helper functions that populate state stat values
    • When an API call is made, the functions are invoked and the information is added to the response on demand

See RFC for more details.

Initial implementation includes 3 stats:

  • Text embedding processor executions
  • Text embedding processors in pipelines
  • Cluster version

Related Issues

Resolves #1196 #1104
Related: #1146

Check List

  • New functionality includes testing.
  • New functionality has been documented.
  • API changes companion pull request created.
  • Commits are signed per the DCO using --signoff.
  • Public documentation issue/PR created.

Example requests

GET /_plugins/_neural/stats
{
	"_nodes": {
		"total": 1,
		"successful": 1,
		"failed": 0
	},
	"cluster_name": "integTest",
	"cluster_version": "3.0.0",
	"processors": {
		"ingest": {
			"text_embedding_processors_in_pipelines": 0
		}
	},
	"all_nodes": {
		"processors": {
			"ingest": {
				"text_embedding_executions": 0
			}
		}
	},
	"nodes": {
		"r-hVPa7-Ra6FBbdpfVrHjg": {
			"processors": {
				"ingest": {
					"text_embedding_executions": 0
				}
			}
		}
	}
}
GET /_plugins/_neural/stats?include_metadata=true
{
	"_nodes": {
		"total": 1,
		"successful": 1,
		"failed": 0
	},
	"cluster_name": "integTest",
	"cluster_version": {
		"value": "3.0.0",
		"stat_type": "settable"
	},
	"processors": {
		"ingest": {
			"text_embedding_processors_in_pipelines": {
				"value": 0,
				"stat_type": "countable"
			}
		}
	},
	"all_nodes": {
		"processors": {
			"ingest": {
				"text_embedding_executions": {
					"value": 0,
					"stat_type": "timestamped_counter",
					"trailing_interval_value": 0,
					"minutes_since_last_event": 29018783
				}
			}
		}
	},
	"nodes": {
		"r-hVPa7-Ra6FBbdpfVrHjg": {
			"processors": {
				"ingest": {
					"text_embedding_executions": {
						"value": 0,
						"stat_type": "timestamped_counter",
						"trailing_interval_value": 0,
						"minutes_since_last_event": 29018783
					}
				}
			}
		}
	}
}
GET _plugins/_neural/stats/text_embedding_executions?include_metadata=true&flat_keys=true

{
	"_nodes": {
		"total": 1,
		"successful": 1,
		"failed": 0
	},
	"cluster_name": "integTest",
	"all_nodes.processors.ingest.text_embedding_executions": {
		"value": 0,
		"stat_type": "timestamped_counter",
		"trailing_interval_value": 0,
		"minutes_since_last_event": 29018784
	},
	"nodes": {
		"r-hVPa7-Ra6FBbdpfVrHjg": {
			"processors.ingest.text_embedding_executions": {
				"value": 0,
				"stat_type": "timestamped_counter",
				"trailing_interval_value": 0,
				"minutes_since_last_event": 29018784
			}
		}
	}
}

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Signed-off-by: Andy Qin <[email protected]>
Signed-off-by: Andy Qin <[email protected]>
Signed-off-by: Andy Qin <[email protected]>
Signed-off-by: Andy Qin <[email protected]>
Signed-off-by: Andy Qin <[email protected]>
Copy link
Contributor Author

@q-andy q-andy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed review comments, What's left is changing the format of the response according to the discussion above (breaking down high level categories as unflat like all_nodes), adding examples, and adding BWCs.

.map(String::toLowerCase)
.collect(Collectors.toSet());

private NeuralSearchSettingsAccessor settingsAccessor;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't static fields typically go before instance fields?

Signed-off-by: Andy Qin <[email protected]>
Signed-off-by: Andy Qin <[email protected]>
@q-andy
Copy link
Contributor Author

q-andy commented Mar 12, 2025

Added BWC tests. Currently they don't run since there's no backwards versions, but included them to have basis for future 3.x versions.

Updated response formatting:

Default

GET {{ _.base_url }}/_plugins/_neural/stats
{
	"_nodes": {
		"total": 1,
		"successful": 1,
		"failed": 0
	},
	"cluster_name": "integTest",
	"info": {
		"cluster_version": "3.0.0",
		"processors": {
			"ingest": {
				"text_embedding_processors_in_pipelines": 0
			}
		}
	},
	"all_nodes": {
		"processors": {
			"ingest": {
				"text_embedding_executions": 0
			}
		}
	},
	"nodes": {
		"Qs3osnyfTz6AokNiPB2uRQ": {
			"processors": {
				"ingest": {
					"text_embedding_executions": 0
				}
			}
		}
	}
}

Flatten

GET {{ _.base_url }}/_plugins/_neural/stats?flat_stat_paths=true
{
	"_nodes": {
		"total": 1,
		"successful": 1,
		"failed": 0
	},
	"cluster_name": "integTest",
	"info": {
		"cluster_version": "3.0.0",
		"processors.ingest.text_embedding_processors_in_pipelines": 0
	},
	"all_nodes": {
		"processors.ingest.text_embedding_executions": 0
	},
	"nodes": {
		"Qs3osnyfTz6AokNiPB2uRQ": {
			"processors.ingest.text_embedding_executions": 0
		}
	}
}

Flatten w/ metadata

GET {{ _.base_url }}/_plugins/_neural/stats?flat_stat_paths=true&include_metadata=true
{
	"_nodes": {
		"total": 1,
		"successful": 1,
		"failed": 0
	},
	"cluster_name": "integTest",
	"info": {
		"cluster_version": {
			"value": "3.0.0",
			"stat_type": "info_string"
		},
		"processors.ingest.text_embedding_processors_in_pipelines": {
			"value": 0,
			"stat_type": "info_counter"
		}
	},
	"all_nodes": {
		"processors.ingest.text_embedding_executions": {
			"value": 0,
			"stat_type": "timestamped_event_counter",
			"trailing_interval_value": 0,
			"minutes_since_last_event": 29028938
		}
	},
	"nodes": {
		"Qs3osnyfTz6AokNiPB2uRQ": {
			"processors.ingest.text_embedding_executions": {
				"value": 0,
				"stat_type": "timestamped_event_counter",
				"trailing_interval_value": 0,
				"minutes_since_last_event": 29028938
			}
		}
	}
}

@q-andy q-andy force-pushed the neural-stats branch 3 times, most recently from 720ea81 to 647d730 Compare March 13, 2025 18:15
Signed-off-by: Andy Qin <[email protected]>
Copy link
Member

@martin-gaievski martin-gaievski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, thanks Andy. please address my comment regarding the rebasing on latest main

Copy link

codecov bot commented Mar 13, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 80.90%. Comparing base (57124dd) to head (82504ce).
Report is 1 commits behind head on feature/neural-stats-api.

Additional details and impacted files
@@                      Coverage Diff                       @@
##             feature/neural-stats-api    #1208      +/-   ##
==============================================================
- Coverage                       81.83%   80.90%   -0.93%     
+ Complexity                       2607     1423    -1184     
==============================================================
  Files                             190      115      -75     
  Lines                            8922     5001    -3921     
  Branches                         1520      803     -717     
==============================================================
- Hits                             7301     4046    -3255     
+ Misses                           1028      643     -385     
+ Partials                          593      312     -281     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@q-andy q-andy force-pushed the neural-stats branch 2 times, most recently from 8648e13 to 82504ce Compare March 14, 2025 16:26
Copy link
Member

@vibrantvarun vibrantvarun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, Nice job Andy

@vibrantvarun
Copy link
Member

@heemin32 if your all comments are resolved then can we merge this?

@heemin32
Copy link
Collaborator

@heemin32 if your all comments are resolved then can we merge this?

I think my comments are all resolved. G2G.

@vibrantvarun vibrantvarun merged commit 89e6932 into opensearch-project:feature/neural-stats-api Mar 14, 2025
90 of 96 checks passed
This was referenced Mar 31, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
v3.0.0 v3.0.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[RFC] Neural Plugin Stats API
4 participants