Description
Is your feature request related to a problem? Please describe
System ingest pipelines added in #17817 are a new pipeline type which allow plugins to dynamically inject and execute processors based on the index mapping or other conditional checks. In essence, system pipelines can allow plugin developers to implement features that completely abstract pipelines/processor executions from the user.
Currently, system ingest pipelines are used to support semantic field (RFC), where we chunk and generate embeddings for semantic field text without the user having to manually configure pipelines themselves.
While system pipelines were developed with semantic field in mind, the extension points are flexible and may be used to develop future field types implemented by other plugins as well.
Describe the solution you'd like
We should document in detail nuances of how system pipelines work outside of the code, such as
- Interactions with bulk update (see also [Feature Request] Support system generated ingest pipelines for bulk update operations #18276)
- Order of pipeline execution
- System pipeline caching
- Configurability options (enable/disable cluster setting)
Since this documentation is targeted at developers, I believe this should go in a readme in the code base itself, rather than in the documentation website which is for users of OpenSearch
Related component
Indexing
Describe alternatives you've considered
not having explicit documentation, letting future developers read the code
Additional context
No response