[META] Advanced Optimization Techniques for Hybrid query

## What/Why
### What are you proposing?
With 2.15 version there has been lot of progress in area of improving search latency for Hybrid query (changes that are done under umbrella of https://github.com/opensearch-project/neural-search/issues/704). Team needs to continue looking for ways of optimizing latency. 

I got some data on breakdown of request time between different search phases/steps:

```
request execution phase      | time span                  | overall percentage of time 
----------------------------------------------------------------------------------------
shard-level/collecting scores| |----81%----|              | 81%
processor/coordinator        |             |--11%--|      | 92%
network                      |                     |-8%-| | 100%
```

`Time span` column indicates the proportion of the total query execution time that each phase consumes, measured from the client’s perspective. For example, the "shard-level/data" phase starts at 0% of the total query time and ends at 81%, meaning it accounts for 81% of the total query execution time. The subsequent phases follow similarly.

Given that shard-level search contributes significantly to the total query execution time, it is reasonable to focus optimization efforts there initially.

At the high level here are some areas where team can apply efforts:

- document iterator
   in today's implementation we use DisiWrapper for iterating over one sub-query results and DisiPriorityQueue to collect scores for one doc id. Few foundational ideas are in this approach: iterate by one doc id, process iterators of every sub-query so they all point to the same doc id we we do a Scorer.score(). This brings some limitations, e.g. we cannot do bulk/block iteration on a set of documents. 

- optimizations in special cases
  we can optimize for some special cases, like for example if 2+ sub-queries can be re-written to the same lucene level query we can execute only one and re-use scores for others

- caching strategies: Implementing smarter caching mechanisms to reduce redundant computations.

- algorithmic Improvements: optimizing existing algorithms or introducing new ones that can handle hybrid queries more efficiently.

Individual issues/features related to this request:

### Shard level search optimizations
- https://github.com/opensearch-project/neural-search/issues/1234 - Shard level search - change score collection approach from random access by same doc ID to collecting scores for all doc IDs in sub-queries
- https://github.com/opensearch-project/neural-search/issues/1236 - Shard level search - optimize for special cases
- https://github.com/opensearch-project/project-website/issues/3829 - request for blog post

### Instrumentation
- https://github.com/opensearch-project/opensearch-benchmark-workloads/issues/341 - OSB workload with vector data and text fields
- https://github.com/opensearch-project/OpenSearch/issues/17558 - Add statistics from search phase results processors
- https://github.com/opensearch-project/neural-search/issues/1237 - Add support for microbenchmarks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[META] Advanced Optimization Techniques for Hybrid query #783

What/Why

What are you proposing?

Shard level search optimizations

Instrumentation

Sub-issues

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[META] Advanced Optimization Techniques for Hybrid query #783

Description

What/Why

What are you proposing?

Shard level search optimizations

Instrumentation

Sub-issues

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions