Description
Description
This issue is for tracking the drafting of an implementation plan in correspondence with the one noted in the search relevancy sandbox project proposal: Staging Elasticsearch reindex DAGs for both potential index types (these will be subsets of the full data refresh).
From the project plan:
This plan will describe the DAG or DAGs which will be used to create/update both the proportional-by-provider and production-data-volume indices.
It will also describe the mechanism by which maintainers can rapidly switch index the staging API uses. This could be done in two separate ways: a DAG which allows changing the primary index alias or a set of changes to the API which would allow queries to specify which index they use. The implementation plan should explore and describe both options.
There was some discussion on this in #1107 (link), from @zackkrida and @sarayourfriend:
(Zack)
Feels quite important to me and like it may be a bit under-explained by this proposal. As written, the proposal seems to suggest that the DAGs would be used to allow for creating a new staging index that replaces the default, "live" index the staging API is pointed at. Is that correct?Alternatively, I can see a lot of value in having DAGs that allow for creating and updating staging indices with different qualities, but having a mechanism that allows developers to switch between them much more rapidly, perhaps even a query param-based solution (which would allow, for example, to switch indices via a checkbox on the staging.openverse.org/preferences page).
To put this all another way, I'm wondering if requirement #5 should be a bit more specific in terms of what "easily" means.
...
(Madison)
That's a great point, but I actually think this would be a good thing to determine in the implementation plan! It was my intention to have a single alias (like production, at least at the current moment) and allow maintainers to swap between these aliases. However, I do quite like the idea of an API setting which might allow dynamically changing which index is used at query time! That would bring increased complexity to the plan and the project that I wasn't initially considering, but if we scope it out as part of the plan, potentially starting with the rapid swapping but laying out a plan for what changing the index at query time might look like, we could even add that capability down the line outside of the bounds of this specific effort!...
(Sara)
FWIW, if anyone is looking at this later during implementation planning, this probably doesn't need to be more complicated than a query parameter that determines the index, but we'd need to be careful to juggle things in light of https://docs.openverse.org/projects/proposals/detecting_sensitive_textual_content/20230308-implementation_plan_filtering_and_designating_results_with_sensitive_textual_content.html
Metadata
Metadata
Assignees
Labels
Type
Projects
Status
Status