Skip to content

specs: VEP-2272 Complete Data Job Configuration Persistence Part 2#2302

Merged
mivanov1988 merged 7 commits intomainfrom
person/miroslavi/spec-complete-job-config-persistence-in-db-part-2
Jun 26, 2023
Merged

specs: VEP-2272 Complete Data Job Configuration Persistence Part 2#2302
mivanov1988 merged 7 commits intomainfrom
person/miroslavi/spec-complete-job-config-persistence-in-db-part-2

Conversation

@mivanov1988
Copy link
Copy Markdown
Contributor

This change introduces VEP-2272, which aims at proposing an improvement to Versatile Data Kit by switching the source of truth from Kubernetes to a database.

Signed-off-by: Miroslav Ivanov miroslavi@vmware.com

This change introduces VEP-2272, which aims at proposing an improvement
to Versatile Data Kit by switching the source of truth
from Kubernetes to a database.

Signed-off-by: Miroslav Ivanov miroslavi@vmware.com
This change introduces VEP-2272, which aims at proposing an improvement
to Versatile Data Kit by switching the source of truth
from Kubernetes to a database.

Signed-off-by: Miroslav Ivanov miroslavi@vmware.com
This change introduces VEP-2272, which aims at proposing an improvement
to Versatile Data Kit by switching the source of truth
from Kubernetes to a database.

Signed-off-by: Miroslav Ivanov miroslavi@vmware.com
@mivanov1988 mivanov1988 changed the title [DRAFT] specs: VEP-2272 Complete Data Job Configuration Persistence Part 2 specs: VEP-2272 Complete Data Job Configuration Persistence Part 2 Jun 22, 2023
Copy link
Copy Markdown
Contributor

@antoniivanov antoniivanov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the proposal. Look pretty solid.

Let's be more specific in a couple of places though.

This change introduces VEP-2272, which aims at proposing an improvement
to Versatile Data Kit by switching the source of truth
from Kubernetes to a database.

Signed-off-by: Miroslav Ivanov miroslavi@vmware.com
This change introduces VEP-2272, which aims at proposing an improvement
to Versatile Data Kit by switching the source of truth
from Kubernetes to a database.

Signed-off-by: Miroslav Ivanov miroslavi@vmware.com
Copy link
Copy Markdown
Collaborator

@dakodakov dakodakov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good. Please address the two minor comments I posted.

This change introduces VEP-2272, which aims at proposing an improvement
to Versatile Data Kit by switching the source of truth
from Kubernetes to a database.

Signed-off-by: Miroslav Ivanov miroslavi@vmware.com
@antoniivanov
Copy link
Copy Markdown
Contributor

antoniivanov commented Jun 23, 2023

How will we migrate?

Once we are ready, we will upgrade the control service , this will create the new database table (but it would be empty). And we need a script that would be populate it with correct info form teh existing environment? Would this script be executed by the helm chart (with helm hook Post-upgrade - https://helm.sh/docs/topics/charts_hooks/)

WHat about rollback in case of issues. If we roll back, then the second table would remain, right but that won't be an issue, the Control Service would simply use the previous way (annotations) .

@mivanov1988
Copy link
Copy Markdown
Contributor Author

How will we migrate?

Once we are ready, we will upgrade the control service , this will create the new database table (but it would be empty). And we need a script that would be populate it with correct info form teh existing environment? Would this script be executed by the helm chart (with helm hook Post-upgrade - https://helm.sh/docs/topics/charts_hooks/)

WHat about rollback in case of issues. If we roll back, then the second table would remain, right but that won't be an issue, the Control Service would simply use the previous way (annotations) .

Basically, we will execute the following steps:

  1. Provide a feature flag that will switch the reading of deployments (database or Kubernetes). By default, Control Service will read the deployments from the Kubernetes.
  2. Enable writes to both database and Kubernetes.
  3. Execute manual script that synchronizes deployments between Kubernetes and the database. At the moment I don't think it is necessary to use helm for that but during the development, we might consider it.
  4. Switch the feature flag.

About the rollback you are absolutely right, we just need to switch the feature flag and everything will work as expected.

@murphp15
Copy link
Copy Markdown
Contributor

that is really well written and clear.

Copy link
Copy Markdown
Contributor

@antoniivanov antoniivanov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Miro for this proposal. It seems sound, thorough, and well-reasoned to me.
It addresses a clear need for improving VDK efficiency of data job configuration and deployment.

So consider this proposal approved from me. Once it's approved by every reviewer you can move its status to implementable.

Two notes to consider further and can be filled in at the later stage

  1. With the shift to the database being the primary source of truth, ensuring the resilience, backup, and high availability of the database becomes even more important. Let's make sure we do our calculations right. I made a comment about that.

  2. We need to make sure we have effective monitoring and telemetry to ensure that any synchronization issues are detected (by monitoring) and can be analyzed historically (by telemetry). As part of the implementation, we need to come up with some for it. Enhancing our telemetry is fairly easy with "Measurable" annotation (for telemetry) and MeterRegistry (for monitoring).

@mivanov1988
Copy link
Copy Markdown
Contributor Author

Thank you for the review!

This change introduces VEP-2272, which aims at proposing an improvement
to Versatile Data Kit by switching the source of truth
from Kubernetes to a database.

Signed-off-by: Miroslav Ivanov miroslavi@vmware.com
@mivanov1988 mivanov1988 merged commit f53d0e6 into main Jun 26, 2023
@mivanov1988 mivanov1988 deleted the person/miroslavi/spec-complete-job-config-persistence-in-db-part-2 branch June 26, 2023 11:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants