Skip to content

[server] Fix mismatch collation issue in workspace instance metrics query #20933

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jul 1, 2025

Conversation

mustard-mh
Copy link
Contributor

@mustard-mh mustard-mh commented Jun 30, 2025

Description

Search in codebase with d_b_org_env_var, d_b_workspace_instance_metrics, DBOrgEnvVar, DBWorkspaceInstanceMetrics (which table is altered in the last db migration) and check their usage, make sure the joined rows are using the same collation

Related Issue(s)

Fixes CLC-1500

How to test

  • Open workspace in preview
  • Check the Insight page, it should work as usual

Verify in real case

  • Alter tables with queries below (generated by Claude.ai)
-- MySQL queries to standardize collation to utf8mb4_0900_ai_ci
-- Based on the current schema analysis

-- First, alter the database default collation
ALTER DATABASE gitpod CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci;

-- Tables that need collation changes to utf8mb4_0900_ai_ci:

-- 1. d_b_app_installation (currently utf8mb4_general_ci)
ALTER TABLE d_b_app_installation CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci;

-- 2. d_b_audit_log (currently latin1_swedish_ci)
ALTER TABLE d_b_audit_log CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci;

-- 3. d_b_auth_provider_entry (currently utf8mb4_general_ci)
ALTER TABLE d_b_auth_provider_entry CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci;

-- 4. d_b_blocked_repository (currently latin1_swedish_ci)
ALTER TABLE d_b_blocked_repository CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci;

-- 5. d_b_code_sync_collection (currently latin1_swedish_ci)
ALTER TABLE d_b_code_sync_collection CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci;

-- 6. d_b_code_sync_resource (currently latin1_swedish_ci)
ALTER TABLE d_b_code_sync_resource CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci;

-- 7. d_b_cost_center (currently utf8mb4_general_ci)
ALTER TABLE d_b_cost_center CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci;

-- 8. d_b_email_domain_filter (currently utf8mb4_general_ci)
ALTER TABLE d_b_email_domain_filter CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci;

-- 9. d_b_free_credits (currently latin1_swedish_ci)
ALTER TABLE d_b_free_credits CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci;

-- 10. d_b_gitpod_token (currently utf8mb4_general_ci)
ALTER TABLE d_b_gitpod_token CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci;

-- 11. d_b_identity (currently utf8mb4_general_ci)
ALTER TABLE d_b_identity CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci;

-- 12. d_b_installation_admin (currently latin1_swedish_ci)
ALTER TABLE d_b_installation_admin CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci;

-- 13. d_b_license_key (currently latin1_swedish_ci)
ALTER TABLE d_b_license_key CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci;

-- 14. d_b_linked_in_profile (currently utf8mb4_general_ci)
ALTER TABLE d_b_linked_in_profile CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci;

-- 15. d_b_oauth_auth_code_entry (currently latin1_swedish_ci)
ALTER TABLE d_b_oauth_auth_code_entry CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci;

-- 16. d_b_oidc_client_config (currently latin1_swedish_ci)
ALTER TABLE d_b_oidc_client_config CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci;

-- 17. d_b_one_time_secret (currently utf8mb4_general_ci)
ALTER TABLE d_b_one_time_secret CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci;

-- 18. d_b_org_env_var (already utf8mb4_0900_ai_ci - no change needed)
-- ALTER TABLE d_b_org_env_var CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci;

-- 19. d_b_org_settings (currently utf8mb4_general_ci)
ALTER TABLE d_b_org_settings CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci;

-- 20. d_b_personal_access_token (currently latin1_swedish_ci)
ALTER TABLE d_b_personal_access_token CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci;

-- 21. d_b_prebuild_info (currently utf8mb4_general_ci)
ALTER TABLE d_b_prebuild_info CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci;

-- 22. d_b_prebuilt_workspace (currently utf8mb4_general_ci)
ALTER TABLE d_b_prebuilt_workspace CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci;

-- 23. d_b_prebuilt_workspace_updatable (currently utf8mb4_general_ci)
ALTER TABLE d_b_prebuilt_workspace_updatable CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci;

-- 24. d_b_project (currently utf8mb4_general_ci)
ALTER TABLE d_b_project CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci;

-- 25. d_b_project_env_var (currently utf8mb4_general_ci)
ALTER TABLE d_b_project_env_var CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci;

-- 26. d_b_project_info (currently utf8mb4_general_ci)
ALTER TABLE d_b_project_info CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci;

-- 27. d_b_project_usage (currently utf8mb4_general_ci)
ALTER TABLE d_b_project_usage CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci;

-- 28. d_b_repository_white_list (currently utf8mb4_general_ci)
ALTER TABLE d_b_repository_white_list CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci;

-- 29. d_b_snapshot (currently utf8mb4_general_ci)
ALTER TABLE d_b_snapshot CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci;

-- 30. d_b_stripe_customer (currently latin1_swedish_ci)
ALTER TABLE d_b_stripe_customer CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci;

-- 31. d_b_team (currently utf8mb4_general_ci)
ALTER TABLE d_b_team CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci;

-- 32. d_b_team_membership (currently utf8mb4_general_ci)
ALTER TABLE d_b_team_membership CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci;

-- 33. d_b_team_membership_invite (currently utf8mb4_general_ci)
ALTER TABLE d_b_team_membership_invite CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci;

-- 34. d_b_token_entry (currently utf8mb4_general_ci)
ALTER TABLE d_b_token_entry CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci;

-- 35. d_b_usage (currently latin1_swedish_ci)
ALTER TABLE d_b_usage CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci;

-- 36. d_b_user (currently utf8mb4_general_ci)
ALTER TABLE d_b_user CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci;

-- 37. d_b_user_env_var (currently utf8mb4_general_ci)
ALTER TABLE d_b_user_env_var CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci;

-- 38. d_b_user_ssh_public_key (currently utf8mb4_general_ci)
ALTER TABLE d_b_user_ssh_public_key CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci;

-- 39. d_b_volume_snapshot (currently utf8mb4_general_ci)
ALTER TABLE d_b_volume_snapshot CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci;

-- 40. d_b_webhook_event (currently utf8mb4_general_ci)
ALTER TABLE d_b_webhook_event CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci;

-- 41. d_b_workspace (currently utf8mb4_general_ci)
ALTER TABLE d_b_workspace CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci;

-- 42. d_b_workspace_cluster (currently utf8mb4_general_ci)
ALTER TABLE d_b_workspace_cluster CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci;

-- 43. d_b_workspace_instance (currently utf8mb4_general_ci)
ALTER TABLE d_b_workspace_instance CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci;

-- 44. d_b_workspace_instance_metrics (already utf8mb4_0900_ai_ci - no change needed)
-- ALTER TABLE d_b_workspace_instance_metrics CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci;

-- 45. d_b_workspace_instance_user (currently utf8mb4_general_ci)
ALTER TABLE d_b_workspace_instance_user CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci;

-- 46. d_b_workspace_report_entry (currently utf8mb4_general_ci)
ALTER TABLE d_b_workspace_report_entry CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci;

-- 47. migrations (currently latin1_swedish_ci)
ALTER TABLE migrations CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci;
  • Check the Insight page

Documentation

Preview status

Gitpod was successfully deployed to your preview environment.

Build Options

Build
  • /werft with-werft
    Run the build with werft instead of GHA
  • leeway-no-cache
  • /werft no-test
    Run Leeway with --dont-test
Publish
  • /werft publish-to-npm
  • /werft publish-to-jb-marketplace
Installer
  • analytics=segment
  • with-dedicated-emulation
  • workspace-feature-flags
    Add desired feature flags to the end of the line above, space separated
Preview Environment / Integration Tests
  • /werft with-local-preview
    If enabled this will build install/preview
  • /werft with-preview
  • /werft with-large-vm
  • /werft with-gce-vm
    If enabled this will create the environment on GCE infra
  • /werft preemptible
    Saves cost. Untick this only if you're really sure you need a non-preemtible machine.
  • with-integration-tests=all
    Valid options are all, workspace, webapp, ide, jetbrains, vscode, ssh. If enabled, with-preview and with-large-vm will be enabled.
  • with-monitoring

/hold

"wsi.metrics",
DBWorkspaceInstanceMetrics,
"wsim",
"wsim.instanceId COLLATE utf8mb4_general_ci = wsi.id COLLATE utf8mb4_general_ci",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The 2nd COLLATE is superfluous.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But also... why not leave it as-is, looking at the mess in some of the tables schemas... 🫠

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But also... why not leave it as-is, looking at the mess in some of the tables schemas... 🫠

I did it because of the original request XD "Maybe we can fix that in the query itself (there is a way to force a collation on the fly)."

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, update schema will be a better idea, let's merge current PR first

@geropl
Copy link
Member

geropl commented Jul 1, 2025

@mustard-mh The test instructions are not correct; they always work because there is not difference between metritcs and instance tables.
Instead, this is a good test:

ALTER TABLE d_b_workspace_instance_metrics CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci; // different to d_b_workspace_instance
SELECT * FROM d_b_workspace_instance wsi JOIN d_b_workspace_instance_metrics me ON me.instanceId = wsi.id; // Should FAIL
SELECT * FROM d_b_workspace_instance wsi JOIN d_b_workspace_instance_metrics me ON me.instanceId COLLATE utf8mb4_general_ci =  wsi.id; // Should WORK

Otherwise the change works! 👍

Copy link
Member

@geropl geropl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes LGTM, tested and works! ✔️

@roboquat roboquat merged commit e93eb6c into main Jul 1, 2025
77 of 78 checks passed
@roboquat roboquat deleted the hw/db-coll branch July 1, 2025 10:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants