Skip to content

airflow db clean command errors when attempting to delete rows still being referenced #59474

@rjh-yext

Description

@rjh-yext

Apache Airflow version

Other Airflow 3 version (please specify below)

If "Other Airflow 3 version" selected, which one?

apache/airflow:3.1.3 Docker image

What happened?

When using the airflow db clean command, an error occurs when deleting rows from the dag_version table. This is caused by a violation of the task_instance_dag_version_id_fkey constraint added by revision id 3ac9e5732b1f, see error message below.

What appears to be happening is that there can be rows in the task_instance table that are not being deleted because they do not fall within the age limit criteria, that reference rows in the dag_version table that are being deleted because they do fall within the age limit criteria.

The db clean command is removing rows from both the task_instance and dag_version table based solely on whether they fit the age limit criteria, and does not take into consideration any fk constraints. I believe this means:

  • task_instance uses columns start_date or updated_at
  • dag_version uses last_updated or created_at

Error message:

sqlalchemy.exc.IntegrityError: (MySQLdb.IntegrityError) (1451, 'Cannot delete or update a parent row: a foreign key constraint fails (airflow.task_instance, CONSTRAINT task_instance_dag_version_id_fkey FOREIGN KEY (dag_version_id) REFERENCES dag_version (id) ON DELETE RESTRICT)') 06:02:09  [SQL: DELETE FROM dag_version USING dag_version, _airflow_deleted__dag_version__20251215060205 WHERE dag_version.id = _airflow_deleted__dag_version__20251215060205.id]

What you think should happen instead?

No response

How to reproduce

Have a dag that has not been updated for a period of time, but is run recently.

Run airflow db clean with a date before the recent dag run, but after the date the dag was last updated.

We should expect the dag_version row to be deleted, but the task_instance row to be kept, which causes the problem.

Operating System

Debian GNU/Linux 12 (bookworm)

Versions of Apache Airflow Providers

No response

Deployment

Other Docker-based deployment

Deployment details

No response

Anything else?

No response

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions