Skip to content

[mount] Addition of "checkAndRepairXfsFilesystem" inadvertently prevents XFS self-recovery via mounting #141

Closed
kubernetes/kubernetes
#89444
@nktpro

Description

@nktpro

PR #126 added an extra step to run xfs_repair before mounting a XFS file system. However instead of helping to automatically correct FS issues due to prior unclean shutdowns, it actually prevented auto recovery from happening, which led to complete unavailability of the corresponding volume and subsequently required manual human intervention.

The sequence of events is as follows:

  1. A node loss / unclean shutdown occurs.
  2. A stateful pod is restarted on another healthy node; its volume is re-attached to the new node.
  3. xfs_repair is run against the volume. The relevant logs would look like these (in the context of rook-ceph but should apply to any other user of the mounter)
Filesystem corruption was detected for /dev/rbd1, running xfs_repair to repair
ID: 29 Req-ID: 0001-0009-rook-ceph-0000000000000001-9adb43bf-4e25-11ea-aa19-2ecc193be507 failed to mount device path (/dev/rbd1) to staging path (/var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-423b5a86-7c03-43c4-a7e9-4921934016de/globalmount/0001-0009-rook-ceph-0000000000000001-9adb43bf-4e25-11ea-aa19-2ecc193be507) for volume (0001-0009-rook-ceph-0000000000000001-9adb43bf-4e25-11ea-aa19-2ecc193be507) error 'xfs_repair' found errors on device /dev/rbd1 but could not correct them: Phase 1 - find and verify superblock...

Phase 2 - using internal log
        - zero log...
ERROR: The filesystem has valuable metadata changes in a log which needs to
be replayed.  Mount the filesystem to replay the log, and unmount it before
re-running xfs_repair.  If you are unable to mount the filesystem, then use
the -L option to destroy the log and attempt a repair.
Note that destroying the log may cause corruption -- please attempt a mount
of the filesystem before doing this.
  1. The volume is then prevented from being mounted and manual intervention is required
  2. All that's needed then, is to manually mount the volume, which replay XFS logs automatically, then unmount it and restart the corresponding pod.

Note that step #5 was what has always happened prior to this change. The volume is simply mounted without any attempt to perform FS check / xfs_repair. It can simply correct itself as part of just being mounted, as per XFS design.

The recommended fix is to only attempt to run xfs_repair if mounting actually fails, as the last resort. There shouldn't be any need to xfs_repair prior to a mount failure.

Alternatively, don't bail out if an error occurs when running xfs_repair. Let the mount attempt happen anyway. It'll then either fix itself, or fail mounting with another error.

Relevant issue from rook-ceph repo: rook/rook#4914

CC'ing @27149chen

Metadata

Metadata

Assignees

Labels

kind/bugCategorizes issue or PR as related to a bug.priority/critical-urgentHighest priority. Must be actively worked on as someone's top priority right now.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions