-
Notifications
You must be signed in to change notification settings - Fork 380
Open
Description
// The path to which the volume MAY be staged. It MUST be an
// absolute path in the root filesystem of the process serving this
// request, and MUST be a directory. The CO SHALL ensure that there
// is only one `staging_target_path` per volume. The CO SHALL ensure
// that the path is directory and that the process serving the
// request has `read` and `write` permission to that directory. The
// CO SHALL be responsible for creating the directory if it does not
// exist.
// This is a REQUIRED field.
string staging_target_path = 3;
The Staging target path is carved out by CO and passed to SP. From the spec it is not clear that, whether the SP has to stage a volume on this target path only.
Is it mandatory ?
If it is not mandatory, I see there are couple of issues.
-
The SP dont have a mechanism to pass the
new
target path to CO viaNodeStageVolumeResponse
. -
The cleanup operation of stage volume actually happens on the path provided by CO ( ex: Kubernetes) . Here I am referring to
removeMountDir
code in kubernetes.
func removeMountDir(plug *csiPlugin, mountPath string) error {
klog.V(4).Info(log("removing mount path [%s]", mountPath))
....
if !mnt {
klog.V(4).Info(log("dir not mounted, deleting it [%s]", mountPath))
if err := os.Remove(mountPath); err != nil && !os.IsNotExist(err) {
return errors.New(log("failed to remove dir [%s]: %v", mountPath, err))
}
....
Any thoughts ?
Activity
humblec commentedon Aug 30, 2019
@msau42 @jsafrane
Madhu-1 commentedon Aug 30, 2019
if it becomes mandatory as staging path is a directory how about block volumes?
msau42 commentedon Aug 30, 2019
the CO will pass around the staging_target_directory from NodeStageVolume to NodePublishVolume. As a SP, you stage your volume however you like inside the staging_target_directory. For example, you can stage it directly at the starging_target_directory, or a subdirectory or file underneath. You can also store other files/metadata inside the staging_target_directory if you like.
gnufied commentedon Aug 30, 2019
There is a interesting consequence of SP storing metadata under staging target_dir, I am noticing that if stage fails then k8s tries to remove the staging directory - https://github.com/kubernetes/kubernetes/blob/master/pkg/volume/csi/csi_attacher.go#L330 but if it had files, that would fail. It is somewhat messy.. :(
humblec commentedon Aug 30, 2019
@msau42 thanks, but as mentioned in the problem description , if SP put a directory or file inside the staging target path directory, the
removeMountDir
fail which got triggered because of a timeout or similar condition, the main reason being it usesif err := os.Remove(mountPath);
in shortos.Remove
. May beos.removeAll
could have helped here. But at present it fails with an error message similar toAfaict, all the code paths in
attacher
basically make use ofstaging target path
(.../globalmount
) for its operations and not really the actualstaged
path inside thestaging target path
for its operations like cleanup..etc in case there was a failure while staging . This looks to be problematic .One other issue I noticed here is this : In this case ie when timeout of 2 mins occurs while staging was in progress, CO start cleanup process which can fail due to an error like
directory not empty
,. At this time, CO think that thehighlevel
operation iestaging
itself failed for the workload and it put the workload to a new NODE even if the POD hasRWO
volume in it, which causedouble staging
( both in old and new node at the same time) to happen for a workload thus Filesystem data corruption..etc.msau42 commentedon Aug 30, 2019
Are you saying there is a problem with the normal Stage->Publish->Unpublish->Unstage flow? Or there's only a problem when Stage fails? I think it should be up to the SP to make sure to undo/cleanup when they abort an operation
humblec commentedon Aug 31, 2019
@msau42 in short, kubernetes/kubernetes#82190 covers the issue I was mentioning.
Mention about the possibilities of using staging_target_path at NodeS…
Mention about the possibilities of using staging_target_path at NodeS…
jdef commentedon Sep 10, 2019
Is there actually a spec issue here, or is this some k8s/CSI implementation issue?