[release-1.16] Add CI check for invalid characters in file paths#9689
[release-1.16] Add CI check for invalid characters in file paths#9689kaovilai wants to merge 1 commit intovelero-io:release-1.16from
Conversation
Go's module zip rejects filenames containing certain characters (shell special chars like " ' * < > ? ` |, path separators : \, and non-letter Unicode such as control/format characters). This caused a build failure when a changelog file contained an invisible U+200E LEFT-TO-RIGHT MARK (see PR velero-io#9552). Add a GitHub Actions workflow that validates all tracked file paths on every PR to catch these issues before they reach downstream consumers. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Tiger Kaovilai <tkaovila@redhat.com> (cherry picked from commit 6d18d9b)
There was a problem hiding this comment.
Pull request overview
Adds a GitHub Actions workflow to validate tracked file paths on pull requests to ensure compatibility with Go module zip filename restrictions (catching invalid ASCII characters and non-letter Unicode such as control/format characters).
Changes:
- Introduces a new PR-triggered workflow to scan
git ls-filesoutput for invalid filename characters. - Emits GitHub Actions error annotations and rename suggestions when invalid paths are detected.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| # Characters explicitly rejected by Go's fileNameOK | ||
| # (path separators / and \ are inherent to paths so we check per-element) | ||
| bad_ascii = set('\"' + \"'\" + '*<>?\`|:') | ||
|
|
There was a problem hiding this comment.
bad_ascii is declared but never used. This adds noise and makes it unclear whether the check is meant to be a whitelist (via allowed_ascii) or a blacklist (via bad_ascii). Remove bad_ascii or incorporate it into the validation logic.
| # Characters explicitly rejected by Go's fileNameOK | |
| # (path separators / and \ are inherent to paths so we check per-element) | |
| bad_ascii = set('\"' + \"'\" + '*<>?\`|:') | |
| # ASCII characters allowed by Go's fileNameOK in addition to letters/digits | |
| # (path separators / and \ are inherent to paths so we check per-element) |
| clean = '/'.join( | ||
| ''.join(c for c in elem if is_ok(c)) | ||
| for elem in name.split('/') | ||
| ) | ||
| print(f'::error file={name}::File \"{name}\" contains invalid char {char_desc}') | ||
| bad_files.append((name, clean, char_desc)) |
There was a problem hiding this comment.
The ::error file={name}::... workflow command needs property value escaping (e.g., %, ,, \r, \n) to be parsed reliably by GitHub Actions. Since allowed_ascii includes characters like % and ,, a valid path containing them could break the annotation. Either escape name per the workflow command spec or omit the file= property and report the path only in the message.
| # Check each path element (split on /) | ||
| for element in name.split('/'): | ||
| for ch in element: | ||
| if not is_ok(ch): | ||
| cp = ord(ch) | ||
| char_name = unicodedata.name(ch, f'U+{cp:04X}') | ||
| char_desc = f'U+{cp:04X} ({char_name})' | ||
| # Build cleaned path by stripping invalid chars | ||
| clean = '/'.join( | ||
| ''.join(c for c in elem if is_ok(c)) | ||
| for elem in name.split('/') | ||
| ) | ||
| print(f'::error file={name}::File \"{name}\" contains invalid char {char_desc}') | ||
| bad_files.append((name, clean, char_desc)) | ||
| break |
There was a problem hiding this comment.
After finding an invalid character, the inner break only exits the character loop for the current path element; the code then continues scanning other elements and can emit multiple errors / duplicate entries for the same file. If you only want one report per file, consider breaking out of the element loop (or tracking that the file has already been flagged) once the first invalid character is found.
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## release-1.16 #9689 +/- ##
=============================================
Coverage 59.84% 59.84%
=============================================
Files 370 370
Lines 32034 32034
=============================================
Hits 19172 19172
Misses 11362 11362
Partials 1500 1500 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
Is it needed for release-1.16? IMO we only need to for |
Cherry-pick of #9553 to release-1.16.
Go's module zip rejects filenames containing certain characters (shell special chars, path separators, and non-letter Unicode such as control/format characters). This adds a GitHub Actions workflow that validates all tracked file paths on every PR to catch these issues before they reach downstream consumers.
No filename fixes needed on this branch.
Note
Responses generated with Claude