Skip to content

[26.0] Fix validation of certain classes of text validators in tools.#22280

Merged
mvdbeek merged 2 commits intogalaxyproject:release_26.0from
jmchilton:optional_text_validators_26_0
Mar 31, 2026
Merged

[26.0] Fix validation of certain classes of text validators in tools.#22280
mvdbeek merged 2 commits intogalaxyproject:release_26.0from
jmchilton:optional_text_validators_26_0

Conversation

@jmchilton
Copy link
Copy Markdown
Member

This is causing issues in the workflow work - a couple tools use this pattern that the tool request API was just rejecting:

        <param name="title" type="text" value="" optional="true" label="Report title" help="It is printed as page header">
            <sanitizer invalid_char="">
                <valid initial="string.letters,string.digits">
                    <add value=","/>
                    <add value=":"/>
                    <add value="-"/>
                    <add value="_"/>
                    <add value=" "/>
                    <add value="."/>
                </valid>
            </sanitizer>
            <validator type="regex">[0-9a-zA-Z,: _.-]+</validator>
        </param>

Rejecting empty values is the right thing to do with that regex - but Galaxy has a bug I guess that lets it through and so we just need to do that forever I think.

Random Claude Table:

Input optional (no default) optional value="" optional value="" + regex
{} None "" ""
None None None None
"" "" "" ""
valid value value value
bad value value fails

All rows now consistent across legacy, 21.01, and request formats.

How to test the changes?

(Select all options that apply)

License

  • I agree to license these and all my past contributions to the core galaxy codebase under the MIT license.

jmchilton and others added 2 commits March 27, 2026 13:32
Optional text params with regex (or other) validators would reject
None/"" in request format even though the runtime correctly skips
validation for empty optional params. Add optional flag to validator
chain so None/"" short-circuit before reaching static validators.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
strictify() and _fill_default_for() in convert.py ignored
parameter.default_value for optional text params, hardcoding None
for absent values. Additionally both used `or` which swallowed
empty string defaults due to Python falsiness.

This caused inconsistent behavior for absent params on tools with
value="" across input formats:

| Input | optional (no default) | optional value="" | optional value="" + regex |
|-------|----------------------|-------------------|--------------------------|
| {}    | None                 | ""                | ""                        |
| None  | None                 | None              | None                      |
| ""    | ""                   | ""                | ""                        |
| valid | value                | value             | value                     |
| bad   | value                | value             | fails                     |

All rows now consistent across legacy, 21.01, and request formats.

Test tools added:
- gx_text_optional_with_empty_default (optional text, value="", no validator)
- gx_text_optional_with_empty_default_regex_validation (the wild case)
- gx_text_optional_regex_validation updated to have no default

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@nsoranzo
Copy link
Copy Markdown
Member

Rejecting empty values is the right thing to do with that regex

Haven't looked at the code yet (and I don't understand Claude's table), but I think that in your example optional="true" has precedence over the regex validator (i.e. an empty value should always be accepted).
But I may be misunderstanding all this completely, so feel free to ignore.

@jmchilton jmchilton changed the title Optional text validators 26 0 Fix validation of certain classes of text validators in tools. Mar 27, 2026
@jmchilton
Copy link
Copy Markdown
Member Author

@nsoranzo I cannot comprehend that as a reasonable way to interpret these tools but you can and Galaxy can and that is enough - this "fixes" the validation work for your interpretation.

@github-actions github-actions Bot changed the title Fix validation of certain classes of text validators in tools. [26.0] Optional text validators 26 0 Mar 27, 2026
@mvdbeek
Copy link
Copy Markdown
Member

mvdbeek commented Mar 28, 2026

It's a bit of an undefined thing isn't it? Optional should mean null/None is allowed, but how should we apply a regex to null/None anyway? An empty string is allowed on required text parameters, why would the optional make a difference here?

@jmchilton jmchilton changed the title [26.0] Optional text validators 26 0 [26.0] Optional text validators specification fixes Mar 28, 2026
@jmchilton jmchilton changed the title [26.0] Optional text validators specification fixes [26.0] Fix validation of certain classes of text validators in tools. Mar 28, 2026
@jmchilton
Copy link
Copy Markdown
Member Author

I don't want to argue about whether this makes sense - I don't agree with y'all but I don't think I should have to right? I'm fixing a bug and making it work the way y'all seem to think it should?

@mvdbeek
Copy link
Copy Markdown
Member

mvdbeek commented Mar 30, 2026

I think we all want this to work well ? I agree with you ? I don't think optional should take precedence over a validator ?

@jmchilton
Copy link
Copy Markdown
Member Author

jmchilton commented Mar 31, 2026

@mvdbeek I promised you this report at today's meeting - it took 30% of a 5-hour usage window but I got it. The scope of the problem is not small - I think it is too late to fix this without a profile version change. I have a PR forthcoming that hardens and tests the JSON schema generated from our models against parameter_specification - the pattern is capturable there and works fine. I wrote a whole separate pipeline in TypeScript that includes meta-model validation for 100% of parameter_specification.xml coverage and 95% JSON schema coverage that matches Galaxy's generation in Python - this is all handled in there. The purpose of having a parameter specification is exactly to catch these sort of oddities and document them and ensure all of our handling around these things works fine and it does. I wish the tool XML was more intuitive to me - but @nsoranzo thinks it makes sense and he is smarter than me - I think we just work around the disagreement and let the implementation describe the expected behavior for now.

Perhaps a place to start the pushback would be to implement a linter that looks at optional text parameters with regex validators and ensure the regex validator accepts the empty string? Because while @nsoranzo might think this makes sense as a syntax he would probably agree it would read clearer if

<validator type="regex">[0-9a-zA-Z,: _.-]+</validator>

was

<validator type="regex">[0-9a-zA-Z,: _.-]*</validator>

IWC Workflows Failing Validation: Optional Text Params with Regex Validators

Commit: 305a2e84Skip pydantic validators for optional text params with None/empty values

Problem: Optional text parameters with regex validators reject None/"" during Pydantic
model validation, even though Galaxy's runtime correctly skips validation for empty optional
params. Without the fix, 57/120 IWC workflows fail roundtrip validation.

Fix: When optional=True and value is None or "", short-circuit before static validators.


Example 1: MultiQC — title and comment

Affects 57 workflows (every IWC workflow using any MultiQC version).

Tool XML

tools-iuc/tools/multiqc/multiqc.xml:493-518

<param name="title" type="text" value="" optional="true" label="Report title" help="...">
    <sanitizer invalid_char="">
        <valid initial="string.letters,string.digits">
            <add value=","/><add value=":"/><add value="-"/>
            <add value="_"/><add value=" "/><add value="."/>
        </valid>
    </sanitizer>
    <validator type="regex">[0-9a-zA-Z,: _.-]+</validator>
</param>
<param name="comment" type="text" value="" optional="true" label="Custom comment" help="...">
    <sanitizer invalid_char="">
        <valid initial="string.letters,string.digits">
            <add value=","/><add value=":"/><add value="-"/>
            <add value="_"/><add value=" "/><add value="."/>
        </valid>
    </sanitizer>
    <validator type="regex">[0-9a-zA-Z,: _.-]+</validator>
</param>

IWC Workflow Step

iwc/workflows/transcriptomics/rnaseq-sr/rnaseq-sr.ga — step 23

tool_id: toolshed.g2.bx.psu.edu/repos/iuc/multiqc/multiqc/1.33+galaxy0

Relevant tool_state fragment:

{
  "title": "",
  "comment": ""
}

Both title and comment are "" in the workflow. The regex [0-9a-zA-Z,: _.-]+ requires
one-or-more characters, so "" fails the regex validator. Without the fix, Pydantic applies the
regex to "" and rejects it; with the fix, the validator short-circuits on empty optional values.


Example 2: bcftools_consensus — mark_del and absent

Affects 6+ workflows (SARS-CoV-2 variant calling pipelines).

Tool XML

tools-iuc/tools/bcftools/bcftools_consensus.xml:171-176

<param argument="--absent" type="text" value="" optional="true"
       label="Mark absent"
       help="Replace reference bases at positions absent from the VCF input with a custom character.">
    <validator type="regex">^.$</validator>
</param>
<param argument="--mark-del" type="text" value="" optional="true"
       label="Mark deletions"
       help="Instead of removing the reference base at deleted positions, replace the base with a custom character.">
    <validator type="regex">^.$</validator>
</param>

IWC Workflow Step

iwc/workflows/sars-cov-2-variant-calling/sars-cov-2-consensus-from-variation/consensus-from-variation.ga — step 21

tool_id: toolshed.g2.bx.psu.edu/repos/iuc/bcftools_consensus/bcftools_consensus/1.15.1+galaxy4

Relevant tool_state fragment:

{
  "sec_default": {
    "mark_del": "",
    "absent": ""
  }
}

The regex ^.$ requires exactly one character. "" fails. Same pattern: optional text param
left empty in the workflow, regex validator rejects empty string without the short-circuit.


Example 3: compleasm — specified_contigs

Affects 30 workflows (all VGP assembly pipelines).

Tool XML

tools-iuc/tools/compleasm/compleasm.xml:38-45

<param argument="--specified_contigs" type="text" optional="true"
       label="Specify the contigs to be evaluated"
       help="e.g. chr1 chr2 chr3. If not specified, all contigs will be evaluated">
    <sanitizer invalid_char="">
        <valid initial="string.letters,string.digits">
            <add value="_" />
        </valid>
    </sanitizer>
    <validator type="regex">[0-9a-zA-Z_ ]+</validator>
</param>

IWC Workflow Step

iwc/workflows/VGP-assembly-v2/Assembly-Hifi-only-VGP3/Assembly-Hifi-only-VGP3.ga — step 39

tool_id: toolshed.g2.bx.psu.edu/repos/iuc/compleasm/compleasm/0.2.6+galaxy3

Relevant tool_state fragment:

{
  "specified_contigs": null
}

The regex [0-9a-zA-Z_ ]+ requires one-or-more characters. The workflow stores null for this
unused optional param. Without the fix, Pydantic converts null to None and the regex
validator receives None, raising a type error or match failure.


Summary of Affected IWC Workflow Categories

Category Failing Workflows Primary Culprit Tool
VGP assembly 10 compleasm, gfastats, hifiasm, genomescope, yahs
Amplicon 6 multiqc
Epigenetics 6 multiqc, cutadapt
SARS-CoV-2 6 multiqc, bcftools_consensus, bcftools_annotate
Microbiome 8 multiqc, bakta, omark
Transcriptomics 2 multiqc, cutadapt
Variant calling 3 multiqc, bwa_mem, bcftools_norm
scRNAseq 2 multiqc
Bacterial genomics 2 multiqc, bakta
Genome annotation 3 multiqc, helixer, gffread
Metabolomics 2 multiqc, xcms
Proteomics 1 multiqc, pepquery2
Virology 3 multiqc
Read preprocessing 1 multiqc, cutadapt

Total: 57/120 IWC workflows fail without the fix.

@mvdbeek
Copy link
Copy Markdown
Member

mvdbeek commented Mar 31, 2026

arg, ok, i guess this is probably also all because we just don't discriminate "" from null in client. It's good to have those examples, thank you!

@mvdbeek mvdbeek merged commit 7a6e4c9 into galaxyproject:release_26.0 Mar 31, 2026
59 of 66 checks passed
@github-project-automation github-project-automation Bot moved this from Needs Review to Done in Galaxy Dev - weeklies Mar 31, 2026
@jmchilton
Copy link
Copy Markdown
Member Author

I get the "arg" trust me - want me to implement the linter?

@nsoranzo nsoranzo deleted the optional_text_validators_26_0 branch March 31, 2026 18:42
@nsoranzo
Copy link
Copy Markdown
Member

To give a bit more details on why I think our current syntax makes sense to me: it feels natural to me that optional is applied as a first filter (set/not set), and the regex validator as a second filter if the parameter is set. All the other validators apply in the same way, and in fact https://docs.galaxyproject.org/en/master/dev/schema.html#tool-inputs-param-validator says:

Note that validators for parameters with optional="true" are not executed if no value is given.

This keeps the regex simpler as well.

@github-actions
Copy link
Copy Markdown

This PR was merged without a "kind/" label, please correct.

@ahmedhamidawan ahmedhamidawan modified the milestones: 26.1, 26.0 Apr 6, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Development

Successfully merging this pull request may close these issues.

4 participants