Skip to content

Implement more tool validation APIs in Galaxy to mirror the Tool Shed. #22196

Draft
jmchilton wants to merge 3 commits intogalaxyproject:devfrom
jmchilton:tool_schema_apis
Draft

Implement more tool validation APIs in Galaxy to mirror the Tool Shed. #22196
jmchilton wants to merge 3 commits intogalaxyproject:devfrom
jmchilton:tool_schema_apis

Conversation

@jmchilton
Copy link
Copy Markdown
Member

Add /parsed and versioned schema endpoints to Galaxy tools API.

Mirror ToolShed tool endpoints on Galaxy's API:

  • GET /api/tools/{id}/parsed — ParsedTool JSON (inputs + outputs)
  • GET /api/tools/{id}/versions/{v}/parsed — versioned variant
  • GET /api/tools/{id}/versions/{v}/parameter_request_schema
  • GET /api/tools/{id}/versions/{v}/parameter_landing_request_schema
  • GET /api/tools/{id}/versions/{v}/parameter_test_case_xml_schema

Enables galaxy-tool-cache to fetch tool metadata from a running Galaxy instance in addition to ToolShed - it will make it easier to work around various bugs in the tool shed (#22181 , #22007) found during development and is needed longer term for custom Galaxy tools not deployed to the Tool Shed at private Galaxy instances.

How to test the changes?

(Select all options that apply)

  • I've included appropriate automated tests.
  • This is a refactoring of components with existing test coverage.

License

  • I agree to license these and all my past contributions to the core galaxy codebase under the MIT license.

Comment thread lib/galaxy/tools/__init__.py Outdated
def history_manager(self):
return self.app.history_manager

@property
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you looked at how expensive this is for complex tools ? What happens when you repeatedly build a tool, do we build the dynamic pydantic models and do they stay in memory ? I guess this is a bit "late" as a review comment but this seems like a potential way to exhaust memory ?

Independently of the answer, I guess I would be in favor of not making that a property, since there is some non-trivial amount of work happening here ?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't generate those dynamic models though right - this is just the meta model itself which is pretty compact. I'll drop the property though. I guess things that use this do generate those dynamic models though.

About the dynamic models we generate from the meta models - I have been repeatedly validating all of the IWC workflows in a single Python process which spans many thousands of steps and presumably hundreds of tools (though I haven't seen that number) and it works suprisingly quickly and I've never hit any sort of memory or performance issue. But that is just one data point - I cannot make any promises about production obviously.

Do you want me to like write unit tests that repeatedly call these things and try to determine if they have memory leaks or something?

Comment thread lib/galaxy/webapps/galaxy/api/tools.py Outdated
@router.get(
"/api/tools/{tool_id}/parsed",
operation_id="tools__parsed",
summary="Return Galaxy's meta model description of the tool's inputs and outputs.",
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe schema or metaschema, meta model would be more appropriate as route paths ? It's all parsed at this point (except for raw ?)

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree the name sucks - it is worse in the API but obviously it is reflecting ParsedTool which... maybe makes some sense but isn't the best. It is a tool that has been parsed but has no runtime component - but you could say that about any of json outputs from the /api/tools/{tool_id}. I don't have a good replacement name but I'll work on it - I agree this stinks.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went on a walk to brainstorm new names- they mostly all sucked:

Interop
Schema
MetaSchema (because the tool inputs are used to derive particular JSONSchema)
Shed (bad)
Standalone
Runtimeless
Runtimefree
Isolated
SourceObject (meh)
Source (it isn’t source)
Interface
Shape
Parboiled (not raw but low-level)

Of these I think I only like interop better than parsed. The model was developed for the tool shed but also I have a project now that can take these models and do both TypeScript and TypeScript-dervied JSON Schema validation in TypeScript of basically all of the thousands of parameters_specification.yml tests. I think this proves the interoperability. I'll redo this work and bake that change in.

return json_schema_response_for_tool_state_model(LandingRequestToolState, inputs)

@router.get(
"/api/tools/{tool_id}/versions/{tool_version}/parameter_test_case_xml_schema",
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we want to have xml in the route name ?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah - the XML test case schema is subtely different than the json test case schema - and this is reflected in the parameter state type or whatever it is called. This seems fine to me and was intentional.

jmchilton and others added 3 commits March 30, 2026 15:19
Mirror ToolShed tool endpoints on Galaxy's API:
- GET /api/tools/{id}/parsed — ParsedTool JSON (inputs + outputs)
- GET /api/tools/{id}/versions/{v}/parsed — versioned variant
- GET /api/tools/{id}/versions/{v}/parameter_request_schema
- GET /api/tools/{id}/versions/{v}/parameter_landing_request_schema
- GET /api/tools/{id}/versions/{v}/parameter_test_case_xml_schema

Enables galaxy-tool-cache to fetch tool metadata from a running Galaxy
instance instead of requiring ToolShed access.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@mvdbeek
Copy link
Copy Markdown
Member

mvdbeek commented Apr 7, 2026

This needs a schema rebuild

@mvdbeek mvdbeek marked this pull request as draft April 7, 2026 12:47
@mvdbeek mvdbeek moved this from Needs Review to In Progress in Galaxy Dev - weeklies Apr 7, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Status: In Progress

Development

Successfully merging this pull request may close these issues.

2 participants