Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
35 commits
Select commit Hold shift + click to select a range
50b7fbd
Adds support for custom headers in URL fetch requests
davelopez Sep 18, 2025
a823471
Adds header encryption utilities using Vault system
davelopez Sep 18, 2025
19b3db7
Replaces hardcoded tool ID with a constant
davelopez Sep 18, 2025
b035a43
Adds header encryption/decryption for tool landing requests
davelopez Sep 18, 2025
37fb0a0
Adds integration test for encrypted sensitive headers in landing requ…
davelopez Sep 19, 2025
d66a7e5
Adds header encryption for workflow landings
davelopez Sep 19, 2025
a80d669
Add integration test for workflow landing header encryption
davelopez Sep 19, 2025
385d709
Simplifies sensitive header pattern matching
davelopez Sep 19, 2025
a9ede65
Adds logging for encryption/decryption failures in landing requests
davelopez Sep 19, 2025
e879af8
Refactors header encryption/decryption logic into helper methods
davelopez Sep 19, 2025
16fb142
Adds logging for missing vault keys in header decryption
davelopez Sep 19, 2025
a91fc73
Let encrypt/decrypt headers fail fast
davelopez Sep 22, 2025
6a87662
Adds recursive sensitive header detection utility
davelopez Sep 22, 2025
089a589
Enforce vault configuration when sensitive headers are present
davelopez Sep 22, 2025
6d29408
Introduce configurable URL header allow-list
davelopez Oct 7, 2025
9be9081
Use configurable patterns for header sensitivity
davelopez Oct 7, 2025
ae1e89b
Update encryption/decryption API for URL-aware config
davelopez Oct 7, 2025
3012450
Use URL-aware header encryption for landing requests
davelopez Oct 7, 2025
2a734e1
Update headers_encryption tests for URL config
davelopez Oct 7, 2025
7d3ecc4
Adds URL header allow-list management
davelopez Oct 7, 2025
71d6685
Replaces generic ValueErrors with specific exceptions
davelopez Oct 13, 2025
94dd7dd
Refactor UrlHeadersConfig to use ABC and Null Object pattern
davelopez Oct 14, 2025
8698bc1
Implement UrlHeadersConfiguration and UrlHeadersConfigFactory
davelopez Oct 14, 2025
f69086b
Adapt LandingRequestManager to use UrlHeadersConfigFactory
davelopez Oct 14, 2025
0c96fa4
Add utility for configuring allowed URL headers in tests
davelopez Oct 14, 2025
3b6d015
Update headers encryption tests for new config factory and exceptions
davelopez Oct 14, 2025
037868f
Add integration tests for URL headers configuration
davelopez Oct 14, 2025
09a2aea
Adds unit tests for URL header pattern matching
davelopez Oct 14, 2025
d29b61a
Adds sample for URL header configuration
davelopez Oct 14, 2025
1a71ca8
Adds URL headers config to mock app
davelopez Oct 15, 2025
d93f093
Rebuild config
davelopez Oct 15, 2025
343d4b8
Removes unused null config factory method
davelopez Oct 15, 2025
848f9f0
Updates sample config with sensitive auth headers
davelopez Oct 16, 2025
962388c
Use correct dataset_populator method after rebase
davelopez Oct 28, 2025
a12a082
Adds docs for enabling HTTP headers in fetch
davelopez Jan 27, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 7 additions & 0 deletions client/src/api/schema/schema.ts
Original file line number Diff line number Diff line change
Expand Up @@ -23479,6 +23479,13 @@ export interface components {
extra_files?: components["schemas"]["ExtraFiles"] | null;
/** Hashes */
hashes?: components["schemas"]["FetchDatasetHash"][] | null;
/**
* Headers
* @description Optional headers to include in the URL fetch request
*/
headers?: {
[key: string]: string;
} | null;
/**
* Info
* @description Free text field that can be used to store arbitrary information about the dataset. This used to be prominently
Expand Down
1 change: 1 addition & 0 deletions config/url_headers_conf.yml.sample
152 changes: 152 additions & 0 deletions doc/source/admin/enable_headers_in_fetch_requests.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,152 @@
# Enabling HTTP Headers in Fetch Requests

Galaxy allows users to **fetch remote data by URL** (for example via _Upload → Paste/Fetch data_ or via APIs that retrieve external resources).
By default, Galaxy **does not forward any custom HTTP headers** when fetching URLs. This restriction is intentional and is part of Galaxy’s security model.

Starting with recent Galaxy releases, administrators can **explicitly allow a controlled set of HTTP headers** to be sent with fetch requests, based on the target URL. This enables integrations with authenticated services (e.g. APIs requiring `Authorization` headers) while maintaining strict security boundaries.

This document explains **how to safely enable HTTP headers for fetch requests**, how the allow‑list mechanism works, and how to configure it.

## Why Header Allow‑Listing Is Required

Allowing arbitrary headers in server‑side HTTP requests is dangerous. Without restrictions, users could:

- Access internal services (SSRF attacks)
- Exfiltrate credentials via forwarded headers
- Abuse Galaxy as a proxy to privileged networks

To prevent this, Galaxy implements **explicit header allow‑listing with URL pattern matching**:

- **No headers are allowed by default**
- Each allowed header must be explicitly configured
- Headers are only sent to URLs that match defined patterns
- Sensitive headers can be stored securely using Galaxy’s Vault

## Configuration Overview
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems fine as a global config but I'd love a follow-up where this can be set on a per-file source basis, so you can allow amazon headers for s3 etc

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(and we could then allow sensible defaults)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean directly as part of a File Source config parameter or set of parameters?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure of the difference ? I was thinking that our file sources should automatically allow accepting and relaying relevant headers for that type of file source, so you don't need to allowlist for instance X-Amz-Security-Token, but we'd block this by default for http/https

Copy link
Copy Markdown
Contributor Author

@davelopez davelopez Jan 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant something like this?

class HeaderEntry(StrictModel):
    name: str
    sensitive: bool = False


class HeaderConfig(StrictModel):
    headers: list[HeaderEntry]


class S3FSFileSourceConfiguration(FsspecBaseFileSourceConfiguration):
    anon: bool = False
    endpoint_url: Optional[str] = None
    bucket: Optional[str] = None
    secret: Optional[str] = None
    key: Optional[str] = None
    allow_headers: Optional[HeaderConfig] = None

BTW, thanks for the merge!

Copy link
Copy Markdown
Member

@mvdbeek mvdbeek Jan 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would make it

class S3FSFileSourceConfiguration(FsspecBaseFileSourceConfiguration):
    anon: bool = False
    endpoint_url: Optional[str] = None
    bucket: Optional[str] = None
    secret: Optional[str] = None
    key: Optional[str] = None
    allow_headers: Optional[HeaderConfig] = [AmzSecretHeaderEntry, OtherDefault HeaderEntryYouMightWantToAdd]


Header forwarding for fetch requests is controlled via a dedicated configuration file:

```yaml
galaxy:
url_headers_config_file: url_headers_conf.yml
```

This file defines:

- Which **HTTP headers** are allowed
- For which **URL patterns** they may be sent
- Whether headers are **sensitive** (stored encrypted in the Vault)

If this configuration file is **not set or empty**, **no headers will ever be forwarded**.

## url_headers_conf.yml Format

The configuration file is a YAML list of rules. Each rule applies to one or more URL patterns.

### Basic Structure

```yaml
- url_pattern: "https://api.example.org/.*"
headers:
- name: Authorization
sensitive: true
- name: X-API-Key
sensitive: true
```

### Fields

| Field | Description |
| --------------------- | -------------------------------------------------------- |
| `url_pattern` | A regular expression matched against the full URL |
| `headers` | List of allowed HTTP headers for matching URLs |
| `headers[].name` | Exact HTTP header name (case‑insensitive) |
| `headers[].sensitive` | Whether the header value is stored securely in the Vault |

## Sensitive vs Non‑Sensitive Headers

### Sensitive Headers

Sensitive headers (for example `Authorization`, `X-API-Key`, `Cookie`) are:

- **Encrypted and stored in the Galaxy Vault**
- Never logged or exposed in plaintext
- Managed through Galaxy’s secure secrets infrastructure

Example:

```yaml
- url_pattern: "https://protected.example.com/.*"
headers:
- name: Authorization
sensitive: true
```

### Non‑Sensitive Headers

Non‑sensitive headers may be stored in plain configuration and are typically used for:

- Feature flags
- API versioning
- Public metadata headers

Example:

```yaml
- url_pattern: "https://public.example.com/.*"
headers:
- name: X-Client-Version
sensitive: false
```

## Multiple Rules and URL Matching

Multiple rules may be defined. The first rule whose `url_pattern` matches the request URL is applied.

```yaml
- url_pattern: "https://api.github.com/.*"
headers:
- name: Authorization
sensitive: true

- url_pattern: "https://raw.githubusercontent.com/.*"
headers:
- name: X-Client-Version
sensitive: false
```

```{note}
Rules are evaluated in order. Be careful with overly broad patterns such as `.*`.
```

## Using Headers in Practice

Once configured, users (or tools) may provide header values when performing fetch operations. Galaxy will:

1. Validate the target URL against the allow‑list
2. Filter headers to the allowed set
3. Securely inject sensitive headers at request time

Headers not explicitly allowed **will be silently dropped**.

## Security Best Practices

```{warning}
Only allow headers and URL patterns that are strictly necessary.
```

Recommended practices:

- Prefer **narrow URL patterns** over wildcards
- Mark authentication headers as `sensitive: true`
- Avoid allowing `Cookie` headers unless absolutely required
- Never allow headers for internal or private network ranges

## Troubleshooting

If headers are not being forwarded as expected:

1. Verify `url_headers_config_file` is configured in `galaxy.yml`
2. Confirm the URL matches the configured `url_pattern`
3. Check that the header name matches exactly
4. Ensure Galaxy has access to the configured Vault
20 changes: 20 additions & 0 deletions doc/source/admin/galaxy_options.rst
Original file line number Diff line number Diff line change
Expand Up @@ -5638,6 +5638,26 @@
:Type: str


~~~~~~~~~~~~~~~~~~~~~~~~~~~
``url_headers_config_file``
~~~~~~~~~~~~~~~~~~~~~~~~~~~

:Description:
Configuration file for URL request headers allow-list with URL
pattern matching. This file defines which HTTP headers are allowed
in URL fetch requests based on URL patterns, and whether they
should be treated as sensitive (encrypted in the vault) or not. If
no allow-list is specified, no headers will be allowed in URL
requests. This provides fine-grained security control over what
headers can be sent when Galaxy fetches external URLs on behalf of
users, allowing different headers for different target domains or
services.
The value of this option will be resolved with respect to
<config_dir>.
:Default: ``url_headers_conf.yml``
:Type: str


~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
``display_builtin_converters``
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Expand Down
1 change: 1 addition & 0 deletions doc/source/admin/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@ Galaxy Deployment & Administration
jobs
job_metrics
authentication
enable_headers_in_fetch_requests
tool_panel
data_tables
mq
Expand Down
1 change: 1 addition & 0 deletions lib/galaxy/app_unittest_utils/galaxy_mock.py
Original file line number Diff line number Diff line change
Expand Up @@ -291,6 +291,7 @@ def __init__(self, **kwargs):
self.monitor_thread_join_timeout = 1
self.integrated_tool_panel_config = None
self.vault_config_file = kwargs.get("vault_config_file")
self.url_headers_config_file = None
self.max_discovered_files = 10000
self.display_builtin_converters = True
self.enable_notification_system = True
Expand Down
12 changes: 12 additions & 0 deletions lib/galaxy/config/sample/galaxy.yml.sample
Original file line number Diff line number Diff line change
Expand Up @@ -3048,6 +3048,18 @@ galaxy:
# <config_dir>.
#vault_config_file: vault_conf.yml

# Configuration file for URL request headers allow-list with URL
# pattern matching. This file defines which HTTP headers are allowed
# in URL fetch requests based on URL patterns, and whether they should
# be treated as sensitive (encrypted in the vault) or not. If no
# allow-list is specified, no headers will be allowed in URL requests.
# This provides fine-grained security control over what headers can be
# sent when Galaxy fetches external URLs on behalf of users, allowing
# different headers for different target domains or services.
# The value of this option will be resolved with respect to
# <config_dir>.
#url_headers_config_file: url_headers_conf.yml

# Display built-in converters in the tool panel.
#display_builtin_converters: true

Expand Down
115 changes: 115 additions & 0 deletions lib/galaxy/config/sample/url_headers_conf.yml.sample
Original file line number Diff line number Diff line change
@@ -0,0 +1,115 @@
# Allowed URL Headers Configuration
#
# This file defines which HTTP headers are allowed in URL fetch requests based
# on URL patterns, and whether they should be treated as sensitive (encrypted
# in the vault) or not.
#
# If no allow-list is specified or this file is empty/missing, NO headers will
# be allowed in URL requests.
#
# Configuration structure:
# patterns:
# - url_pattern: A regular expression pattern to match URLs
# headers:
# - name: The exact header name (case-insensitive)
# sensitive: Whether this header contains sensitive information that should
# be encrypted when stored in the database (requires vault configuration)
#
# IMPORTANT:
# ------------------------------------
# When a URL matches MULTIPLE patterns, the union of all allowed headers is used.
# This means you can compose permissions from multiple patterns for flexibility.
#
# Example: A URL matching both pattern A (allows headers X, Y) and pattern B
# (allows headers Y, Z) will allow headers X, Y, and Z.
#
# Security: If ANY matching pattern marks a header as sensitive, it will be
# treated as sensitive (secure-by-default).
#
# The following examples are for illustration purposes only; please use only the minimum configuration for your needs.
# Examples:

patterns:
# GitHub API access - allow authentication headers for GitHub URLs
- url_pattern: "^https://api\\.github\\.com/.*"
headers:
- name: Authorization
sensitive: true
- name: Accept
sensitive: false
- name: X-GitHub-Api-Version
sensitive: false

# Generic GitHub content (raw files, releases) - no auth needed
- url_pattern: "^https://(raw\\.githubusercontent\\.com|github\\.com/.*/releases/download)/.*"
headers:
- name: Accept
sensitive: false
- name: Accept-Encoding
sensitive: false

# AWS S3 buckets - allow AWS authentication headers
- url_pattern: "^https://.*\\.s3\\..+\\.amazonaws\\.com/.*"
headers:
- name: Authorization
sensitive: true
- name: X-Amz-Date
sensitive: false
- name: X-Amz-Content-Sha256
sensitive: false
- name: X-Amz-Security-Token
sensitive: true

# Generic cloud storage APIs
- url_pattern: "^https://.*\\.(googleapis\\.com|azure\\.com|digitaloceanspaces\\.com)/.*"
headers:
- name: Authorization
sensitive: true
- name: X-API-Key
sensitive: true
- name: Accept
sensitive: false

# FTP over HTTP services
- url_pattern: "^https?://ftp\\..*/.*"
headers:
- name: Authorization
sensitive: true
- name: Accept
sensitive: false

# Academic/research data repositories
- url_pattern: "^https://.*(zenodo\\.org|figshare\\.com|dryad\\.org|dataverse\\.org)/.*"
headers:
- name: Authorization
sensitive: true
- name: X-API-Key
sensitive: true
- name: Accept
sensitive: false

# HTTPS URLs - basic headers only (most restrictive for unknown sources)
- url_pattern: "^https://.*"
headers:
- name: Authorization
sensitive: true
- name: X-Auth-Token
sensitive: true
- name: X-API-Key
sensitive: true
- name: Accept
sensitive: false
- name: Accept-Language
sensitive: false
- name: Accept-Encoding
sensitive: false
- name: Cache-Control
sensitive: false
Comment thread
davelopez marked this conversation as resolved.

# Security notes:
# - All matching patterns contribute their allowed headers (union of permissions)
# - If ANY pattern marks a header as sensitive, it's treated as sensitive
# - Only add headers that are absolutely necessary for your use case
# - When in doubt, mark headers as sensitive to ensure encryption
# - Patterns are order-independent, making configuration more composable
# - HTTP (non-HTTPS) URLs are generally not recommended and may be blocked
14 changes: 14 additions & 0 deletions lib/galaxy/config/schemas/config_schema.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4162,6 +4162,20 @@ mapping:
desc: |
Vault config file.

url_headers_config_file:
type: str
default: url_headers_conf.yml
path_resolves_to: config_dir
required: false
desc: |
Configuration file for URL request headers allow-list with URL pattern matching.
This file defines which HTTP headers are allowed in URL fetch requests based
on URL patterns, and whether they should be treated as sensitive (encrypted
in the vault) or not. If no allow-list is specified, no headers will be
allowed in URL requests. This provides fine-grained security control over
what headers can be sent when Galaxy fetches external URLs on behalf of users,
allowing different headers for different target domains or services.

display_builtin_converters:
type: bool
default: true
Expand Down
Loading
Loading