You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The default agentFactory returned by getNodeHttpConfigurationDefaults (in otlp-exporter-base/src/configuration/otlp-node-http-configuration.ts) is httpAgentFactoryFromOptions({ keepAlive: true }), which constructs new https.Agent({ keepAlive: true }) per export. This bypasses https.globalAgent — the agent that Node replaces with EnvHttpProxyAgent when NODE_USE_ENV_PROXY=1 (or, on Node 22+ with appropriate flags, when HTTPS_PROXY/HTTP_PROXY are set).
Effect: in any runtime that injects a proxy via env vars (kubernetes pods behind an egress proxy, sandboxed runtimes that L7-proxy outbound HTTP, dev environments with a local capture proxy), the default OTLP HTTP exporter silently fails to traverse the proxy. Users only discover this when their backend stays empty.
This is acknowledged territory — #5835 was closed pointing at the merged work in #5711 / #5719, which DID add httpAgentOptions factory support so users can pass a proxy-aware agent themselves. That feature is great. This issue is the follow-up: the default behavior still ignores env-proxy, so users have to read the docs, learn that the SDK overrides globalAgent, find a proxy-agent library, and wire it in. The diagnostic surface is a silent failure — the exporter dies with ECONNREFUSED or EAI_AGAIN and the SDK's diag logger isn't wired by default, so operators see nothing.
In plain English: if the user's environment sets HTTPS_PROXY=…, the OTel exporter ignores it and tries to connect directly. In sandboxed/cloud environments where the proxy was the only path out, spans stop landing — and the user gets no visible error.
Steps to Reproduce
Tested-against:
@opentelemetry/sdk-trace-node (SDK v2)
@opentelemetry/otlp-exporter-base v0.203.0
@opentelemetry/exporter-trace-otlp-http v0.203.0
Node v22.22.2
Setup:
Stand up an OTel collector reachable via a public HTTPS hostname (e.g. through a Cloudflare quick tunnel: cloudflared tunnel --url http://localhost:4318).
Run any OTLP-emitting process in a container that has HTTPS_PROXY=http://<your-proxy-host>:<port> injected into its environment. The proxy is the only path that can reach the public collector hostname; direct connect/DNS fails.
Use the default OTel SDK configuration (no explicit httpAgentOptions, no factory override) and point it at the public collector hostname.
Generate spans.
Watch the backend stay empty. With default log levels, the SDK produces no error.
Add diag.setLogger(new DiagConsoleLogger(), DiagLogLevel.DEBUG) and observe Error: getaddrinfo EAI_AGAIN <hostname> (or ECONNREFUSED) — proving the SDK is bypassing the proxy.
As a control, in the same environment, node -e 'https.request({hostname: "<same-host>", ...})' succeeds (200) — Node's default https.globalAgent IS replaced with EnvHttpProxyAgent when the env vars are set, so the standard https module works. Only the SDK's default-factory path fails.
Expected Result
When the user has not explicitly provided an agentFactory AND the runtime has env-proxy configured (any of NODE_USE_ENV_PROXY=1, HTTPS_PROXY, https_proxy, HTTP_PROXY, http_proxy), the default factory should return globalAgent (proxy-aware) instead of a fresh Agent.
Actual Result
Default factory always returns new Agent({ keepAlive: true }), which is not proxy-aware regardless of env vars. The exporter then attempts direct egress, fails, and silently retries.
Proposed fix
In httpAgentFactoryFromOptions (or in the default-config path), conditionally fall back to globalAgent when env-proxy is configured:
export function httpAgentFactoryFromOptions(
options: http.AgentOptions | https.AgentOptions
): HttpAgentFactory {
return async protocol => {
const isInsecure = protocol === 'http:';
const module = isInsecure ? import('http') : import('https');
const { Agent } = await module;
++ // Honor Node's env-proxy configuration: globalAgent IS the EnvHttpProxyAgent+ // when NODE_USE_ENV_PROXY=1 or HTTPS_PROXY/http_proxy are set. A fresh+ // `new Agent(...)` skips it. Users who want proxy-aware default behavior+ // shouldn't have to wire a proxy-agent library by hand.+ if (process.env.NODE_USE_ENV_PROXY === '1' ||+ process.env.HTTPS_PROXY || process.env.https_proxy ||+ process.env.HTTP_PROXY || process.env.http_proxy) {+ const mod = await module;+ return (mod as any).globalAgent;+ }+
if (isInsecure) {
// eslint-disable-next-line @typescript-eslint/no-unused-vars
const { ca, cert, key, ...insecureOptions } = options as https.AgentOptions;
return new Agent(insecureOptions);
}
return new Agent(options);
};
}
A user who explicitly opts out (passes their own factory or wants the non-proxy behavior) is not affected — only the default changes.
Validated locally as a runtime patch on v0.203.0 in a sandbox where openshell injects HTTPS_PROXY=http://10.200.0.1:3128 and the only reachable hostname is on the proxied side. Spans flow through the proxy as expected after the patch. Without the patch, exports silently fail with EAI_AGAIN against the openshell DNS proxy.
Alternatives considered
Always use globalAgent from the default factory. Simpler, but loses the SDK's keepAlive: true default. The patch above preserves keepAlive when env-proxy is not set.
Document that users must wire https-proxy-agent themselves. Existing path post-feat(exporter-otlp-*)!: support custom HTTP agents #5719. Works, but is high-friction and the silent-failure mode means most users will only discover the gap when their backend is empty.
Auto-detect NO_PROXY patterns inside the factory (since EnvHttpProxyAgent already does this when used). Out of scope here — the proposed patch delegates to globalAgent which already honors NO_PROXY / no_proxy.
Add OTEL_EXPORTER_OTLP_PROXY as a sibling env var. Plausible follow-up but not a substitute — node's standard env vars are what most container/orchestration platforms inject by default.
Test plan
Add a unit test for httpAgentFactoryFromOptions: with NODE_USE_ENV_PROXY=1 set in process.env, the returned factory should return globalAgent. Without it, it should return a fresh Agent.
Add an integration test: spin a small HTTP-CONNECT proxy + mock OTLP receiver in the existing test harness; assert spans flow through the proxy when HTTPS_PROXY points at it.
Regression: confirm no existing httpAgentFactoryFromOptions callers break.
Risk / blast radius
Behavioral change: users behind a proxy who ALSO relied on the SDK's keepAlive: true default will see those overridden by globalAgent's defaults (which already are proxy-aware via EnvHttpProxyAgent, but may not pin keepAlive the same way). Mitigation: env-proxy users today are broken at the default level, so any working configuration they have is via explicit factory injection — that path is unaffected.
Detectable side effect: users who set HTTPS_PROXY for other tools and didn't expect OTel to honor it will now have OTel honor it. Mitigation: this matches Node 22+ default behavior for fetch — bringing OTel into line with standard Node networking conventions.
Is httpAgentFactoryFromOptions the right home for the conditional, or should the default-config builder (getNodeHttpConfigurationDefaults) check env-proxy and select between two factories? The former is narrower; the latter is more explicit.
Should the SDK emit a warning at start when it detects env-proxy and falls back to globalAgent, so operators have a visible signal? (I'd say yes — silent magic is exactly the failure mode this fixes.)
Preferred env-var matrix — just HTTPS_PROXY/http_proxy, or also OTEL_EXPORTER_OTLP_PROXY / similar OTel-specific override?
Independent of this fix: would the maintainers entertain wiring diag.setLogger automatically when OTEL_LOG_LEVEL is set in the env? Operators who hit silent-failure modes have to add a code change today to surface the underlying SDK error. Out of scope here — flagging for context.
Real-world repro context: this surfaced while debugging a sandbox-runtime deployment where HTTPS_PROXY is the only egress path, the OTel SDK's exporter silently failed, and root-causing took ~3 hours of investigation through the SDK source. Happy to share full ClickHouse query output, gateway log, and SDK debug logs if useful.
What happened?
The default
agentFactoryreturned bygetNodeHttpConfigurationDefaults(inotlp-exporter-base/src/configuration/otlp-node-http-configuration.ts) ishttpAgentFactoryFromOptions({ keepAlive: true }), which constructsnew https.Agent({ keepAlive: true })per export. This bypasseshttps.globalAgent— the agent that Node replaces withEnvHttpProxyAgentwhenNODE_USE_ENV_PROXY=1(or, on Node 22+ with appropriate flags, whenHTTPS_PROXY/HTTP_PROXYare set).Effect: in any runtime that injects a proxy via env vars (kubernetes pods behind an egress proxy, sandboxed runtimes that L7-proxy outbound HTTP, dev environments with a local capture proxy), the default OTLP HTTP exporter silently fails to traverse the proxy. Users only discover this when their backend stays empty.
This is acknowledged territory — #5835 was closed pointing at the merged work in #5711 / #5719, which DID add
httpAgentOptionsfactory support so users can pass a proxy-aware agent themselves. That feature is great. This issue is the follow-up: the default behavior still ignores env-proxy, so users have to read the docs, learn that the SDK overridesglobalAgent, find a proxy-agent library, and wire it in. The diagnostic surface is a silent failure — the exporter dies withECONNREFUSEDorEAI_AGAINand the SDK's diag logger isn't wired by default, so operators see nothing.Steps to Reproduce
Tested-against:
@opentelemetry/sdk-trace-node(SDK v2)@opentelemetry/otlp-exporter-basev0.203.0@opentelemetry/exporter-trace-otlp-httpv0.203.0Setup:
cloudflared tunnel --url http://localhost:4318).HTTPS_PROXY=http://<your-proxy-host>:<port>injected into its environment. The proxy is the only path that can reach the public collector hostname; direct connect/DNS fails.httpAgentOptions, no factory override) and point it at the public collector hostname.diag.setLogger(new DiagConsoleLogger(), DiagLogLevel.DEBUG)and observeError: getaddrinfo EAI_AGAIN <hostname>(orECONNREFUSED) — proving the SDK is bypassing the proxy.node -e 'https.request({hostname: "<same-host>", ...})'succeeds (200) — Node's defaulthttps.globalAgentIS replaced withEnvHttpProxyAgentwhen the env vars are set, so the standard https module works. Only the SDK's default-factory path fails.Expected Result
When the user has not explicitly provided an
agentFactoryAND the runtime has env-proxy configured (any ofNODE_USE_ENV_PROXY=1,HTTPS_PROXY,https_proxy,HTTP_PROXY,http_proxy), the default factory should returnglobalAgent(proxy-aware) instead of a freshAgent.Actual Result
Default factory always returns
new Agent({ keepAlive: true }), which is not proxy-aware regardless of env vars. The exporter then attempts direct egress, fails, and silently retries.Proposed fix
In
httpAgentFactoryFromOptions(or in the default-config path), conditionally fall back toglobalAgentwhen env-proxy is configured:export function httpAgentFactoryFromOptions( options: http.AgentOptions | https.AgentOptions ): HttpAgentFactory { return async protocol => { const isInsecure = protocol === 'http:'; const module = isInsecure ? import('http') : import('https'); const { Agent } = await module; + + // Honor Node's env-proxy configuration: globalAgent IS the EnvHttpProxyAgent + // when NODE_USE_ENV_PROXY=1 or HTTPS_PROXY/http_proxy are set. A fresh + // `new Agent(...)` skips it. Users who want proxy-aware default behavior + // shouldn't have to wire a proxy-agent library by hand. + if (process.env.NODE_USE_ENV_PROXY === '1' || + process.env.HTTPS_PROXY || process.env.https_proxy || + process.env.HTTP_PROXY || process.env.http_proxy) { + const mod = await module; + return (mod as any).globalAgent; + } + if (isInsecure) { // eslint-disable-next-line @typescript-eslint/no-unused-vars const { ca, cert, key, ...insecureOptions } = options as https.AgentOptions; return new Agent(insecureOptions); } return new Agent(options); }; }A user who explicitly opts out (passes their own factory or wants the non-proxy behavior) is not affected — only the default changes.
Validated locally as a runtime patch on v0.203.0 in a sandbox where openshell injects
HTTPS_PROXY=http://10.200.0.1:3128and the only reachable hostname is on the proxied side. Spans flow through the proxy as expected after the patch. Without the patch, exports silently fail withEAI_AGAINagainst the openshell DNS proxy.Alternatives considered
globalAgentfrom the default factory. Simpler, but loses the SDK'skeepAlive: truedefault. The patch above preserveskeepAlivewhen env-proxy is not set.https-proxy-agentthemselves. Existing path post-feat(exporter-otlp-*)!: support custom HTTP agents #5719. Works, but is high-friction and the silent-failure mode means most users will only discover the gap when their backend is empty.NO_PROXYpatterns inside the factory (sinceEnvHttpProxyAgentalready does this when used). Out of scope here — the proposed patch delegates toglobalAgentwhich already honorsNO_PROXY/no_proxy.OTEL_EXPORTER_OTLP_PROXYas a sibling env var. Plausible follow-up but not a substitute — node's standard env vars are what most container/orchestration platforms inject by default.Test plan
httpAgentFactoryFromOptions: withNODE_USE_ENV_PROXY=1set inprocess.env, the returned factory should returnglobalAgent. Without it, it should return a freshAgent.HTTPS_PROXYpoints at it.httpAgentFactoryFromOptionscallers break.Risk / blast radius
keepAlive: truedefault will see those overridden byglobalAgent's defaults (which already are proxy-aware viaEnvHttpProxyAgent, but may not pin keepAlive the same way). Mitigation: env-proxy users today are broken at the default level, so any working configuration they have is via explicit factory injection — that path is unaffected.HTTPS_PROXYfor other tools and didn't expect OTel to honor it will now have OTel honor it. Mitigation: this matches Node 22+ default behavior forfetch— bringing OTel into line with standard Node networking conventions.agentFactory: () => new https.Agent({...})directly.Open questions
httpAgentFactoryFromOptionsthe right home for the conditional, or should the default-config builder (getNodeHttpConfigurationDefaults) check env-proxy and select between two factories? The former is narrower; the latter is more explicit.globalAgent, so operators have a visible signal? (I'd say yes — silent magic is exactly the failure mode this fixes.)HTTPS_PROXY/http_proxy, or alsoOTEL_EXPORTER_OTLP_PROXY/ similar OTel-specific override?diag.setLoggerautomatically whenOTEL_LOG_LEVELis set in the env? Operators who hit silent-failure modes have to add a code change today to surface the underlying SDK error. Out of scope here — flagging for context.Real-world repro context: this surfaced while debugging a sandbox-runtime deployment where
HTTPS_PROXYis the only egress path, the OTel SDK's exporter silently failed, and root-causing took ~3 hours of investigation through the SDK source. Happy to share full ClickHouse query output, gateway log, and SDK debug logs if useful.