Replies: 1 comment
-
This is going to be the approach. Async doesn't mean unlimited capacity. Ultimately you have a single event loop on a single thread. If that's getting saturated, multiple workers is going to be the way to scale. If you can dig into something specific, and show where that's coming up (likely in your ASGI server, which is responsible for the handshake) then there might be possible optimisations. (But short of such a demonstration, there's not really much to say.) |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
WebSocket Handshake Timeout During Peak Traffic - H13 Connection Closed Without Response
Summary
Experiencing massive WebSocket handshake timeouts during peak server usage, resulting in hundreds to thousands of H13 errors (Connection closed without response) within minutes. The issue affects both long-lived and short-lived connections and is temporarily resolved by scaling dynos.
Environment
Error Details
Heroku Logs
Current Configuration
Procfile:
Channel Layers (settings.py):
Problem Analysis
Consumer Implementation Issues
Scaling Behavior
Expected Behavior
WebSocket connections should complete handshake within reasonable time even during peak traffic.
Actual Behavior
During peak usage:
Questions for Maintainers
Is there a recommended pattern for handling heavy database operations during WebSocket connect without blocking the handshake?
Should channels provide better guidance on async vs sync consumers for production use?
Are there built-in mechanisms to defer operations until after handshake completion?
What are the recommended timeout values for production deployments with high concurrent connections?
Additional Context
django-db-connection-pool[postgresql]
for database connection poolingReproducible Test Case
The issue can be reproduced by:
connect()
Would appreciate guidance on best practices for handling this scenario and whether this indicates a bug in channels or a configuration/implementation issue.
Beta Was this translation helpful? Give feedback.
All reactions