Closed
Description
- asyncpg version: 0.28.0
- PostgreSQL version: 14
- Python version: 3.10
- Platform: OSX
- Do you use pgbouncer?: yes (transactional mode)
- Did you install asyncpg with pip?: yes
Started from sqlalchemy/sqlalchemy#10226.
asyncpg uses prepared statements despite they're disabled.
To reproduce (thanks @zzzeek for the script!):
import asyncpg
import asyncio
import uuid
async def main():
conn = await asyncpg.connect(
user="scott",
password="tiger",
host="127.0.0.1",
database="test",
port=6432,
statement_cache_size=0,
)
pps = await conn.prepare("select 1", name=f'__asyncpg_{uuid.uuid4()}__')
rows = await pps.fetch()
# remove this 'del' and the error goes away
del pps
(await conn.fetchrow(";"))
asyncio.run(main())
Error
Traceback (most recent call last):
File "/user/app/script.py", line 57, in <module>
asyncio.run(main())
File "/user/.pyenv/versions/3.10.3/lib/python3.10/asyncio/runners.py", line 44, in run
return loop.run_until_complete(main)
File "/user/.pyenv/versions/3.10.3/lib/python3.10/asyncio/base_events.py", line 646, in run_until_complete
return future.result()
File "/user/app/script.py", line 53, in main
(await conn.fetchrow(";"))
File "/user/app/venv/lib/python3.10/site-packages/asyncpg/connection.py", line 678, in fetchrow
data = await self._execute(
File "/user/app/venv/lib/python3.10/site-packages/asyncpg/connection.py", line 1658, in _execute
result, _ = await self.__execute(
File "/user/app/venv/lib/python3.10/site-packages/asyncpg/connection.py", line 1683, in __execute
return await self._do_execute(
File "/user/app/venv/lib/python3.10/site-packages/asyncpg/connection.py", line 1730, in _do_execute
result = await executor(stmt, None)
File "asyncpg/protocol/protocol.pyx", line 201, in bind_execute
asyncpg.exceptions.InvalidSQLStatementNameError: unnamed prepared statement does not exist
HINT:
NOTE: pgbouncer with pool_mode set to "transaction" or
"statement" does not support prepared statements properly.
You have two options:
* if you are using pgbouncer for connection pooling to a
single server, switch to the connection pool functionality
provided by asyncpg, it is a much better option for this
purpose;
* if you have no option of avoiding the use of pgbouncer,
then you can set statement_cache_size to 0 when creating
the asyncpg connection object.
Metadata
Metadata
Assignees
Labels
No labels
Activity
elprans commentedon Aug 15, 2023
statement_cache_size=0
does not mean that asyncpg will not use prepared statements at all, only that it will not attempt to use named prepared statements and re-use them. Unnamed prepared statements will still be used as they're an essential part of PostgreSQL Extended Query protocol.Unfortunately, I'm unable to reproduce the error using your script (I tried on Python 3.10 and 3.11).
zzzeek commentedon Aug 15, 2023
hi @elprans -
this error reproduces if you run it against pgbouncer in "transaction" mode, here is a config file I used for pgbouncer to reproduce this:
also, can we look at the error message that asyncpg is emitting, which is also in the FAQ - it suggests using "statement_cache_size = 0" in order to mitigate the issue, however our test case here shows that this is not sufficient. if you were able to reproduce this, would it be considered a bug, or is the whole notion of using "transaction" mode in pgbouncer something that asyncpg should say is explicitly not supported?
elprans commentedon Aug 15, 2023
OK, thanks, can repro now.
It's technically a bug. Or, rather, insufficiently strong mitigation. For now a solution would be to force another
Parse
beforeBind
/Execute
whenstatement_cache_size == 0
, which would impose a certain performance penalty. Avoiding the secondParse
might be possible, but not without a big code reorg.That said,
ps = conn.prepare(); ps.execute()
would obviously still not work as it faces the same problem of prepared statement persistence. I don't think lying about statement "preparedness" here is a good idea, so you'll need to apply a similar workaround in SQLAlchemy for the pgbouncer case (i.e. only useprepare()
for introspection and instead of callingps.fetch()
redo the query withconn.fetch()
).When prepared statements are disabled, avoid relying on them harder
zzzeek commentedon Aug 16, 2023
when used in this mode (and as seen in the test case) we are giving our prepared statements names and they run on pgbouncer without issue. the issue here is that a totally unrelated statement that is using
conn.fetch()
appears to be polluted due to previous, unrelated activity on the connection. why wouldconn.fetch()
do anything with prepared statements when it is not usingconn.prepare()
?We can "fix" this issue on our end by using
conn.prepare()
with a name in all cases. indeed the poster here first proposed we simply revert our change that usesconn.fetch()
for this one particular operation that does not require overhead of prepared statements.additionally it seems, though maybe you can clarify, that it would be quite wasteful to do
conn.prepare()
, get the attributes for the result, then doconn.fetch()
anyway?elprans commentedon Aug 16, 2023
This is surprising to me. This is from the PgBouncer FAQ:
"How to use prepared statements with transaction pooling?
To make prepared statements work in this mode would need PgBouncer to keep track of them internally, which it does not do. So the only way to keep using PgBouncer in this mode is to disable prepared statements in the client."
fetch()
relies on PostgreSQL Extended Query protocol, where prepared statements (named or unnamed) are part of the normal flow. Asyncpg relies on prepared statement introspection in order to actually be able to encode argument data inBind
and decode row data inRowData
responses. The issue here is that PgBouncer seems to be completely broken with respect to implicit transactions at the protocol level.elprans commentedon Aug 16, 2023
Here's an illustration of what PgBouncer sends to Postgres in the above script (I modified
fetchrow(";")
tofetchrow('select $1::int', 1)
):Note two things:
SET client_encoding
(this is what causes the unnamed prepared statement to get blown up)parse
and subsequentbind
/execute
run on different backend despite being logically part of the same transaction (there is noSync
message between theParse
and theBind
).There is no reason why this wouldn't happen with explicit named prepared statements.
When prepared statements are disabled, avoid relying on them harder
zzzeek commentedon Aug 16, 2023
So it's weird they say that, because I think I understand the problem now and that FAQ is not really accurate. Ive googled a bit for other takes on this issue, and the issue really is nothing more than "transaction" mode is going to give you a different connection for each command, as you've illustrated in your example above. So what the FAQ does not say is that prepared statements won't work, if you did not actually begin a transaction. Because that's what "transaction" mode means; pgbouncer will give you the same connection for all commands if you open a transaction and work inside that.
So we can make the above test case work just like this:
the other part of it is that the prepared statement we use explicitly has a name using the feature added by #837, that is asyncpg otherwise will name it something that might be already present on a different PG connection that nonetheless to asyncpg looks like the same connection, as pgbouncer switches connections around behind the scenes. This itself could also be mitigated by using server_reset_query_always , while potentially though not necessarily setting server_reset_query to use DEALLOCATE ALL instead of DISCARD ALL. Again pgbouncer's docs here are misleading when they use terminology like "This setting is for working around broken setups that run applications that use session features over a transaction-pooled PgBouncer."
So what's happening in SQLAlchemy is that we've wrapped asyncpg behind a layer that provides pep-249 compatibility, which includes, all SQL commands implicitly auto-begin a transaction. That's why things work for us. This particular "SELECT ;" thing is a liveness ping, that occurs outside of any transaction.....on the asyncpg driver. On other drivers like pg8000, the DBAPI always uses a transaction.
So. Great news for asyncpg, I can fix this on my end just by making sure our "ping" is run inside of a transaction. clearly asyncpg can't do much here, would be nice if pgbouncer had less opinionated documentation that is not so confusing to everyone.
elprans commentedon Aug 16, 2023
Yes, except they completely overlook that implicit transactions exist and break those.
I think I'm still going to merge the "redo the Parse if statement cache is off" PR.
zzzeek commentedon Aug 16, 2023
sounds great!
When prepared statements are disabled, avoid relying on them harder (#…
9 remaining items