Skip to content

asyncpg uses prepared statements despite they're disabled #1058

Closed
@ionsome

Description

@ionsome
  • asyncpg version: 0.28.0
  • PostgreSQL version: 14
  • Python version: 3.10
  • Platform: OSX
  • Do you use pgbouncer?: yes (transactional mode)
  • Did you install asyncpg with pip?: yes

Started from sqlalchemy/sqlalchemy#10226.

asyncpg uses prepared statements despite they're disabled.

To reproduce (thanks @zzzeek for the script!):

import asyncpg
import asyncio
import uuid

async def main():
    conn = await asyncpg.connect(
        user="scott",
        password="tiger",
        host="127.0.0.1",
        database="test",
        port=6432,
        statement_cache_size=0,
    )

    pps = await conn.prepare("select 1", name=f'__asyncpg_{uuid.uuid4()}__')
    rows = await pps.fetch()

    # remove this 'del' and the error goes away
    del pps

    (await conn.fetchrow(";"))


asyncio.run(main())

Error

Traceback (most recent call last):
  File "/user/app/script.py", line 57, in <module>
    asyncio.run(main())
  File "/user/.pyenv/versions/3.10.3/lib/python3.10/asyncio/runners.py", line 44, in run
    return loop.run_until_complete(main)
  File "/user/.pyenv/versions/3.10.3/lib/python3.10/asyncio/base_events.py", line 646, in run_until_complete
    return future.result()
  File "/user/app/script.py", line 53, in main
    (await conn.fetchrow(";"))
  File "/user/app/venv/lib/python3.10/site-packages/asyncpg/connection.py", line 678, in fetchrow
    data = await self._execute(
  File "/user/app/venv/lib/python3.10/site-packages/asyncpg/connection.py", line 1658, in _execute
    result, _ = await self.__execute(
  File "/user/app/venv/lib/python3.10/site-packages/asyncpg/connection.py", line 1683, in __execute
    return await self._do_execute(
  File "/user/app/venv/lib/python3.10/site-packages/asyncpg/connection.py", line 1730, in _do_execute
    result = await executor(stmt, None)
  File "asyncpg/protocol/protocol.pyx", line 201, in bind_execute
asyncpg.exceptions.InvalidSQLStatementNameError: unnamed prepared statement does not exist
HINT:  
NOTE: pgbouncer with pool_mode set to "transaction" or
"statement" does not support prepared statements properly.
You have two options:

* if you are using pgbouncer for connection pooling to a
  single server, switch to the connection pool functionality
  provided by asyncpg, it is a much better option for this
  purpose;

* if you have no option of avoiding the use of pgbouncer,
  then you can set statement_cache_size to 0 when creating
  the asyncpg connection object.

Activity

elprans

elprans commented on Aug 15, 2023

@elprans
Member

statement_cache_size=0 does not mean that asyncpg will not use prepared statements at all, only that it will not attempt to use named prepared statements and re-use them. Unnamed prepared statements will still be used as they're an essential part of PostgreSQL Extended Query protocol.

Unfortunately, I'm unable to reproduce the error using your script (I tried on Python 3.10 and 3.11).

zzzeek

zzzeek commented on Aug 15, 2023

@zzzeek

hi @elprans -

this error reproduces if you run it against pgbouncer in "transaction" mode, here is a config file I used for pgbouncer to reproduce this:

[databases]
test = host=127.0.0.1 dbname=test auth_user=scott password=tiger

[pgbouncer]
pool_mode = transaction
listen_port = 6432
listen_addr = localhost
auth_type = md5
auth_file = users.txt
logfile = pgbouncer.log
pidfile = pgbouncer.pid
admin_users = someuser
stats_users = stat_collector

also, can we look at the error message that asyncpg is emitting, which is also in the FAQ - it suggests using "statement_cache_size = 0" in order to mitigate the issue, however our test case here shows that this is not sufficient. if you were able to reproduce this, would it be considered a bug, or is the whole notion of using "transaction" mode in pgbouncer something that asyncpg should say is explicitly not supported?

elprans

elprans commented on Aug 15, 2023

@elprans
Member

this error reproduces if you run it against pgbouncer in "transaction" mode, here is a config file I used for pgbouncer to reproduce this

OK, thanks, can repro now.

if you were able to reproduce this, would it be considered a bug, or is the whole notion of using "transaction" mode in pgbouncer something that asyncpg should say is explicitly not supported?

It's technically a bug. Or, rather, insufficiently strong mitigation. For now a solution would be to force another Parse before Bind/Execute when statement_cache_size == 0, which would impose a certain performance penalty. Avoiding the second Parse might be possible, but not without a big code reorg.

That said, ps = conn.prepare(); ps.execute() would obviously still not work as it faces the same problem of prepared statement persistence. I don't think lying about statement "preparedness" here is a good idea, so you'll need to apply a similar workaround in SQLAlchemy for the pgbouncer case (i.e. only use prepare() for introspection and instead of calling ps.fetch() redo the query with conn.fetch()).

added a commit that references this issue on Aug 16, 2023
43d5198
zzzeek

zzzeek commented on Aug 16, 2023

@zzzeek

when used in this mode (and as seen in the test case) we are giving our prepared statements names and they run on pgbouncer without issue. the issue here is that a totally unrelated statement that is using conn.fetch() appears to be polluted due to previous, unrelated activity on the connection. why would conn.fetch() do anything with prepared statements when it is not using conn.prepare() ?

We can "fix" this issue on our end by using conn.prepare() with a name in all cases. indeed the poster here first proposed we simply revert our change that uses conn.fetch() for this one particular operation that does not require overhead of prepared statements.

additionally it seems, though maybe you can clarify, that it would be quite wasteful to do conn.prepare(), get the attributes for the result, then do conn.fetch() anyway?

elprans

elprans commented on Aug 16, 2023

@elprans
Member

when used in this mode (and as seen in the test case) we are giving our prepared statements names and they run on pgbouncer without issue

This is surprising to me. This is from the PgBouncer FAQ:

"How to use prepared statements with transaction pooling?

To make prepared statements work in this mode would need PgBouncer to keep track of them internally, which it does not do. So the only way to keep using PgBouncer in this mode is to disable prepared statements in the client."

why would conn.fetch() do anything with prepared statements when it is not using conn.prepare()

fetch() relies on PostgreSQL Extended Query protocol, where prepared statements (named or unnamed) are part of the normal flow. Asyncpg relies on prepared statement introspection in order to actually be able to encode argument data in Bind and decode row data in RowData responses. The issue here is that PgBouncer seems to be completely broken with respect to implicit transactions at the protocol level.

elprans

elprans commented on Aug 16, 2023

@elprans
Member

Here's an illustration of what PgBouncer sends to Postgres in the above script (I modified fetchrow(";") to fetchrow('select $1::int', 1)):

[2653098] LOG:  duration: 0.104 ms  statement: SET client_encoding='''utf-8''';
[2653098] LOG:  duration: 0.118 ms  parse __asyncpg_790766f0-8d39-42e6-9089-8afaa114e054__: select 1
[2653098] LOG:  duration: 0.057 ms  bind __asyncpg_790766f0-8d39-42e6-9089-8afaa114e054__: select 1
[2653098] LOG:  duration: 0.008 ms  execute __asyncpg_790766f0-8d39-42e6-9089-8afaa114e054__: select 1
[2653099] LOG:  duration: 0.121 ms  statement: SET client_encoding='''utf-8''';
[2653099] LOG:  duration: 0.159 ms  parse <unnamed>: select $1::int
[2653098] LOG:  duration: 0.025 ms  statement: SET client_encoding='''utf-8''';
[2653098] LOG:  duration: 0.063 ms  parse <unnamed>: select $1::int
[2653098] LOG:  duration: 0.020 ms  bind <unnamed>: select $1::int
[2653098] DETAIL:  parameters: $1 = '1'
[2653098] LOG:  duration: 0.003 ms  execute <unnamed>: select $1::int
[2653098] DETAIL:  parameters: $1 = '1'

Note two things:

  1. injection of SET client_encoding (this is what causes the unnamed prepared statement to get blown up)
  2. change of PID indicating that the initial parse and subsequent bind/execute run on different backend despite being logically part of the same transaction (there is no Sync message between the Parse and the Bind).

There is no reason why this wouldn't happen with explicit named prepared statements.

added a commit that references this issue on Aug 16, 2023
606a368
zzzeek

zzzeek commented on Aug 16, 2023

@zzzeek

This is surprising to me. This is from the PgBouncer FAQ:

"How to use prepared statements with transaction pooling?

To make prepared statements work in this mode would need PgBouncer to keep track of them internally, which it does not do. So the only way to keep using PgBouncer in this mode is to disable prepared statements in the client."

So it's weird they say that, because I think I understand the problem now and that FAQ is not really accurate. Ive googled a bit for other takes on this issue, and the issue really is nothing more than "transaction" mode is going to give you a different connection for each command, as you've illustrated in your example above. So what the FAQ does not say is that prepared statements won't work, if you did not actually begin a transaction. Because that's what "transaction" mode means; pgbouncer will give you the same connection for all commands if you open a transaction and work inside that.

So we can make the above test case work just like this:

async def main():
    conn = await asyncpg.connect(
        user="scott",
        password="tiger",
        host="127.0.0.1",
        database="test",
        port=6432,
        statement_cache_size=0,
    )

    async with conn.transaction():
        pps = await conn.prepare("select 1", name=f'__asyncpg_{uuid.uuid4()}__')
        rows = await pps.fetch()

        # del or not, doesn't matter
        del pps

        (await conn.fetchrow(";"))

the other part of it is that the prepared statement we use explicitly has a name using the feature added by #837, that is asyncpg otherwise will name it something that might be already present on a different PG connection that nonetheless to asyncpg looks like the same connection, as pgbouncer switches connections around behind the scenes. This itself could also be mitigated by using server_reset_query_always , while potentially though not necessarily setting server_reset_query to use DEALLOCATE ALL instead of DISCARD ALL. Again pgbouncer's docs here are misleading when they use terminology like "This setting is for working around broken setups that run applications that use session features over a transaction-pooled PgBouncer."

So what's happening in SQLAlchemy is that we've wrapped asyncpg behind a layer that provides pep-249 compatibility, which includes, all SQL commands implicitly auto-begin a transaction. That's why things work for us. This particular "SELECT ;" thing is a liveness ping, that occurs outside of any transaction.....on the asyncpg driver. On other drivers like pg8000, the DBAPI always uses a transaction.

So. Great news for asyncpg, I can fix this on my end just by making sure our "ping" is run inside of a transaction. clearly asyncpg can't do much here, would be nice if pgbouncer had less opinionated documentation that is not so confusing to everyone.

elprans

elprans commented on Aug 16, 2023

@elprans
Member

if you did not actually begin a transaction. Because that's what "transaction" mode means; pgbouncer will give you the same connection for all commands if you open a transaction and work inside that.

Yes, except they completely overlook that implicit transactions exist and break those.

clearly asyncpg can't do much here

I think I'm still going to merge the "redo the Parse if statement cache is off" PR.

zzzeek

zzzeek commented on Aug 16, 2023

@zzzeek

sounds great!

added a commit that references this issue on Aug 17, 2023
cbf64e1

9 remaining items

Loading
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

      Development

      Participants

      @zzzeek@elprans@wwarne@nsbradford@ionsome

      Issue actions

        asyncpg uses prepared statements despite they're disabled · Issue #1058 · MagicStack/asyncpg