Skip to content

Program hangs during termination #6538

Closed
@Gigioliva

Description

@Gigioliva

Polars version checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of Polars.

Issue description

I use Polars to read several parquet files. I also put it in an async loop to use it with asyncio. More or less the code is this:

def _parquet_file_to_df(file_path: str, columns: tuple[str, ...]) -> polars.DataFrame:
    return polars.read_parquet(
        file_path, columns=list(columns) if columns else None, use_pyarrow=True
    )

async def main() -> None:
    loop = asyncio.get_running_loop()
    for file in [...]:
        await loop.run_in_executor(None, func)

if __name__ == "__main__":
    asyncio.run(main())
    print("done")

Sometimes my program does not terminate even after printing done. Here some pieces of the stack trace:

Screenshot 2023-01-28 alle 19 05 30

Screenshot 2023-01-28 alle 19 06 18

Screenshot 2023-01-28 alle 19 07 35

Screenshot 2023-01-28 alle 19 09 52

I am not sure the problem is 100% related to Polars. Can you help me understand the bug?

Reproducible example

def _parquet_file_to_df(file_path: str, columns: tuple[str, ...]) -> polars.DataFrame:
    return polars.read_parquet(
        file_path, columns=list(columns) if columns else None, use_pyarrow=True
    )

async def main() -> None:
    loop = asyncio.get_running_loop()
    for file in [...]:
        await loop.run_in_executor(None, func)

if __name__ == "__main__":
    asyncio.run(main())
    print("done")

Expected behavior

The program terminates execution

Installed versions

---Version info---
Polars: 0.15.16
Index type: UInt32
Platform: Linux-5.13.0-52-generic-x86_64-with-glibc2.35
Python: 3.10.6 (main, Nov 14 2022, 16:10:14) [GCC 11.3.0]
---Optional dependencies---
OpenBLAS WARNING - could not determine the L2 cache size on this system, assuming 256k
pyarrow: 10.0.1
pandas: 1.5.3
numpy: 1.24.1
fsspec: <not installed>
connectorx: <not installed>
xlsx2csv: <not installed>
deltalake: <not installed>
matplotlib: 3.6.3

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingpythonRelated to Python Polars

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions