Description
If creating a filesystem
using from adlfs import AzureBlobFileSystem
. I use that filesystem
to do pd.read_parquet(... filesystem=filesystem)
and df.to_parquet(..., filesystem=filesystem)
.
When I serve my app with --autoreload
I get
ERROR: This event loop is already running
Exception ignored in atexit callback: <bound method BaseServer._atexit of <panel.io.server.Server object at 0x7f70461b4cd0>>
Traceback (most recent call last):
File "/home/jovyan/repos/mt-pm-reporting/.venv/lib/python3.11/site-packages/bokeh/server/server.py", line 290, in _atexit
self.stop(wait=False)
File "/home/jovyan/repos/mt-pm-reporting/.venv/lib/python3.11/site-packages/panel/io/server.py", line 354, in stop
self._loop.asyncio_loop.run_until_complete(stop_autoreload())
File "/opt/conda/lib/python3.11/asyncio/base_events.py", line 629, in run_until_complete
self._check_running()
File "/opt/conda/lib/python3.11/asyncio/base_events.py", line 588, in _check_running
raise RuntimeError('This event loop is already running')
I believe the line self._loop.asyncio_loop.run_until_complete(stop_autoreload())
above needs to be able to handle if the loop is already running. It is already running because with autoreload the script is run one time before the server actually starts. And this runs the adlfs
async functionality.
Minimum, reproducible example
panel==1.4.4, adfls=2024.4.1
- Create an Azure Blob Storage and container with the name
test
- Set the environment variable
export AZURE_CONNECTIONSTRING_TEST="DefaultEndpointsProtocol=https;AccountName=.."
- Run the file below
python app.py
to create the data - Serve the file below
panel serve app.py --autoreload --index app
import os
import pandas as pd
import panel as pn
from adlfs import AzureBlobFileSystem
azure_blob_connection_string = os.environ["AZURE_CONNECTIONSTRING_TEST"]
filesystem = AzureBlobFileSystem(connection_string=azure_blob_connection_string)
path = "test/panel_adlfs.parquet"
def read_data():
return pd.read_parquet(path=path, engine="pyarrow", filesystem=filesystem)
def write_data():
if filesystem.exists(path):
filesystem.rm(path)
df = pd.DataFrame({"x": [1, 2, 3, 4]})
df.to_parquet(path, engine="pyarrow", filesystem=filesystem)
if __name__ == "__main__":
write_data()
print(read_data())
elif pn.state.served:
data = read_data()
pn.panel(data).servable()
The interesting thing is that if you raise an exception raise ValueError()
the application will serve and show you the error. When you remove the ValueError
the application will reload and work fine. I.e. the problem is that something else gets a chance to start the ioloop before Panel with autoreload.
Work Around
You can work around the problem if you make sure not to use the adfls filesystem the first time the script is executed.
import os
import pandas as pd
import panel as pn
from adlfs import AzureBlobFileSystem
azure_blob_connection_string = os.environ["AZURE_CONNECTIONSTRING_TEST"]
filesystem = AzureBlobFileSystem(connection_string=azure_blob_connection_string)
path = "test/panel_adlfs.parquet"
def read_data():
return pd.read_parquet(path=path, engine="pyarrow", filesystem=filesystem)
def write_data():
if filesystem.exists(path):
filesystem.rm(path)
df = pd.DataFrame({"x": [1, 2, 3, 4]})
df.to_parquet(path, engine="pyarrow", filesystem=filesystem)
def can_load():
if "can_load" in pn.state.cache:
return True
pn.state.cache["can_load"]=True
return not pn.config.autoreload
if __name__ == "__main__":
write_data()
print(read_data())
elif pn.state.served:
if not can_load():
pn.panel("Please reload ...").servable()
else:
data = read_data()
pn.panel(data).servable()