A repository to learn and understand more about memory usage in data applications
For memory profiling is memray used.
Each folder is a use case and contains:
- a notebook for playing around
- a python script to execute with memray
- 001-hello-world-setup
- 002-generate-tpch-data-with-duckdb
python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
memray run --native <script>
look into the logs and create an html page with the provided command
- extension
- results would be in the memray-results
- you can install live server extension in VS code to be able to open them with right click
- it is important to undestand graph representations
-
understand the pid of the process import os pid = os.getpid() print(pid)
-
attach live to that process with memray
- start terminal as root
- activate virtual environment with memray
- execute memray attach
Some material to understand core concepts