Skip to content

Commit e0198c8

Browse files
authored
Merge pull request #6 from togethercomputer/fede/test-fix
simple test fix
2 parents b633ac3 + c977937 commit e0198c8

File tree

3 files changed

+131
-34
lines changed

3 files changed

+131
-34
lines changed

README.md

Lines changed: 6 additions & 34 deletions
Original file line numberDiff line numberDiff line change
@@ -72,43 +72,15 @@ The ReAct agent supports two execution modes for running Python code:
7272

7373
**Docker Mode**: All code execution and file processing happens locally in your Docker container.
7474

75-
## ⚠️ Docker Mode Session Isolation Limitations
75+
## ⚠️ Docker Mode Limitations
7676

77-
**Important**: While Docker mode provides basic session isolation for variables, it has significant limitations:
77+
**Important**: Docker mode has session isolation limitations and security considerations for local development.
7878

79-
### ✅ What IS Isolated:
80-
- **User variables**: `x = 1` in one session won't affect another session
81-
- **Session state**: Each session maintains its own execution context
79+
- **Session isolation**: While user variables are isolated between sessions, module modifications and global state changes affect all sessions
80+
- **Host directory access**: The container has read-write access to specific host directories
81+
- **Best for**: Single-user local development and data analysis workflows
8282

83-
### ❌ What is NOT Isolated:
84-
- **Module modifications**: Changes to imported libraries affect ALL sessions
85-
- **Global state changes**: Modifications to `sys.path`, `os.environ`, etc. are shared
86-
- **Library monkey-patching**: Modifying `json.dumps`, `numpy` settings, etc. corrupts other sessions
87-
88-
### Examples of Problematic Code:
89-
```python
90-
# These operations will affect ALL sessions:
91-
import json
92-
json.dumps = custom_function # ❌ Breaks all sessions
93-
94-
import sys
95-
sys.path.append('/custom/path') # ❌ Affects all sessions
96-
97-
import os
98-
os.environ['KEY'] = 'value' # ❌ Global environment change
99-
```
100-
101-
### Docker Mode is OK For:
102-
- **Data analysis workflows**: Reading CSV/JSON files, pandas operations, statistical analysis
103-
- **Machine learning**: Training models, feature engineering, model evaluation
104-
- **Visualization**: Creating plots with matplotlib, seaborn, plotly
105-
- **Standard data science**: EDA, data cleaning, hypothesis testing
106-
- **Single-user development** and **testing environments**
107-
108-
### When to Use TCI Mode Instead:
109-
- **Multi-user environments** where sessions must be completely isolated
110-
- **Production applications** with concurrent users
111-
- **Workflows that modify global state** (if unavoidable)
83+
📖 **For detailed technical information, security warnings, and setup instructions, see the [Interpreter README](interpreter/README.md)**
11284

11385
## 🛠️ Usage
11486

interpreter/README.md

Lines changed: 123 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,123 @@
1+
# 🐳 Python Code Interpreter Service
2+
3+
A containerized Python code execution service that provides an API for running Python code in isolated sessions. This service is designed for local development and experimentation.
4+
5+
## ⚠️ Local Development Tool
6+
7+
**This is intended for local development only** - not for production deployment. The service allows arbitrary code execution and is designed to run in a disposable Docker container.
8+
9+
## 🚀 Quick Start
10+
11+
1. **Start the service:**
12+
```bash
13+
cd interpreter
14+
docker-compose up --build -d
15+
```
16+
17+
2. **Verify it's running:**
18+
```bash
19+
curl http://localhost:8000/health
20+
```
21+
22+
3. **Stop the service:**
23+
```bash
24+
docker-compose down
25+
```
26+
27+
## 🔒 Security Model
28+
29+
### ✅ Container Isolation
30+
- All code runs inside a Docker container
31+
- Container is **disposable** - can be rebuilt if corrupted
32+
- No access to host system beyond mounted directories
33+
34+
### 🚨 Host Directory Access Warning
35+
36+
The Docker container has **read-write access** to specific host directories:
37+
38+
```yaml
39+
volumes:
40+
- ../eval/kaggle_data/spooky:/app/spooky:rw
41+
- ../eval/kaggle_data/jigsaw:/app/jigsaw:rw
42+
```
43+
44+
**⚠️ This means executed code can:**
45+
- **Modify or delete** files in `../eval/kaggle_data/spooky`
46+
- **Modify or delete** files in `../eval/kaggle_data/jigsaw`
47+
- **Create new files** in these directories
48+
49+
**✅ Code CANNOT access:**
50+
- Your home directory
51+
- Other projects
52+
- System files
53+
- Any directories outside the mounted volumes
54+
55+
### 🔧 Recommended: Read-Only Mounts
56+
57+
For safer operation, consider changing to read-only mounts in `docker-compose.yml`:
58+
59+
```yaml
60+
volumes:
61+
- ../eval/kaggle_data/spooky:/app/spooky:ro # read-only
62+
- ../eval/kaggle_data/jigsaw:/app/jigsaw:ro # read-only
63+
```
64+
65+
## ⚠️ Session Isolation Limitations
66+
67+
While the service provides basic session isolation, it has important limitations:
68+
69+
### ✅ What IS Isolated:
70+
- **User variables**: `x = 1` in one session won't affect another session
71+
- **Session state**: Each session maintains its own execution context
72+
73+
### ❌ What is NOT Isolated:
74+
- **Module modifications**: Changes to imported libraries affect ALL sessions
75+
- **Global state changes**: Modifications to `sys.path`, `os.environ`, etc. are shared
76+
- **Library monkey-patching**: Modifying `json.dumps`, `numpy` settings, etc. corrupts other sessions
77+
78+
### Examples of Problematic Code:
79+
```python
80+
# These operations will affect ALL sessions:
81+
import json
82+
json.dumps = custom_function # ❌ Breaks all sessions
83+
84+
import sys
85+
sys.path.append('/custom/path') # ❌ Affects all sessions
86+
87+
import os
88+
os.environ['KEY'] = 'value' # ❌ Global environment change
89+
```
90+
91+
### Safe for Single-User Local Development:
92+
- **Data analysis workflows**: Reading CSV/JSON files, pandas operations
93+
- **Machine learning**: Training models, feature engineering
94+
- **Visualization**: Creating plots with matplotlib, seaborn
95+
- **Standard data science**: EDA, data cleaning, hypothesis testing
96+
97+
## 📁 File Structure
98+
99+
- `main.py` - FastAPI application with endpoints
100+
- `code_executor.py` - Core code execution logic
101+
- `session_manager.py` - Session handling and isolation
102+
- `download_data.py` - Data download from HuggingFace
103+
- `Dockerfile` - Container configuration
104+
- `docker-compose.yml` - Service orchestration
105+
- `requirements.txt` - Python dependencies
106+
107+
## 🧹 Cleanup
108+
109+
To completely remove the service and data:
110+
111+
```bash
112+
# Stop and remove containers
113+
docker-compose down
114+
115+
# Remove downloaded data
116+
rm -rf downloaded_data/
117+
118+
# Remove custom uploaded data
119+
rm -rf custom_data/
120+
121+
# Remove Docker images (optional)
122+
docker image prune -f
123+
```

tests/test_functional.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,7 @@
44
import tempfile
55
import requests
66
import asyncio
7+
import pytest
78
from open_data_scientist.codeagent import ReActDataScienceAgent
89
from concurrent.futures import ThreadPoolExecutor, as_completed
910

@@ -295,6 +296,7 @@ def test_end_to_end_agent_analysis():
295296
os.unlink(temp_file_path)
296297

297298

299+
@pytest.mark.asyncio
298300
async def test_concurrent_sessions_isolation():
299301
"""Test that 50 concurrent math operations across different sessions work correctly."""
300302

0 commit comments

Comments
 (0)