Skip to content

Commit b75226b

Browse files
PopSoda2002zhaochenyang20
authored andcommitted
[Feat.] Enable grafana to show metrics (sgl-project#4718)
Co-authored-by: zhaochenyang20 <[email protected]>
1 parent 3bd1ee4 commit b75226b

File tree

6 files changed

+115
-8
lines changed

6 files changed

+115
-8
lines changed

docs/references/production_metrics.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33
SGLang exposes the following metrics via Prometheus. The metrics are namespaced by `$name` (the model name).
44

5-
An example of the monitoring dashboard is available in [examples/monitoring/grafana.json](../examples/monitoring/grafana.json).
5+
An example of the monitoring dashboard is available in [examples/monitoring/grafana.json](../examples/monitoring/grafana/dashboards/json/sglang-dashboard.json).
66

77
Here is an example of the metrics:
88

@@ -150,7 +150,7 @@ In a new Grafana setup, ensure that you have the `Prometheus` data source enable
150150

151151
If not, click `Add data source` -> `Prometheus`, set Prometheus URL to `http://localhost:9090`, and click `Save & Test`.
152152

153-
To import the Grafana dashboard, click `+` -> `Import` -> `Upload JSON file` -> `Upload` and select [grafana.json](../examples/monitoring/grafana.json).
153+
To import the Grafana dashboard, click `+` -> `Import` -> `Upload JSON file` -> `Upload` and select [grafana.json](../examples/monitoring/grafana/dashboards/json/sglang-dashboard.json).
154154

155155
### Troubleshooting
156156

examples/monitoring/README.md

Lines changed: 76 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,76 @@
1+
# SGLang Monitoring Setup
2+
3+
This directory contains a ready-to-use monitoring setup for SGLang using Prometheus and Grafana.
4+
5+
## Prerequisites
6+
7+
- Docker and Docker Compose installed
8+
- SGLang server running with metrics enabled
9+
10+
## Usage
11+
12+
1. Start your SGLang server with metrics enabled:
13+
14+
```bash
15+
python -m sglang.launch_server --model-path meta-llama/Meta-Llama-3.1-8B-Instruct --port 30000 --enable-metrics
16+
```
17+
18+
By default, the metrics server will run on `127.0.0.1:30000`.
19+
20+
2. Start the monitoring stack:
21+
22+
```bash
23+
cd examples/monitoring
24+
docker compose up
25+
```
26+
27+
3. Access the monitoring interfaces:
28+
- Grafana: [http://localhost:3000](http://localhost:3000)
29+
- Prometheus: [http://localhost:9090](http://localhost:9090)
30+
31+
Default Grafana login credentials:
32+
- Username: `admin`
33+
- Password: `admin`
34+
35+
You'll be prompted to change the password on first login.
36+
37+
4. The SGLang dashboard will be automatically available in the "SGLang Monitoring" folder.
38+
39+
## Troubleshooting
40+
41+
### Port Conflicts
42+
If you see errors like "port is already allocated":
43+
44+
1. Check if you already have Prometheus or Grafana running:
45+
```bash
46+
docker ps | grep -E 'prometheus|grafana'
47+
```
48+
49+
2. Stop any conflicting containers:
50+
```bash
51+
docker stop <container_id>
52+
```
53+
54+
3. Ensure no other services are using ports 9090 and 3000:
55+
```bash
56+
lsof -i :9090
57+
lsof -i :3000
58+
```
59+
60+
### Connection Issues
61+
If Grafana cannot connect to Prometheus:
62+
1. Check that both services are running
63+
2. Verify the datasource configuration in Grafana
64+
3. Check that your SGLang server is properly exposing metrics
65+
66+
## Configuration
67+
68+
- Prometheus configuration: `prometheus.yaml`
69+
- Docker Compose configuration: `docker-compose.yaml`
70+
- Grafana datasource: `grafana/datasources/datasource.yaml`
71+
- Grafana dashboard configuration: `grafana/dashboards/config/dashboard.yaml`
72+
- SGLang dashboard JSON: `grafana/dashboards/json/sglang-dashboard.json`
73+
74+
## Customization
75+
76+
You can customize the monitoring setup by modifying the configuration files as needed.

examples/monitoring/docker-compose.yaml

Lines changed: 17 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,16 +1,28 @@
1+
version: '3'
12
services:
23
prometheus:
34
image: prom/prometheus:latest
5+
container_name: prometheus
46
network_mode: host
5-
ports:
6-
- "9090:9090"
77
volumes:
8-
- ${PWD}/prometheus.yaml:/etc/prometheus/prometheus.yml
8+
- ./prometheus.yaml:/etc/prometheus/prometheus.yml
9+
command:
10+
- '--config.file=/etc/prometheus/prometheus.yml'
11+
- '--storage.tsdb.path=/prometheus'
912

1013
grafana:
1114
image: grafana/grafana:latest
15+
container_name: grafana
1216
network_mode: host
17+
volumes:
18+
- ./grafana/datasources:/etc/grafana/provisioning/datasources
19+
- ./grafana/dashboards/config:/etc/grafana/provisioning/dashboards
20+
- ./grafana/dashboards/json:/var/lib/grafana/dashboards
21+
environment:
22+
- GF_AUTH_ANONYMOUS_ENABLED=true
23+
- GF_AUTH_ANONYMOUS_ORG_ROLE=Viewer
24+
- GF_AUTH_BASIC_ENABLED=false
25+
- GF_USERS_ALLOW_SIGN_UP=false
26+
- GF_DASHBOARDS_DEFAULT_HOME_DASHBOARD_PATH=/var/lib/grafana/dashboards/sglang-dashboard.json
1327
depends_on:
1428
- prometheus
15-
ports:
16-
- "3000:3000"
Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,11 @@
1+
apiVersion: 1
2+
providers:
3+
- name: 'SGLang'
4+
orgId: 1
5+
folder: 'SGLang Monitoring'
6+
type: file
7+
disableDeletion: false
8+
updateIntervalSeconds: 10
9+
allowUiUpdates: false
10+
options:
11+
path: /var/lib/grafana/dashboards

examples/monitoring/grafana.json renamed to examples/monitoring/grafana/dashboards/json/sglang-dashboard.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -388,7 +388,7 @@
388388
},
389389
"disableTextWrap": false,
390390
"editorMode": "code",
391-
"expr": "histogram_quantile(0.95, sum by (le) (rate(sglang:time_to_first_token_seconds_bucket[$__rate_interval])))\r\n",
391+
"expr": "histogram_quantile(0.5, sum by (le) (rate(sglang:time_to_first_token_seconds_bucket[$__rate_interval])))\r\n",
392392
"fullMetaSearch": false,
393393
"hide": false,
394394
"includeNullMetadata": true,
Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
apiVersion: 1
2+
datasources:
3+
- name: Prometheus
4+
type: prometheus
5+
access: proxy
6+
url: http://localhost:9090
7+
isDefault: true
8+
editable: false

0 commit comments

Comments
 (0)