Skip to content

Commit d6d3b8a

Browse files
authored
feat: add references for builtin ai agent (#1751)
1 parent 27381db commit d6d3b8a

File tree

4 files changed

+129
-28
lines changed

4 files changed

+129
-28
lines changed

internal/agent/system_prompt.txt

Lines changed: 83 additions & 28 deletions
Original file line numberDiff line numberDiff line change
@@ -102,24 +102,35 @@ Example:
102102
```yaml
103103
type: graph
104104
steps:
105-
- name: fetch-data
105+
- id: fetch_data
106106
command: curl -o data.json https://api.example.com/data
107107
depends: []
108-
- name: process
108+
- id: process
109109
command: python process.py
110110
depends:
111-
- fetch-data
112-
- name: notify
111+
- fetch_data
112+
- id: notify
113113
command: echo "Done"
114114
depends:
115115
- process
116116
```
117117

118-
When referencing step output with `${step_id.stdout}`, `${step_id.stderr}`, or `${step_id.exit_code}`, the step MUST have an `id:` field. Only `id:` registers a step in the reference map; `name:` alone does not work.
118+
**Step ID rules:** Always set `id:` on every step. Omit `name:` — it auto-fills from `id`. Regex: `^[a-zA-Z][a-zA-Z0-9_]*$` (no hyphens — use underscores). Max 40 chars. Reserved words: `env`, `params`, `args`, `stdout`, `stderr`, `output`, `outputs`.
119119

120120
Use `script:` (not `command:`) when a step needs multi-line shell logic, pipes, or variables. `command:` is for single-line commands.
121121

122-
For passing data between steps: use `output: VAR` + `$VAR` for small safe values (IDs, counts). Use `${step_id.stdout}` (file path reference) for large or untrusted content — this avoids shell expansion issues.
122+
**Passing data between steps:**
123+
- `output: VAR` captures stdout **content** into `${VAR}`. For JSON output, extract fields with `${VAR.key}`.
124+
- `${step_id.stdout}` is a **file path** to the step's stdout log, not the content. Use `cat "${step_id.stdout}"` to read it.
125+
- Use `output:` + `${VAR}` for small safe values (IDs, counts). Use `${step_id.stdout}` (file path) for large or untrusted content.
126+
- Resolution priority: `${foo.bar}` checks step references first, then JSON path on variables.
127+
128+
**env ordering:** Use list-of-maps to preserve evaluation order. `env: {A: foo, B: ${A}/bar}` may fail because Go maps iterate randomly. Use:
129+
```yaml
130+
env:
131+
- A: foo
132+
- B: ${A}/bar
133+
```
123134

124135
Parameter values with spaces must be quoted: `dagu start dag -- name="John Doe"`. Unquoted `name=John Doe` splits into two separate parameters.
125136
</correctness>
@@ -265,6 +276,25 @@ Rules:
265276
</memory_management>
266277
{{end}}
267278

279+
{{if .ReferencesDir}}
280+
<builtin_knowledge>
281+
Built-in reference documents are available at {{.ReferencesDir}}/. Use `read` to load them when you need detailed information beyond what `dagu schema` provides.
282+
283+
Available references:
284+
- `schema.md` — Complete DAG YAML schema (top-level and step-level fields)
285+
- `executors.md` — All executor types with full configuration details
286+
- `cli.md` — All CLI subcommands with flags
287+
- `env.md` — Execution and configuration environment variables
288+
- `pitfalls.md` — Critical pitfalls and how to avoid them
289+
- `codingagent.md` — Integrating AI coding agents (Claude Code, Codex, Gemini, etc.) into DAG workflows
290+
291+
Load a reference when:
292+
- A user asks about a specific executor, CLI command, or env var and `dagu schema` doesn't cover the detail
293+
- You need to write a DAG that uses coding agents (claude -p, codex exec, gemini -p, etc.)
294+
- You want to double-check a pitfall before authoring a DAG
295+
</builtin_knowledge>
296+
{{end}}
297+
268298
<reference>
269299
Use `dagu schema` and `dagu example` via bash to look up DAG YAML structure and see examples:
270300
- `dagu schema dag` — root-level DAG fields
@@ -287,47 +317,72 @@ Available in all steps without declaration:
287317
- `DAG_RUN_WORK_DIR` — per-run temporary working directory
288318
- `DAGU_PARAMS_JSON` / `DAG_PARAMS_JSON` — all resolved params as JSON
289319

290-
### Step References
291-
Use `${step_id.stdout}`, `${step_id.stderr}`, `${step_id.exit_code}` to reference a completed step's log file path or exit code. Slicing supported: `${step_id.stdout:0:5}`.
320+
### Lifecycle Hooks
292321
```yaml
293-
steps:
294-
- id: fetch
295-
command: curl -s https://api.example.com/data
296-
- id: process
297-
script: |
298-
cat "${fetch.stdout}" | jq '.items[]'
299-
depends: [fetch]
322+
handler_on:
323+
init:
324+
command: echo "starting"
325+
success:
326+
command: echo "succeeded"
327+
failure:
328+
command: echo "failed with status ${DAG_RUN_STATUS}"
329+
exit:
330+
command: echo "always runs"
300331
```
301332

302-
### Parameters
333+
### Retry and Continue
303334
```yaml
304-
params:
305-
- NAME: "default"
306-
- COUNT: "10"
335+
steps:
336+
- id: flaky_step
337+
command: curl http://api.example.com/data
338+
retry_policy:
339+
limit: 3
340+
interval_sec: 10
341+
continue_on:
342+
failed: true
307343
```
308344

309345
### Sub-DAGs
310-
Use `call:` to invoke another DAG. Define inline (after `---`) or in a separate file.
311346
```yaml
312347
steps:
313348
- id: sub_task
314-
call: other-dag
315-
params: "KEY=$VALUE"
316-
output: RESULT
317-
```
349+
type: dag
350+
call: child-workflow
351+
params:
352+
input_file: /data/input.csv
318353

319-
Parallel execution:
320-
```yaml
321354
- id: fan_out
322355
call: worker
323356
parallel:
324357
items: ["A", "B", "C"]
325-
output: RESULTS
326358
```
327359

328-
### Validation
360+
### Conditional Routing
361+
Routes map patterns to lists of existing step names.
362+
```yaml
363+
steps:
364+
- id: check
365+
command: echo "error"
366+
output: RESULT
367+
- id: route
368+
type: router
369+
value: ${RESULT}
370+
routes:
371+
"ok": [success_path]
372+
"re:err.*": [error_path]
373+
depends: [check]
374+
- id: success_path
375+
command: echo "success"
376+
- id: error_path
377+
command: echo "handling error"
378+
```
379+
380+
### Validation and Inspection
329381
```bash
382+
dagu config # show resolved paths (DAGs dir, logs, data)
330383
dagu validate my_dag.yaml # validate structure
331384
dagu dry my_dag.yaml -- p=val # dry run without executing
385+
dagu status my-dag # latest run status (tree view)
386+
dagu status --run-id=<id> my-dag # specific run
332387
```
333388
</reference>

internal/agent/types.go

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -442,4 +442,6 @@ type EnvironmentInfo struct {
442442
WorkingDir string
443443
// BaseConfigFile is the path to the base configuration file.
444444
BaseConfigFile string
445+
// ReferencesDir is the directory containing built-in reference documents.
446+
ReferencesDir string
445447
}

internal/persis/fileagentskill/examples.go

Lines changed: 38 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -96,6 +96,44 @@ func SeedExampleSkills(baseDir string) bool {
9696
return true
9797
}
9898

99+
const builtinKnowledgeEmbedDir = "examples/dagu/references"
100+
101+
// SeedReferences extracts built-in reference documents to the given directory.
102+
// These are read-only knowledge files the AI agent can read on demand.
103+
// Returns the directory path if successful, empty string on failure.
104+
// Files are always overwritten on each startup to keep them up-to-date with the binary.
105+
func SeedReferences(destDir string) string {
106+
if err := os.MkdirAll(destDir, skillDirPermissions); err != nil {
107+
slog.Warn("Failed to create builtin knowledge directory", "dir", destDir, "error", err)
108+
return ""
109+
}
110+
111+
err := fs.WalkDir(exampleSkillsFS, builtinKnowledgeEmbedDir, func(path string, d fs.DirEntry, err error) error {
112+
if err != nil || d.IsDir() {
113+
return err
114+
}
115+
relPath := strings.TrimPrefix(path, builtinKnowledgeEmbedDir+"/")
116+
destPath := filepath.Join(destDir, relPath)
117+
118+
data, readErr := exampleSkillsFS.ReadFile(path)
119+
if readErr != nil {
120+
slog.Warn("Failed to read embedded knowledge file", "path", path, "error", readErr)
121+
return nil
122+
}
123+
124+
if err := os.WriteFile(destPath, data, filePermissions); err != nil {
125+
slog.Warn("Failed to write knowledge file", "path", destPath, "error", err)
126+
}
127+
return nil
128+
})
129+
if err != nil {
130+
slog.Warn("Failed to walk embedded knowledge files", "error", err)
131+
return ""
132+
}
133+
134+
return destDir
135+
}
136+
99137
// hasExistingSkills checks if the directory already contains skill subdirectories.
100138
func hasExistingSkills(baseDir string) bool {
101139
entries, err := os.ReadDir(baseDir)

internal/service/frontend/server.go

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -179,6 +179,11 @@ func NewServer(ctx context.Context, cfg *config.Config, dr exec.DAGStore, drs ex
179179
}
180180
}
181181

182+
// Seed built-in knowledge references to data dir (not git-synced).
183+
fileagentskill.SeedReferences(
184+
filepath.Join(cfg.Paths.DataDir, "agent", "references"),
185+
)
186+
182187
var agentSkillStore agent.SkillStore
183188
skillsDir := filepath.Join(cfg.Paths.DAGsDir, "skills")
184189
if fileagentskill.SeedExampleSkills(skillsDir) && agentConfigStore != nil {
@@ -676,6 +681,7 @@ func initAgentAPI(ctx context.Context, store *fileagentconfig.Store, modelStore
676681
ConfigFile: paths.ConfigFileUsed,
677682
WorkingDir: paths.DAGsDir,
678683
BaseConfigFile: paths.BaseConfig,
684+
ReferencesDir: filepath.Join(paths.DataDir, "agent", "references"),
679685
},
680686
})
681687

0 commit comments

Comments
 (0)