Instructions for Claude Code working on this Terraform Provider for Proxmox VE.
Never violate these — they cause bugs, test failures, or provider misbehavior.
| Never Do | Reason |
|---|---|
| Start work without a GitHub issue | All work must be tracked |
| Make assumptions without verification | Always verify with code/tests/mitmproxy |
| Skip acceptance tests | Tests reproduce and verify fixes |
| Commit without running linter | Always make lint first |
| Commit without explicit user request | User controls git operations |
| Add changes beyond what's requested | Only implement what's asked |
| Always Do | Reason |
|---|---|
| Verify GitHub issue exists first | No issue = flag deficiency, offer to help |
| Ask questions when uncertain | Never assume; clarify before proceeding |
| Create acceptance test BEFORE fixing | Proves issue exists, proves fix works |
| Verify API calls with mitmproxy | Tests passing ≠ correct API calls |
| Maintain session state for multi-step work | Enables context recovery across sessions |
| Run full checklist before completion | See Production Readiness Checklist |
All work on fixes or features MUST have a corresponding GitHub issue.
- Verify issue exists — Search for an existing issue
- If no issue exists — Flag deficiency, do NOT proceed
Flag this to the user:
"No GitHub issue found for this work. All fixes and features must be tracked with an issue before implementation begins."
Then offer to help create one:
- Ask: "Would you like me to help draft a GitHub issue?"
- Determine type: Bug or Feature/Enhancement
- Draft content following the template structure
- Provide draft for user to submit at:
https://github.com/bpg/terraform-provider-proxmox/issues/new/choose - Wait for issue number before proceeding
| Artifact | Format | Example |
|---|---|---|
| Branch | {type}/{issue}-{desc} |
fix/1234-clone-timeout |
| Plans | .dev/{issue}_PLAN.md |
.dev/1234_PLAN.md |
| PR body | .dev/{issue}_PR_BODY.md |
.dev/1234_PR_BODY.md |
| Session state | .dev/{issue}_SESSION_STATE.md |
.dev/1234_SESSION_STATE.md |
| Test names | Descriptive, NO issue numbers | TestAccResourceVMClone |
| VM names | Descriptive, NO issue numbers | test-vm-clone |
| Commits | Conventional, NO issue numbers | fix(vm): handle clone timeout |
make build # Build provider binary
make lint # Run Go linter (auto-fixes formatting and most issues)
make test # Run unit tests
make docs # Regenerate Framework resource/datasource docs (not SDK)
./testacc TestName # Run specific acceptance test
npx --yes markdownlint-cli2 --fix "path/to/*.md" # Lint markdown filesNever manually format or lint code. Always use the appropriate linter tool.
| File type | Linter command | When to run |
|---|---|---|
Go .go |
make lint |
After editing any .go file |
| Markdown | npx --yes markdownlint-cli2 --fix "file.md" |
After editing any .md file |
./testacc TestAccResourceVM # Run single test
./testacc "TestAccResource.*" # Run tests matching pattern
./testacc --tier light # Light tests only (~30s)
./testacc --tier medium # Medium tests only (~3 min)
./testacc --tier heavy # Heavy tests only (~15 min)
./testacc --tier light,medium # Combine tiers
./testacc --tier all # All tiers with smart parallelism (~15 min)
./testacc --resource vm # All VM-related tests
./testacc --resource sdn # All SDN tests
./testacc --no-proxy TestName # Run without mitmproxy
./testacc TestName -- -count 2 # Pass flags through to go testTests are classified via //testacc:tier=X annotations in test files:
| Tier | Description | Parallelism | Time |
|---|---|---|---|
| light | API-only, no VMs or containers | -p 8 | ~30s |
| medium | Simple VMs with unique IDs | -p 4 | ~3 min |
| heavy | Cloud images, shared state | -p 1 | ~15 min |
Resource targeting via //testacc:resource=X annotations: vm, container, firewall, sdn, file, pool, acme, access, backup, ha, hardwaremapping, metrics, options, replication, apt, datastores, storage, network, misc
Requires testacc.env with:
TF_ACC=1
PROXMOX_VE_API_TOKEN="root@pam!<token>=<value>"
PROXMOX_VE_ENDPOINT="https://<host>:8006/"
PROXMOX_VE_SSH_AGENT="true"
PROXMOX_VE_SSH_USERNAME="root"
# Optional: PROXMOX_VE_ACC_NODE_NAME, PROXMOX_VE_ACC_NODE_SSH_ADDRESS, etc.Run /bpg:ready to execute automatically.
make build— Must passmake lint— Must show 0 issuesmake test— All unit tests pass./testacc TestAccYourFeature— Acceptance tests pass/bpg:debug-api— Verify API calls with mitmproxymake docs— Regenerate Framework docs if schema changed/bpg:prepare-pr— Generate PR body from template
See CONTRIBUTING.md. Key rules:
- Format:
{type}({scope}): {description} - Types:
feat,fix,chore - Scopes:
vm,lxc,provider,core,docs,ci - Lowercase, no period, under 72 chars, NO issue numbers
- DCO sign-off required: use
git commit -s(addsSigned-off-byline)
Use parallel agents for independent tasks to speed up work:
Good candidates for parallel execution:
- Research tasks (explore different parts of codebase simultaneously)
- Running independent test suites
- Searching for patterns across different directories
- Gathering context from multiple unrelated files
Not suitable for parallel execution:
- Tasks with dependencies (B needs output of A)
- File modifications (risk of conflicts)
- Sequential workflows (test → fix → verify)
How to request: Ask for agents to run "in parallel" explicitly.
LLMs have no memory between sessions. Externalize state to files:
- Session state file — The agent's memory across context resets
- Update before ANY context switch — End of session, new task, long operation
- Write "next action" for a stranger — Assume no prior context
- User decisions — Never re-ask; record in session state
- Agent assumptions — Make explicit; mark verified/rejected
- Reasoning — "Why" matters more than "what"
- Form hypothesis → test → record result
- Prevents circular debugging across sessions
- Use "Hypotheses Tested" table in session state
- Cache code patterns and file locations in session state
- Record dead ends so they're not re-explored
- Note key file:line references for quick restoration
- Each commit = working, resumable state
- If session dies mid-work, resume from last commit
- "Tests pass" ≠ correct behavior
- Verify with mitmproxy when available, OR use behavioral assertions in tests (uptime checks, API status queries) to prove the behavior change
- Include evidence in PR proof of work section
For long-running tasks:
- Checkpoint frequently — Update session state after every successful test run
- Summarize completed work — Don't keep raw exploration in context; distill findings
- Chunk large changes — Break into atomic commits to create resume points
- Use
/bpg:resume— Start new sessions by loading session state, not from memory
When things go wrong:
- Test failures — Record in session state, add to "Hypotheses Tested", don't mark complete
- API errors — Capture in mitmproxy log, document in session state
- Context loss — Always resume from session state file using
/bpg:resume - Blocked work — Update session status to "Blocked", document blocker, move to next task
When handing off work:
- To another agent — Ensure "Quick Context Restore" is complete and current
- To human — Create PR using
/bpg:prepare-pr, reference session state location - From human — Use
/bpg:resume, ask about any "Unverified" assumptions
- Go 1.25+ required
- golangci-lint 2.8.0 — installed automatically by
make lint - Line length limit: 150 characters (enforced by linter)
- Comment line wrap: ~120 characters (not 70–80; the linter allows 150, so narrow wrapping wastes vertical space)
- Dual-provider: SDK v2 (
proxmoxtf/) and Plugin Framework (fwprovider/) - New features: Framework only; SDK is feature-frozen
├── proxmox/ # Shared API client
│ └── retry/ # Unified retry logic (TaskOperation, APICallOperation, PollOperation)
├── fwprovider/ # Framework provider ← NEW CODE HERE
│ ├── test/ # Shared test utilities and acceptance tests
│ ├── config/ # Provider configuration types (Resource, DataSource)
│ ├── attribute/ # Attribute helpers (ResourceID, CheckDelete, IsDefined)
│ ├── types/ # Custom attribute types (stringset, etc.)
│ └── validators/ # Custom validators
├── proxmoxtf/ # Legacy SDK provider (feature-frozen)
├── utils/ # Shared utilities (maps, sets, strings, IP)
├── .dev/ # Development tools, plans, and session files
├── example/ # Example Terraform configurations
├── templates/ # Doc templates for Framework resources/datasources
└── docs/ # Provider documentation (mixed: see Documentation section)
proxmox.Client
├── Node(name) → nodes.Client
├── Cluster() → cluster.Client
├── Access() → access.Client
├── Pool() → pools.Client
├── Storage() → storage.Client
├── Version() → version.Client
├── API() → api.Client (raw HTTP)
└── SSH() → ssh.Client
- Verify GitHub issue exists — Flag deficiency if not
- Create branch:
fix/{issue}-description - Create session state:
.dev/{issue}_SESSION_STATE.md - Create acceptance test that reproduces the issue
- Verify test fails with current code
- Implement fix
- Verify test passes
- Run linter:
make lint - Verify with mitmproxy
- Complete checklist
- Verify GitHub issue exists — Flag deficiency if not
- Create branch:
feat/{issue}-description - Create session state:
.dev/{issue}_SESSION_STATE.md - Implement in Framework provider only (
fwprovider/) - Add validation, acceptance tests, documentation
- Complete checklist
Each resource has 3 files: resource_*.go (CRUD), *_model.go (API mapping), resource_*_test.go (acceptance tests). Client access flows through config.Resource → cfg.Client.Domain().SubClient().
schema.StringAttribute{
Required: true,
Validators: []validator.String{
stringvalidator.OneOf("a", "b"),
},
}
resp.Diagnostics.AddError("Unable to Create Resource", err.Error())Error diagnostic conventions: New code should use "Unable to [Action] [Resource]" format (see ADR-005). Include the resource name/ID in the summary (e.g., fmt.Sprintf("Unable to Read VM %q", name)) — domain clients do not reliably include it in err.Error(). No trailing period. Pass err.Error() as the detail string — never double-wrap. Legacy prefixes ("Could not", "Error") are acceptable in existing code.
In a datasource, attributes that are purely output (populated by the provider during Read) must be Computed: true only — never Optional. This applies to all attributes except lookup keys (which are Required).
| Attribute role | Schema flags | Example |
|---|---|---|
| Lookup key | Required: true |
id, node_name |
| Read-only output | Computed: true |
name, status, tags, cpu block |
Why not Optional on outputs? Optional on a datasource output lets users write values in config that are silently ignored — misleading UX and confusing docs (attributes appear under "Optional" instead of "Read-Only").
Nil API values in Computed fields: After Read, Computed attributes must have a known value — null means "unknown" which is only valid during planning. Convert nil API pointers to sensible defaults: "" for strings, false for bools, empty collections for sets/maps. Use types.StringValue("") instead of types.StringPointerValue(nil).
Nested blocks in datasources (e.g., cpu, vga, rng): The datasource should have its own DataSourceSchema() with Computed: true on the block and all inner attributes. Do not reuse ResourceSchema() which has Optional: true, Computed: true for resource write semantics.
When the Proxmox API uses comma-separated strings (e.g., vmid=100,101,102), always expose them as Terraform list or set attributes — never as raw comma-separated strings. Convert in toAPI() (join) and fromAPI() (split). See ADR-004 for details and code examples.
Three operation types — choose based on the API call pattern:
// Async UPID tasks (create, clone, delete, start):
op := retry.NewTaskOperation("name", retry.WithRetryIf(retry.IsTransientAPIError))
op.DoTask(ctx, dispatchFn, waitFn)
// Synchronous blocking calls (PUT /config):
op := retry.NewAPICallOperation("name", retry.WithRetryIf(retry.ErrorContains("got timeout")))
op.Do(ctx, fn)
// Polling loops (wait for status, config unlock):
op := retry.NewPollOperation("name", retry.WithRetryIf(func(err error) bool { ... }))
op.DoPoll(ctx, fn)Delete predicate trap: ErrResourceDoesNotExist can arrive via HTTP 500, so IsTransientAPIError alone will match it. Delete operations must combine predicates:
retry.WithRetryIf(func(err error) bool {
return retry.IsTransientAPIError(err) && !errors.Is(err, api.ErrResourceDoesNotExist)
})See ADR-005: Error Handling for full details.
"key": {
Type: schema.TypeString,
Required: true,
ValidateDiagFunc: validation.ToDiagFunc(
validation.StringInSlice([]string{"a", "b"}, false)),
}When fixing validation issues, update BOTH providers where applicable.
- VMs with
started = trueneed boot disk with cloud image; usestop_on_destroy = true - Naming: Descriptive names only, NO issue numbers
- API verification: Use
/bpg:debug-apifor mitmproxy workflow - Behavioral assertions: When verifying side effects (reboots, state changes), use direct API checks in test check functions rather than relying only on Terraform state attributes. Example: use
te.NodeClient().VM(vmID).GetVMStatus(ctx)to check uptime before/after to detect reboots. Seeresource_vm_hotplug_test.goandresource_vm_disks_test.gofor patterns. - TDD acceptance tests: Tests MUST actually fail without the fix. If a test passes both with and without the fix, it doesn't prove anything — add behavioral assertions (uptime, status, API checks) that detect the actual behavior change.
- Functional coverage: Tests must cover ALL major use cases for the resource — not just one happy path. Different input modes (e.g.,
allvsvmidvspool), list attributes with multiple elements, compound fields, nested objects, and import round-trips must each have test scenarios. PRs with insufficient functional coverage will be rejected. See ADR-006.
Docs under docs/ are a mix of auto-generated and manually maintained files.
| Provider | Docs generation | Edit where |
|---|---|---|
Framework (fwprovider/) |
Auto-generated by make docs from schema + optional templates/ overrides |
Edit templates/resources/<name>.md.tmpl (or templates/data-sources/<name>.md.tmpl). If no custom template exists, docs come from the schema MarkdownDescription fields in Go code. |
SDK (proxmoxtf/) |
Manually maintained | Edit docs/ files directly |
Key rules:
make docsonly regenerates Framework resource/datasource docs and guides with templates; SDK docs are untouched- Manual edits to
docs/files for Framework resources will be lost onmake docs— always edit the template or schema description instead - Manual edits to
docs/files for SDK resources are safe — they are the source of truth - Custom templates in
templates/override defaulttfplugindocsgeneration for specific Framework resources
Guides use two patterns:
| Pattern | Source of truth | Examples |
|---|---|---|
| A (template-driven) | templates/guides/<name>.md.tmpl with {{ codefile }} directives; examples in examples/guides/<name>/ |
clone-vm, vm-lifecycle |
| B (direct markdown) | docs/guides/<name>.md edited directly; inline HCL blocks |
multi-node, upgrade, migration-vm-clone, cloned-vm |
For Pattern A guides, edit the template — docs/guides/<name>.md is auto-generated by make docs and will be overwritten.
For multi-step work, maintain session state using .dev/SESSION_STATE_TEMPLATE.md.
Location: .dev/{issue}_SESSION_STATE.md
Key sections to maintain:
- Quick Context Restore — For fast agent bootstrap
- User Decisions — Prevent re-asking
- Assumptions Made — Track verification status
- Context Gathered — Save re-reading files
- Hypotheses Tested — For debugging sessions
Update triggers:
- Before ending session
- Before context-heavy operations
- After completing a phase
- When blocked or switching tasks
| Do | Don't |
|---|---|
| Be concise and direct | Apologize |
| Use technical terminology | Summarize changes made |
| Explain reasoning | Make up information |
| Admit uncertainty | Show implementation unless asked |
| Skill | Purpose |
|---|---|
/bpg:start-issue |
Start work on a GitHub issue (branch + session state) |
/bpg:resume |
Resume work from a previous session |
/bpg:ready |
Run production readiness checklist |
/bpg:debug-api |
Debug API calls with mitmproxy |
/bpg:prepare-pr |
Prepare PR body from template with proof of work |
See .dev/README.md for detailed workflow documentation and how skills connect together.
- CONTRIBUTING.md — Contributing guide
- docs/adr/ — Architecture Decision Records and reference examples
- .dev/DEBUGGING.md — Debugging guide
- .dev/SESSION_STATE_TEMPLATE.md — Session template
- Proxmox API
- Terraform Plugin Framework