Skip to content

Execution Call Graph to detect cyclic tasks #2280

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

trulede
Copy link
Contributor

@trulede trulede commented Jun 1, 2025

A prototype of an Execution Call Graph, can be used to detect cyclic task calls, and for profiling. It uses the same techniques as used for the Reader graph.

The initial implementation is using graph features to detect duplicate edges and cycles. Because of the way the graph hash is currently generated, it's also possible to use duplicate vertices as a detection method (but currently disabled). That might change, depending on how the call graph (and hash) are realised.

Some very simple test data exists, to prove the concept, and more would be needed to cover all possible call patterns. Quick experiment with these commands:

go run ./cmd/task -v --dir testdata/cyclic/Taskfile_callgraph.yml call
go run ./cmd/task -v --dir testdata/cyclic/Taskfile_callgraph.yml dep
go run ./cmd/task -v --dir testdata/cyclic/Taskfile_callgraph.yml multi-cmd-cycle
go run ./cmd/task -v --dir testdata/cyclic/Taskfile_callgraph.yml multi-cmd-ok
go run ./cmd/task -v --dir testdata/cyclic/Taskfile_callgraph.yml multi-dep-cycle
go run ./cmd/task -v --dir testdata/cyclic/Taskfile_callgraph.yml multi-dep-ok

Next? more and more examples to check the graph algorithm, more thinking about the graph itself (if the idea/algo is correct), adding profiling data, a way to print the call graph (similar to the reader).

@trulede trulede changed the title Execution Call Graph Execution Call Graph to detect cyclic tasks Jun 1, 2025
@trulede
Copy link
Contributor Author

trulede commented Jun 2, 2025

Some testing of for loops. Works, but, if the call graph converges (e.g. task calls task with no vars, so hash is identical, which means identical vertex) then ErrEdgeAlreadyExists will occur.

Possible solution:

  • Change the parent hash of a vertex, from the task name (i.e. t.Task) to a function of the parent task. Such a change would result is a unique call history and thus no convergence. Need to consider if such a graph is more, or less, correct.
  • Depend on the cycle detection, and consume the ErrEdgeAlreadyExists error. This is tested and works.

Real question for consideration is if the call graph should really converge? Or if the call graph should be made unique? When the graph converges, call information is lost (e.g. profiling information or other tracking). Some of that would also depend on how the run is configured, for instance, convergence may be correct when run=once.

With run=always task calls would be this:
A -> B1 -> C -> D
A -> B2 -> C -> D

but with convergence of the call graph you end up with:
A -> B1 -> C -> D
A -> B2 -> C' ** converges on C

In some ways that is correct (for a graph) but not really correct for a call graph ... unless run=once when the second C should never be called ... another thing to check.

Quite a few variations checked:
go run ./cmd/task -v --dir testdata/cyclic/Taskfile_callgraph.yml for-ok
go run ./cmd/task -v --dir testdata/cyclic/Taskfile_callgraph.yml for-ok-duplicate-item
go run ./cmd/task -v --dir testdata/cyclic/Taskfile_callgraph.yml for-item-cycle
go run ./cmd/task -v --dir testdata/cyclic/Taskfile_callgraph.yml for-item-task
go run ./cmd/task -v --dir testdata/cyclic/Taskfile_callgraph.yml for-item-task-cycle
go run ./cmd/task -v --dir testdata/cyclic/Taskfile_callgraph.yml for-item-dep
go run ./cmd/task -v --dir testdata/cyclic/Taskfile_callgraph.yml for-item-nodep
go run ./cmd/task -v --dir testdata/cyclic/Taskfile_callgraph.yml for-item-nodep-var
go run ./cmd/task -v --dir testdata/cyclic/Taskfile_callgraph.yml for-item-nodep-var-var

Something similar needs to be done with sources, since it might be necessary to add them (sources) to the task call hash.

@trulede
Copy link
Contributor Author

trulede commented Jun 4, 2025

The graph seems to work, and the theory behind it is becoming more robust. I think the sources for a task will need to be added to the hash, but this needs to be proven with a test. - sources are added to task variables (special vars) which servers the required purpose ... however, it's possible those special vars are added after the graph vertex is created ... in which case it might be necessary to update the vertex hash value, or inject another edge/vertex to capture that constraint. I suspect that updating the vertex hash value will be sufficient.

Also, worthwhile covering some of the previous reported cases: #820

And, I think it could be useful to be able to disable this with an envar, since ... sometimes people know what they are doing ...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant