Skip to content

get_subgraph fails when nodes share a common prefix #6924

@Joehunk

Description

@Joehunk

Checked other resources

  • This is a bug, not a usage question.
  • I added a clear and descriptive title that summarizes this issue.
  • I used the GitHub search to find a similar question and didn't find it.
  • I am sure that this is a bug in LangGraph rather than my code.
  • The bug is not resolved by updating to the latest stable version of LangGraph (or the specific integration package).
  • This is not related to the langchain-community package.
  • I posted a self-contained, minimal, reproducible example. A maintainer can copy it and run it AS IS.

Reproduction Steps / Example Code (Python)

from langgraph.graph import StateGraph

    class State(TypedDict):
        do_not_care: str

    def generic_node(state: State) -> State:
        # Just need a properly typed node; we will not be running this graph
        return state

    grandchild_graph_builder = StateGraph(State)
    grandchild_graph_builder.add_node("grandchild", generic_node)
    grandchild_graph_builder.set_entry_point("grandchild")
    grandchild_graph_builder.add_edge("grandchild", END)
    grandchild_graph = grandchild_graph_builder.compile()

    child_graph_builder = StateGraph(State)
    child_graph_builder.add_node("child_subgraph", grandchild_graph)
    child_graph_builder.set_entry_point("child_subgraph")
    child_graph_builder.add_edge("child_subgraph", END)
    child_graph = child_graph_builder.compile()

    parent_graph_builder = StateGraph(State)
    parent_graph_builder.add_node("common_prefix", child_graph)
    parent_graph_builder.add_node("common_prefix_2", child_graph)
    parent_graph_builder.set_entry_point("common_prefix")
    parent_graph_builder.add_edge("common_prefix", "common_prefix_2")
    parent_graph_builder.add_edge("common_prefix_2", END)

    parent_graph = parent_graph_builder.compile()

    # It's pretty intricate to get this to fail against the previous
    # implementation of get_subgraphs. You need the following conditions:
    # * A parent graph with 2 subgraphs
    # * Subgraph A added before subgraph B
    # * Subgraph A's name is a prefix of subgraph B's name
    # * Subgraph B must also contain a subgraph (call it subgraph C)
    # * You must call get_subgraphs with the namespace 'subgraph_b|subgraph_c'
    #
    # The old implementation used prefix matching to find subgraphs,
    # so it would chop of subgraph A's name from subgraph B's name and
    # look for subgraph C under the chopped-off name, which would fail.
    subgraphs = parent_graph.get_subgraphs(
        namespace="common_prefix_2|child_subgraph",
        recurse=True,
    )
    assert len(list(subgraphs)) == 1

Error Message and Stack Trace (if applicable)

(assert at the end of example code fails)

Description

Pregel.get_subgraphs() had a subtle bug due to the fact that it matches node names with namespaces using startswith. This created the possibility (which I did in fact hit in production) that 2 nodes whose names share a common prefix, and both contain subgraphs, could cause the algorithm to malfunction. It would find the first node (whose name was a prefix of the second), chop off that many characters leaving a partial namespace, and recurse on that.

System Info

This is a logic bug not affected by system info. I put up a PR to fix it: #6366

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingexternal

    Type

    No fields configured for Bug.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions