Skip to content

Implement serde, getTuple(), and list() in BaseCheckpointSaver#119

Merged
Nuno Campos (nfcampos) merged 18 commits intolangchain-ai:mainfrom
andrewnguonly:migrate-checkpoint
Apr 23, 2024
Merged

Implement serde, getTuple(), and list() in BaseCheckpointSaver#119
Nuno Campos (nfcampos) merged 18 commits intolangchain-ai:mainfrom
andrewnguonly:migrate-checkpoint

Conversation

@andrewnguonly
Copy link
Copy Markdown
Contributor

@andrewnguonly Andrew Nguonly (andrewnguonly) commented Apr 19, 2024

Summary

To get checkpointing functionality up to parity with LangGraph's Python implementation, the BaseCheckpointSaver class needs to be updated to define serde, getTuple() and list(). Downstream dependencies (e.g. MemorySaver) are updated accordingly.

Implementation

  1. Move deepCopy() to /checkpoint/base.ts.
  2. Update BaseCheckpointSaver: Define serde attribute. Define getTuple() and list() methods. Update methods to return Promise.
  3. Update MemorySaver: Update storage attribute type. Update method implementations.
  4. Implement SqliteSaver.

To Do

  • Implement serde in MemorySaver.
  • Implement SQLite checkpointer and tests.
  • Add unit tests for getTuple() and list().
  • Clean up the code a bit.
  • Update SQLite schema to use camel_case.
  • Add changes for SqliteSaver from this PR: Add FewShotExamples managed value langgraph#332

@andrewnguonly Andrew Nguonly (andrewnguonly) marked this pull request as draft April 19, 2024 03:15
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file is named checkpoints.test.ts instead of checkpoint.test.ts because the jest --testPathIgnorePatterns=\\.int\\.test.ts config in package.json is ignoring the later file name for some reason. The regex is correct, but there seems to be a bug with jest.

I'm not sure what the best solution is. I tried updating the regex to require at least one character before the first dot (.), but I couldn't get jest to pick up any files.

}

const CheckpointThreadId: ConfigurableFieldSpec = {
id: "threadId",
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nuno Campos (@nfcampos), general question: The JS implementation serializes threadId and threadTs in camelCase (as opposed to snake_case in Python). Is there any value in making them both one way or the other?

Hypothetical scenario: Python graph checkpoints state, pauses execution, and JS graph (or some other runtime) resumes execution.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep would be nice to use same keys

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll make the changes in this PR.

}

export abstract class BaseCheckpointSaver {
at: CheckpointAt = CheckpointAt.END_OF_RUN;
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change this to CheckpointAt.END_OF_STEP.


export interface SerializerProtocol {
// eslint-disable-next-line @typescript-eslint/no-explicit-any
dumps(obj: any): any;
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what the right input/output types are for dumps() and loads().

Copy link
Copy Markdown
Contributor

@nfcampos Nuno Campos (nfcampos) Apr 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think in JS the most common would be unknown -> string -> unknown

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me know if you like this approach better: 348572c (commit)

I implemented generics on the BaseCheckpointSaver class and the SerializerProtocol interface. This ensures that a type incompatible serializer cannot be passed to the checkpoint saver constructor. For example, the SqliteSaver requires the checkpoint to be serialized to a string. A serializer that doesn't implement this correctly will raise an error. It's also nice to help developers implement dumps() and loads() with the correct types.

That being said, I'm not sure how much value this provides.

) {
super(serde, at);
this.db = new Database(connStringOrLocalPath);
this.setup().catch((error) =>
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isn't there a race condition here? you could end up calling one of the methods before setup finishes

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you're right. Let me fix.

I reproduced the race condition by adding a sleep statement.

private async setup(): Promise<void> {
  // sleep for 3 seconds
  await new Promise((resolve) => setTimeout(resolve, 3000));
  ...
}

The unit tests failed.

private async setup(): Promise<void> {
try {
this.db.exec(`
CREATE TABLE IF NOT EXISTS checkpoints (
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

prob worth enabling wal mode? I know I haven't done that in py, prob should there too

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know I haven't done that in py, prob should there too

Added this to my To Do list for later.

loads(data: L): D;
}

export abstract class BaseCheckpointSaver<D, L> {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

D here should always be Checkpoint

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. I removed D and hardcoded Checkpoint in the internal implementation of BaseCheckpointSaver.

@nfcampos Nuno Campos (nfcampos) merged commit c27db27 into langchain-ai:main Apr 23, 2024
@andrewnguonly Andrew Nguonly (andrewnguonly) deleted the migrate-checkpoint branch April 23, 2024 17:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants