feat(sdk): add preview config to createFileSystemSerdes#523
Open
ParidelPooya wants to merge 10 commits intomainfrom
Open
feat(sdk): add preview config to createFileSystemSerdes#523ParidelPooya wants to merge 10 commits intomainfrom
ParidelPooya wants to merge 10 commits intomainfrom
Conversation
Add optional preview configuration that stores a subset of the serialized
value inline in the checkpoint envelope alongside the file pointer. This
makes data visible in the console and API without reading the full file.
Preview is generated whenever data is written to a file (both ALWAYS mode
and OVERFLOW mode when payload exceeds threshold). Inline payloads in
OVERFLOW mode do not get a preview since the full data is already in the
checkpoint.
New types:
- PreviewMode (INCLUDE_ALL | EXCLUDE_ALL): default visibility strategy
- FieldMatchMode (ANYWHERE | PATH): how field names are matched
- PreviewField: { name, match? } field selector
- PreviewConfig: { mode, include?, exclude?, mask?, maskString?, maxPreviewBytes? }
Priority rules:
- exclude always wins (even over mask)
- mask implies visibility — masked fields are shown (with maskString) unless excluded
- maxPreviewBytes (default 4096) caps the preview size
Also renames FileSystemSerdesConfig.mode to storageMode for clarity and
adds full unit test coverage for all preview behaviors.
… keys buildPreview now mirrors the original object structure in the preview, creating intermediate objects as needed. This is more readable and consistent with the original data shape.
Guard against dangerous keys (__proto__, constructor, prototype) in setNestedValue to prevent prototype pollution when building preview objects from user-controlled field names.
Check all path segments upfront before traversal, use hasOwnProperty instead of 'in' operator, and create intermediate objects with Object.create(null) to eliminate prototype chain entirely.
Replace imperative traversal with reduceRight to build nested structure from inside out, then merge via deepMerge. This eliminates the dynamic obj[userInput] assignment pattern that static analyzers flag as prototype pollution risk, while the DANGEROUS_KEYS upfront check remains as defense in depth.
Refactor buildPreview to collect flat (path, value) pairs first, then build the nested result using reduceRight + JSON spread — no mutation, no dynamic obj[key] assignment. Remove deepMerge and setNestedValue helpers entirely. This eliminates all patterns that static analyzers flag as prototype pollution risk.
…d of O(n²)) Track maxPreviewBytes budget incrementally using flat path as key estimate instead of re-serializing the full accumulated object for each field. Build the nested result once from accepted pairs at the end.
Replace reduce+JSON.parse with direct O(1)-per-field traversal. Keys are safe at this point — dangerous keys were already filtered during the collect phase.
yaythomas
reviewed
May 8, 2026
yaythomas
left a comment
There was a problem hiding this comment.
This is relatively complicated.
I wonder if just providing the caller with the ability to specify their own transform means the SDK can be less opinionated about this?
export interface FileSystemSerdesConfig {
storageMode?: FileSystemSerdesMode;
preview?: (value: unknown) => unknown;
maxPreviewBytes?: number; // keep as sanity guard, default 4096
}Usage:
createFileSystemSerdes("/mnt/s3", {
preview: (v) => ({ id: v.id, status: v.status, email: "***" }),
});
// or with a utility:
import { omit } from "lodash";
createFileSystemSerdes("/mnt/s3", {
preview: (v) => omit(v, ["password", "ssn"]),
});You could combine with a helper for masking:
export const maskFields = (keys: string[], maskString = "***") =>
(v: any) => keys.reduce((acc, k) => ({ ...acc, [k]: maskString }), v);e.g
maskFields(["ssn"]).
| * filesystem via S3 Files, enabling durable, shared state across invocations | ||
| * and parallel function instances without checkpoint size constraints. | ||
| * | ||
| /** @internal */ |
| if ( | ||
| obj[key] !== null && | ||
| typeof obj[key] === "object" && | ||
| !Array.isArray(obj[key]) |
There was a problem hiding this comment.
does this !Array mean { items: [{ secret: "xyz" }] } would not mask?
| if (obj === null || typeof obj !== "object") return; | ||
| for (const key of Object.keys(obj)) { | ||
| if (DANGEROUS_KEYS.has(key)) continue; | ||
| const path = pathPrefix ? `${pathPrefix}.${key}` : key; |
There was a problem hiding this comment.
{ user: { email: "x" } } and { "user.email": "x" } will end up as user.email?
So PATH and ANYWHERE matches on either.
if (mode === FieldMatchMode.PATH) {
return path === field.name;
}
...
return path.split(".").includes(field.name);
What then happens on the rebuild?
| let node = result; | ||
| for (let i = 0; i < parts.length - 1; i++) { | ||
| if (typeof node[parts[i]] !== "object" || node[parts[i]] === null) { | ||
| node[parts[i]] = {}; |
There was a problem hiding this comment.
if you had result.user = "arb", and then later user.email, this ends up as node['user'] = {}, and you lose arb?
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Add optional preview configuration that stores a subset of the serialized value inline in the checkpoint envelope alongside the file pointer. This makes data visible in the console and API without reading the full file.
Preview is generated whenever data is written to a file (both ALWAYS mode and OVERFLOW mode when payload exceeds threshold). Inline payloads in OVERFLOW mode do not get a preview since the full data is already in the checkpoint.
New types:
Priority rules:
Also renames FileSystemSerdesConfig.mode to storageMode for clarity and adds full unit test coverage for all preview behaviors.