Skip to content

Commit 473bc22

Browse files
authored
Merge pull request #9 from glideapps/new-architecture
New architecture
2 parents de24ea2 + 5612e18 commit 473bc22

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

42 files changed

+4046
-1874
lines changed

ARCHITECTURE.md

Lines changed: 313 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,313 @@
1+
# JSON Schema to Zod Converter Architecture
2+
3+
## Overview
4+
5+
This document describes the modular architecture for converting JSON Schema to Zod schemas. The architecture reflects JSON Schema's compositional nature, where each property adds independent constraints that combine to form the complete validation.
6+
7+
## Core Concepts
8+
9+
### Two-Phase Processing
10+
11+
The converter operates in two distinct phases based on a fundamental architectural principle:
12+
13+
**🔑 Key Principle: Refinement handlers should only be used when Zod doesn't support the operations natively.**
14+
15+
1. **Primitive Phase**: Handlers that use Zod's built-in constraint methods (`.min()`, `.max()`, `.regex()`, etc.)
16+
2. **Refinement Phase**: Handlers that add custom validation logic through Zod's `.refine()` method for operations Zod cannot express natively
17+
18+
### Type Schemas
19+
20+
During the first phase, we maintain a `TypeSchemas` object that tracks the state of each possible type:
21+
22+
```typescript
23+
interface TypeSchemas {
24+
string?: z.ZodTypeAny | false;
25+
number?: z.ZodTypeAny | false; // integers are numbers with .int() constraint
26+
boolean?: z.ZodTypeAny | false;
27+
null?: z.ZodNull | false;
28+
array?: z.ZodArray<any> | false;
29+
tuple?: z.ZodTuple<any> | false;
30+
object?: z.ZodObject<any> | false;
31+
}
32+
```
33+
34+
Each type can be in one of three states:
35+
- `undefined`: Type is still allowed (no constraints have excluded it)
36+
- `false`: Type is explicitly disallowed
37+
- `z.Zod*`: Type with accumulated constraints (including literals, enums, and unions)
38+
39+
## Architecture Components
40+
41+
### 1. Primitive Handlers
42+
43+
These handlers operate during the first phase and modify the `TypeSchemas`:
44+
45+
```typescript
46+
interface PrimitiveHandler {
47+
apply(types: TypeSchemas, schema: JSONSchema): void;
48+
}
49+
```
50+
51+
#### Implemented Primitive Handlers:
52+
- **TypeHandler**: Sets types to `false` if not in the `type` array
53+
- **ConstHandler**: Handles const values by creating literals
54+
- **EnumHandler**: Handles enum validation with appropriate Zod types
55+
- **ImplicitStringHandler**: Enables string type when string constraints are present without explicit type
56+
- **MinLengthHandler**: Applies `.min()` to string (Zod native support)
57+
- **MaxLengthHandler**: Applies `.max()` to string (Zod native support)
58+
- **PatternHandler**: Applies `.regex()` to string (Zod native support)
59+
- **MinimumHandler**: Applies `.min()` to number (Zod native support)
60+
- **MaximumHandler**: Applies `.max()` to number (Zod native support)
61+
- **ExclusiveMinimumHandler**: Applies `.gt()` to number (Zod native support)
62+
- **ExclusiveMaximumHandler**: Applies `.lt()` to number (Zod native support)
63+
- **MultipleOfHandler**: Applies `.multipleOf()` to number (Zod native support)
64+
- **MinItemsHandler**: Applies `.min()` to array (Zod native support)
65+
- **MaxItemsHandler**: Applies `.max()` to array (Zod native support)
66+
- **ItemsHandler**: Configures array element validation if arrays still allowed
67+
- **TupleHandler**: Detects tuple arrays and marks them as tuple type
68+
- **PropertiesHandler**: Creates initial object schema with known properties
69+
70+
### 2. Refinement Handlers
71+
72+
These handlers operate during the second phase on the combined schema:
73+
74+
```typescript
75+
interface RefinementHandler {
76+
apply(zodSchema: z.ZodTypeAny, schema: JSONSchema): z.ZodTypeAny;
77+
}
78+
```
79+
80+
#### Implemented Refinement Handlers:
81+
82+
**✅ Legitimate Refinement Handlers (Zod doesn't support natively):**
83+
- **UniqueItemsHandler**: Custom validation for array uniqueness (Zod has no native unique constraint)
84+
- **NotHandler**: Complex logical negation validation (Zod has no native `.not()`)
85+
- **AllOfHandler**: Complex schema intersection logic
86+
- **AnyOfHandler**: Complex anyOf validation logic
87+
- **OneOfHandler**: Complex oneOf validation (exactly one must match)
88+
- **EnumComplexHandler**: Complex object/array equality in enums
89+
- **ConstComplexHandler**: Complex object/array equality for const values
90+
- **MetadataHandler**: Description and title annotations
91+
92+
**⚠️ Edge Case Handlers (legitimate but specific):**
93+
- **ProtoRequiredHandler**: Special handler for `__proto__` security protection
94+
- **EmptyEnumHandler**: Handles empty enum arrays (always invalid)
95+
- **EnumNullHandler**: Handles null in enum when type doesn't include null
96+
- **PrefixItemsHandler**: Handles Draft 2020-12 prefixItems validation
97+
- **ObjectPropertiesHandler**: Object validation for union types (primitive work moved to PropertiesHandler)
98+
99+
## Processing Flow
100+
101+
### Phase 1: Type-Specific Constraints
102+
103+
1. Initialize empty `TypeSchemas` object
104+
2. Run all primitive handlers in sequence
105+
3. Each handler:
106+
- Checks if it has relevant constraints in the schema
107+
- For each type it affects, checks if that type is still allowed (`!== false`)
108+
- If allowed and constraint applies, either:
109+
- Creates initial type schema if `undefined`
110+
- Adds constraints to existing type schema
111+
112+
### Phase 2: Build Union and Apply Refinements
113+
114+
1. Convert remaining `undefined` types to their most permissive schemas:
115+
- `string``z.string()`
116+
- `number``z.number()`
117+
- `array``z.array(z.any())`
118+
- `tuple` → handled by TupleItemsHandler
119+
- `object``z.object({}).passthrough()`
120+
- etc.
121+
122+
2. Filter out `false` types and create union of allowed types:
123+
- 0 types → `z.never()`
124+
- 1 type → that type's schema
125+
- 2+ types → `z.union([...])`
126+
127+
3. Run all refinement handlers on the resulting schema
128+
129+
## Example: Processing a Complex Schema
130+
131+
Given this JSON Schema:
132+
```json
133+
{
134+
"type": ["string", "number"],
135+
"minimum": 5,
136+
"minLength": 3,
137+
"pattern": "^[A-Z]",
138+
"uniqueItems": true
139+
}
140+
```
141+
142+
**Phase 1 (Primitive Handlers):**
143+
1. TypeHandler: marks `boolean`, `null`, `array`, `object` as `false`
144+
2. MinimumHandler: sets `number` to `z.number().min(5)`
145+
3. MinLengthHandler: sets `string` to `z.string().min(3)`
146+
4. PatternHandler: updates `string` to `z.string().min(3).regex(/^[A-Z]/)`
147+
148+
**Result after Phase 1:**
149+
```typescript
150+
{
151+
string: z.string().min(3).regex(/^[A-Z]/),
152+
number: z.number().min(5),
153+
boolean: false,
154+
null: false,
155+
array: false,
156+
tuple: undefined,
157+
object: false
158+
}
159+
```
160+
161+
**Phase 2:**
162+
1. Create union: `z.union([z.string().min(3).regex(/^[A-Z]/), z.number().min(5)])`
163+
2. UniqueItemsHandler: Adds refinement (only validates for arrays, but none allowed here)
164+
165+
## Implementation Status
166+
167+
### Test Results
168+
- **Total tests**: 1355 (999 active, 356 skipped)
169+
- **Passing**: 999 tests
170+
- **Failing**: 0 tests
171+
- **Skipped**: 356 tests (JSON Schema features not supported by Zod)
172+
173+
### Known Limitations
174+
1. **`__proto__` property validation**: Zod's `passthrough()` strips this property for security. Solved with ProtoRequiredHandler using `z.any()` when `__proto__` is required.
175+
2. **Unicode grapheme counting**: JavaScript uses UTF-16 code units instead of grapheme clusters. Test added to skip list as platform limitation.
176+
3. **Complex schema combinations**: Some edge cases with deeply nested `allOf`, `anyOf`, `oneOf` combinations may not perfectly match JSON Schema semantics.
177+
178+
## Benefits
179+
180+
1. **Modularity**: Each JSON Schema keyword is handled by a dedicated handler
181+
2. **Composability**: Handlers don't need to know about each other
182+
3. **Type Safety**: Type-specific constraints are only applied to appropriate types
183+
4. **Extensibility**: New keywords can be supported by adding new handlers
184+
5. **Maintainability**: Clear separation between constraint types
185+
6. **Correctness**: Reflects JSON Schema's additive constraint model
186+
7. **Testability**: Each handler can be tested independently
187+
8. **Performance**: Native Zod operations are faster than custom refinements
188+
9. **Better Type Inference**: Primitive handlers create proper Zod types with built-in validation
189+
10. **Architectural Clarity**: Clear distinction between schema construction vs. custom validation
190+
191+
## Implementation Guidelines
192+
193+
### Adding a New Primitive Handler
194+
195+
**Use primitive handlers when Zod has native support for the constraint (e.g., `.min()`, `.max()`, `.regex()`).**
196+
197+
1. Determine which type(s) the constraint affects
198+
2. Create handler that checks if those types are still allowed
199+
3. Apply constraints using Zod's built-in methods (prefer native over custom logic)
200+
4. Add type guards when working with `z.ZodTypeAny` to ensure type safety
201+
5. Consider if the constraint should enable a type implicitly (like `ImplicitStringHandler`)
202+
203+
Example:
204+
```typescript
205+
export class MyConstraintHandler implements PrimitiveHandler {
206+
apply(types: TypeSchemas, schema: JSONSchema.BaseSchema): void {
207+
const mySchema = schema as JSONSchema.MySchema;
208+
if (mySchema.myConstraint === undefined) return;
209+
210+
if (types.string !== false) {
211+
const currentString = types.string || z.string();
212+
if (currentString instanceof z.ZodString) {
213+
types.string = currentString.myMethod(mySchema.myConstraint);
214+
}
215+
}
216+
}
217+
}
218+
```
219+
220+
### Adding a New Refinement Handler
221+
222+
**Only use refinement handlers when Zod doesn't support the operation natively.**
223+
224+
1. Use for constraints that:
225+
- **Cannot be expressed with Zod's built-in constraints** (e.g., uniqueItems, complex object equality)
226+
- Apply complex logical operations (e.g., not, anyOf, oneOf)
227+
- Require custom validation across multiple types
228+
- Handle edge cases or security concerns
229+
2. Handler receives the complete schema after type union
230+
3. Return schema with added `.refine()` validation
231+
4. **Avoid using refinements for operations Zod supports natively** (e.g., string length, number ranges)
232+
233+
Example:
234+
```typescript
235+
export class MyRefinementHandler implements RefinementHandler {
236+
apply(zodSchema: z.ZodTypeAny, schema: JSONSchema.BaseSchema): z.ZodTypeAny {
237+
if (!schema.myConstraint) return zodSchema;
238+
239+
return zodSchema.refine(
240+
(value: any) => {
241+
// Custom validation logic
242+
return validateMyConstraint(value, schema.myConstraint);
243+
},
244+
{ message: "Value does not satisfy myConstraint" }
245+
);
246+
}
247+
}
248+
```
249+
250+
### Handler Order
251+
252+
- **Primitive handlers**: Order matters for some handlers:
253+
- ConstHandler and EnumHandler should run before TypeHandler
254+
- ImplicitStringHandler should run before other string handlers
255+
- TupleHandler should run before ItemsHandler
256+
- Others can run in any order (they're independent)
257+
258+
- **Refinement handlers**: Should be ordered by complexity/dependencies:
259+
- Special cases first (ProtoRequiredHandler, EmptyEnumHandler)
260+
- Logical combinations (AllOf, AnyOf, OneOf)
261+
- Type-specific refinements (TupleItems, ArrayItems, ObjectProperties)
262+
- General refinements (Not, UniqueItems)
263+
- Metadata handlers last
264+
265+
## Future Enhancements
266+
267+
1. **Additional JSON Schema Keywords**: Support for more keywords like `dependencies`, `if/then/else`, `contentMediaType`, etc.
268+
2. **Performance Optimization**: Cache converted schemas for repeated conversions
269+
3. **Better Error Messages**: Provide more descriptive validation error messages
270+
4. **Schema Version Support**: Handle different JSON Schema draft versions
271+
5. **Bidirectional Conversion**: Improve Zod to JSON Schema conversion fidelity
272+
273+
## Architectural Evolution
274+
275+
### Key Insight: Native vs. Custom Validation
276+
277+
During development, we discovered that **many operations initially implemented as refinement handlers should actually be primitive handlers** because Zod supports them natively. This led to a major architectural insight:
278+
279+
**❌ Anti-pattern: Using refinements for Zod-native operations**
280+
```typescript
281+
// WRONG: Using refinement for string length (Zod supports .min() natively)
282+
return zodSchema.refine(
283+
(value: any) => typeof value !== "string" || value.length >= minLength,
284+
{ message: "String too short" }
285+
);
286+
```
287+
288+
**✅ Correct pattern: Using primitive handlers for Zod-native operations**
289+
```typescript
290+
// CORRECT: Using Zod's native .min() method
291+
if (types.string !== false) {
292+
types.string = (types.string || z.string()).min(minLength);
293+
}
294+
```
295+
296+
### Migration Examples
297+
298+
1. **String Constraints**: Moved from `StringConstraintsHandler` (refinement) to `ImplicitStringHandler` + existing primitive handlers
299+
2. **Array Items**: Moved from `ArrayItemsHandler` (refinement) to enhanced `ItemsHandler` (primitive) + `PrefixItemsHandler` (refinement for edge cases)
300+
3. **Tuple Handling**: Moved from `TupleItemsHandler` (refinement) to `TupleHandler` (primitive)
301+
302+
### Benefits of This Evolution
303+
304+
- **Performance**: Native Zod methods are faster than custom refinements
305+
- **Type Safety**: Better TypeScript inference with proper Zod types
306+
- **Maintainability**: Less custom validation code to maintain
307+
- **Coverage**: Eliminated unreachable code paths in refinement handlers
308+
309+
## Conclusion
310+
311+
The modular two-phase architecture successfully addresses the need for a clean, extensible design where each JSON Schema property is handled by independent modules. The key insight about **preferring native Zod operations over custom refinements** has significantly improved the architecture's performance, maintainability, and correctness.
312+
313+
This approach makes the codebase more maintainable, testable, and easier to extend with new JSON Schema features while leveraging Zod's full capabilities.

CLAUDE.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,8 @@ This file provides guidance to Claude Code (claude.ai/code) when working with co
2525
## Architecture
2626
- Maintain dual module support (CJS/ESM) for all exports
2727
- Keep conversion logic modular with single-responsibility functions
28-
- Write tests for each feature and edge cases, achieving 100% coverage
28+
- Write tests for each feature and edge cases, achieving 100% line and branch coverage
29+
- Never go back on supported features! If something works, it has to keep working.
2930
- Target ES2018 for maximum compatibility (Node 10+)
3031
- Use esbuild for bundling with optimized output
3132
- Follow semantic versioning for releases

0 commit comments

Comments
 (0)