This comprehensive guide covers Combi-Parse's advanced regex functionality and type safety features, including the revolutionary type-level regex engine and various regex parsing functions.
Combi-Parse provides several regex-related functions for different use cases:
regex()- Basic runtime regex parsingtypedRegex()- Type-safe regex with basic type inferenceadvancedTypedRegex()- Full type-level regex analysis (performance-sensitive)literalRegex()- Regex with literal type validation- Type-level regex engine - Compile-time regex pattern analysis
The fundamental regex parser that uses JavaScript RegExp for pattern matching:
import { regex } from 'combi-parse';
// Basic usage
const identifier = regex(/[a-zA-Z_][a-zA-Z0-9_]*/);
identifier.parse("myVariable"); // → "myVariable"
const number = regex(/[0-9]+(\.[0-9]+)?/);
number.parse("42.5"); // → "42.5"
// Email pattern
const email = regex(/[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}/);
email.parse("user@example.com"); // → "user@example.com"When to use regex():
- Standard regex parsing needs
- Complex patterns that don't need compile-time analysis
- Performance-critical code where type safety is less important
- Working with existing RegExp objects
A type-safe regex parser that validates patterns at compile time while providing basic type safety:
import { typedRegex } from 'combi-parse';
// Type-safe pattern compilation
const hello = typedRegex('hello');
hello.parse('hello'); // Type: string (matches 'hello')
// Character classes with type safety
const digit = typedRegex('[0-9]');
digit.parse('5'); // Type: string (but validated at compile time)
// Alternation patterns
const greeting = typedRegex('hi|hello|hey');
greeting.parse('hi'); // Type: string (matches 'hi'|'hello'|'hey')
// Optional patterns
const color = typedRegex('colou?r');
color.parse('color'); // Type: string (matches 'color'|'colour')Advantages:
- Compile-time pattern validation
- Better error messages for invalid patterns
- Type safety without performance overhead
- Compatible with all TypeScript environments
When to use typedRegex():
- When you want compile-time pattern validation
- For patterns that should be type-safe but performance is critical
- In environments where advanced type analysis might be too slow
The most advanced regex parser that provides exact union types of all possible matching strings:
import { advancedTypedRegex } from 'combi-parse';
// ✅ Simple patterns work well
const hello = advancedTypedRegex('hello');
hello.parse('hello'); // Type: 'hello'
const animal = advancedTypedRegex('cat|dog');
animal.parse('cat'); // Type: 'cat' | 'dog'
const digit = advancedTypedRegex('[0-9]');
digit.parse('5'); // Type: '0'|'1'|'2'|'3'|'4'|'5'|'6'|'7'|'8'|'9'
const optional = advancedTypedRegex('colou?r');
optional.parse('color'); // Type: 'color' | 'colour'
// Escape sequences
const wordChar = advancedTypedRegex('\\w');
wordChar.parse('a'); // Type: 'a'|'b'|...|'z'|'A'|...|'Z'|'0'|...|'9'|'_'
// Infinite patterns use template literals
const oneOrMore = advancedTypedRegex('a+');
oneOrMore.parse('aaa'); // Type: `a${string}`
const zeroOrMore = advancedTypedRegex('a*b');
zeroOrMore.parse('aaab'); // Type: 'b' | `a${string}`Complex patterns can slow down or crash the TypeScript server:
// ⚠️ These patterns may cause TypeScript server issues
const complex = advancedTypedRegex('(hello|hi)+ (world|earth)*');
const identifier = advancedTypedRegex('[a-zA-Z_][a-zA-Z0-9_]*');
const email = advancedTypedRegex('[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\\.[a-zA-Z]{2,}');
// ✅ Use simple version for complex patterns
const complexSafe = typedRegex('(hello|hi)+ (world|earth)*');
const identifierSafe = typedRegex('[a-zA-Z_][a-zA-Z0-9_]*');Safe patterns for advancedTypedRegex():
- Simple literals:
'hello','abc' - Simple alternations:
'cat|dog','red|green|blue' - Single character classes:
'[0-9]','[abc]' - Simple quantifiers:
'a?','b+'(on single characters)
When to use advancedTypedRegex():
- When you need precise union types
- For simple, finite patterns
- In development environments where TypeScript performance is acceptable
- For domain-specific languages with limited vocabularies
Combines regex efficiency with literal type validation:
import { literalRegex } from 'combi-parse';
// Boolean parsing with validation
const boolean = literalRegex(/true|false/, ['true', 'false'] as const);
boolean.parse('true'); // Type: 'true' | 'false'
// HTTP methods
const httpMethod = literalRegex(
/GET|POST|PUT|DELETE|PATCH/,
['GET', 'POST', 'PUT', 'DELETE', 'PATCH'] as const
);
httpMethod.parse('GET'); // Type: 'GET'|'POST'|'PUT'|'DELETE'|'PATCH'
// CSS colors
const cssColor = literalRegex(
/red|blue|green|yellow|black|white/,
['red', 'blue', 'green', 'yellow', 'black', 'white'] as const
);
// Programming language operators
const operator = literalRegex(
/\+\+|--|[+\-*\/=<>!]=?|&&|\|\|/,
['++', '--', '+', '-', '*', '/', '=', '==', '!=', '<', '>', '<=', '>=', '&&', '||'] as const
);Advantages:
- Regex performance for initial matching
- Type safety through literal validation
- Runtime validation that the match is in the expected set
- Works with complex regex patterns
When to use literalRegex():
- When you have a finite set of valid values
- For parsing enums or keyword sets
- When regex performance is important but type safety is required
- For domain-specific languages with known vocabularies
Combi-Parse includes a complete regex engine that operates entirely at the TypeScript type level:
The type-level regex engine supports:
- Literals:
'abc','hello' - Character Classes:
'[abc]','[0-9]','[a-z]' - Escape Sequences:
'\\d'(digits),'\\w'(word chars),'\\s'(whitespace) - Quantifiers:
'*'(zero or more),'+'(one or more),'?'(optional) - Alternation:
'cat|dog' - Grouping:
'(ab)+c' - Wildcards:
'.'(any printable character)
import type { CompileRegex, Enumerate } from 'combi-parse';
// Compile patterns to NFAs (Non-deterministic Finite Automata)
type HelloRegex = CompileRegex<'hello'>;
type DigitRegex = CompileRegex<'[0-9]'>;
type ChoiceRegex = CompileRegex<'cat|dog'>;
// Enumerate all possible matching strings
type HelloStrings = Enumerate<HelloRegex>; // 'hello'
type DigitStrings = Enumerate<DigitRegex>; // '0'|'1'|'2'|...|'9'
type ChoiceStrings = Enumerate<ChoiceRegex>; // 'cat'|'dog'
// Complex patterns
type OptionalRegex = CompileRegex<'ab?c'>;
type OptionalStrings = Enumerate<OptionalRegex>; // 'ac'|'abc'
type RepeatingRegex = CompileRegex<'a+'>;
type RepeatingStrings = Enumerate<RepeatingRegex>; // `a${string}`
// Character classes with ranges
type HexRegex = CompileRegex<'[0-9a-fA-F]'>;
type HexStrings = Enumerate<HexRegex>; // '0'|'1'|...|'9'|'a'|...|'f'|'A'|...|'F'
// Grouping with quantifiers
type GroupRegex = CompileRegex<'(ab)+c'>;
type GroupStrings = Enumerate<GroupRegex>; // `ab${string}c`Use the type system to validate patterns at compile time:
// Valid patterns compile successfully
type ValidPattern = CompileRegex<'hello|world'>; // ✅
// Invalid patterns cause compile errors
type InvalidPattern = CompileRegex<'[unclosed'>; // ❌ Compile error
// Use in conditional types
type IsValidPattern<P extends string> =
CompileRegex<P> extends never ? false : true;
type Test1 = IsValidPattern<'hello'>; // true
type Test2 = IsValidPattern<'[invalid'>; // false// Define language tokens with type safety
const keywords = advancedTypedRegex('if|else|while|for|return|function');
// Type: Parser<'if'|'else'|'while'|'for'|'return'|'function'>
const operators = literalRegex(
/\+\+|--|[+\-*\/=<>!]=?|&&|\|\|/,
['++', '--', '+', '-', '*', '/', '=', '==', '!=', '<', '>', '<=', '>=', '&&', '||'] as const
);
// Type: Parser<'++' | '--' | '+' | '-' | ...>
const booleanLiteral = advancedTypedRegex('true|false');
// Type: Parser<'true'|'false'>
// Combine in a language parser
const token = choice([
keywords.map(kw => ({ type: 'keyword', value: kw })),
operators.map(op => ({ type: 'operator', value: op })),
booleanLiteral.map(bool => ({ type: 'boolean', value: bool === 'true' })),
regex(/[a-zA-Z_][a-zA-Z0-9_]*/).map(id => ({ type: 'identifier', value: id }))
]);// Type-safe configuration parsing
const logLevel = advancedTypedRegex('debug|info|warn|error');
const environment = advancedTypedRegex('development|staging|production');
const protocol = advancedTypedRegex('http|https');
const configParser = genParser(function* () {
yield str('log_level=');
const logLevel = yield logLevel;
yield str('\n');
yield str('environment=');
const env = yield environment;
yield str('\n');
yield str('protocol=');
const protocol = yield protocol;
return {
logLevel, // Type: 'debug'|'info'|'warn'|'error'
environment: env, // Type: 'development'|'staging'|'production'
protocol // Type: 'http'|'https'
};
});// Type-safe HTTP response parsing
const httpStatus = advancedTypedRegex('200|201|400|401|403|404|500|502|503');
const contentType = literalRegex(
/application\/json|text\/html|text\/plain|application\/xml/,
['application/json', 'text/html', 'text/plain', 'application/xml'] as const
);
const httpResponse = genParser(function* () {
yield str('HTTP/1.1 ');
const status = yield httpStatus;
yield str('\r\n');
yield str('Content-Type: ');
const contentType = yield contentType;
yield str('\r\n\r\n');
const body = yield regex(/.*/);
return {
status, // Type: '200'|'201'|'400'|'401'|'403'|'404'|'500'|'502'|'503'
contentType, // Type: 'application/json'|'text/html'|'text/plain'|'application/xml'
body
};
});-
Use
regex()when:- Working with complex patterns
- Performance is critical
- Type safety is not essential
- Pattern is dynamic or not known at compile time
-
Use
typedRegex()when:- You want compile-time pattern validation
- Performance is important
- Basic type safety is sufficient
- Working in constrained TypeScript environments
-
Use
advancedTypedRegex()when:- You need precise union types
- Pattern is simple and finite
- Development environment performance is acceptable
- Type safety is critical
-
Use
literalRegex()when:- You have a finite set of valid results
- Need regex performance with type safety
- Parsing enums or fixed vocabularies
- Want runtime validation of expected values
Simple patterns (safe for advancedTypedRegex()):
// ✅ These compile quickly
'hello'
'cat|dog|bird'
'[0-9]'
'[abc]'
'a?'
'hello|world'Medium patterns (use typedRegex()):
// ⚠️ These may slow compilation
'[a-zA-Z]+'
'\\d{2,4}'
'(hello|hi) (world|earth)'
'[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+'Complex patterns (use regex()):
// ❌ These can crash TypeScript server with advancedTypedRegex
'([a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\\.[a-zA-Z]{2,})'
'\\b(?:[0-9]{1,3}\\.){3}[0-9]{1,3}\\b'
'(?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\\.){3}(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)'// Create a type that validates regex patterns
type ValidatedRegex<P extends string> =
CompileRegex<P> extends never
? { error: 'Invalid regex pattern' }
: { pattern: P; compiled: CompileRegex<P> };
// Usage
type Valid = ValidatedRegex<'hello|world'>; // ✅ { pattern: ..., compiled: ... }
type Invalid = ValidatedRegex<'[unclosed'>; // ❌ { error: 'Invalid regex pattern' }// Create types that only accept valid regex strings
type RegexPattern = string & { __brand: 'regex' };
const createTypedRegex = <P extends string>(
pattern: P extends ValidatedRegex<P>['pattern'] ? P : never
): Parser<Enumerate<CompileRegex<P>>> => {
return advancedTypedRegex(pattern);
};
// Usage - only valid patterns are accepted
const validParser = createTypedRegex('hello|world'); // ✅
// const invalidParser = createTypedRegex('[unclosed'); // ❌ Compile error// Compose patterns at the type level
type Digit = '[0-9]';
type Letter = '[a-zA-Z]';
type Identifier = `${Letter}(${Letter}|${Digit})*`;
// Use composed patterns
const identifierParser = typedRegex('' as Identifier);// The type system provides helpful error messages
const invalidParser = advancedTypedRegex('[');
// Error: Type '[' is not a valid regex pattern
// Use type assertions for debugging
type DebugPattern = CompileRegex<'[0-9]+'>;
// Hover in IDE to see the compiled NFA structureconst safeRegexParser = <P extends string>(pattern: P) => {
try {
return typedRegex(pattern);
} catch (error) {
console.error(`Invalid regex pattern: ${pattern}`, error);
return fail(`Invalid regex: ${pattern}`);
}
};// Use helper types to debug complex patterns
type DebugEnumeration<P extends string> = {
pattern: P;
compiled: CompileRegex<P>;
enumerated: Enumerate<CompileRegex<P>>;
};
// Example
type Debug = DebugEnumeration<'[0-9]+'>;
// Hover in IDE to see: { pattern: '[0-9]+', compiled: ..., enumerated: `0${string}` | `1${string}` | ... }This comprehensive regex and type safety system makes Combi-Parse uniquely powerful for building type-safe parsers while providing flexibility for different performance and complexity requirements.