Skip to content

Commit 9031a90

Browse files
committed
Update for migrating compiler info the the notes repo
1 parent 8c51c74 commit 9031a90

8 files changed

+19
-543
lines changed

Architectural-Overview.md

Lines changed: 5 additions & 140 deletions
Original file line numberDiff line numberDiff line change
@@ -1,150 +1,15 @@
11
## Layer Overview
2-
3-
![Architectural overview.](https://raw.githubusercontent.com/wiki/Microsoft/TypeScript/images/architecture.png)
4-
5-
* **Core TypeScript Compiler**
6-
* **Parser:** Starting from a set of sources, and following the productions of the language grammar, to generate an Abstract Syntax Tree (AST)
7-
8-
* **Binder:** Linking declarations contributing to the same structure using a Symbol (e.g. different declarations of the same interface or module, or a function and a module with the same name). This allows the type system to reason about these named declarations.
9-
10-
* **Type resolver/ Checker:** Resolving types of each construct, checking semantic operations and generate diagnostics as appropriate.
11-
12-
* **Emitter:** Output generated from a set of inputs (.ts and .d.ts) files can be one of: JavaScript (.js), definitions (.d.ts), or source maps (.js.map)
13-
14-
* **Pre-processor:** The "Compilation Context" refers to all files involved in a "program". The context is created by inspecting all files passed in to the compiler on the command line, in order, and then adding any files they may reference directly or indirectly through `import` statements and `/// <reference path=... />` tags.
15-
The result of walking the reference graph is an ordered list of source files, that constitute the program.
16-
When resolving imports, preference is given to ".ts" files over ".d.ts" files to ensure the most up-to-date files are processed.
17-
The compiler does a node-like process to resolve imports by walking up the directory chain to find a source file with a .ts or .d.ts extension matching the requested import.
18-
Failed import resolution does not result in an error, as an ambient module could be already declared.
19-
20-
* **Standalone compiler (tsc):** The batch compilation CLI. Mainly handle reading and writing files for different supported engines (e.g. Node.js)
21-
22-
* **Language Service:** The "Language Service" exposes an additional layer around the core compiler pipeline that are best suiting editor-like applications.
23-
The language service supports the common set of a typical editor operations like statement completions, signature help, code formatting and outlining, colorization, etc... Basic re-factoring like rename, Debugging interface helpers like validating breakpoints as well as TypeScript-specific features like support of incremental compilation (--watch equivalent on the command-line). The language service is designed to efficiently handle scenarios with files changing over time within a long-lived compilation context; in that sense, the language service provides a slightly different perspective about working with programs and source files from that of the other compiler interfaces.
24-
> Please refer to the [[Using the Language Service API]] page for more details.
25-
26-
* **Standalone Server (tsserver):** The `tsserver` wraps the compiler and services layer, and exposes them through a JSON protocol.
27-
> Please refer to the [[Standalone Server (tsserver)]] for more details.
28-
292
## Data Structures
303

31-
* **Node:** The basic building block of the Abstract Syntax Tree (AST). In general node represent non-terminals in the language grammar; some terminals are kept in the tree such as identifiers and literals.
32-
33-
* **SourceFile:** The AST of a given source file. A SourceFile is itself a Node; it provides an additional set of interfaces to access the raw text of the file, references in the file, the list of identifiers in the file, and mapping from a position in the file to a line and character numbers.
34-
35-
* **Program:** A collection of SourceFiles and a set of compilation options that represent a compilation unit. The program is the main entry point to the type system and code generation.
36-
37-
* **Symbol:** A named declaration. Symbols are created as a result of binding. Symbols connect declaration nodes in the tree to other declarations contributing to the same entity. Symbols are the basic building block of the semantic system.
38-
39-
* **Type:** Types are the other part of the semantic system. Types can be named (e.g. classes and interfaces), or anonymous (e.g. object types).
40-
41-
* **Signature:** There are three types of signatures in the language: call, construct and index signatures.
4+
Moved to [the glossary](https://github.com/microsoft/TypeScript-Compiler-Notes/blob/main/GLOSSARY.md) of the complier-notes repo.
425

436
## Overview of the compilation process
447

45-
The process starts with preprocessing.
46-
The preprocessor figures out what files should be included in the compilation by following references (`/// <reference path=... />` tags and `import` statements).
47-
48-
The parser then generates AST `Node`s.
49-
These are just an abstract representation of the user input in a tree format.
50-
A `SourceFile` object represents an AST for a given file with some additional information like the file name and source text.
51-
52-
The binder then passes over the AST nodes and generates and binds `Symbol`s.
53-
One `Symbol` is created for each named entity.
54-
There is a subtle distinction but several declaration nodes can name the same entity.
55-
That means that sometimes different `Node`s will have the same `Symbol`, and each `Symbol` keeps track of its declaration `Node`s.
56-
For example, a `class` and a `namespace` with the same name can *merge* and will have the same `Symbol`.
57-
The binder also handles scopes and makes sure that each `Symbol` is created in the correct enclosing scope.
58-
59-
Generating a `SourceFile` (along with its `Symbol`s) is done through calling the `createSourceFile` API.
60-
61-
So far, `Symbol`s represent named entities as seen within a single file, but several declarations can merge multiple files, so the next step is to build a global view of all files in the compilation by building a `Program`.
62-
63-
A `Program` is a collection of `SourceFile`s and a set of `CompilerOptions`.
64-
A `Program` is created by calling the `createProgram` API.
65-
66-
From a `Program` instance a `TypeChecker` can be created.
67-
`TypeChecker` is the core of the TypeScript type system.
68-
It is the part responsible for figuring out relationships between `Symbols` from different files, assigning `Type`s to `Symbol`s, and generating any semantic `Diagnostic`s (i.e. errors).
8+
Moved to the root readme of the [complier-notes repo](https://github.com/microsoft/TypeScript-Compiler-Notes).
699

70-
The first thing a `TypeChecker` will do is to consolidate all the `Symbol`s from different `SourceFile`s into a single view, and build a single Symbol Table by "merging" any common `Symbol`s (e.g. `namespace`s spanning multiple files).
71-
72-
After initializing the original state, the `TypeChecker` is ready to answer any questions about the program.
73-
Such "questions" might be:
74-
* What is the `Symbol` for this `Node`?
75-
* What is the `Type` of this `Symbol`?
76-
* What `Symbol`s are visible in this portion of the AST?
77-
* What are the available `Signature`s for a function declaration?
78-
* What errors should be reported for a file?
79-
80-
The `TypeChecker` computes everything lazily; it only "resolves" the necessary information to answer a question.
81-
The checker will only examine `Node`s/`Symbol`s/`Type`s that contribute to the question at hand and will not attempt to examine additional entities.
82-
83-
An `Emitter` can also be created from a given `Program`.
84-
The `Emitter` is responsible for generating the desired output for a given `SourceFile`; this includes `.js`, `.jsx`, `.d.ts`, and `.js.map` outputs.
85-
8610
## Terminology
8711

88-
##### **Full Start/Token Start**
89-
90-
Tokens themselves have what we call a "full start" and a "token start". The "token start" is the more natural version, which is the position in the file where the text of a token begins. The "full start" is the point at which the scanner began scanning since the last significant token. When concerned with trivia, we are often more concerned with the full start.
91-
92-
Function | Description
93-
---------|------------
94-
`ts.Node.getStart` | Gets the position in text where the first token of a node started.
95-
`ts.Node.getFullStart` | Gets the position of the "full start" of the first token owned by the node.
96-
97-
#### **Trivia**
98-
99-
Syntax trivia represent the parts of the source text that are largely insignificant for normal understanding of the code, such as whitespace, comments, and even conflict markers.
100-
101-
Because trivia are not part of the normal language syntax (barring ECMAScript ASI rules) and can appear anywhere between any two tokens, they are not included in the syntax tree. Yet, because they are important when implementing a feature like refactoring and to maintain full fidelity with the source text, they are still accessible through our APIs on demand.
102-
103-
Because the `EndOfFileToken` can have nothing following it (neither token nor trivia), all trivia naturally precedes some non-trivia token, and resides between that token's "full start" and the "token start"
104-
105-
It is a convenient notion to state that a comment "belongs" to a `Node` in a more natural manner though. For instance, it might be visually clear that the `genie` function declaration owns the last two comments in the following example:
106-
107-
```TypeScript
108-
var x = 10; // This is x.
109-
110-
/**
111-
* Postcondition: Grants all three wishes.
112-
*/
113-
function genie([wish1, wish2, wish3]: [Wish, Wish, Wish]) {
114-
while (true) {
115-
}
116-
} // End function
117-
```
118-
119-
This is despite the fact that the function declaration's full start occurs directly after `var x = 10;`.
120-
121-
We follow [Roslyn's notion of trivia ownership](https://github.com/dotnet/roslyn/wiki/Roslyn%20Overview#syntax-trivia) for comment ownership. In general, a token owns any trivia after it on the same line up to the next token. Any comment after that line is associated with the following token. The first token in the source file gets all the initial trivia, and the last sequence of trivia in the file is tacked onto the end-of-file token, which otherwise has zero width.
122-
123-
For most basic uses, comments are the "interesting" trivia. The comments that belong to a Node which can be fetched through the following functions:
124-
125-
Function | Description
126-
---------|------------
127-
`ts.getLeadingCommentRanges` | Given the source text and position within that text, returns ranges of comments between the first line break following the given position and the token itself (probably most useful with `ts.Node.getFullStart`).
128-
`ts.getTrailingCommentRanges` | Given the source text and position within that text, returns ranges of comments until the first line break following the given position (probably most useful with `ts.Node.getEnd`).
129-
130-
As an example, imagine this portion of a source file:
131-
132-
```TypeScript
133-
debugger;/*hello*/
134-
//bye
135-
/*hi*/ function
136-
```
137-
138-
The full start for the `function` keyword begins at the `/*hello*/` comment, but `getLeadingCommentRanges` will only return the last 2 comments:
139-
140-
```
141-
d e b u g g e r ; / * h e l l o * / _ _ _ _ _ [CR] [NL] _ _ _ _ / / b y e [CR] [NL] _ _ / * h i * / _ _ _ _ f u n c t i o n
142-
↑ ↑ ↑ ↑ ↑
143-
full start look for first comment second comment token start
144-
leading comments
145-
starting here
146-
```
147-
148-
Appropriately, calling `getTrailingCommentRanges` on the end of the debugger statement will extract the `/*hello*/` comment.
12+
### **Full Start/Token Start**
13+
### **Trivia**
14914

150-
In the event that you are concerned with richer information of the token stream, `createScanner` also has a `skipTrivia` flag which you can set to `false`, and use `setText`/`setTextPos` to scan at different points in a file.
15+
See [the Scanner](https://github.com/microsoft/TypeScript-Compiler-Notes/blob/main/codebase/src/compiler/scanner.md) in the complier-notes repo.

Common-Errors.md

Lines changed: 1 addition & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -1,14 +1 @@
1-
# Introduction
2-
3-
The list below captures some of the commonly confusing error messages that you may encounter when using the TypeScript language and Compiler
4-
5-
# Commonly Confusing Errors
6-
## "tsc.exe" exited with error code 1.
7-
8-
*Fixes:*
9-
* check file-encoding is UTF-8 — https://typescript.codeplex.com/workitem/1587
10-
11-
## external module XYZ cannot be resolved
12-
13-
*Fixes:*
14-
* check if module path is case-sensitive — https://typescript.codeplex.com/workitem/2134
1+
Deprecated doc

Compiler-Internals.md

Lines changed: 1 addition & 68 deletions
Original file line numberDiff line numberDiff line change
@@ -1,68 +1 @@
1-
This page details the compiler implementation and its philosophy.
2-
Because it focuses on implementation, it's necessarily out-of-date and incomplete.
3-
4-
Before reading this page, be sure to read the [[Architectural Overview]] first.
5-
6-
# General Design
7-
8-
## Laziness
9-
10-
To support language services that respond interactively, the compiler is lazy: it does not calculate any information until it is required.
11-
This allows it to respond quickly when the language service requests the type of a variable or its members.
12-
Unfortunately, laziness also makes the compiler code more complicated.
13-
14-
As an overview, after parsing is complete, the binder does nothing but identify symbols.
15-
The checker then waits until a particular symbol is requested to calculate type information, etc.
16-
17-
## Immutability
18-
19-
Each phase of the compiler (parser, binder, etc -- see below for details) treats data structures from the previous phases as immutable.
20-
In addition, data structures created within each phase are not usually modified after their creation.
21-
This requires a look-aside table in some cases.
22-
For example, because the binder only looks at one file at a time, the checker needs a merged-symbols table to track merged declarations.
23-
It checks whether a symbol has an entry in the merged-symbols table each time before it uses a symbol.
24-
25-
# Parser
26-
27-
The parser is a recursive descent parser.
28-
It's pretty resilient to small changes, so if you search for function names matching the thing you want to change, you can probably get away with not having to think about the whole parser.
29-
There aren't any surprises in the general implementation style here.
30-
31-
## Incremental parsing
32-
33-
## ECMAScript parsing contexts
34-
35-
## ECMAScript automatic semicolon insertion
36-
37-
## JSDoc parsing
38-
39-
# Binder
40-
41-
# Checker
42-
43-
The checker is almost 20,000 lines long, and does almost everything that's not syntactic -- it's the second of two semantic passes, after binding, which is the first semantic pass.
44-
Since the semantics of a entire program can change dramatically with a couple of keystrokes (e.g. renaming a class), a new checker gets created every time the language service requests information.
45-
Creating a checker is cheap, though, because the compiler as a whole is so lazy.
46-
You just have to create some basic types and get the binder to build the global symbol table.
47-
48-
## Grammatical checking
49-
50-
## Overload resolution
51-
52-
## Type argument inference
53-
54-
### Type argument fixing
55-
56-
# Transformer
57-
58-
The transformer is nearing completion to replace the emitter.
59-
The change in name is because the *emitter* translated TypeScript to JavaScript.
60-
The *transformer* transforms TypeScript or JavaScript (various versions) to JavaScript (various versions) using various module systems.
61-
The input and output are basically both trees from the same AST type, just using different features.
62-
There is still a small printer that writes any AST back to text.
63-
64-
## Rewriting & synthesized nodes
65-
66-
## Sourcemap generation
67-
68-
# Language service
1+
> ### This page has moved to https://github.com/microsoft/TypeScript-Compiler-Notes/

Contributing-to-TypeScript.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,8 @@ To log a bug, just use the GitHub issue tracker. Confirmed bugs will be labelled
88

99
Before we can accept a pull request from you, you'll need to sign the Contributor License Agreement (CLA). See the "Legal" section of the [CONTRIBUTING.md guide](https://github.com/Microsoft/TypeScript/blob/main/CONTRIBUTING.md). That document also outlines the technical nuts and bolts of submitting a pull request. Be sure to follow our [[Coding Guidelines|coding-guidelines]].
1010

11+
You can learn more about the compiler's codebase at https://github.com/microsoft/TypeScript-Compiler-Notes/
12+
1113
### Suggestions
1214

1315
We're also interested in your feedback in future of TypeScript. You can submit a suggestion or feature request through the issue tracker. To make this process more effective, we're asking that these include more information to help define them more clearly. Start by reading the [[TypeScript Design Goals]] and refer to [[Writing Good Design Proposals]] for information on how to write great feature proposals.

Node-Target-Mapping.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,9 @@
11
## Recommended Node TSConfig settings
22

33
You can let TypeScript compile as little as possible by knowing what the baseline support
4-
for ECMAScript features are available in your node version.
4+
for ECMAScript features are available in your node version
5+
6+
You can also use https://github.com/tsconfig/bases/ to find `tsconfig.json`s to extend, simplifying your own JSON files to just the options for your project.
57

68
To update this file, you can use [node.green](https://node.green) to map to the different options in [microsoft/typescript@src/lib](https://github.com/Microsoft/TypeScript/tree/main/src/lib)
79

Tooling-On-The-Compiler-Repo.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,7 @@ The bot which adds/removes labels and assignees lives at [microsoft/TypeScript-r
1919

2020
### Repros
2121

22-
A scheduled task which evaulates code samples generated in [the Bug Workbench](https://www.typescriptlang.org/dev/bug-workbench).
22+
A scheduled task which evaluates code samples generated in [the Bug Workbench](https://www.typescriptlang.org/dev/bug-workbench).
2323

2424
This automation runs via a [daily GitHub Action](https://github.com/microsoft/TypeScript/blob/master/.github/workflows/twoslash-repros.yaml) where the majority of the code lives at [`microsoft/TypeScript-Twoslash-Repro-Action`](https://github.com/microsoft/TypeScript-Twoslash-Repro-Action)
2525

0 commit comments

Comments
 (0)