Avoid unnecessary string join and split

https://github.com/microsoft/vscode/blob/4e5d793a7828f1baf62e9e06ce5137023bd17c06/extensions/ipynb/src/helpers.ts#L364-L377

Currently the stream output conversion (from VS Code types to Jupyter) does unnecessary V8 string concatenation and split, which slows down the conversion (using more memory and gc):

* Stream output is either `CellOutputMimeTypes.stderr` or `CellOutputMimeTypes.stdout`, so `convertOutputMimeToJupyterOutput` will always return `string`. Using `prev.concat(curr)` will keep creating arrays
* `splitMultilineString(outputs.join(''))` can slow down the process significantly. It firstly joins all the string, and then split by line breaks, this will trigger v8 to flatten the concatenated string (`outputs.join('')`) and double the memory usage.

We can probably run `splitMultilineString` on each output and concatenate last line of each output with the first line of next output (split first, then join).

	function convertStreamOutput(output: NotebookCellOutput): JupyterOutput {
	const outputs = output.items
	.filter((opit) => opit.mime === CellOutputMimeTypes.stderr \|\| opit.mime === CellOutputMimeTypes.stdout)
	.map((opit) => convertOutputMimeToJupyterOutput(opit.mime, opit.data as Uint8Array) as string)
	.reduceRight<string[]>((prev, curr) => (Array.isArray(curr) ? prev.concat(...curr) : prev.concat(curr)), []);

	const streamType = getOutputStreamType(output) \|\| 'stdout';

	return {
	output_type: 'stream',
	name: streamType,
	text: splitMultilineString(outputs.join(''))
	};
	}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Avoid unnecessary string join and split #129370

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Avoid unnecessary string join and split #129370

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions