Skip to content

Include column offset information in the tokenizer #97997

Closed
@lysnikolaou

Description

@lysnikolaou
Member

The tokenizer currently only holds information about the current line number (along with the starting line number for strings that span multiple lines). This makes computing column offset harder, since it has to be done with pointer arithmetic using pointers to the beginning and end of the token (even more complicated when line continuations or multiline tokens happen).

Feature or enhancement

All of this can be significantly simplified, if we keep a column offset counter in the tokenizer state.

Activity

self-assigned this
on Oct 6, 2022
added a commit that references this issue on Oct 6, 2022

pythongh-97997: Add col_offset field to tokenizer and use that for AS…

added a commit that references this issue on Oct 7, 2022

gh-97997: Add col_offset field to tokenizer and use that for AST nodes (

3de08ce
added a commit that references this issue on Oct 8, 2022
added a commit that references this issue on Oct 11, 2022

pythongh-97997: Add col_offset field to tokenizer and use that for AS…

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

Labels

type-featureA feature request or enhancement

Projects

No projects

Milestone

No milestone

Relationships

None yet

    Development

    Participants

    @lysnikolaou

    Issue actions

      Include column offset information in the tokenizer · Issue #97997 · python/cpython