-
Notifications
You must be signed in to change notification settings - Fork 591
JSON Support #590
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JSON Support #590
Conversation
This is now ready for review. We support all of the above-noted functionality. Some known limitations:
This feature is experimental and requires 22.6.1 of CH cc @genzgd |
Why not something more generic, like type Named struct {
Name string
Column Interface
} Or I'm missing something important? |
@ernado, do you mean to wrap each column here in the named struct? seems possible but the JSON column is like any other column - when AppendRow is called its called on the Column itself. We would need either an exception here to access the name or move the AppendRow to the Named struct. This seemed a bigger change but I'm open to the idea. We need the name of the columns (unlike other formats) as this information needs to be encoded in the data - since JSON is encoded as tuples. We also need to name of the columns at scan time - this is done in the tuple as this is the format its returned in. Ultimately although this led to a bit of replication of the name property I maybe thought it would have less impact...might be wrong. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Amazing amount of work. At some point it probably needs another pass to reduce some duplication but LGTM for now
Add handling for nulls/nil at the beginning of a JSON slice
Great work! |
Relies on ClickHouse/ClickHouse#37482
This support comes in a few steps:
Insert time
Query time (Users left to serialize to string here)
Maps and structs can be interleaved as required.
Changes are significant as we need names with the sub-columns at query and insert time.
cc @CurtizJ