Skip to content

🧹 Clarification: Optional properties and null vs. undefined when used in languages like Python that only has a single None type #1586

@Ark-kun

Description

@Ark-kun

Specification section

?

What is unclear?

Please help us.

Pydantic v2 started converting Python's Optional[str] type to {"anyOf":[{"type":"string"}, {"type":"null"}]} Json Schema instead of an optional string property.This breaks many existing tools that use JsonSchemas, but the maintainer claims that JsonSchema is designed this way. pydantic/pydantic#7161
Please help us get clarity whether this is really what Json Schema spec design intends.

I want to ask whether this is indeed the intention of JsonSchema design and if it's not the case, then hopefully the maintainers can be persuaded to restore the previous behavior.

Problem background:
Javascript has null and undefined types.
Python has None singleton type. It's automatically used in some cases. For example, when function does not return anything, the actual returned value is None.

Let's look at this simple JsonSchema that has an optional field:

{
  "title": "Something",
  "type": "object",
  "properties": {
    "requiredProp": {"type": "string"},
    "optionalProp": {"type": "string"},
  "required": [ "requiredProp"]
}

Now let's try to represent such schema using Python:

class Something:
  requiredProp: str
  optionalProp: Optional[str]

For this type, Pydantic v2 produces the following JsonSchema:

{
  "title": "Something",
  "type": "object",
  "properties": {
    "requiredProp": {
      "title": "Requiredprop",
      "type": "string"
    },
    "optionalProp": {
      "title": "Optionalprop",
      "anyOf": [
        {"type": "string"},
        {"type": "null"}
      ]
    }
  },
  "required": ["requiredProp", "optionalProp"]
}

Notice that the "optionalProp" is required and it's type declaration is {"anyOf":[{"type":"string"}, {"type":"null"}]}.

And if we slightly change the class to add the default value:

class Something:
  requiredProp: str
  optionalProp: Optional[str] = None

some_obj = Something(requiredProp="foo")

The generated schema becomes

{
  "title": "Something",
  "type": "object",
  "properties": {
    "requiredProp": {
      "title": "Requiredprop",
      "type": "string"
    },
    "optionalProp": {
      "title": "Optionalprop",
      "anyOf": [
        {"type": "string"},
        {"type": "null"}
      ]
    }
  },
  "required": ["requiredProp"]
}

The optionalProp type declaration still remains {"anyOf":[{"type":"string"}, {"type":"null"}]}.

So it's not possible to generate a normal optional string property.

Is it the intention of JsonSchema that programming languages that do not have the undefined/null duality of Javascript cannot adhere to simple JSON schemas with simple optional properties?

Would it be OK to treat Python's None as Javascript's undefined in cases of optional function/constructor parameters or are these types considered to be fundamentally different?

Proposal

I propose to clarify that in non-JS languages optional properties with the default None/NULL/nil value can be treated as Javascript's undefined and can be described using JsonSchema's optional property mechanism.

Do you think this work might require an [Architectural Decision Record (ADR)]? (significant or noteworthy)

No

Activity

gregsdennis

gregsdennis commented on Feb 21, 2025

@gregsdennis
Member

I think @Julian is probably the best person to comment on Python-specific things.


I do have a question: would you consider this to be valid data?

{
  "requiredProp": "",
  "optionalProp": null
}

Specifically, is a null value interpreted by your code the same as the property just being absent?

I'd guess that the Pydantic folks might think it is valid if the property is optional (null and absence are the same), whereas maybe you don't.


As far as JSON Schema is concerned, a property with a null value is distinct from the absence of that property. This is the design intent of JSON Schema.

Julian

Julian commented on Feb 21, 2025

@Julian
Member

(The JSON Schema spec doesn't cover schema generation from a language's types, so a bit of this discussion will always be groundless. But nevertheless, yes, opinions below.)

What you're asking is mostly about "shortcomings" in the typing annotation system in Python really more than anything else I think. And I put "shortcomings" in quotes here because the case where this matters -- at least when it comes to classes -- is one I would call a bad idea in Python, so I don't personally cry too hard about it not being possible.

Optional[str] in Python, as it seems has been pointed out in the ticket there, is simply shorthand for str | None.
There is no way to express the concept of "might not exist" as part of a normal class. E.g. for your example:

class Something:
  requiredProp: str
  optionalProp: Optional[str] = None

I disagree that even this expresses the JSON / JSON Schema notion of "optionalProp may not be present". That notion is expressed by typing.NotRequired in the case of dicts, and in the case of classes it.. does not exist (and above I called it a bad idea, I think it is for any use case other than using class syntax to generate schemas).

Specifically, for classes it really would look like:

class _S1:
    requiredProp: str
    optionalProp: str

class _S2:
    requiredProp: str

Something: _S1 | _S2

but there's no shorthand for that, and clearly it's untenable for multiple such properties -- and again I think for normal Python classes it's ridiculous to design one which sometimes doesn't have an attribute (but this is the direct parallel to dicts not having a key).

I propose to clarify that in non-JS languages optional properties with the default None/NULL/nil value can be treated as Javascript's undefined and can be described using JsonSchema's optional property mechanism.

I short I'd disagree with that both from a JSON Schema perspective and from a Python developer's perspective, though one not really familiar with Pydantic's norms.

I'm not saying this solves the upstream problem, just that "treat None specially" seems very wrong. To me my first guess would be an annotation a la NotRequired for non-TypedDicts is the right shape of solution.

jdesrosiers

jdesrosiers commented on Feb 21, 2025

@jdesrosiers
Member

As Julian said, the spec doesn't cover how schemas map to a language's type system, but I can share my opinion.

First, a couple things the keep in mind. Remember that JSON Schema describes JSON, not JavaScript and undefined is not a feature of JSON. The absence of a value is effectively the same concept, but you can't assign something to be undefined ({ "foo": undefined }) like you can in JavaScript. Also, as Greg pointed out, null isn't the same as undefined. In JSON null is a value, not an indicator of the absence of a value as it is in most languages. In the same way that boolean is a type with two possible values (true and false), null is a type with one possible value (null). When a JSON Schema says a property is null, it means it must preset with the value null.

IMO, that makes JSON's null a JSON-specific concept that should be avoided unless you're specifically trying model JSON that has nulls and you definitely shouldn't equate it to common concepts of null/nil/None/etc. Since there's no concept in Python that translates to JSON's concept of null, I wouldn't expect it to ever generate schemas that use null. I think it makes the most sense to equate Python's None with the absence of a value in JSON.

So, Something(requiredProp="foo", optionalProp=None) should be considered equivalent to { "requiredProp": "foo" }.

I don't think using a None default value should make any difference to the generated schema. The instance created from instance1 = Something(requiredProp="foo", optionalProp=None) and instance2 = Something(requiredProp="foo") are indistinguishable. Both instance1.optionalProp and instance2.optionalProp have None. Therefore the JSON representation should be the same as well.

This approach also has the benefit of resulting in simpler and more idiomatic JSON Schemas.

Again, there's no official correct or incorrect way to do this. This is just my recommendation.

Julian

Julian commented on Feb 21, 2025

@Julian
Member

(Responding again just in case you didn't know the below Jason, but if you did and still think your way obviously all fine to disagree:

Since there's no concept in Python that translates to JSON's concept of null, I wouldn't expect it to ever generate schemas that use null. I think it makes the most sense to equate Python's None with the absence of a value in JSON.

Python's None serializes as null, and it's very common to have None wherever you'd like in Python as a real value, so the equivalence is there already / I think that ship has long sailed, which is why I disagreed (strongly) with:

So, Something(requiredProp="foo", optionalProp=None) should be considered equivalent to { "requiredProp": "foo" }.

jdesrosiers

jdesrosiers commented on Feb 21, 2025

@jdesrosiers
Member

Thanks for the correction Julian! It's been a while since I've written Python and didn't remember that correctly. That means that Python's None is equivalent to JSON's null. In that case, generating schemas from Python using JSON null is logically sound.

However, JSON that uses null to represent absent values is not idiomatic JSON and makes schemas unnecessarily complex, awkward, and renders some JSON Schema keywords unusable. That's the problem that originally motivated this question. So, I still think it would be best to equate None with not-present even though null isn't technically wrong. I recognize that that could cause some friction in the Python ecosystem that serializes nulls by default. If it's not too hard to get around that, you could make the lives of the users of your JSON and JSON Schemas much easier.

gregsdennis

gregsdennis commented on Feb 28, 2025

@gregsdennis
Member

@Ark-kun does the above answer your questions?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

      Development

      No branches or pull requests

        Participants

        @Julian@jdesrosiers@Ark-kun@gregsdennis

        Issue actions

          🧹 Clarification: Optional properties and null vs. undefined when used in languages like Python that only has a single None type · Issue #1586 · json-schema-org/json-schema-spec