Skip to content

JSON Protocol #1880

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 60 commits into from
Aug 24, 2019
Merged

JSON Protocol #1880

merged 60 commits into from
Aug 24, 2019

Conversation

mpharrigan
Copy link
Collaborator

@mpharrigan mpharrigan commented Jul 26, 2019

Protocol to support serializing cirq objects

  • cirq.to_json
  • cirq.read_json

The former looks for a magic method _json_dict_(self) which returns a dictionary keyed by strings and whose values are either basic python types or are objects that support the SupportsJSON protocol. They dictionary must have an entry with key "cirq_type", to be used during loading. The latter looks for a magic classmethod _from_json_dict_(cls, **kwargs) or falls back to calling the constructor.

_json_dict_ is required for objects to support serialization, so cirq.protocols.to_json_dict is provided as convenience to implementers. You must only provide a list of attribute names which will be included in the dictionary.

During deserialization, the "cirq_type" value is passed to an ordered list of functions. Each function either resolves the string name into a python type or the next function is called. A default mapping from class name is provided as a dictionary.

There is a test fixture that tests any object exposed at the top cirq-level import by roundtripping to JSON. There is a blacklist of classes for which it doesn't make sense to serialize, and another "xfail" list for objects that haven't been upgraded to support serialization yet, but there's no reason not for them to.

This pull request includes support for Qubits, Gates, Operations, Moments, Circuits, PauliString, and a couple of other things. I recommend reviewing and merging this PR which extends support for these high-priority objects and then working on the rest over time. Any new objects should implement the serialization protocol unless there's good reason not to.

This pull request makes some small changes to objects, mostly adding @propertys to provide a consistent expectation that arguments passed to the constructor are available under the same name as attributes/properties.

@googlebot googlebot added the cla: yes Makes googlebot stop complaining. label Jul 26, 2019
and remove redundant helper functions
Copy link
Contributor

@Strilanc Strilanc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like how classes have to declare which attributes will be initialized/read/etc. This is too inflexible w.r.t. how state is stored vs exposed. We should do something more generic that requires a bit more work per gate, such as this:

class XPowGate:
    ...
    def _json_(self):
        return {'exponent': 0.5, 'global_shift': 0.5}

    @staticmethod
    def _from_json_id_():
        return 'XPow'

    @staticmethod
    def _from_json_(args)
        return XPowGate(args.json['exponent'], args.json['global_shift'])

class GateOperation:
    ...
    def _json_(self):
        return {'qs': [e._json_() for e in self.qubits], 'gate': gate._json_()}

    @staticmethod
    def _from_json_id_():
        return 'gate'

    @staticmethod
    def _from_json_(args)
        gate = args.from_json(json['gate'])
        assert isinstance(gate, cirq.Gate)
        qubits = [Qid.from_json(e) for e in json['qs']]
        return GateOperation(gate, qubits)

And when people call cirq.from_json they can pass in an optional list of classes to be parsed. By default cirq includes all its internal gates in the parsing.

@mpharrigan
Copy link
Collaborator Author

What is args in the above?

I'm not sure exactly what is different in your example vs. what I've implemented other than GateOperation being responsible for calling _json_ on its children rather than having the encoder do it.

How does from_json know what type to construct?

from cirq.type_workarounds import NotImplementedType


# Note for developers:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably have a test that checks that every public class is serializable.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sketched out an implementation of this in the latest commit. There are some classes that we probably don't want or need to make serializable and there are some that are lower priority that I don't want to block this PR from being merged

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one problem with the existing implementation is that it leaves out singletons whose class is private

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I reworked the logic to try anything that's not a function or module, with an explicit blacklist

pass


def to_json_dict(obj, attribute_names):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a bit worried that we're straying into territory that should be solved at the python level instead of at a library level. I don't have a better suggestion, there's just an alarm going off in my head about it.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dataclasses can use the asdict helper. In general, I think hacking into the __dict__ is a little too magic

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Synced offline with @Strilanc. His objection was to the concept of serialization and that it should be done at the python level. Python's built-in serialization is pickle which is too brittle for our case; python's json module in the standard library is what we exploit here and follow the general pattern for extending that library

@mpharrigan
Copy link
Collaborator Author

@dabacon addressed concerns and added support for sympy.Symbol

Copy link
Collaborator

@dabacon dabacon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

import cirq
self._crd = {
'CCXPowGate': cirq.CCXPowGate,
'CCZPowGate': cirq.CCZPowGate,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it necessary to repeat the names twice? You could have a list of the classes, and then {a.__name__: a for a in ...}

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The way I structured it this way is in the eventuality that we re-name a class (or something) and want to support backwards compatible loading where the string name won't match the (new) class name. Of course this could still be tacked on with the scheme you mention but I think it would end up being more confusing.

Let me know what you think and I can make the change if you think it's necessary

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that eventuality, we can create the dictionary as {easy_case} + {hard_stuff} and have the best of both worlds. This is not a blocking change.

@dabacon dabacon added the automerge Tells CirqBot to sync and merge this PR. (If it's running.) label Aug 24, 2019
@CirqBot CirqBot added the front_of_queue_automerge CirqBot uses this label to indicate (and remember) what's being merged next. label Aug 24, 2019
@CirqBot CirqBot merged commit 237f64f into quantumlib:master Aug 24, 2019
@CirqBot CirqBot removed automerge Tells CirqBot to sync and merge this PR. (If it's running.) front_of_queue_automerge CirqBot uses this label to indicate (and remember) what's being merged next. labels Aug 24, 2019
@mpharrigan mpharrigan deleted the json-protocol branch January 28, 2020 21:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla: yes Makes googlebot stop complaining.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants