-
Notifications
You must be signed in to change notification settings - Fork 82
Implement decomposition for BitonicSort
#1089
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement decomposition for BitonicSort
#1089
Conversation
|
@tanujkhattar @fdmalone ptal |
| with pytest.raises(NotImplementedError): | ||
| bloq.decompose_bloq() | ||
| bloq = Comparator(2**4) | ||
| assert bloq.t_complexity().t == 88 - 4 - 7 * 4 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why the -7 * 4?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had some reason to justify this, but can't remember anymore. I think somewhere in the previous t_complexity it assumed CSwap was t=14, but the current impl is t=7
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The paper assumes CSWAP is t=4 using 1 ancilla, right? We should probably open an issue to track that though. Textbook CSWAP should be t=7.
CSWAP to 1 Toffoli can be reduced as follows

and then the Toffoli can be implemented using 4 T either using a Relative phase toffoli or using an ancilla with AND gate.
fdmalone
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So nice! I'm glad someone finally did this.
tanujkhattar
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, great to have this implemented!
qualtran/bloqs/arithmetic/sorting.py
Outdated
| """Given k numbers in [0, L), compare every pair that is `offset` apart. | ||
| Args: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please more details to the docstring - do we compare and swap or we just store the result of the comparison somewhere? Are the RIGHT junk registers storing the result of the comparison? Or are they supposed to be completely ignored?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated all docstrings.
qualtran/bloqs/arithmetic/sorting.py
Outdated
| Args: | ||
| bitsize: Number of bits used to represent each integer. | ||
| L: Upper-bound (excluded) on the input integers. | ||
| k: Number of integers in each half |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please specify that k must be a power of 2. Also specify the complexity in terms of T / Toffoli costs, ancilla cost and circuit depth.
Also, would it be a lot more work if we want to support merging two lists of different sizes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It isn't too difficult, but the edge cases need to be dealt with care. I wanted to get the power-of-2 case working first, will implement the general case in a follow up.
qualtran/bloqs/arithmetic/sorting.py
Outdated
| r"""Sort k numbers in the range {0, ..., L-1}. | ||
| Args: | ||
| L: Upper-bound (excluded) on the input integers. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please specify that k must be a power of 2. Also specify the complexity in terms of T / Toffoli costs, ancilla cost and circuit depth.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added to docstring
| return Signature( | ||
| [ | ||
| Register("xs", BoundedQUInt(self.bitsize, self.L), shape=(self.k,)), | ||
| Register("junk", QBit(), shape=(self.num_comparisons,), side=Side.RIGHT), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we have junk RIGHT registers everywhere ? Are we saving on any uncomputation? Can you also expand the docstring to explain this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the comparator circuit from the reference:

I wrote the signature similar to the And bloq where the result bit is RIGHT.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think its worth highlighting this in the docstring -- i.e. the junk stores the result of the comparisons. This would be necessary to make the sorting bloq reversible so you can apply the unitary in reverse and get back to the original sequence of unsorted elements if needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated the docstring explaining this.
qualtran/bloqs/arithmetic/sorting.py
Outdated
| If each half has length $k$, then the merge network uses $k * (\log{k} + 1)$ comparisons | ||
| when $k$ is a power of 2. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please also add information / junk stores since it can be potentially useful depending upon the context.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a brief description (similar to the one in BitonicSort).
Did you also want a description of the order of the ancillas?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The order is not as important, but we should at least specify that the ancilla qubits store the results of intermediate comparisons.
qualtran/bloqs/arithmetic/sorting.py
Outdated
| The bitonic sorting network requires $\frac{k}{2} \frac{\log{k} (\log{k} + 1)}{2}$ total comparisons, | ||
| and has depth $\frac{\log{k} (\log{k} + 1)}{2}$, when $k$ is a power of 2. Each comparison generates | ||
| one ancilla qubit, so the total size of junk register equals the number of comparisons. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great!
tanujkhattar
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM % some final comments
qualtran/bloqs/arithmetic/sorting.py
Outdated
| $$ | ||
| The above expression is at most $k / 2$. | ||
| Each comparison generates one ancilla qubit, which are aggregated into the `junk` register. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| Each comparison generates one ancilla qubit, which are aggregated into the `junk` register. | |
| Each comparison generates one ancilla qubit which stores the result of the comparison. The ancillas | |
| are aggregated into the `junk` register. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated
qualtran/bloqs/arithmetic/sorting.py
Outdated
| + \max((k\mod 2\delta) - \delta, 0) | ||
| $$ | ||
| The above expression is at most $k / 2$. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There seems to be a break after "this requires, . The above expression".
We should finish the "this requires," line. Maybe something like
| The above expression is at most $k / 2$. | |
| number of comparisons. The above expression is at most $k / 2$. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oops, fixed
qualtran/bloqs/arithmetic/sorting.py
Outdated
| If each half has length $k$, then the merge network uses $k (1+\log{k})$ comparisons | ||
| when $k$ is a power of 2. | ||
| Each comparison generates one ancilla qubit, which are aggregated into the `junk` register. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto: Let's specify that the ancilla qubit stores the result of comparisons
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated
Decomposition for sorting lists whose size is a power of 2. Partially addresses #219
Described briefly in https://www.nature.com/articles/s41534-018-0071-5, circuit based on wiki.