-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Qubit allocation / deallocation / move #3240
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
There's an argument to be made that #4100 will fulfill the majority of this feature request. In short:
Qubit-moving is not explicitly covered in that PR, but it does add a "reorder" method to the |
"reorder" would only handle moving if they were in the same qubit entanglement set. If they weren't, of course you could join the two qubit sets then reorder, which would do it but then you've got an unnecessarily joined set. An optimization would be to add logic to core_iterator that explicitly looks for swap gates, and then renames the corresponding two qubits in the two qubit sets (requires a new |
#4169 handles the qubit-moving part. |
The PRs mentioned above provide implicit qubit (de-)allocation in simulation - i.e., there is no "allocate qubit" operation, the simulators simply detect changes in qubit entanglement and adjust accordingly. @dabacon, does this fulfill the requirements for ion trap simulation (particularly with regards to "moving" qubits), or are separate, explicit qubit (de-)allocation operations also necessary? |
From @mpharrigan: could we add support for tracing qubits out (rather than measuring them)? |
Add optimization that ensures independent qubit sets are simulated independently. This is done by adding join, extract, and reorder methods to ActOnArgs, and updating SimulatorBase with the logic to merge qubit sets when necessary and split them when possible. This optimization is enabled or disabled via a new parameter in the simulator constructors: `split_entangled_qubits`. Currently the PR has this set to True by default, though perhaps it should be disabled by default lest it breaks anything? The MPS simulator does not yet have `extract` defined and thus there's no option to enable this feature in MPS simulator's constructor yet, though nothing prevents this from being added later. The perf boost of this implementation is limited because each StepResult still requires the full product state. It's still a speedup because full product state calculations will only have to occur once per moment rather than once per operation, but not as nice as avoiding full product state calculations entirely. *That* optimization will be available in a subsequent PR that never creates the full product state if possible: StepResults will join the product state only on demand, and sampling will sample each substate independently and zip up the results, avoiding the full state join: The WIP is here https://github.com/daxfohl/Cirq/compare/split...daxfohl:sample?expand=1. I ramped up the number of qubits in the benchmarks to 25 for sparse and 12 for DM: From master: ``` (cirq-py3) dax@DESKTOP-Q5MLJ3J:~/cirq$ time pytest dev_tools/profiling/benchmark_simulators_test.py platform linux -- Python 3.8.5, pytest-5.4.3, py-1.10.0, pluggy-0.13.1 benchmark: 3.2.3 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=0.000005 max_time=1.0 calibration_precision=10 warmup=False warmup_iterations=100000) rootdir: /home/dax/cirq plugins: cov-2.5.1, asyncio-0.12.0, benchmark-3.2.3 collected 5 items dev_tools/profiling/benchmark_simulators_test.py ..... [100%] real 0m16.973s user 0m15.754s sys 0m3.862s ``` From split branch (the current PR): ``` (cirq-py3) dax@DESKTOP-Q5MLJ3J:~/cirq$ time pytest dev_tools/profiling/benchmark_simulators_test.py platform linux -- Python 3.8.5, pytest-5.4.3, py-1.10.0, pluggy-0.13.1 benchmark: 3.2.3 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=0.000005 max_time=1.0 calibration_precision=10 warmup=False warmup_iterations=100000) rootdir: /home/dax/cirq plugins: cov-2.5.1, asyncio-0.12.0, benchmark-3.2.3 collected 5 items dev_tools/profiling/benchmark_simulators_test.py ..... [100%] real 0m10.073s user 0m9.082s sys 0m3.805s ``` From sample branch (future iteration mentioned above): ``` (cirq-py3) dax@DESKTOP-Q5MLJ3J:~/cirq$ time pytest dev_tools/profiling/benchmark_simulators_test.py platform linux -- Python 3.8.5, pytest-5.4.3, py-1.10.0, pluggy-0.13.1 benchmark: 3.2.3 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=0.000005 max_time=1.0 calibration_precision=10 warmup=False warmup_iterations=100000) rootdir: /home/dax/cirq plugins: cov-2.5.1, asyncio-0.12.0, benchmark-3.2.3 collected 5 items dev_tools/profiling/benchmark_simulators_test.py ..... [100%] real 0m2.885s user 0m3.523s sys 0m2.597s ``` Initial PR for #3240 Closes #882
@mpharrigan @viathor Note all the PRs for this is merged into master now. Per tracing out, if you add reset gates to things you no longer need, the simulator separates those. So, a bell state between q0, q1, with reset on q1:
Does this satisfy the needs for tracing out? Do we want an optimizer that appends reset gates to unused/unmeasured qubits? Or can we mark this as done? |
@95-martin-orion looks like this can be closed now. |
Just to be clear: the user needs to use DensityMatrixSimulator in order to get the correct behavior for de-allocating qubits. It's strange we let the user shoot themselves in the foot by allowing |
Re-opening. I couldn't get de-allocation to work in any meaningful way, see #4360 |
Re-closing. Turns out there were other issues, see #4360 |
Add optimization that ensures independent qubit sets are simulated independently. This is done by adding join, extract, and reorder methods to ActOnArgs, and updating SimulatorBase with the logic to merge qubit sets when necessary and split them when possible. This optimization is enabled or disabled via a new parameter in the simulator constructors: `split_entangled_qubits`. Currently the PR has this set to True by default, though perhaps it should be disabled by default lest it breaks anything? The MPS simulator does not yet have `extract` defined and thus there's no option to enable this feature in MPS simulator's constructor yet, though nothing prevents this from being added later. The perf boost of this implementation is limited because each StepResult still requires the full product state. It's still a speedup because full product state calculations will only have to occur once per moment rather than once per operation, but not as nice as avoiding full product state calculations entirely. *That* optimization will be available in a subsequent PR that never creates the full product state if possible: StepResults will join the product state only on demand, and sampling will sample each substate independently and zip up the results, avoiding the full state join: The WIP is here https://github.com/daxfohl/Cirq/compare/split...daxfohl:sample?expand=1. I ramped up the number of qubits in the benchmarks to 25 for sparse and 12 for DM: From master: ``` (cirq-py3) dax@DESKTOP-Q5MLJ3J:~/cirq$ time pytest dev_tools/profiling/benchmark_simulators_test.py platform linux -- Python 3.8.5, pytest-5.4.3, py-1.10.0, pluggy-0.13.1 benchmark: 3.2.3 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=0.000005 max_time=1.0 calibration_precision=10 warmup=False warmup_iterations=100000) rootdir: /home/dax/cirq plugins: cov-2.5.1, asyncio-0.12.0, benchmark-3.2.3 collected 5 items dev_tools/profiling/benchmark_simulators_test.py ..... [100%] real 0m16.973s user 0m15.754s sys 0m3.862s ``` From split branch (the current PR): ``` (cirq-py3) dax@DESKTOP-Q5MLJ3J:~/cirq$ time pytest dev_tools/profiling/benchmark_simulators_test.py platform linux -- Python 3.8.5, pytest-5.4.3, py-1.10.0, pluggy-0.13.1 benchmark: 3.2.3 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=0.000005 max_time=1.0 calibration_precision=10 warmup=False warmup_iterations=100000) rootdir: /home/dax/cirq plugins: cov-2.5.1, asyncio-0.12.0, benchmark-3.2.3 collected 5 items dev_tools/profiling/benchmark_simulators_test.py ..... [100%] real 0m10.073s user 0m9.082s sys 0m3.805s ``` From sample branch (future iteration mentioned above): ``` (cirq-py3) dax@DESKTOP-Q5MLJ3J:~/cirq$ time pytest dev_tools/profiling/benchmark_simulators_test.py platform linux -- Python 3.8.5, pytest-5.4.3, py-1.10.0, pluggy-0.13.1 benchmark: 3.2.3 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=0.000005 max_time=1.0 calibration_precision=10 warmup=False warmup_iterations=100000) rootdir: /home/dax/cirq plugins: cov-2.5.1, asyncio-0.12.0, benchmark-3.2.3 collected 5 items dev_tools/profiling/benchmark_simulators_test.py ..... [100%] real 0m2.885s user 0m3.523s sys 0m2.597s ``` Initial PR for quantumlib#3240 Closes quantumlib#882
Add optimization that ensures independent qubit sets are simulated independently. This is done by adding join, extract, and reorder methods to ActOnArgs, and updating SimulatorBase with the logic to merge qubit sets when necessary and split them when possible. This optimization is enabled or disabled via a new parameter in the simulator constructors: `split_entangled_qubits`. Currently the PR has this set to True by default, though perhaps it should be disabled by default lest it breaks anything? The MPS simulator does not yet have `extract` defined and thus there's no option to enable this feature in MPS simulator's constructor yet, though nothing prevents this from being added later. The perf boost of this implementation is limited because each StepResult still requires the full product state. It's still a speedup because full product state calculations will only have to occur once per moment rather than once per operation, but not as nice as avoiding full product state calculations entirely. *That* optimization will be available in a subsequent PR that never creates the full product state if possible: StepResults will join the product state only on demand, and sampling will sample each substate independently and zip up the results, avoiding the full state join: The WIP is here https://github.com/daxfohl/Cirq/compare/split...daxfohl:sample?expand=1. I ramped up the number of qubits in the benchmarks to 25 for sparse and 12 for DM: From master: ``` (cirq-py3) dax@DESKTOP-Q5MLJ3J:~/cirq$ time pytest dev_tools/profiling/benchmark_simulators_test.py platform linux -- Python 3.8.5, pytest-5.4.3, py-1.10.0, pluggy-0.13.1 benchmark: 3.2.3 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=0.000005 max_time=1.0 calibration_precision=10 warmup=False warmup_iterations=100000) rootdir: /home/dax/cirq plugins: cov-2.5.1, asyncio-0.12.0, benchmark-3.2.3 collected 5 items dev_tools/profiling/benchmark_simulators_test.py ..... [100%] real 0m16.973s user 0m15.754s sys 0m3.862s ``` From split branch (the current PR): ``` (cirq-py3) dax@DESKTOP-Q5MLJ3J:~/cirq$ time pytest dev_tools/profiling/benchmark_simulators_test.py platform linux -- Python 3.8.5, pytest-5.4.3, py-1.10.0, pluggy-0.13.1 benchmark: 3.2.3 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=0.000005 max_time=1.0 calibration_precision=10 warmup=False warmup_iterations=100000) rootdir: /home/dax/cirq plugins: cov-2.5.1, asyncio-0.12.0, benchmark-3.2.3 collected 5 items dev_tools/profiling/benchmark_simulators_test.py ..... [100%] real 0m10.073s user 0m9.082s sys 0m3.805s ``` From sample branch (future iteration mentioned above): ``` (cirq-py3) dax@DESKTOP-Q5MLJ3J:~/cirq$ time pytest dev_tools/profiling/benchmark_simulators_test.py platform linux -- Python 3.8.5, pytest-5.4.3, py-1.10.0, pluggy-0.13.1 benchmark: 3.2.3 (defaults: timer=time.perf_counter disable_gc=False min_rounds=5 min_time=0.000005 max_time=1.0 calibration_precision=10 warmup=False warmup_iterations=100000) rootdir: /home/dax/cirq plugins: cov-2.5.1, asyncio-0.12.0, benchmark-3.2.3 collected 5 items dev_tools/profiling/benchmark_simulators_test.py ..... [100%] real 0m2.885s user 0m3.523s sys 0m2.597s ``` Initial PR for quantumlib#3240 Closes quantumlib#882
Problem: A common optimization for simulators to perform is to allow for qubits to exist for only some small amount of time. Qubits can be allocated and deallocated as the simulation runs and this can result in smaller simulation step sizes (less memory and less time). This type of optimization is useful in quantum error correction and in open systems simulation. In addition there are architectures in which moving a qubit around is a useful concept and one good way to implement this is to consider move swaps operations as a separate type of operation. Simulators can often interpret this as tensor index re-ordering, and hence this can be used to speed up simulation.
Rough requirements:
The text was updated successfully, but these errors were encountered: