Skip to content

Add support for "flat_sample": sampling without keys for performance #3808

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
Strilanc opened this issue Feb 15, 2021 · 11 comments
Open

Add support for "flat_sample": sampling without keys for performance #3808

Strilanc opened this issue Feb 15, 2021 · 11 comments
Labels
area/measurements area/performance area/result kind/feature-request Describes new functionality status/needs-agreed-design We want to do this, but it needs an agreed upon design before implementation status/rfc-needed This needs an RFC document. triage/accepted A consensus emerged that this bug report, feature request, or other action should be worked on

Comments

@Strilanc
Copy link
Contributor

Currently Cirq has a bit of a performance problem. One of the contributing factors to this is that there are very few "escape hatches" that you can use to opt out of features that slow things down. This suggestion is to add one such escape hatch.

A "flat sample" is a 2d numpy uint8 array where the major index is "repetition index" and the minor axis lists off the output bits ordered by moment, then by within-moment order, then by qubit order within each measurement operation. If we want to go even more extreme, the array could be bit packed to reduce space usage by 8x although then it's much harder for people to use.

To be explicit:

k = 0
for t, moment in enumerate(circuit):
    for op in moment:
        if isinstance(op.gate, cirq.MeasurementGate):
            for q in op.qubits:
                print(f"flat index {k} refers to the measurement of qubit {q} in moment {t}")
                k += 1

Flat measurements can be more performant than non-flat measurements because they don't require scattering values across memory as they are collected. They also implicitly avoid key collision issues to do with measurements being repeated in a loop. This makes them much more amenable to e.g. error correction circuits.

I propose that we modify Sampler to have a flat_sample(circuit) method that by default delegates to Sampler.run, but which we then write more efficient implementations for in cirq.Simulator.

@Strilanc Strilanc added the kind/feature-request Describes new functionality label Feb 15, 2021
@daxfohl
Copy link
Collaborator

daxfohl commented Feb 16, 2021

Do you have an example of when the cost of key lookup would be significant compared to the cost of doing the simulation itself? It would be valuable to be able to provide before/after benchmarks vs the cost of maintaining the feature.

@daxfohl
Copy link
Collaborator

daxfohl commented Feb 17, 2021

Also I wonder if this would be better done using a strategy pattern instead of inheritance. So, you set (or preferably inject) repetition_sampling_strategy instead of requiring a method override. That would allow you to pick and choose which strategy you want rather than having it bound to the simulator class and dealing with inheritance. (There are a number of other things in simulators that I feel are orthogonal concerns and may be better to extract into strategy patterns as well, but that's a different topic).

(Also, granted I'm still new to Python so if there's a more Pythonic way of doing this then that's fine too.)

@maffoo
Copy link
Contributor

maffoo commented Feb 17, 2021

I like the idea of a packed representation because it's very natural to get from hardware. A different way to support something like this would be to make cirq.Result an abstract type instead of a concrete type. Then samplers would be free to make their own implementations using whatever packed data representation they want, as long as they support "unpacking" to access data for particular measurements by key, per the Result API.

Note, the v1 engine API defined exactly this sort of packed result format: https://github.com/quantumlib/Cirq/blob/master/cirq/google/api/v1/program.proto#L35 (there, the results are actually bit packed, giving the ~8x size reduction mentioned by @Strilanc). We ended up changing this in the v2 api to use a different format with results indexed by key, but TBH I think it'd be worth revisiting this decision for performance reasons, especially if the cirq.Result type itself becomes abstract and we have implementations for packed data.

@maffoo
Copy link
Contributor

maffoo commented Feb 17, 2021

A somewhat related issue is #3233 which wants to generalize the Result type along other axes such as allowing ints (for qudit measurements) or IQ points (for lower-level hardware readout info) instead of bits. In those cases, as here, I think decoupling the cirq.Result interface from particular concrete implementations would be very helpful.

@daxfohl
Copy link
Collaborator

daxfohl commented Feb 20, 2021

Also is this something that would have to be done in the simulator, or is it something that could be done in post processing? If the latter, we can make a function that takes in a measurement map and a circuit, and output a json-like object that has all the nested subcircuit and repetition measurements organized as nested objects and arrays. Then that should be easy analyze efficiently.

Basically, does the perf bottleneck primarily occur during the simulation itself, or in the post analysis? If the latter, then maybe the above could help resolve the issue.

@balopat balopat added area/performance triage/accepted A consensus emerged that this bug report, feature request, or other action should be worked on triage/discuss Needs decision / discussion, bring these up during Cirq Cynque and removed triage/accepted A consensus emerged that this bug report, feature request, or other action should be worked on labels Feb 23, 2021
@balopat
Copy link
Contributor

balopat commented Feb 23, 2021

+1 to dax's comment - I would like to understand a bit more the use case(s) where the performance issue shows itself, please comment on that @Strilanc - alternatively we can discuss on Cirq Cynque (added the label).

Regarding the implementation: I'm open towards this - I like @maffoo's abstract Result direction + revisiting the program proto + tying it in with the IQ points solution potentially.

@daxfohl
Copy link
Collaborator

daxfohl commented Feb 23, 2021

Maybe the simplest solution is to change the classes where you measure and store results from Dict to OrderedDict. Then you can reduce that to raw ordered list of measurements and pack it as much as you want after the fact. This way you don't have to inject functionality into all the simulators or add extra layers of abstraction.

Alternatively, a function that takes a circuit and gives you the ordered list of measurements it produces. You can then create a packed ordered measurement array from that and the result dictionary.

Gut feeling is we probably want to avoid abstracting Result here. I don't see a case for these two representations to be used interchangeably. Trying to force them into an inheritance hierarchy may just cause confusion in the end. Better to keep them as two separate things for now.

@daxfohl
Copy link
Collaborator

daxfohl commented Feb 23, 2021

I just saw there's a measurement_keys protocol, but it returns a set instead of a list. If we change that to list, then it looks like it returns everything in order (it worked on all the test cases I tried anyway: subcircuits, repetition ids, key mappings). From which you can do the second option I mentioned above. Would that solve the problem?

@Strilanc
Copy link
Contributor Author

Strilanc commented Feb 23, 2021

Do you have an example of when the cost of key lookup would be significant compared to the cost of doing the simulation itself?

It's not so much that it's a significant runtime cost as that it is one of the cuts in a death by a thousand cuts.

...Using a bit packed format instead of a bool-per-byte format often gives noticeable speed benefits just due to touching less memory, but I wouldn't expect most data sources and consumers to want to deal with the hassle of packed bits.

I think it's notable that the flat format, and the bit packed format, are much closer to what is actually produced by the hardware. This suggests there may be implementation convenience benefits, and removal of conversion costs.

@daxfohl
Copy link
Collaborator

daxfohl commented Feb 24, 2021

If we want this optimization during simulation, we could invert the above option. Have the record_measurement_result function record it in a packed array (which will be only a single code change this once #3841 is merged, which creates an ActOnArgs base class that contains this implementation), and then use the new measurement_keys_in_order protocol mentioned above to unpack the array into a dictionary before returning the run result. And of course provide a run_without_unpacking that returns the raw array.

This approach would give the functionality to all the _act_on_ simulators (sparse, DM, clifford, maybe mps soon). @Strilanc, which simulators are most important here? Are you primarily concerned about these, or qsim, etc?

@balopat
Copy link
Contributor

balopat commented Feb 24, 2021

Cirq Cynque:

  • unanimous agreement around getting this done: abstract Result class + having the ability for different runtime to use whatever native result format and then provide converter methods to it
  • needs detailed design (a little RFC would be nice!)

@balopat balopat added status/needs-agreed-design We want to do this, but it needs an agreed upon design before implementation triage/accepted A consensus emerged that this bug report, feature request, or other action should be worked on status/rfc-needed This needs an RFC document. and removed triage/discuss Needs decision / discussion, bring these up during Cirq Cynque labels Feb 24, 2021
CirqBot pushed a commit that referenced this issue Feb 7, 2022
Adds a `ClassicalDataStore` class so we can keep track of which qubits are associated to which measurements.

Closes #3232. Initially this was created as part 14 (of 14) of https://tinyurl.com/cirq-feedforward to enable qudits in classical conditions, by storing and using dimensions of the measured qubits when calculating the integer value of each measurement when resolving sympy expressions. However it may have broader applicability.

This approach also sets us up to more easily add different types of measurements (#3233, #4274). It will also ease the path to #3002 and #4449., as we can eventually pass this into `Result` rather than the raw `log_of_measurement_results` dictionary. (The return type of `_run` will have to be changed to `Sequence[C;assicalDataStoreReader]`.

Related: #887, #3231 (open question @95-martin-orion whether this closes those or not)

This PR contains a `ClassicalDataStoreReader` and `ClassicalDataStoreBase` parent "interface" for the `ClassicalDataStore` class as well. This will allow us to swap in different representations that may have different performance characteristics. See #3808 for an example use case. This could be done by adding an optional `ClassicalDataStore` factory method argument to the `SimulatorBase` initializer, or separately to sampler classes.

(Note this is an alternative to #4778 for supporting qudits in sympy classical control expressions, as discussed here: https://github.com/quantumlib/Cirq/pull/4778/files#r774816995. The other PR was simpler and less invasive, but a bit hacky. I felt even though bigger, this seemed like the better approach and especially fits better with our future direction, and closed the other one).

**Breaking Changes**:
1. The abstract method `SimulatorBase._create_partial_act_on_args` argument `log_of_measurement_results: Dict` has been changed to `classical_data: ClassicalData`. Any third-party simulators that inherit `SimulatorBase` will need to update their implementation accordingly.
2. The abstract base class `ActOnArgs.__init__` argument `log_of_measurement_results: Dict` is now copied before use. For users that depend on the pass-by-reference semantics (this should be rare), they can use the new `classical_data: ClassicalData` argument instead, which is pass-by-reference.
rht pushed a commit to rht/Cirq that referenced this issue May 1, 2023
…mlib#4781)

Adds a `ClassicalDataStore` class so we can keep track of which qubits are associated to which measurements.

Closes quantumlib#3232. Initially this was created as part 14 (of 14) of https://tinyurl.com/cirq-feedforward to enable qudits in classical conditions, by storing and using dimensions of the measured qubits when calculating the integer value of each measurement when resolving sympy expressions. However it may have broader applicability.

This approach also sets us up to more easily add different types of measurements (quantumlib#3233, quantumlib#4274). It will also ease the path to quantumlib#3002 and quantumlib#4449., as we can eventually pass this into `Result` rather than the raw `log_of_measurement_results` dictionary. (The return type of `_run` will have to be changed to `Sequence[C;assicalDataStoreReader]`.

Related: quantumlib#887, quantumlib#3231 (open question @95-martin-orion whether this closes those or not)

This PR contains a `ClassicalDataStoreReader` and `ClassicalDataStoreBase` parent "interface" for the `ClassicalDataStore` class as well. This will allow us to swap in different representations that may have different performance characteristics. See quantumlib#3808 for an example use case. This could be done by adding an optional `ClassicalDataStore` factory method argument to the `SimulatorBase` initializer, or separately to sampler classes.

(Note this is an alternative to quantumlib#4778 for supporting qudits in sympy classical control expressions, as discussed here: https://github.com/quantumlib/Cirq/pull/4778/files#r774816995. The other PR was simpler and less invasive, but a bit hacky. I felt even though bigger, this seemed like the better approach and especially fits better with our future direction, and closed the other one).

**Breaking Changes**:
1. The abstract method `SimulatorBase._create_partial_act_on_args` argument `log_of_measurement_results: Dict` has been changed to `classical_data: ClassicalData`. Any third-party simulators that inherit `SimulatorBase` will need to update their implementation accordingly.
2. The abstract base class `ActOnArgs.__init__` argument `log_of_measurement_results: Dict` is now copied before use. For users that depend on the pass-by-reference semantics (this should be rare), they can use the new `classical_data: ClassicalData` argument instead, which is pass-by-reference.
harry-phasecraft pushed a commit to PhaseCraft/Cirq that referenced this issue Oct 31, 2024
…mlib#4781)

Adds a `ClassicalDataStore` class so we can keep track of which qubits are associated to which measurements.

Closes quantumlib#3232. Initially this was created as part 14 (of 14) of https://tinyurl.com/cirq-feedforward to enable qudits in classical conditions, by storing and using dimensions of the measured qubits when calculating the integer value of each measurement when resolving sympy expressions. However it may have broader applicability.

This approach also sets us up to more easily add different types of measurements (quantumlib#3233, quantumlib#4274). It will also ease the path to quantumlib#3002 and quantumlib#4449., as we can eventually pass this into `Result` rather than the raw `log_of_measurement_results` dictionary. (The return type of `_run` will have to be changed to `Sequence[C;assicalDataStoreReader]`.

Related: quantumlib#887, quantumlib#3231 (open question @95-martin-orion whether this closes those or not)

This PR contains a `ClassicalDataStoreReader` and `ClassicalDataStoreBase` parent "interface" for the `ClassicalDataStore` class as well. This will allow us to swap in different representations that may have different performance characteristics. See quantumlib#3808 for an example use case. This could be done by adding an optional `ClassicalDataStore` factory method argument to the `SimulatorBase` initializer, or separately to sampler classes.

(Note this is an alternative to quantumlib#4778 for supporting qudits in sympy classical control expressions, as discussed here: https://github.com/quantumlib/Cirq/pull/4778/files#r774816995. The other PR was simpler and less invasive, but a bit hacky. I felt even though bigger, this seemed like the better approach and especially fits better with our future direction, and closed the other one).

**Breaking Changes**:
1. The abstract method `SimulatorBase._create_partial_act_on_args` argument `log_of_measurement_results: Dict` has been changed to `classical_data: ClassicalData`. Any third-party simulators that inherit `SimulatorBase` will need to update their implementation accordingly.
2. The abstract base class `ActOnArgs.__init__` argument `log_of_measurement_results: Dict` is now copied before use. For users that depend on the pass-by-reference semantics (this should be rare), they can use the new `classical_data: ClassicalData` argument instead, which is pass-by-reference.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/measurements area/performance area/result kind/feature-request Describes new functionality status/needs-agreed-design We want to do this, but it needs an agreed upon design before implementation status/rfc-needed This needs an RFC document. triage/accepted A consensus emerged that this bug report, feature request, or other action should be worked on
Projects
Status: No status
Development

No branches or pull requests

6 participants