Skip to content

When recovery is enabled more deliveries can be acked after recovery than the user expects #395

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
caspermout opened this issue Aug 17, 2018 · 15 comments

Comments

@caspermout
Copy link

I found the following code in the class RecoveryAwareChannelN:

@Override
public void basicAck(long deliveryTag, boolean multiple) throws IOException {
    long realTag = deliveryTag - activeDeliveryTagOffset;
    // 0 tag means ack all when multiple is set
    if (realTag > 0 || (multiple && realTag == 0)) {
        transmit(new Basic.Ack(realTag, multiple));
        metricsCollector.basicAck(this, deliveryTag, multiple);
    }
}

When activeDeliveryTagOffset and deliveryTag are the same (this can happen directly after a channel is reconnected/recovered) this method will ack all outstanding messages instead of acking nothing.

The output of the attached sample application contains PRECONDITION_FAILED - unknown delivery tag 1 because it is trying to ack all messages. But it was already acked by acking the first message after connection recovery.

example-output.txt
TestRabbitRecover.txt

  • RabbitMQ version
    3.6.15

  • Erlang version
    Erlang 19.3.6.7

  • Client library version (for all libraries used)
    com.rabbitmq/amqp-client/4.5.0

@michaelklishin
Copy link
Contributor

So what’s the proposed solution? However imperfect, this versioning feature has worked quite well for a number of years.

@michaelklishin michaelklishin changed the title Possible loss of messages caused by RecoveryAwareChannelN When recovery is enabled more deliveries can be acked after recovery than the user expects Aug 17, 2018
@michaelklishin
Copy link
Contributor

@acogoluegnes do you have an opinion?

@caspermout
Copy link
Author

See pull request #397

@michaelklishin
Copy link
Contributor

@acogoluegnes I assume this can go into 4.8.0 safely?

@acogoluegnes
Copy link
Contributor

@michaelklishin Yes, it can. We may have to release another RCs for 5.4.0 and 4.8.0, but we can also fit #398 in them.

@caspermout
Copy link
Author

@michaelklishin I cannot find a 4.8.x version that contains this fix?

@acogoluegnes
Copy link
Contributor

@caspermout Looks like it didn't make it to 4.x.x-stable. I'll backport it and release 4.8.3.

@acogoluegnes
Copy link
Contributor

Didn't make it to 5.x.x-stable :-(

@caspermout
Copy link
Author

Thank you

@acogoluegnes acogoluegnes modified the milestones: 4.8.0, 4.8.3 Oct 8, 2018
@acogoluegnes
Copy link
Contributor

acogoluegnes commented Oct 8, 2018

@caspermout
Copy link
Author

Thanks!

@antenko
Copy link

antenko commented Apr 11, 2019

Didn't make it to 5.x.x-stable :-(

Hi!
Are you going to release this bugfix in 5.3.x or 5.x.x?
We have the same issue..

@michaelklishin
Copy link
Contributor

There were multiple 5.x releases since October 2018.

@michaelklishin
Copy link
Contributor

There were multiple 5.x releases since October 2018.

2e6bf27 was cherry-picked to 5.x.x-stable and 5.6.x-stable as 1a8ffed.

@michaelklishin
Copy link
Contributor

So 1a8ffed is available as of 5.5.3 and 5.7.0. I don't think it's in 5.6.0 but would be in 5.6.1 if it ships. I see no reason to stick to 5.3.x over any of the above versions, in particular 5.7.0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants