TTL expire message
Hello,
We are testing our broker config link-heartbeat-interval. We create the scenario on network failure on 1 link, on a 2 links network.
Message's TTL is 18 seconds. We capture message at the sender and receiver's exchanges. We measure the message delivery rate by matching up message ids at both ends.
For link-heartbeat-interval = 2, there is no message lost. link-heartbeat-interval = 10, there are message lost. However, for the both configs, we still see discards-ttl-expired increases with qpid-stat.
From my understanding of the acquire/acknowledge process of a message, when the message is marked for redelivered, the other consumer (2nd link) can acquire and acknowledge the message. But from what i'm seeing, the 2nd consumer got the message but still not able to acknowledge it.
Could you give me a better understanding of this whole process? And how can I avoid having the discards-ttl-expired messages.
Thank you.
Huong,
Responses
Hello,
I dont understand how 2 links between 2 brokers can be established / utilized for unidirectional routes. Could you please provide the qpid-route commands used for configuring it?
Also it is interesting there are discards-ttl-expired messages in the link-heartbeat-interval=2 scenario where you claim there is no message loss. One explanation could be the 2nd consumer received the messages, but was really unable to acknowledge them, so the broker expire them after a while. The client acknowledgement should not depend on redelivered flag of the message, so I suspect either some unexpected bug in the qpid client library (what language it is?) or different behaviour in your application to redelivered messages.
I guess the link-heartbeat-interval=10 scenario has message loss because the source broker sends connection.heartbeat every 10 seconds, expecting a reply with same AMQP frame from the destination broker - if no reply in 10s is received, link is closed. So:
- at time T, src.broker sends a heartbeat, latest one that is responded by dst.broker
- at time T+X (X<10s), a network failure happens
- at time T+10, first unresponded heartbeat is sent
- at time T+20, link is assumed as lost and closed
- one second later, src.broker tries to re-connect (dont know how promptly it tries to fail-over to 2nd link since I dont understand how you achieved this)
So a message sent at time T+X can be successfully re-delivered to the dst.broker after approx. 21-X seconds. If you add some latency from sender to src.broker and from dst.broker to receiver, TTL 18 seconds can easily be reached. For link-heartbeat-interval=2, the math ends up with 4-X seconds only.
Welcome! Check out the Getting Started with Red Hat page for quick tours and guides for common tasks.
