-
Notifications
You must be signed in to change notification settings - Fork 220
Bug fixes for sync flush and add_tracker #91
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug fixes for sync flush and add_tracker #91
Conversation
Note: ideally if order preservation is needed with a retry mechanism, users should call |
@@ -527,8 +529,16 @@ void BufferedProducer<BufferType>::async_flush() { | |||
|
|||
template <typename BufferType> | |||
void BufferedProducer<BufferType>::flush() { | |||
async_flush(); | |||
wait_for_acks(); | |||
CounterGuard<size_t> counter_guard(flushes_in_progress_); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand that this is trying to preserve the ordering but this is extremely inefficient. I'm not sure this should be the default behavior. Many applications just care about the entire batch of messages to be produced, they don't care too much about the exact ordering in which that happens. Maybe this belongs on a separate method or something like that.
well I guess there's |
I think kafka could actually reject message 2 and 4 but not 3. There's a partition leader for each partition so if the leader of partition 3 is up but the one(s) for 2 and 4 is having issues, I think this could happen. I think adding a default parameter may work? There should be some documentation indicating some "Be aware that preserving the order can add significant time overhead to producing messages" or something like that. |
Ok I'll put a flag in the flush function. In reference to what you're saying, my understanding was that the batches are per-partition, and in that case, they only go to the leader of that partition. If messages in the batch are intermixed (i.e. one queue per topic instead of per-partition) then your assumption will be correct. However, producing in batches is not necessarily MUCH more performant, you have more efficient bandwidth utilization but then you get into IP fragmentation, reassemby and such issues. Also batching means delaying sending the message to the remote application, so while one client is waiting to batch say 100 messages, another non-batched client may have already sent and received 50 acks. Only proper benchmarking (and that's very network setup dependent) can say what the performance increase is. BTW if you don't mind sharing, how do you use the BufferedProducer? Do you do most buffering inside it and try not to use rdkafka's internal buffers? or do you only buffer when throttling? or you buffer when you need to do ack counting? |
0851cac
to
4d44dc9
Compare
Right, which means if message 1 and 3 go to partition 1 and message 2 goes to partition 2, then the batch for (1, 3) can fail and the batch for 2 succeeds, then that could happen. The idea of batching is to remove the rtt between the client and kafka. You don't want to run to the kafka server and say "hey I'm producing this one message" 100 times rather than go to the server a single time and say "hey I'm producing these 100 messages". Requests/responses will take time to travel through the wire so you want to not make more than you should.
I normally can split my data into chunks so I usually gather data for that chunk and once I'm done I generate the output data and simply call |
Thanks for the input. My data is quite different than yours and ordering is very important, but I have it configurable by the app so if ordering is not important for a certain topic, then they can configure it to just do normal batching. What i meant earlier about the "batches per-partition" is that the output rdkafka queues are per-partition so when a batch is sent for partition A, all messages in that batch are for the same partition so all can fail or perhaps the first X messages in the batch. But maybe i'm wrong. Anyway...thanks for reviewing this!! |
Right, but you can add messages for multiple partitions so when you flush you could send multiple batches, one (or more?) per partition and each individual batch can fail so if some partition leader is broken, that particular batch will fail but the rest will succeed, which means you could have that situation you mentioned where 2 succeeds but 1 and 3 fail because they belong to another batch for a different broker. |
Two bugs were fixed:
SenderType == Sync
was not set forsync_producer
BufferedProducer
implementation would post the failedMessage
from within theon_delivery_report
callback thus guaranteeing original order if a message was to be retried.