header support implementation #115

accelerated · 2018-10-02T21:21:40Z

Hi @mfontanini, please take a look. This is just the first pass of the implementation for review purposes only. I haven't written any tests yet and also have not decorated methods with doxygen comments as things may change.

Some considerations:

I have not implemented the name-based getter function rd_kafka_header_get the reason being that the more generic rd_kafka_header_get_all can be used as a forward/reverse iterator and can actually have an end() function which can be used for bound checking when iterating and matches well with STL container API. With the name based getter, there is no way of guaranteeing what end() will be, hence iterating can only be implemented as a forward iterator with a while loop. I felt that in any case, the number of headers won't be very large and anyone using the generic iterator over all elements can simply check for the header name they want, so there's no loss in functionality and the code is less complicated.
The rdkafka iterator API is strange because it lets you iterate by index, but there's no API to delete a header at a certain position, instead it deletes by name, which could have duplicates, in which case one delete operation can remove entire portions of the underlying vector (I checked the rdkafka code and the header list is indeed a vector).
HeaderList<HeaderType> Message::detach_header_list() function could be implemented in a more c++ way as HeaderList<HeaderType> Message::get_header_list() && which would be an overload of the const get_header_list() but would apply only to rvalue messages. The effect would be the same. Not sure what your preference is.
The is a potential performance issue with the Producer::produce(const MessageBuilder) and Producer::produce(const Message) functions. The problem is that rd_kafka_producev() takes ownership of the header list and because both MessageBuilder and Message are const, you can't make the underlying handle non-owning (which would be the fastest). Also a user might produce multiple messages with the same builder object so you have to maintain ownership in the const case,
which forces you to clone the header list each time. Perhaps we can add two more produce functions with move semantics which would take MessageBuilder&& and Message&& ?
Inside the HeaderList class, I made the following assumptions. If the list object is non-owning, the copy would not invoke clone. If on the other hand you own the handle, then a copy operation would clone the headers. You can also have an owning object which on copy simply returns a non-owning object, but I don't think it's the proper logic.
I have also added the Message::get_latency() api which was missing.
I will split the HeaderIterator and HeaderList into separate files in the next commit.
Note that Message has a HeaderList<Buffer> type which is non-owning, whereas MessageBuilder has a HeaderList<T> where T matches the BufferType template parameter.

mfontanini · 2018-10-02T21:41:40Z

Sweet! Thanks for the PR, this is great. I'll have a look at it in more detail later but on a first pass it seems alright.

mfontanini · 2018-10-02T21:43:08Z

I presume given point 1, there can be multiple headers with the same name?

accelerated · 2018-10-02T22:43:15Z

Yes, any number. The rdkafka list is a vector. Any add or remove operations will invalidate iterators.

mfontanini

Alright, I managed to get back to this and have a look at the code. It looks good, just a few comments. Regarding your third comment about the produce performance, I think that's a good idea. Maybe having something like this would work?

void produce(MessageBuilder&& b) {
    // move them
    do_produce(b, move(b.header_list());
}

void produce(const MessageBuilder& b) {
    // copy them
    do_produce(MessageBuilder::HeaderListType(b.header_list));
}

mfontanini · 2018-10-06T16:31:49Z

include/cppkafka/header.h

+    using ValueType = BufferType;
+    Header() = default;
+
+    Header(const std::string& name,


Since name is being stored here, I would normally take it by value and then std::move it.

BufferType is also stored...shouldn't it also in this case be passed by value and moved ?

Can you pls comment on this...I'm not sure if you also want the BufferType to be passed by value instead of lvalue/rvalue ?

Given a BufferType can't be moved automatically, I don't think this one should take a value. Probably just have both constructors take a string by value.

mfontanini · 2018-10-06T16:36:50Z

include/cppkafka/header.h

+                           const BufferType& value)
+: name_(name),
+  value_(make_value(value)) {
+  assert(!name.empty());


I haven't played around with headers but just in case: is it impossible for a header to have an empty name? It looks like the header is defined as just a string in the protocol so I don't know if the brokers validate the length before accepting them. I know this wouldn't make sense but I'd rather be sure this won't backfire.

Ok I will remove the assert and I will add a null-name test to see if kafka likes it.

mfontanini · 2018-10-06T16:38:38Z

include/cppkafka/header_list.h

+
+    // prefix increment
+    HeaderIterator& operator++()
+    {


Minor: brace on line above please (same for all methods below)

mfontanini · 2018-10-06T16:45:54Z

include/cppkafka/message.h

+    /**
+     * \brief Gets the message latency in microseconds as measured from the produce() call.
+     */
+    int64_t get_latency() const {


This should probably return std::chrono::microseconds

mfontanini · 2018-10-06T20:34:11Z

include/cppkafka/header_list.h

+    // prefix decrement
+    HeaderIterator& operator--()
+    {
+        if (index_ == 0) {


I think these types of checks should be asserts. If people don't know how to use an iterator, it's their problem. I'd rather take the undefined behavior way like STL does.

mfontanini · 2018-10-06T20:34:51Z

include/cppkafka/header_list.h

+    // prefix increment
+    HeaderIterator& operator++()
+    {
+        if (*this == header_list_.end()) {


Same comment as below regarding the use of asserts for this.

mfontanini · 2018-10-06T20:44:15Z

include/cppkafka/header_list.h

+            throw Exception("Over bounds");
+        }
+        if (++index_ >= header_list_.size()) {
+            *this = header_list_.end();


The mixed comparisons against header_list_.size() and header_list_.end() are kind of confusing. Why can't this just use the index for everything rather than assigning end to itself? e.g. this method's implementation could simply be:

HeaderIterator& operator++() { assert(index_ < header_list_.size()); ++index_; return *this; }

I think the reason why you currently can't do this is because your implementation of HeaderList::end returns Iterator{make_non_owning(handle_.get()), -1} (btw I think that -1 is wrong given the index is a size_t, that should be at least throwing a warning). If that method instead returned Iterator{make_non_owning(handle_.get()), size()} then you could just use the index to identify the iterator and you wouldn't have to compare against header_list_.end() all the time. Plus, this is in line with what e.g. std::vector does where end returns a one-past-the-end iterator.

Am I missing something as to why it couldn't be implemented that way? It just seems like there's too many checks here and methods could be a lot simpler instead. Right now the logic for operator-- seems overly complicated just because when you're at end, your index is not actually pointing at size() but to -1 (which I presume means numeric_limits<size_t>::max() in this context.

Ok so there's a bit of a story about this particular implementation. I know it can easily be achieved with the last index (and I will change the code to reflect your simpler approach), however my reasons initially were the following:

I assumed that the rdkafka header list implementation was a list (turns out it's a vector) and because we don't have access to the underlying container I still wanted to keep the list iterator behavior where:

auto endIt = list.end(); list.emplace_back(...); //add one ore more elements or delete them assert (list.end() == endIt); //end() is immutable no matter how many operations you do //obviously this is not valid for a vector, and since the underlying //container is a vector, we can just put a warning that any operation //on the list will invalidate all iterators (including end())

In order to achieve immutability, I had to create this -1 (max size_t) trick and assign it to the iterator if rdkafka getter returns ...NO_ENT error.

I initially had two more overloads for doing name-based iteration (begin(name) and end(name)) to support that API as well...and when you iterate by name, the last element is not necessarily the highest index in the list. Even if I use rd_kafka_header_get_last to find the last element, it does not return the index it's at so I have no clue what the next position is...hence, both these overloads had to have a way to indicate end().

Since we now know it's a vector, and we don't care about the name-based iterator, then using your simplified logic is better! Just thought I'd give you my reasoning so you don't think i wrote overly complicated code for no reason :).

mfontanini · 2018-10-06T20:55:28Z

include/cppkafka/header_list.h

+    HeaderType at(size_t index) const; //throws
+    HeaderType front() const;
+    HeaderType back() const;
+    HeaderType back(const std::string& name) const;


For some reason I feel like we shouldn't even wrap rd_kafka_header_get_last or it shouldn't be called back. I think the semantics are a bit confusing and I don't really see the point of it. Why would someone want to find the last occurrence of a header with a particular name? You also can't call this function find because it doesn't really find the header, it gets the last one with that name. So, I think it may be less confusing to just remove it altogether.

yeah i just had it for completeness reasons, but I agree with you, seems like a totally useless function. I'll remove it.

mfontanini · 2018-10-06T20:56:08Z

include/cppkafka/header_list.h

+Error HeaderList<HeaderType>::add(const HeaderType& header) {
+    assert(handle_);
+    return rd_kafka_header_add(handle_.get(),
+                               header.get_name().c_str(),  header.get_name().size(),


Minor: I'd rather do header.get_name().data() for consistency (same in next 2 methods).

mfontanini · 2018-10-06T21:01:53Z

include/cppkafka/message.h

+     * \brief Detaches the message's header list
+     */
+    template <typename HeaderType>
+    HeaderList<HeaderType> detach_header_list() {


Following up on your comment, I think this is fine. I'd rather be explicit about wanting to detach them than having to do move(message).detach_header_list().

accelerated · 2018-10-07T23:48:13Z

Regarding ...

Alright, I managed to get back to this and have a look at the code. It looks good, just a few comments. Regarding your third comment about the produce performance, I think that's a good idea. Maybe having something like this would work?

void produce(MessageBuilder&& b) {
    // move them
    do_produce(b, move(b.header_list());
}

void produce(const MessageBuilder& b) {
    // copy them
    do_produce(MessageBuilder::HeaderListType(b.header_list));
}

no this will not work. What needs to be done for the MessageBuilder class is to have a method to detach the header list handle, and then create a non-owning header list with that which you then pass into producev. Simply moving the list will not work, because moving transfers ownership of the unique_ptr elsewhere...what we need to do is get rid of that ownership altogether since rdkafka will assume it. So the HeaderList class must have a method rdkafka_header_list_t* HeaderList::detach() which calls release() on the pointer and returns the handle and then you can do : make_non_owning(b.header_list().detach()).

In the case of Message&&, since Message object is non-owning, you can't just pass the handle into rdkafka as it will take ownership and now the handle will have 2 owners. In this case, you need to detach the list (like above), and pass it into rdkafka. Ex: make_non_owning(message.detach_header_list().detach()) which calls the same HeaderList::detach() method. One thing to remember is that Message::detach_header_list() creates an owning header list so that in turn must also be detached.

For the const ref methods you have to clone. There's not option there.

PS: regarding detaching the handle from the HeaderList object there are two ways:

rdkafka_header_handle* HeaderList::detach() which releases the ptr and returns the handle. After this the HeaderList object is unusable.
void HeaderList::make_non_owning() which internally removes the deleter function from the unique ptr. aka....create a new unique_ptr without a deleter method. This preserves the HeaderList object but just makes it from owner->non-owner. Maybe we can call it convert_to_non_owning or release_ownership?

Not sure what you prefer. Please indicate and I will implement it.

mfontanini · 2018-10-09T16:10:35Z

I see your point, makes sense. I think I would go the release_ownership way but I'd call it release so it's in line with unique_ptr/shared_ptr.

updated travis file with v0.11.5

accelerated · 2018-10-10T18:21:50Z

Not really sure why all these tests are failing, everything passes in my environment. Also the consumer tests haven't changed at all. Could it be some timing issue on the build server where travis runs ?

===============================================================================
All tests passed (544 assertions in 37 test cases)

mfontanini · 2018-10-10T19:37:47Z

Hmm it's weird because the last two (#1 and #2) builds failed on exactly the same tests. If this was a timing issue I'd expect it to be more unpredictable.

Unrelated to failing tests but I forgot to mention: the header support should be allowed to be disabled via some compilation flag. Otherwise nobody would be able to use this with rdkafka < whatever version header support was introduced in.

accelerated · 2018-10-10T20:54:20Z

Hmm it's weird because the last two (#1 and #2) builds failed on exactly the same tests. If this was a timing issue I'd expect it to be more unpredictable.

Unrelated to failing tests but I forgot to mention: the header support should be allowed to be disabled via some compilation flag. Otherwise nobody would be able to use this with rdkafka < whatever version header support was introduced in.

Yeah makes sense...we can do something similar to the Queue memory leak issue. I'll fix it. The producer code will look somewhat uglier as that's the entry point of use...

accelerated · 2018-10-10T22:09:57Z

Good to go!

accelerated · 2018-10-11T00:51:45Z

It might well be that the rdkafka changes from 0.11.0 to 0.11.5 make a difference in timing. They made many perf improvements, hence why the same tests failed. Your consumer runner class might need a tweak perhaps?

mfontanini · 2018-10-11T01:51:37Z

include/cppkafka/header.h

+     * \param name The header name
+     * \param value The header data to be moved
+     */
+    Header(const std::string name,


This shouldn't be const, otherwise it won't be moved. Same with the overload above.

Silly me. Of course...good catch!

mfontanini · 2018-10-11T01:58:00Z

src/producer.cpp

+
+void Producer::produce(MessageBuilder&& builder) {
+    MessageBuilder temp(std::move(builder)); //owns header list after the move
+    do_produce(temp, MessageBuilder::HeaderListType(temp.header_list().release_handle())); //move headers


Am I missing something as to why you can't just use builder without moving it into a temporary? Normally after moving the moved object is left in an unspecified state, so if your worries were in the lines of "we should still move it away" then it's fine, since it's unspecified I think moving only what's necessary (the headers in this case) is alright.

Yeah that was the reason, I wanted to make sure a moved object will be empty afterwards (not just the headers) regardless if it's called from a temp rvalue or from a moved lvalue. Ok will remove it.

mfontanini · 2018-10-11T02:00:07Z

src/producer.cpp

+
+void Producer::produce(Message&& message) {
+    Message temp(std::move(message)); //rdakfka still owns the header list at this point
+    do_produce(temp, HeaderList<Message::HeaderType>(temp.detach_header_list<Message::HeaderType>())); //move headers


I think the HeaderList<Message::HeaderType> is unnecessary, unless I'm missing something.

Yes you're right, I'll remove it.

mfontanini · 2018-10-11T02:01:00Z

tests/producer_test.cpp

+    CHECK(messages[0].get_header_list() == messages[1].get_header_list());
+    CHECK(messages[0].get_header_list().get_handle() != messages[1].get_header_list().get_handle());
+}
+


Good job on the tests!

mfontanini · 2018-10-11T17:39:00Z

I think this looks good! I'm going to clone it over the weekend, play around a bit and see if I can fix the test issues and we can probably merge it after that.

mfontanini

This looks good. It looks like tests started failing sporadically when I upgraded my kafka broker to 2.0.0 (it was ~0.10 before). As far as I can see, when a consumer unsubscribes and another subscribes right after (such is the case when running tests) kafka doesn't immediately trigger a partition assignment. I presume this is to give it some grace period or something like that, maybe to reduce the number of rebalances.

It seems like if you increase the timeout in the BasicConsumerRunner from 20 seconds to something > 30 (35?), there's fewer tests that fail. I still can't find a way to make it work every time though.

mfontanini · 2018-10-14T18:22:55Z

tests/headers_test.cpp

+        //Create an owning header list and copy it
+        HeaderList<BufferHeader> list(3), list2(3);
+        string payload1 = "payload1", payload2 = "payload2", payload3 = "payload3";
+        list.add(BufferHeader(std::string("header1"), payload1));


Just for the record, this can simply be

list.add({ "header1", payload1 });

No need to change it, but just to point out there's no need to be that explicit.

Sure but how will the compiler know the instantiate a BufferHeader vs a StringHeader with that initializer list? Both can be created with the same arguments.

HeaderList::add takes a HeaderType which for that particular instantiation (list), that type is BufferHeader so there's no ambiguity of which one you're referring to.

accelerated · 2018-10-16T00:02:19Z

Instead of using a consumer runner, can't you produce say N messages, then start the consumer with relative offsets (last-N) since there are no local offsets cached at this point, read N messages till EOF and then compare? This way you eliminate the runner altogether.

mfontanini · 2018-10-16T00:07:59Z

Well, some tests require simply subscribing and there could be some uncommitted messages in the topic you subscribe to. Also you need to make sure that you get the messages at some point, as if there's some bug and you don't produce them, you don't want the runner to be stuck forever. It's a pretty hacky solution but it kinda worked until it didn't.

accelerated · 2018-10-16T01:52:26Z

The consumer block can be solved easily by only polling x amount of times or with a max timer. I'm not sure what happens if you have uncomitted messages in the topic, and you open the consumer with relative offset "latest". Is it latest committed + 1 or just latest timewise, aka top of the queue?
Could be worth testing. Alternatively, you create 500 topics and each test uses a fresh topic. Clean slate.

mfontanini · 2018-10-16T02:35:21Z

Creating 500 topics will still be an issue if a previous run on the same topic fails. Plus, that's not very maintainable :P. If you keep using the same consumer group id, the auto offset reset policy is irrelevant because that's a fallback in case the consumer doesn't have any already committed offset for a topic/partition. Since there's been previous runs of the tests and offsets were committed, this is irrelevant. Now, if you used a random consumer group id on each run and use reset policy "latest", that would work. I'm not a fan of this approach but if it's the only way to make it work reliably then that's fine.

accelerated · 2018-10-16T03:01:28Z

Yeah i see what you mean. Ideally a consumer group id made from a GUID would be unique for each test, esp a v1Guid which is time based. Alternatively maybe blowing away the topics and recreating them on startup ensures everything is empty. I was assuming the travis sandbox would be a blank state each time you run kafka, meaning no topic or partitions leftovers from previous runs.

mfontanini · 2018-10-16T03:15:04Z

Travis does run in a clean state, but the issue is that all tests share the same topics and consumer groups. I think using a random string as the consumer group is enough here. Something like 16 alphanums generated via random_device + mt19937 + uniform_int_distribution should be enough.

accelerated · 2018-10-16T03:56:00Z

Right, each test gets a new producer, consumer pair. One other possibility is to implement the admin api and generate/teardown topics dynamically? Not too familiar with this api though...
In any case, making this test rehaul is a big operation, i wonder if you can merge the allocator and headers PRs and have the test refactor as a new issue where you can ask for help? For now my bandwidth is very limited....

mfontanini · 2018-10-16T15:57:39Z

Yeah, I think we can merge this. I wouldn't go the admin API way because that only works in some recent version of rdkafka so you wouldn't be able to test it locally. Can you just rebase this one? I should have some time in the afternoon today or tomorrow so I can give it a try.

…headers

mfontanini · 2018-10-16T17:57:59Z

Thanks!

header support implementation

b221997

Fixed issue when ptr is null and doesn't have a cloner function

8c7149d

mfontanini reviewed Oct 6, 2018

View reviewed changes

Code complete with test cases

c17f3e5

updated travis file with v0.11.5

accelerated force-pushed the headers branch from 825e6a9 to c17f3e5 Compare October 10, 2018 18:02

Added compile time check for rdkafka header support version

ac5d41e

mfontanini reviewed Oct 11, 2018

View reviewed changes

Changes per last code review

9dbf515

mfontanini reviewed Oct 14, 2018

View reviewed changes

accelerated added 2 commits October 16, 2018 13:05

Merge branch 'master' of https://github.com/mfontanini/cppkafka into …

8af2492

…headers

Using brace list initializers

06938a1

mfontanini merged commit fbe3759 into mfontanini:master Oct 16, 2018

mfontanini mentioned this pull request Oct 18, 2018

Support for Message Headers #114

Closed

accelerated deleted the headers branch November 12, 2018 14:48

header support implementation #115

header support implementation #115

Conversation

accelerated commented Oct 2, 2018 • edited Loading

mfontanini commented Oct 2, 2018

mfontanini commented Oct 2, 2018

accelerated commented Oct 2, 2018

mfontanini left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

accelerated Oct 7, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

accelerated commented Oct 7, 2018 • edited Loading

mfontanini commented Oct 9, 2018

accelerated commented Oct 10, 2018 • edited Loading

mfontanini commented Oct 10, 2018

accelerated commented Oct 10, 2018

accelerated commented Oct 10, 2018

accelerated commented Oct 11, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mfontanini commented Oct 11, 2018

mfontanini left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

accelerated commented Oct 16, 2018 • edited Loading

mfontanini commented Oct 16, 2018

accelerated commented Oct 16, 2018

mfontanini commented Oct 16, 2018

accelerated commented Oct 16, 2018

mfontanini commented Oct 16, 2018

accelerated commented Oct 16, 2018

mfontanini commented Oct 16, 2018

mfontanini commented Oct 16, 2018

accelerated commented Oct 2, 2018 •

edited

Loading

accelerated Oct 7, 2018 •

edited

Loading

accelerated commented Oct 7, 2018 •

edited

Loading

accelerated commented Oct 10, 2018 •

edited

Loading

accelerated commented Oct 16, 2018 •

edited

Loading