Skip to content

Compatibility with Phprdkafka 4.0 #959

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Sep 15, 2020
Merged

Conversation

Steveb-p
Copy link
Contributor

@Steveb-p Steveb-p commented Oct 4, 2019

@see https://github.com/arnaud-lb/php-rdkafka/releases/tag/4.0.0

Since poll no longer gets called on destruct/shutdown, we should provide error handling (#749).

I'm marking this as draft PR since otherwise we might have people losing messages when PHP process shuts down before message is acknowledged by Kafka broker.

@Steveb-p Steveb-p changed the title Allow usage of Phprdkafka 4.0 Compatibility with Phprdkafka 4.0 Oct 4, 2019
@nick-zh nick-zh mentioned this pull request Oct 7, 2019
2 tasks
@anam-hossain
Copy link

@Steveb-p What are the code changes required for handling this issue? Do you have any example? We are facing the same issue. Thanks in advanced.

@Steveb-p
Copy link
Contributor Author

@anam-hossain #749 needs to be resolved.

I've seen your response about flush in phprdkafka repository, I'll try to work on it soon™.

@Steveb-p Steveb-p marked this pull request as ready for review October 26, 2019 12:58
@Steveb-p
Copy link
Contributor Author

Steveb-p commented Oct 26, 2019

@TiMESPLiNTER @makasim @nick-zh if you could take a look at this PR and see if I'm doing it alright. This should roughly replicate the current behavior for phprdkafka 3.x and wait for message delivery if there is no special configuration present (shutdown_timeout).

There is still the "issue" of setDefaultTopicConf being deprecated (btw @nick-zh, it might be worth adding it to documentation of phprdkafka?). What is the expected way of handling this? How TopicConf should be delivered to an instance?

EDIT: @nick-zh actually, it looks like not passing TopicConf instance into newTopic method (https://arnaud.le-blanc.net/php-rdkafka/phpdoc/rdkafka.newtopic.html) causes NULL to be returned. I've noticed it when running automated tests here.

I've got issues with local tests failing due to - probably - some connectivity issue. Integration tests are failing when sending message 😕


// Compatibility with phprdkafka 4.0.
if (isset($this->producer) && method_exists($this->producer, 'flush')) {
$this->producer->flush($this->config['shutdown_timeout'] ?? -1);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe throw an error or log something if the return value is not RD_KAFKA_RESP_ERR_NO_ERROR, sry not sure what the enqueue convention is for something like that

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As what we discussed: throwing an exception here would change the behavior and make it necessary for wrapping libraries (like enqueue-bundle) to catch it, so I've left it out.

However, I do agree that it is something that this package eventually should do, since otherwise potential delivery errors will be hidden from end user.

@Steveb-p
Copy link
Contributor Author

Steveb-p commented Oct 27, 2019

Issues we're steming from changes in image Dockerfile is based on (PHP version changed to 7.3) which caused phprdkafka extension to not be loaded and stubs were being used instead 😕 . I'd say we drop stubs loading in bootstrap.php in this situation? It's only really relevant when doing unit tests and it shouldn't really be that way - it makes those classes behave like phpunit's mocks :/

I've streamlined Dockerfile, spliting especially loading external OS dependencies for build and sorting them alphabetically.

@nick-zh
Copy link
Contributor

nick-zh commented Oct 28, 2019

Except for the error handling, i think the changes do look ok.
Regarding the default topic conf @Steveb-p , all the configs that were set there, can be set in the normal config object.

@stale
Copy link

stale bot commented Nov 27, 2019

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the wontfix label Nov 27, 2019
@stale stale bot removed the wontfix label Nov 27, 2019
Copy link

@maks-rafalko maks-rafalko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like to bump this PR and activate further development.

We have upgraded quite a big application to PHP 7.4. All our consumers are now stuck and can not be restarted when php-rdkafka:^3 is used.

bin/console messenger:consume --limit=1

So, after 1 message is processed, the process is "frozen" and does not return exit code 0.

There was no such an issue with PHP 7.3 + php-rdkafka 3.

Fortunately, upgrading to php-rdkafka 4 solves the issue.

I have forked enqueue/rdkafka and installed the fork to the project - and it works as expected.

Questions: are you guys interested in merging this one? How can I help? Can we merge and release a new major version without waiting for #749 or should we continue to work with our fork?

Probably, as a compromise we can merge this one without tagging a major version, so that we can use dev-master but not the fork.


PHP 7.3 works fine with php-rdkafka 3
PHP 7.4 does not work correctly with php-rdkafka 3
PHP 7.4 works fine with php-rdkafka 4

@Steveb-p Steveb-p self-assigned this Feb 3, 2020
This allows compatibility with phprdkafka 4.0
@Steveb-p
Copy link
Contributor Author

I've revisited the branch and removed all changes unrelated to actual compatibility with phprdkafka 4.0.

Now all this does is introduce shutdown_timeout configuration property that takes effect when working with phprdkafka 4.0.

Other changes will be opened as separate PRs.

@Steveb-p Steveb-p requested a review from makasim February 14, 2020 20:19
@schroedingerskatze42
Copy link

@makasim are there any contributions needed to merge this?

@Egari
Copy link

Egari commented Sep 14, 2020

I would also very much like to use enqueue/rdkafka on PHP 7.4 with phprdkafka 4.

@Steveb-p @makasim any prospect on seeing this PR reviewed and/or merged?

@makasim
Copy link
Member

makasim commented Sep 14, 2020

@makasim are there any contributions needed to merge this?

I've no idea.

@Steveb-p is it good to go?

@Steveb-p
Copy link
Contributor Author

@makasim It should, I think it was working last time I've worked with it. But a lot of time has passed and I'm not sure - at the very least I'll need to take a look at it again.

@makasim
Copy link
Member

makasim commented Sep 14, 2020

Looks good to me.

$producer = new VendorProducer($this->getConf());

if (isset($this->config['log_level'])) {
$producer->setLogLevel($this->config['log_level']);
Copy link
Contributor

@nick-zh nick-zh Sep 14, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is setLogLevel of RdKafka\Producer be aware that it has been deprecated. Just having log_level in RdKafka\Conf is fine ✌️
@Steveb-p i do tend to forget which producer is which in enqueue, forgive me if my assumption is not correct

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nick-zh you're correct, this is directly calling setLogLevel on Kafka Producer instance.

At this point I'm not really going to remove this deprecation, since in general PHP frameworks and applications should only log the deprecation (similarly to the default topic case). I'll address this in a separate PR.

@Steveb-p
Copy link
Contributor Author

@makasim I'd say it's good to go.

There is a side effect of shutdown function being registered that mimics pre-4.0 kafka extension behavior, but I think it's... acceptable? For 3.0 versions it will be simply a no-op since flush does not exist.

@makasim makasim merged commit caf0afa into php-enqueue:master Sep 15, 2020
@nikolaposa
Copy link

nikolaposa commented Nov 10, 2021

Hi @Steveb-p, 👋 sorry for awaking this PR. Do you know what's the preferred value for shutdown_timeout configuration for production environment? I see that the default value is -1, which I guess means "no timeout"?

Also, in general, do you have recommendations for low-latency Producer configuration? In the php-rdkafka docs I've found this: https://github.com/arnaud-lb/php-rdkafka#performance--low-latency-settings, but I'm wondering if there's more? What I had on mind was something like this:

socket.timeout.ms: 50
queue.buffering.max.ms: 1
connection.max.idle.ms: 1000
topic.metadata.refresh.interval.ms: 500

I'm also wondering whether queue.buffering.max.ms: 1 means that there's no need for explicit flush() since there will be no batching of messages. 🤔

Thanks for all the great work you did here! 👍

@nick-zh
Copy link
Contributor

nick-zh commented Nov 10, 2021

Hey @nikolaposa

These two links should help latency 1 and latency 2 from the c library.
If you don't want to lose messages, you should always call flush when you end the producer.

@nikolaposa
Copy link

Thank you @nick-zh. 👍 Is flush() automatically called with default setup of this library? I see this register_shutdown_function call https://github.com/php-enqueue/enqueue-dev/pull/959/files#diff-3e3dc85ef34c5e358f190d3af6de75fc1f81596044f068d0cddc6d916d029c32R109. Are there any side effects if it is called more than once in a single request? For example, manually and then in register_shutdown_function?

@nick-zh
Copy link
Contributor

nick-zh commented Nov 10, 2021

No worries, yeah sry i forgot about this, seems it will flush when you call close()
Regarding side effects, yes and no, if you are only producing one message or calling it manually after you are done producing, then no, the second flush will be instant.
If you produce more than one message, it will increase delay if you call it after each message.
Many settings / calls depend on your use case and what you try to achieve (how many messages, accepted delay time, etc.)
If you are just sending one message (e.g. triggered by a rest request), then not much optimization besides maybe shutdown is needed. It is a whole other story if it is a long running process with many messages, then the settings that i linked will help.
I hope that helps 😄

@nikolaposa
Copy link

Thank you once again Nick, that's very useful to know.

@felippeduarte
Copy link
Contributor

felippeduarte commented Aug 8, 2023

Hey. Looking at the rdkafka docs (https://arnaud.le-blanc.net/php-rdkafka-doc/phpdoc/rdkafka.flush.html) flush returns an integer instead of void in Producer class. Can we change that? Will make it easier for some processes here that required the return code.
Basically will change

    public function flush(int $timeout): void
    {
        // Flush method is exposed in phprdkafka 4.0
        if (method_exists($this->producer, 'flush')) {
            $this->producer->flush($timeout);
        }
    }

to

    public function flush(int $timeout): ?int
    {
        // Flush method is exposed in phprdkafka 4.0
        if (method_exists($this->producer, 'flush')) {
            return $this->producer->flush($timeout);
        }
    }

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants