Skip to content

zbus: remove k_malloc dependency for ZBUS_RUNTIME_OBSERVERS #79777

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

JordanYates
Copy link
Collaborator

Remove the dependency on the system heap existing when enabling ZBUS_RUNTIME_OBSERVERS. This means the overhead of the runtime list is applied to all observers in the system, instead of just the observers that are added at runtime. However this overhead is only 8 bytes per observer (2 pointers). It is unlikely that this overhead is larger than the cost of enabling the system heap for applications that don't already have it enabled.

Remove the dependency on the system heap existing when enabling
`ZBUS_RUNTIME_OBSERVERS`. This means the overhead of the runtime list
is applied to all observers in the system, instead of just the observers
that are added at runtime. However this overhead is only 8 bytes per
observer (2 pointers). It is unlikely that this overhead is larger
than the cost of enabling the system heap for applications that don't
already have it enabled.

Signed-off-by: Jordan Yates <[email protected]>
@rodrigopex
Copy link
Collaborator

@JordanYates, are you trying to reuse a single node (observer) in multiple linked lists (channel's observer list)? Are you sure that will work when you have a set of observers for different channels? I tried that in the very beginning, and several inconsistencies appeared. I am preparing an example showing those inconsistencies.

@JordanYates
Copy link
Collaborator Author

@JordanYates, are you trying to reuse a single node (observer) in multiple linked lists (channel's observer list)? Are you sure that will work when you have a set of observers for different channels? I tried that in the very beginning, and several inconsistencies appeared. I am preparing an example showing those inconsistencies.

No, I am just trying to register a node at runtime without needing malloc. The observer itself is already part of program memory, I don't see a great advantage in requiring a 8 byte malloc to enable it to be used.

@rodrigopex
Copy link
Collaborator

Suppose you have channel C1, C2 and C3; and listeners L1, L2, and L3.

Please try this with your implementation:

zbus_chan_add_obs(&C1, &L1, K_NO_WAIT);

zbus_chan_add_obs(&C1, &L2, K_NO_WAIT);

zbus_chan_add_obs(&C2, &L1, K_NO_WAIT);

zbus_chan_add_obs(&C2, &L3, K_NO_WAIT);

zbus_chan_add_obs(&C3, &L1, K_NO_WAIT);

zbus_chan_pub(&C1, &msg, K_NO_WAIT);

zbus_chan_pub(&C2, &msg, K_NO_WAIT);

zbus_chan_pub(&C3, &msg, K_NO_WAIT);

Let me know how many listeners were notified by VDED.

@JordanYates
Copy link
Collaborator Author

Well that is obviously not going to work because adding the same node to two different linked lists is wrong.
I was not aware that adding the same observer to multiple channels was something that was intended or possible.
I can't see any mention of it in the docs or sample, and it obviously isn't tested since CI passes on the PR.

What is the reasoning behind the feature? It's not like struct zbus_observer is some huge consumer of RAM, its two pointers and an enum by default.

@rodrigopex
Copy link
Collaborator

Let's improve tests. 1 listener can observe many channels simultaneously. Doing what you are suggesting we are going to break compatibility.

@rodrigopex
Copy link
Collaborator

rodrigopex commented Nov 3, 2024

Let's improve docs as well. This is a very important feature.

@JordanYates
Copy link
Collaborator Author

Doing what you are suggesting we are going to break compatibility.

Compatibility with what? It's an undocumented and untested implementation detail at this point.

Let's improve docs as well. This is a very important feature.

Why? As I said previously, struct zbus_observer is a ~12 byte structure, why is it so important that it support arbitrary numbers of channels, when the cost is requiring k_malloc?

@rodrigopex
Copy link
Collaborator

This feature enables the developer to use a single observer to handle several channels notifications even if they have different messages. The only tricky is to use a union on the receiving message.

@rodrigopex
Copy link
Collaborator

I am going to add documentation about it. Using the same observer to several channels would make the design of software simpler and clearer. I will provide a sample and add that to the documentation. But it cannot be removed, it could cause several problems to them that uses this "hidden" feature. Several people uses that. In my point of view it would be very prejudicial to the bus.

@JordanYates
Copy link
Collaborator Author

This feature enables the developer to use a single observer to handle several channels notifications even if they have different messages. The only tricky is to use a union on the receiving message.

I understand that is what it enables, what has not been explained is why this is preferable to one observer per channel (for runtime observers, static observers don't use anything touched by this PR). My reasoning is:

  • Not all applications enable k_malloc in the first place
  • The overhead of each k_malloc is approximately the size of struct zbus_observer (8 vs 12 bytes)
  • Individual observers can still point back to the same output implementation (common k_msgq, callback or message_fifo)

Using the same observer to several channels would make the design of software simpler and clearer. I will provide a sample and add that to the documentation.

That's a blanket statement I am pretty skeptical of, but happy to wait and see what the sample looks like.

@rodrigopex
Copy link
Collaborator

I don't see that as a good reason for the runtime observers to be different from the static ones. Another point is that if you are not relying on malloc, you have everything allocated statically. In this way, you can use the ZBUS_CHAN_ADD_OBS to do what you are suggesting and disable the observation individually or totally in runtime.

I strongly suggest not changing that, as removing the malloc would reduce the feature's power and flexibility. If one cannot use malloc, use static observers.

@rodrigopex
Copy link
Collaborator

Please, if you want to continue with that, open an RFC because it will affect people's code. If the other members of the community think it is progress, we can add that to the code. Thank you.

@kartben
Copy link
Collaborator

kartben commented Nov 4, 2024

Another point is that if you are not relying on malloc, you have everything allocated statically

But I think what @JordanYates is saying is simply that by means of only adding an 8-byte overhead to each observer, it would be possible to not even need dynamic allocation. Maybe there could be two configurable modes based on whether people do mind about that overhead (ex. they have tons of observers) and want malloc, but the default could be the one @JordanYates proposes?

@rodrigopex
Copy link
Collaborator

But I think what @JordanYates is saying is simply that by means of only adding an 8-byte overhead to each observer, it would be possible to not even need dynamic allocation.

@kartben, I understood what he was saying. However, doing that will break compatibility and drastically decrease the feature's power and flexibility.

Maybe there could be two configurable modes based on whether people do mind about that overhead (ex. they have tons of observers) and want malloc, but the default could be the one @JordanYates proposes?

Another variation for zbus configuration for that? Adding another configuration variant to zbus unnecessarily will not be good. There are so many variations and possibilities, and people struggle with that.

Again, if you need static code, you already have ZBUS_CHAN_ADD_OBS. Please point out an example where "the only one channel runtime observer" is required so I can show you how to use it statically.

In the end, you are suggesting adding another kind of runtime observer and not removing the oldest style, right @kartben? If so, it is still necessary to change this PR. An RFC may be good because we should discuss that at the Architecture Group.

@JordanYates
Copy link
Collaborator Author

There is not much point progressing anything until the PR with documentation and sample is opened by @rodrigopex, as there is nothing in-tree showing how a single observer on multiple channels provides benefits. Until that exists in some form, we're debating against a ghost.

Copy link

github-actions bot commented Jan 4, 2025

This pull request has been marked as stale because it has been open (more than) 60 days with no activity. Remove the stale label or add a comment saying that you would like to have the label removed otherwise this pull request will automatically be closed in 14 days. Note, that you can always re-open a closed pull request at any time.

@github-actions github-actions bot added the Stale label Jan 4, 2025
@JordanYates JordanYates removed the Stale label Jan 4, 2025
@JordanYates
Copy link
Collaborator Author

Still waiting on the documentation, sample and tests that demonstrate the functionality that is blocking this PR.

@rodrigopex
Copy link
Collaborator

Still waiting on the documentation, sample and tests that demonstrate the functionality that is blocking this PR.

As I said, the PR breaks compatibility with the current code behavior. Nowadays, runtime and static observers behave the same in all aspects other than allocation time. It is implicit that they should behave equally since no comments tell they must be different, right? One single line of comment should be enough to enforce the idea. Will that be enough? Another good thing is to add tests to enforce equal behavior.

@rodrigopex
Copy link
Collaborator

@JordanYates, we have two ways to proceed here:

(a) change the PR to keep the current behavior and add only one observation for runtime observers as an option/config to disable the malloc feature. That would increase some complexity to the setup, but as it should be enabled by the user, it could be seen as a tunning feature;

(b) If you'd like to proceed with that as it is, please open an RFC requesting that change, and we can talk about that at the Architecture WG Meeting.

Sounds good?

@JordanYates
Copy link
Collaborator Author

As I said, the PR breaks compatibility with the current code behavior.

Current code behavior is not a standalone reason to not make changes. Bugs are "current code behavior", yet we fix them (not suggesting the current behavior is a bug).

Your argument is that it is currently possible to attach the same listener to multiple channels, which is true. It is however not documented, tested or used anywhere in tree, and has no reasoning for how this might be a useful feature. Since it has no explicit version in the API overview, I am assuming that zbus is still an experimental API. I don't see any reason why proposing a change to an undocumented, untested, unused feature on an experimental API warrants a trip to the architecture WG.

I am not trying to push this PR over your valid concerns, I am hoping to get some (any) example of why the feature could be useful so I can weigh up whether the advantage makes sense at the cost of requiring k_malloc.

@pdgendt
Copy link
Collaborator

pdgendt commented Jan 8, 2025

Just wanted to weigh in a bit here;

I've had some concerns as well for using malloc in a subsystem like zbus, and would prefer to have the default use a linked list as proposed in this PR.

It's somewhat trivial to create multiple runtime observer wrappers to call the same callbacks to mimic the allocing mechanism. Both have tradeoffs. but allocating should be a last resort or opt-in, IMHO.

@rodrigopex
Copy link
Collaborator

@pdgendt

Just wanted to weigh in a bit here;

I've had some concerns as well for using malloc in a subsystem like zbus, and would prefer to have the default use a linked list as proposed in this PR.

It's somewhat trivial to create multiple runtime observer wrappers to call the same callbacks to mimic the allocing mechanism. Both have tradeoffs. but allocating should be a last resort or opt-in, IMHO.

The runtime implementation for things usually relies on dynamic allocation. I don't see this as a problem.

Another point is that one of the most requested zbus features (MSG Subscribers) relies on dynamic allocation (and has a static alternative). So, as message subscribers become popular, people end up enabling the heap. The event manager (from Nordic) and ZMK event manager (if I recall correctly) do the same for message distribution.

For me, we are talking about an optimization or a variant similar to message subscribers, where we can offer a solution for those who do not want to use dynamic allocation. We can offer the user a limited version of runtime observers that does not rely on the heap, which is good. However, in my point of view, removing the current implementation is not an improvement. It will cause compatibility issues and unnecessarily reduce the feature set already deployed.

@JordanYates
Copy link
Collaborator Author

The runtime implementation for things usually relies on dynamic allocation. I don't see this as a problem.

I disagree on this point, the majority of runtime implementations take a pointer to an object (which could be allocated through malloc) with no further allocation required. A search for = k_malloc( on the tree (excluding tests) shows only 87 usages, with the majority being in WiFi, networking and OTA update code.

@brandon-exact
Copy link
Contributor

brandon-exact commented Jan 20, 2025

There is not much point progressing anything until the PR with documentation and sample is opened by @rodrigopex, as there is nothing in-tree showing how a single observer on multiple channels provides benefits. Until that exists in some form, we're debating against a ghost.

FWIW: I have a "network manager" observer that subscribes to my "ethernet", "wifi" and "cellular" zbus channels; this change would break my application architecture.

Copy link

This pull request has been marked as stale because it has been open (more than) 60 days with no activity. Remove the stale label or add a comment saying that you would like to have the label removed otherwise this pull request will automatically be closed in 14 days. Note, that you can always re-open a closed pull request at any time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: Samples Samples area: Zbus Release Notes To be mentioned in the release notes Stale
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants