-
Notifications
You must be signed in to change notification settings - Fork 7.4k
[RFC] Proposed development plan for Zephyr's POSIX subsystem #17706
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
This is the promised RFC which was discussed at the recent Networking Forum(s) and Dev meeting(s). cc: @nashif, @mbolivar, @jukkar, @MaureenHelm, @galak, @PiotrZierhoffer, @tgorochowik, @dleach02. |
@galak, Please help me to lead this to TSC/Dev meetings as required to review/discuss this RFC. I'm especially concerned, as I'm on vacation next week, and then there's very little time to merge the long-pending patches for 2.0. As the RFC argues in the pertinent section, all the available patches follow the same course of development as already took place last 6-12 month, including 1.14 release preparation. Thanks. |
I have lots of issues with this so-called RFC which advocates creating a special development process for the posix subsystems, but lets start with some sections:
For the headers, include/posix is not ideal and conflict with our definition of what include/ should contain (Zephyr public APIs), for posix (and other abstractions) there are at least 2 options:
I am leaning towards option (2), we should do the same with cmsis rtos APIs and any other abstractions in the future.
We should not rush into getting this into 2.0, IMO this item is big enough to be a 2.1 item and I do not see us solving all issues with POSIX for 2.0 and it was never on the roadmap anyways. So lets tread slowly and get this right for 2.1. |
Recently, there were a few similar and related questions I commented on, which I'd like to summarize here for future reference (this summary should be pretty compatible with, and entail from, the original RFC above). The kind of questions being talked about is along the lines of "Why Zephyr POSIX subsystem includes function fun1(), it's not POSIX function, it's BSD (or similar) function" and "Function fun2() [taken from Linux or similar] is not in POSIX, so why do we talk POSIX subsystem here?". To answer these, I'd like to remind that there're 2 meaning of word POSIX:
This whole RFC advocates and emphasizes importance of point 2. That is, purpose of Zephyr POSIX subsystem isn't just to put a badge by the Zephyr name "conforms to POSIX xxxx.y-zz", but also to enabling porting and reusing multitude of real-world software and knowledge to Zephyr. In this regard, besides purely standard-defines APIs, there's a kind of extended API, which can be called "extended POSIX", which perhaps can be subdivided to: a) "sub-POSIX", or well-known functions which didn't get into the POSIX standard(s) for various reasons. One of known reasons is that the POSIX standard is known to have affinity for SystemV family of origin Unix, omitting support for BSD-derived functionality. That doesn't mean it doesn't exist or useless. As long as it's useful to port/be compatible with existing software, we could add BSD-compatible functions. Hopefully with the above detailed outline, specific questions can be answered: Q: Why Zephyr POSIX subsystem includes some BSD-heritage functions which aren't in POSIX? Q: It would be nice to add some function from Linux to work with other Zephyr POSIX subsystem components (file descriptors, sockets, poll, etc.). In which way such a Linux function should be added? |
@cfriedt can you please take a look and close if this is already covered in other RFCs? |
@nashif - this is great - thanks for finding these slightly older posix issues. You've helped me find the specific standards that we should be implementing for the embedded profile :-) https://ieeexplore.ieee.org/document/1342418 I think this rfc is still relevant in terms of content. The direction here is a bit vague but I think it agrees with the current one. I'll close this issue but add a reference to the POSIX LTSv3 Roadmap. |
Summary
This RFC seeks to transform and extend Zephyr's POSIX subsystem, which was initially conceived to implement just a small embedded profile specification of POSIX, into a subsystem with wider coverage of the full POSIX standard. While doing so, it doesn't seek to establish specific (sub)set of the POSIX standard to implement. Instead, it seeks to establish process and criteria to allow incremental and gradual development and addition of new features, based on the Zephyr stakeholder and community needs.
Mission statement
There're 2 ways to develop software for a particular system:
Zephyr is a small, efficient RTOS, and thus p.1 was the initial scope. But importance of p.2 should not be underestimated. The author of this RFC and the growing community of Zephyr users think that inability or extra hurdles in porting existing software become a growing blocker on the way to wider Zephyr adoption and usage.
This RFC seeks to remedy the situation and enable large-scale application porting to Zephyr, by laying close attention on the implementation of the standard OS API (the POSIX standard). At the same time, it seeks to do so in sustainable, manageable and lean way, following the principles of agile software development and open-source, community-driven process.
Motivation
Zephyr includes many subsystems, which are largely disjoint. One of such subsystem is BSD Sockets(-like) subsystem, initially written by the author of this RFC. It was initially developed as a proof-of-concept, alternative networking API to Zephyr's own (adhoc) networking API. There were 3 main ideas why adding BSD Sockets(-like) API to Zephyr would be useful:
Of these, p.1 was the initial motivation, p.2 helped the BSD Sockets(-like) API to achieve status of the official networking API, when it became clear that it provides a good answer for kernel-vs-userspace separation challenges and resource protection needs.
However, leveraging p.3 took some time to gather momentum, with real work starting since the beginning of this year (2019). Even the first porting experiment (see a retrospective section below) exposed big issues with BSD Sockets(-like) subsystem. To remind, this section starts with "Zephyr includes many subsystems, which are largely disjoint." Then, throughout the text, the sockets subsystem is called "BSD Sockets(-like)". Existing sockets subsystem largely follows the spirit of BSD Sockets API, but lacks a lot of functionality and features a lot of small-ish differences if taken POSIX BSD Sockets API by word. It also doesn't integrate well with other Zephyr subsystems, like existing (also very incomplete) POSIX subsystem and C library.
All that led to following issues observed:
Over time, while fighting with the issues described above, the solution became apparent: Different Zephyr subsystems should be integrated together under auspices of the POSIX standard, following it closely by a word, not just by a spirit.
Implementation process
Developing a complete implementation of POSIX IEEE 1003.1 is no simple task, due to a breadth and depth of the standard. It would take dozens of man-years to finish that task. There're no such (formally allocatable) resources in the Zephyr community, so this RFC doesn't lay a specific plan to implement "full POSIX". Instead, the proposal is to focus on the practical side of things. As the previous sections said, the main motivation is to be able to port/reuse existing application software to Zephyr, and that's why we're interested POSIX, and not any other way around. Thus, development of the POSIX subsystem should be primarily driven by active porting efforts:
This is essentially a lean/agile development methodology, where development is driven by the short-term needs, and as long as the development goes in the right direction - more POSIX functionality gets implemented (even if not completely!), CI passes, there're no obvious mistakes or noticeable/avoidable technical debt added - it gets merged and process immediately repeats with the next development iteration, etc.
Of course, besides community-driven new-feature process, there's also maintenance process working in a usual way:
This process might have more background priority and lower intensity, but otherwise follows the same agile workflow as feature process, with the same acceptance criteria (as long as a change improves the situation and doesn't deteriorate it, it's good to go).
Relationship to the existing Zephyr POSIX subsystem
This efforts is supposed to be fully based on the existing POSIX subsystem, and intended to continue its development further in continuous, seamless, sustainable fashion. There's no intention to replace it, tear it off, beat with sticks, or anything like that. There may be a need for deep bug-fixes or wide refactors, but too-deep and too-wide cases should be rare, and each case would be handled on as-needed basis, following the usual process (big non-trivial changes gets RFCed and discussed, etc., while normal changes follow the agile process described above).
It should be noted that the process of the elaboration of the existing POSIX subsystem is going for quite some time now, and this RFC effectively just captures this existing practices, for the entire Zephyr community to be in loop of it.
During the initial discussion of the development process of the POSIX subsystem (i.e. the subject of this RFC), it was raised to the attention the fact that initially the POSIX subsystem was intended to implement PSE52 profile of POSIX. The author of this RFC (also a maintainer of the POSIX subsystem for last half a year and the author of many changes to it) has to admit that such a claim came as a surprise. A lot of time while preparing this RFC was spent trying to understand this situation and how historical plans for PSE52 profile development affect this RFC. Below, the situation with PSE52 is traced in detail:
So, web search works well, there're no other references in the docs.
3. The 1.11.0 changelog links to #1291 dated 2017-08-30, which indeed talks about implementing PSE52 subset of POSIX.
4. The answer would be implied by the doc search above, but let's double-check:
I.e., there're no further mentioning of PSE52 in the tree, in particular, no config options specifically for PSE52.
So, how the existing POSIX subsystem is described by the config options?
I.e. the main POSIX option describes itself as implementing the "big" POSIX (IEEE 1003.1), not some subset (IEEE 1003.13). For the full disclosure, this description comes from a patch by the author of this RFC, dated 2018-09-27: 8dc69e0 . This proves the point that the changes described in this document didn't start today or yesterday, but are in progress for quite some time. (After making a number of changes like that to POSIX subsystem, the author of this RFC volunteered to be a maintainer of the subsystem to progress it along the vision now formally described in this document).
However, I didn't mention IEEE 1003.1 out of top of my head, I essentially just copied it from a description of one previous POSIX patches: eb0aaca :
That commit was made by one of the original authors of the POSIX subsystem.
Hopefully, the evidence presented is enough to make following summary:
marketing materialsobscure subsets of a well-known API, than the whole API, which allows to work with real-world applications. Again, let's please not go there.)lib/pse52
. The exact difference would be that this RFC would start with proposal to rename itlib/posix
. But per p.2, it was done in future-proof way from the start, so there's nothing to worry about here.Location of the headers
Another question raised during pre-discussion of this RFC was location of the new POSIX headers added. The previous section should provide pretty obvious and natural response: the existing POSIX subsystem has headers in
include/posix
, thus any extension to it would also have headers ininclude/posix
.There was speculation that (some?) new headers might be put directly in
include/
. It would be quite inconsistent and unsustainable to have some POSIX headers under one path, while other under another (also 3rd, 4th, etc?). An example was given based on a particular header which was in a subdirectory:include/posix/arpa/inet.h
(re: https://github.com/zephyrproject-rtos/zephyr/pull/16621/files), speculating that it as well could go intoinclude/arpa/inet.h
. But that's a very peculiar example. There're many POSIX headers, and majority of them go into the top-level include directory, not a subdirectory likearpa/
above. Again, we don't want to make confusing rules of what goes where, based on such a kind of criteria. This won't be sustainable, will lead to mistakes, conflicts, duplication, etc.The interesting implication question is however whether using the
include/posix/
location as was done originally was a good move, and whether it would make sense to revisit it now.Generally, the header space should be structured and ordered. More specifically, there should be proper namespaces of native Zephyr headers vs POSIX headers. Failing that, there will be confusion and conflicts, again. We actually had example of that, so such claims come from the actual experience: b4b108d .
The idea situation would be that Zephyr native headers would be namespaced, e.g. located in
include/zephyr/
and included as#include <zephyr/...>
. While POSIX headers would be at their natural locations mandated by the standard. This would 100% resolve any risk of namespace conflicts (including with other 3rd-party projects). Unfortunately, recently Zephyr TSC discussed this question and made a decision to not move Zephyr headers underzephyr/
, so for the mid-term (2-3 years), we're locked out from the opportunity to resolve this issue completely, until further experience and leveraging Zephyr in real-world conditions might prompt another iteration of handling this matter.Then, in the current conditions, sticking with existing
include/posix/
makes good sense - it's already tested and tried solution, which doesn't give as strong non-conflict guarantees as the described above, but still provides bare-minimum required namespacing separation. While this solution requires some overheads in managing include paths, and indeed, some recent elaborations of that aren't merged yet (#15937), at least it's by now well understood that these elaborations are required and how to have done them.Retrospective: Existing 3rd-party applications porting projects
Immediate scope of work
The text was updated successfully, but these errors were encountered: