Skip to content

net: socket: syscall for socketpair(2) #24813

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
May 10, 2020
Merged

net: socket: syscall for socketpair(2) #24813

merged 2 commits into from
May 10, 2020

Conversation

cfriedt
Copy link
Member

@cfriedt cfriedt commented Apr 29, 2020

Working:

  • non-blocking reads / writes
  • blocking reads / writes
  • send(2) / recv(2) / sendto(2) / recvfrom(2) / sendmsg(2)
  • select(2)
  • poll(2)

Fixes #24366

*** Booting Zephyr OS build zephyr-v2.2.0-2516-g63d10feeb7b8  ***
Running test suite socketpair
===================================================================
starting test - test_socketpair_AF_LOCAL__SOCK_STREAM__0
PASS - test_socketpair_AF_LOCAL__SOCK_STREAM__0
===================================================================
starting test - test_socketpair_AF_UNIX__SOCK_STREAM__0
PASS - test_socketpair_AF_UNIX__SOCK_STREAM__0
===================================================================
starting test - test_socketpair_write_nonblock
PASS - test_socketpair_write_nonblock
===================================================================
starting test - test_socketpair_read_nonblock
PASS - test_socketpair_read_nonblock
===================================================================
starting test - test_socketpair_write_block
PASS - test_socketpair_write_block
===================================================================
starting test - test_socketpair_read_block
PASS - test_socketpair_read_block
===================================================================
starting test - test_socketpair_close_one_end_and_write_to_the_other
PASS - test_socketpair_close_one_end_and_write_to_the_other
===================================================================
starting test - test_socketpair_close_one_end_and_read_from_the_other
PASS - test_socketpair_close_one_end_and_read_from_the_other
===================================================================
starting test - test_socketpair_expected_failures
PASS - test_socketpair_expected_failures
===================================================================
starting test - test_socketpair_unsupported_calls
PASS - test_socketpair_unsupported_calls
===================================================================
starting test - test_socketpair_fcntl
PASS - test_socketpair_fcntl
===================================================================
starting test - test_socketpair_poll_timeout
PASS - test_socketpair_poll_timeout
===================================================================
starting test - test_socketpair_poll_timeout_nonblocking
PASS - test_socketpair_poll_timeout_nonblocking
===================================================================
starting test - test_socketpair_poll_immediate_data
PASS - test_socketpair_poll_immediate_data
===================================================================
starting test - test_socketpair_poll_delayed_data
PASS - test_socketpair_poll_delayed_data
===================================================================
starting test - test_socketpair_poll_close_remote_end_POLLIN
PASS - test_socketpair_poll_close_remote_end_POLLIN
===================================================================
starting test - test_socketpair_poll_close_remote_end_POLLOUT
PASS - test_socketpair_poll_close_remote_end_POLLOUT
===================================================================
Test suite socketpair succeeded
===================================================================
PROJECT EXECUTION SUCCESSFUL

@zephyrbot zephyrbot added area: Networking area: API Changes to public APIs area: Tests Issues related to a particular existing or missing test labels Apr 29, 2020
@zephyrbot
Copy link
Collaborator

zephyrbot commented Apr 29, 2020

All checks are passing now.

checkpatch (informational only, not a failure)

-:1411: WARNING:IF_0: Consider removing the code enclosed by this #if 0 and its #endif
#1411: FILE: tests/net/socket/socketpair/src/main.c:79:
+#if 0

- total: 0 errors, 1 warnings, 2453 lines checked

NOTE: For some of the reported defects, checkpatch may be able to
      mechanically convert to the typical style using --fix or --fix-inplace.

Your patch has style problems, please review.

NOTE: Ignored message types: AVOID_EXTERNS BRACES CONFIG_EXPERIMENTAL CONST_STRUCT DATE_TIME FILE_PATH_CHANGES MINMAX NETWORKING_BLOCK_COMMENT_STYLE PRINTK_WITHOUT_KERN_LEVEL SPDX_LICENSE_TAG SPLIT_STRING VOLATILE

NOTE: If any of the errors are false positives, please report
      them to the maintainers.

Tip: The bot edits this comment instead of posting a new one, so you can check the comment's history to see earlier messages.

@cfriedt
Copy link
Member Author

cfriedt commented Apr 29, 2020

Ugh... let me take care of those gitlint issues

@cfriedt cfriedt requested review from aescolar and daor-oti as code owners May 4, 2020 00:58
@zephyrbot zephyrbot added the area: native port Host native arch port (native_sim) label May 4, 2020
@stephanosio
Copy link
Member

-:1242: WARNING:SPDX_LICENSE_TAG: Missing or malformed SPDX-License-Identifier tag in line 1
#1242: FILE: tests/net/socket/socketpair/src/main.c:1:

Please ignore these warnings (see #24921)

@stephanosio stephanosio marked this pull request as draft May 4, 2020 06:18
@cfriedt cfriedt changed the title [WIP] net: socket: syscall for socketpair(2) net: socket: syscall for socketpair(2) May 4, 2020
@cfriedt cfriedt marked this pull request as ready for review May 4, 2020 17:06
@cfriedt
Copy link
Member Author

cfriedt commented May 4, 2020

I omitted the poll(2) tests because they are somewhat exercised by select(2).

Please suggest any additional testing or technical changes that you feel are necessary.

@cfriedt
Copy link
Member Author

cfriedt commented May 4, 2020

Looks as though the shippable failures have nothing to do with this change - seems to be Bluetooth related.

@aescolar aescolar removed their request for review May 5, 2020 06:38
@pfalcon
Copy link
Collaborator

pfalcon commented May 5, 2020 via email

@pfalcon
Copy link
Collaborator

pfalcon commented May 5, 2020

sockets.c, which is currently where much of the poll machinery lives.

That definitely becomes a problem, I submitted on that #24966. Even if done now, it unlikely would get into the 2.3 release (needless risk), and then it makes no sense to rush with it now (I'm for one still preoccupied with other tasks). And then, this code would need to deal with the current situation (workaround it/whatever).

@pfalcon
Copy link
Collaborator

pfalcon commented May 5, 2020

I omitted the poll(2) tests because they are somewhat exercised by select(2).
Please suggest any additional testing or technical changes that you feel are necessary.

I would suggest, and even urge, to remove any calls of select() from the code in this PR. select() is explicitly defined as being implemented on top of poll(), and being a legacy and inefficient API. The only test for it we need is that select() <-> poll() marshalling works correctly (and we have such a test).

Copy link
Member

@jukkar jukkar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. I am very happy to see such extensive testing for this, thumbs up for that.
Some minor comments about coding style, I did not comment coding style in tests as there was so much stuff there, but I would appreciate if you can check those files too and fix relevant style issues there.

@pfalcon
Copy link
Collaborator

pfalcon commented May 5, 2020

In networking code, it would be better to use NET_DBG().

We first should decide whether unix domain sockets are "networking code" per se. I didn't look thru the code in detail yet, but @cfriedt, I'd urge you to consider whether this code should belong to subsys/net/lib/, or lib/posix/ instead.

The network variant adds thread id to debug print which is quite useful in debugging.

IMHO, this is an example of "feudalism" in Zephyr's subsystems. If it's quite useful, then certainly it should have been contributed to the generic logging (and e.g. made configurable per-logger, like currently default logging level is).

@cfriedt
Copy link
Member Author

cfriedt commented May 5, 2020

In networking code, it would be better to use NET_DBG().

We first should decide whether unix domain sockets are "networking code" per se. I didn't look thru the code in detail yet, but @cfriedt, I'd urge you to consider whether this code should belong to subsys/net/lib/, or lib/posix/ instead.

It's a tricky subject. Technically, they are sockets, but they don't depend on IP support. I'd be happy keeping them in net until #24966 is resolved, and would be happy to refactor as part of that change at some point.

Edit: @pfalcon - it's not possible to just put it in lib/posix as-is, is it? I guess, technically, I could and then just use Kconfig to make it depend on NET_SOCKETS.. That way, later refactoring would be just removing the dependency.

Edit: @pfalcon - i tried the above, but for some reason it could not find zsock_impl_socketpair() to link with. Probably best to save that for another day, but I do agree it should eventually go in lib/posix.

@pfalcon
Copy link
Collaborator

pfalcon commented May 5, 2020

I'd be happy keeping them in net until #24966 is resolved, and would be happy to refactor as part of that change at some point.

Sounds good.

@cfriedt
Copy link
Member Author

cfriedt commented May 5, 2020

Looks good. I am very happy to see such extensive testing for this, thumbs up for that.
Some minor comments about coding style, I did not comment coding style in tests as there was so much stuff there, but I would appreciate if you can check those files too and fix relevant style issues there.

I will do. So much work to get it passing checkpatch already though.

checkpatch doesn't fix all of these things. I know there a program that supposedly does, but I forget its name. clang-format is pretty great, but it doesn't work perfectly atm.

Edit: uncrustify is the one #21392

@carlescufi
Copy link
Member

@jukkar @pfalcon @cfriedt I would really like to get support for socket pairs into 2.3. Any chance we can get there before Friday?

@pfalcon pfalcon closed this May 6, 2020
@pfalcon pfalcon reopened this May 6, 2020
Copy link
Collaborator

@pfalcon pfalcon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cfriedt, please split addition of the functionality from the addition of tests/samples into separate commits. (Rule of thumb: better to have well-separated commits, it's much easier to squash them together than split them.)

Also, as suggested, please make sure tests use poll(), not select().

Thanks.

@pfalcon
Copy link
Collaborator

pfalcon commented May 7, 2020

Sure, we can remove SOCK_RAW. I've just seen it used with socketpair in the past.

The question is how that is supposed to work. I myself don't have a knowledge of entire POSIX/extended POSIX APIs. But maintain an idea that the whole purpose of those APIs in Zephyr is to aid software portability (and I'm specifically interested in in maintaining that outlook because native Zephyr APIs don't often demonstrate that level of design consideration/conciseness/generality). So, there would need need to be some references of how a particular functionality should work, so we can check that Zephyr implementation adheres to it. And as even the POSIX standard offers only so much detail, that ultimately means writing tests to check behavior of a particular well-known, well-developed, popular system, Linux being an obvious choice.

But that definitely requires quite an extra effort, so I'm all for limiting the scope. More specifically, I'm trying to see if it's possible to extend "depth" of scope, implement 2 distinct features in one go, that's why I suggested to look into possibility of implementing socketpair() in terms of pipe(). I'm all ok to limit "breadth" of scope, like start with supporting less socket types (up to just 1), don't support extended operations like recvfrom/sendto, making some support API kernel-only for starters, and leaving synchronization of such internal to API clients - all that is ok by me for initial version (which can be extended later as needed, based on the actual usecases). (I write this down as a kind of abstracted feedback, and that's definitely my personal outlook, which may be not shared by other developers.)

Specific concern with SOCK_RAW is that in case of UDP/TCP it causes sockets to work on the level of packets, and packets are themselves datagrams, and working with datagrams over sockets is peculiar matter, so you wisely didn't include SOCK_DGRAM into this initial implementation.

Now I don't know what SOCK_RAW could mean for the case of AF_UNIX, but I fully support leaving it out of scope of the initial implementation. And if the need for it arises, I'd love to see a test-like examples, with the source which can be compiled with both Zephyr and (for example) Linux, so consistent behavior can be demonstrated.

@cfriedt
Copy link
Member Author

cfriedt commented May 7, 2020

Sure, we can remove SOCK_RAW. I've just seen it used with socketpair in the past.

The question is how that is supposed to work. I myself don't have a knowledge of entire POSIX/extended POSIX APIs. But maintain an idea that the whole purpose of those APIs in Zephyr is to aid software portability (and I'm specifically interested in in maintaining that outlook because native Zephyr APIs don't often demonstrate that level of design consideration/conciseness/generality). So, there would need need to be some references of how a particular functionality should work, so we can check that Zephyr implementation adheres to it. And as even the POSIX standard offers only so much detail, that ultimately means writing tests to check behavior of a particular well-known, well-developed, popular system, Linux being an obvious choice.

I do have pretty detailed knowledge of POSIX APIs. It's better to have a limited subset of functionality than broken functionality, or OS-specific functionality. E.g. I know from experience that socketpair works equally well with either SOCK_STREAM or SOCK_RAW on several different OS's.

But that definitely requires quite an extra effort, so I'm all for limiting the scope. More specifically, I'm trying to see if it's possible to extend "depth" of scope, implement 2 distinct features in one go, that's why I suggested to look into possibility of implementing socketpair() in terms of pipe(). I'm all ok to limit "breadth" of scope, like start with supporting less socket types (up to just 1), don't support extended operations like recvfrom/sendto, making some support API kernel-only for starters, and leaving synchronization of such internal to API clients - all that is ok by me for initial version (which can be extended later as needed, based on the actual usecases). (I write this down as a kind of abstracted feedback, and that's definitely my personal outlook, which may be not shared by other developers.)

Right. Originally, I completely agreed that implementing socketpair in terms of pipe was a good idea, and then I remembered that bidirectional pipes are non-portable. Some POSIX APIs support them, others don't. They spec does not specify that it cannot be bidirectional, but it's wrong for the application to assume so. Since Zephyr pipes are also unidirectional, it makes sense to not force a square peg into a round hole, and also since it would have consumed excessive (dynamically allocated) resources, it seemed like the wrong approach. No judgement. It's smart to try to reuse code that way.

Specific concern with SOCK_RAW is that in case of UDP/TCP it causes sockets to work on the level of packets, and packets are themselves datagrams, and working with datagrams over sockets is peculiar matter, so you wisely didn't include SOCK_DGRAM into this initial implementation.

On every POSIX OS I've ever used (Linux, Darwin, QNX, my own RTOSs, probably BSDs), a RAW socket is the same as a STREAM when dealing with AF_UNIX, but it is kind of sloppy, and is not in POSIX, so best to cut it out. It's probably a bad habit I picked up from my time at a certain fruity phone manufacturer.

Now I don't know what SOCK_RAW could mean for the case of AF_UNIX, but I fully support leaving it out of scope of the initial implementation. And if the need for it arises, I'd love to see a test-like examples, with the source which can be compiled with both Zephyr and (for example) Linux, so consistent behavior can be demonstrated.

Agreed.

@cfriedt
Copy link
Member Author

cfriedt commented May 8, 2020

All tests are successful now, except for the very last one - test_socketpair_poll_close_remote_end_POLLOUT().

spair_write(): 414: count: 1 is_connected: 1 avail: 1 is_nonblock: 0 will_block: 0
zsock_poll_prepare_ctx(): 726: 
ASSERTION FAIL [!z_is_thread_state_set(_kernel.current, ((1UL << (4))))] @ /home/cfriedt/workspace/zephyrproject/zephyr/kernel/sched.c:1187
	
@ /home/cfriedt/workspace/zephyrproject/zephyr/lib/os/assert.c:45
E: >>> ZEPHYR FATAL ERROR 4: Kernel panic on CPU 0
E: Current thread: 0x56690e60 (unknown)
E: Halting system

@pfalcon
Copy link
Collaborator

pfalcon commented May 8, 2020

@cfriedt: This has a single commit again (instead of 2 commits for subsys vs tests), marked as Draft, commit message is not descriptive of the feature added. Is this targeted for 2.3? (If so, these points should be addressed, to (potentially) allow merging as an initial implementation to be elaborated during the freeze window).

@pfalcon
Copy link
Collaborator

pfalcon commented May 8, 2020

@cfriedt : Bottom line, please try to run your tests via sanitycheck, and see if they expose some bug in it.

I assumed you didn't have a chance to look into that, so looking myself. For reference, command to reproduce it is:

./SANITYCHECK.sh --testcase-root tests/net/socket/socketpair/

Where SANITYCHECK.sh

./scripts/sanitycheck --no-clean --enable-slow --inline-logs -j2 "$@"

(You obviously will have your own convenience wrappers.)

And --testcase-root there is purely to speed up matters, you can run ./scripts/sanitycheck alone and wait couple of hours ;-).

And well, looking at your testcase.yaml it's not hard to spot problem:

common:
  depends_on:
tests:
  net.socket.socketpair:
    min_ram: 21
    tags: net socket userspace

That depends_on looks suspicious. You know, it's YAML syntax for mapping, and as you don't specify value after :, it's, well, None. But kinda if it's None, you shouldn't specify it. Or at least, sanitycheck doesn't expect to see those empty values, and chokes, like you've seen. Besides an obvious choice of removing empty stuff (then you won't have a template to extend it later!), there's another one, which most people seem to follow - feel with actual values, even if dummy (e.g. repeated). Here it could be:

common:
  tags: net socket userspace
tests:
  net.socket.socketpair:
    min_ram: 21

(I.e. I killed empty depends_on:, but then common: becomes empty! Too shy to kill that, so moved "tags:" there. Why didn't move min_ram: there? Because "net.socket.socketpair:" would become empty! I hope you got rules of jongling by now.)

So, I don't try to submit any patch for sanitycheck itself - fixing leaf error of "local variable 'v' referenced before assignment" won't lead to any "correct" behavior, it likely will just hit another error level up. The real solution would be to better specify testcase.yaml syntax, allow empty/placeholder values, and raise awareness so people weren't afraid to use them, etc., etc. And that's quite different kind of task.

@pfalcon
Copy link
Collaborator

pfalcon commented May 8, 2020

Btw, after these manipulations, what I get from sanitycheck is:

[ 36%] Building C object zephyr/CMakeFiles/zephyr.dir/subsys/net/lib/sockets/sockets.c.obj
[ 37%] Building C object zephyr/CMakeFiles/zephyr.dir/subsys/net/lib/sockets/sockets_select.c.obj
[ 37%] Building C object zephyr/CMakeFiles/zephyr.dir/subsys/net/lib/sockets/getnameinfo.c.obj
[ 38%] Building C object zephyr/CMakeFiles/zephyr.dir/subsys/net/lib/sockets/sockets_misc.c.obj
[ 38%] Building C object zephyr/CMakeFiles/zephyr.dir/subsys/net/lib/sockets/socketpair.c.obj
/home/pfalcon/projects-3rdparty/Embedded/Zephyr/zephyr/subsys/net/lib/sockets/socketpair.c: In function 'z_vrfy_zsock_socketpair':
/home/pfalcon/projects-3rdparty/Embedded/Zephyr/zephyr/subsys/net/lib/sockets/socketpair.c:327:10: fatal error: syscalls/zsock_spair_mrsh.c: No such file or directory
  327 | #include <syscalls/zsock_spair_mrsh.c>
      |          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
compilation terminated.
zephyr/CMakeFiles/zephyr.dir/build.make:439: recipe for target 'zephyr/CMakeFiles/zephyr.dir/subsys/net/lib/sockets/socketpair.c.obj' failed
make[2]: *** [zephyr/CMakeFiles/zephyr.dir/subsys/net/lib/sockets/socketpair.c.obj] Error 1
CMakeFiles/Makefile2:1061: recipe for target 'zephyr/CMakeFiles/zephyr.dir/all' failed
make[1]: *** [zephyr/CMakeFiles/zephyr.dir/all] Error 2
Makefile:83: recipe for target 'all' failed
make: *** [all] Error 2

@cfriedt
Copy link
Member Author

cfriedt commented May 8, 2020

@cfriedt: This has a single commit again (instead of 2 commits for subsys vs tests), marked as Draft, commit message is not descriptive of the feature added. Is this targeted for 2.3? (If so, these points should be addressed, to (potentially) allow merging as an initial implementation to be elaborated during the freeze window).

Right - sorry. I'm mostly concerned with getting it working before the commits are polished.

@pfalcon
Copy link
Collaborator

pfalcon commented May 8, 2020

@cfriedt: Please un-draft this PR, so I could dismiss my -1. Don't see a way to do that now, and draft stay is my only guess. (On holiday on Monday otherwise, and don't want my -1 to block this, just in case.)

@cfriedt
Copy link
Member Author

cfriedt commented May 9, 2020

All tests are successful now, except for the very last one - test_socketpair_poll_close_remote_end_POLLOUT().

Fixed. I was just missing a k_thread_join()

@cfriedt cfriedt marked this pull request as ready for review May 9, 2020 04:26
@cfriedt cfriedt requested a review from pfalcon May 9, 2020 04:27
@cfriedt
Copy link
Member Author

cfriedt commented May 9, 2020

327 | #include <syscalls/zsock_spair_mrsh.c>

Fixed that. Not sure what's up with sanitycheck now.

@pfalcon pfalcon dismissed their stale review May 9, 2020 06:58

Thanks, unblocking PR from my side, IMHO it's in good shape to be merged as EXPERIMENTAL (as it's marked in Kconfig).

@pfalcon
Copy link
Collaborator

pfalcon commented May 9, 2020

@cfriedt

Not sure what's up with sanitycheck now.

I'd suggest to spend a couple of minutes to click thru Shippable CI UI to familiarize yourself with it. You'd soon arrive at https://app.shippable.com/github/zephyrproject-rtos/zephyr/runs/72444/5/tests, which shows what's wrong.

@pfalcon
Copy link
Collaborator

pfalcon commented May 9, 2020

@jukkar: Please consider at least removing your stale -1 review, I believe @cfriedt did a great work of addressing review comments.

@cfriedt
Copy link
Member Author

cfriedt commented May 9, 2020

@cfriedt

Not sure what's up with sanitycheck now.

I'd suggest to spend a couple of minutes to click thru Shippable CI UI to familiarize yourself with it. You'd soon arrive at https://app.shippable.com/github/zephyrproject-rtos/zephyr/runs/72444/5/tests, which shows what's wrong.

I did run sanitycheck - just couldn't see it at the time (was falling asleep).

Looks like the closing paren is missing for z_oops though

Copy link
Member

@jukkar jukkar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Collaborator

@pfalcon pfalcon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@carlescufi : Please consider for merging.

cfriedt added 2 commits May 10, 2020 07:53
Working:

* non-blocking reads / writes
* blocking reads / writes
* send(2) / recv(2) / sendto(2) / recvfrom(2) / sendmsg(2)
* select(2)
* poll(2)

Fixes #24366

Signed-off-by: Christopher Friedt <[email protected]>
Tests for issue #24366

Signed-off-by: Christopher Friedt <[email protected]>
@cfriedt
Copy link
Member Author

cfriedt commented May 10, 2020

@carlescufi - I've disabled the tests that require userspace + threads + permission fixes until I figure it out.

@pfalcon
Copy link
Collaborator

pfalcon commented May 10, 2020

@cfriedt: CI may randomly fail, yes. (~Last month is happens more often than previous years.) Quick way to restart build is to close and reopen a PR. (Doing now.)

@pfalcon pfalcon closed this May 10, 2020
@pfalcon pfalcon reopened this May 10, 2020
@carlescufi carlescufi merged commit 3bfc765 into zephyrproject-rtos:master May 10, 2020
@cfriedt cfriedt deleted the issue/24366/syscall-for-socketpair-2 branch May 18, 2020 01:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: API Changes to public APIs area: Kernel area: native port Host native arch port (native_sim) area: Networking area: Tests Issues related to a particular existing or missing test
Projects
None yet
Development

Successfully merging this pull request may close these issues.

syscall for socketpair(2)
6 participants