libc: thread-safe newlib #21518

kaidoho · 2019-12-19T12:11:53Z

The aim of this PR is to utilize the functionalty provided by newlib for thread-safety.

RFC #21519

Relevant Links

Documentation

Newlib Documentation on __malloc_lock

Newlib Documentation on Reentrancy

Mailing Lists

Newlib discussion on --enable-newlib-retargetable-locking

[PATCH, newlib] Allow locking routine to be retargeted

What are the __retarget_lock functions?

Add newlib reent struct to each k_thread struct and set newlib's global impure_ptr to point to the reent struct of the current thread after context switch. Signed-off-by: Markus Bernd Moessner <[email protected]>

kaidoho · 2019-12-19T12:13:42Z

RFC #21519

zephyrbot · 2019-12-19T12:14:53Z

Some checks failed. Please fix and resubmit.

checkpatch issues

-:198: ERROR:SPACING: space prohibited before that ',' (ctx:WxW)
#198: FILE: lib/libc/newlib/libc-hooks.c:360:
+		sys_sem_take((struct sys_sem *) lock , K_FOREVER);
 		                                     ^

-:216: ERROR:SPACING: space prohibited before that close parenthesis ')'
#216: FILE: lib/libc/newlib/libc-hooks.c:378:
+		sys_sem_give((struct sys_sem *) lock );

-:243: ERROR:POINTER_LOCATION: "(foo*)" should be "(foo *)"
#243: FILE: lib/libc/newlib/libc-hooks.c:405:
+		sys_mutex_lock((struct sys_mutex*) lock, K_FOREVER);

- total: 3 errors, 0 warnings, 239 lines checked

NOTE: For some of the reported defects, checkpatch may be able to
      mechanically convert to the typical style using --fix or --fix-inplace.

Your patch has style problems, please review.

NOTE: Ignored message types: AVOID_EXTERNS BRACES CONFIG_EXPERIMENTAL CONST_STRUCT DATE_TIME FILE_PATH_CHANGES MINMAX NETWORKING_BLOCK_COMMENT_STYLE PRINTK_WITHOUT_KERN_LEVEL SPLIT_STRING VOLATILE

NOTE: If any of the errors are false positives, please report
      them to the maintainers.

Tip: The bot edits this comment instead of posting a new one, so you can check the comment's history to see earlier messages.

stephanosio · 2019-12-19T14:35:51Z

arch/arm/core/swap.c

+#ifdef CONFIG_NEWLIB_LIBC
+	_impure_ptr = &_current->base.k_reent;
+#endif


Note that arch_swap is only used for co-operative task switching. When a task is preempted, this function is not called.

_impure_ptr should be updated in z_arm_pendsv, which is the actual common task switching function.

Line 228 of swap_helper.S would be a good place to add this.

zephyr/arch/arm/core/swap_helper.S

Lines 225 to 229 in 816d2c4

isb

#endif

ldr r4, =_thread_offset_to_callee_saved

do we need to make this change in the context switch code for all arches or is there something special about ARM that _impure_ptr needs to be updated like this?

also what about SMP?

do we need to make this change in the context switch code for all arches

Yes, struct reent is basically per-thread context for newlib.

https://github.com/bminor/newlib/blob/b61dc22adaf82114eee3edce91cc3433bcd27fe5/newlib/libc/include/sys/reent.h#L377-L424

also what about SMP?

For ARM, SMP is not supported at the moment. I will look into this in the future.

these changes impact all arches so I think we will need to close on this before it can be merged.

I don't see how a global _impure_ptr updated on context switch could ever work in an SMP system, surely newlib has something for this...does this need to be stored in threa-local storage?

perhaps we need to override __getreent()?

perhaps we need to override __getreent()?

Looks like that would be the correct approach for SMP.

https://github.com/eblot/newlib/blob/2a63fa0fd26ffb6603f69d9e369e944fe449c246/newlib/libc/sys/linux/linuxthreads/getreent.c#L5-L10

One problem I see is that __getreent is not declared __weak, so it would not be override-able.

https://github.com/eblot/newlib/blob/2a63fa0fd26ffb6603f69d9e369e944fe449c246/newlib/libc/reent/getreent.c#L10-L14

I wonder if Zephyr should create a separate fork of newlib for this.

stephanosio · 2019-12-19T14:41:59Z

include/kernel.h

@@ -493,6 +493,10 @@ struct _thread_base {
 	u8_t cpu_mask;
 #endif

+#ifdef CONFIG_NEWLIB_LIBC


It would be desirable to add a separate newlib config symbol that enables reent support (e.g. CONFIG_NEWLIB_LIBC_REENT).

This symbol can be default y if MULTITHREADING so as to only enable reent support when multi-threading is enabled.

If there is definitelly only one thread running I think that's a good idea.

yeah this would be good, we have some use-cases for disabling mulithreading and using Zephyr more like a HAL (bootloaders, for example)

How do you think about about the other switch to reserve memory for the locks? Is it ok, to have one? Would you set the default value to 0? Actually, I dont like to set it to 0 as only the malloc_lock hooks work in this szenario, but I was concerned that existing applications could run out of memory in case I choose a too little default value

lib/libc/newlib/libc-hooks.c

andyross

The feature seems sound (though I'm no expert on the newlib locking design), but the memory storage for the locks seems kinda wrong?

lib/libc/newlib/libc-hooks.c

andyross · 2019-12-19T15:54:47Z

lib/libc/newlib/libc-hooks.c

+
+SYS_MEM_POOL_DEFINE(z_nl_lock_pool, NULL,
+		NEWLIB_LOCK_FRAG_SIZE, NEWLIB_LOCK_POOL_SIZE,
+		1, sizeof(void *), NEWLIB_LOCK_SECTION);


This creates a dependency between newlib and mempool that didn't exist before. Generally those have been either/or: an application will use a heap managed by mempool or by newlib, not both. Now we need to include both variants. Isn't there a way to repurpose the newlib heap code to do this?

And if there's not, you probably want to be looking at the Zephyr mem_slab and not mem_pool, as AFAICT all allocations are of the same object size.

we do have a pool of objects here, slab would be better...however, mem slabs can't be used from user mode. sys_mem_pool can, the whole object lives in user memory (we route it to the libc memory domain with NEWLIB_LOCK_SECTION)

I used malloc in first place, but it's kind of heavy. Is there any action for me?

andyross · 2019-12-19T15:55:41Z

lib/libc/newlib/libc-hooks.c

+static LIBC_DATA SYS_SEM_DEFINE(nl_at_quick_exit_sem, 1, 1);
+static LIBC_DATA SYS_SEM_DEFINE(nl_tz_sem, 1, 1);
+static LIBC_DATA SYS_SEM_DEFINE(nl_dd_hash_sem, 1, 1);
+static LIBC_DATA SYS_SEM_DEFINE(nl_arc4random_sem, 1, 1);


I'm a little confused: why does newlib need recursive locking in some subsystems and not others? Is that something specific to this patch or to newlib?

Newlib - not my idea

andyross · 2019-12-19T16:00:32Z

lib/libc/newlib/libc-hooks.c

+
+	__ASSERT(lock, "failed to allocate memory for newlib lock");
+
+	(*lock)->pSemOrMtx = (void *) &((char *)lock)[sizeof(void *)];


Same comments here as in the other init function.

Removed the pSemOrMtx

andyross · 2019-12-19T16:14:47Z

lib/libc/newlib/libc-hooks.c

+
+	__ASSERT(lock, "failed to allocate memory for newlib lock");
+
+	(*lock)->pSemOrMtx = (void *) &((char *)lock)[sizeof(void *)];


I'm not understanding this function. You're taking a lock pointer as an argument (where is that defined?), but then throwing that value away and replacing it with a new heap block (which may be null, and is unchecked), then dereferencing whatever pointer happened to be stored in that uninitialized heap block to store a pointer into the same block?

I'm guessing that what you really want to be doing is allocating a block containing just the sem/mutex union and assigning that through the opaque pointer you're being passed?

Why is there a header containing pSemOrMtx if the pointer always points to the byte after its own address? Why not just cast the struct address in the first place?

You're taking a lock pointer as an argument (where is that defined?), but then throwing that value away and replacing it with a new heap block

Yeah this is confusing me too, some more detail on the intention here would be helpful, maybe leave a comment if this is truly correct (although right now it looks like the allocated lock simply leaks)

Fixed the possible null pointer dereference

I'm guessing that what you really want to be doing is allocating a block containing just the sem/mutex union and assigning that through the opaque pointer you're being passed?

Why is there a header containing pSemOrMtx if the pointer always points to the byte after its own address? Why not just cast the struct address in the first place?

Your right, i've changed that.

andrewboie

thanks for looking into this, our newlib bindings have needed some attention for a while.

andrewboie · 2019-12-20T00:58:10Z

lib/libc/newlib/libc-hooks.c

+
+
+#if !defined(_RETARGETABLE_LOCKING) || \
+	CONFIG_NEWLIB_LIBC_DYNAMIC_LOCK_MEM_SIZE == 0


what happens if this block isn't compiled? (i.e. someone set the dynamic lock mem size to 0)

How do you think about about the other switch to reserve memory for the locks? Is it ok, to have one? Would you set the default value to 0? Actually, I dont like to set it to 0 as only the malloc_lock hooks work in this szenario, but I was concerned that existing applications could run out of memory in case I choose a too little default value

kaidoho · 2019-12-20T10:08:59Z

@stephanosio @andyross @andrewboie

do we need to make this change in the context switch code for all arches or is there something special about ARM that _impure_ptr needs to be updated like this?

also what about SMP?

This has an impact on all architectures. As described within the RFC, I've only added ARM (checkpatch shall complain until all others are there too, to avoid having a partial implementation going into Zephyr).

My main intent was to bring up the issue, perhaps it wasn't a good idea to support the RFC with a PR as it draws more attention to the implementation than the feature.

Let's get one step back:

Do you agree that it would be great to have a thread-safe newlib?

If so, there are two ways one can achieve this.

The one I show within this PR. It leaves newlib "as is" and only adds the hooks and impure_ptr switching to Zephyr.
Pro:

GNU ARM Embedded can be used without patches
Small changes which can be done in short amount of time

@stephanosio thought about patching newlib to have getreent available. That's partially what I considered when mentioning the alternative RTEMS route to go. Actually, I'd prefer to go the extra mile and have a target OS dependend toolchain (patching newlib and GCC for Zephyr). Why? Well, looking forward the next issue which will arise is within libstdc++:

The C++ library string functionality requires a couple of atomic operations to provide thread-safety. If you don't take any special action, the library will use stub versions of these functions that are not thread-safe. They will work fine, unless your applications are multi-threaded.

If you want to provide custom, safe, versions of these functions, there are two distinct approaches. One is to provide a version for your CPU, using assembly language constructs. The other is to use the thread-safety primitives in your operating system.
https://gcc.gnu.org/onlinedocs/libstdc++/manual/internals.html#internals.thread_safety

No strings == no go to me.

Those functions go in libstdc++ there are no simple hooks - one has to either add a full OS / thread model to GCC / libstdc++, or tweak the single thread implementation by adding hooks which we can use like the newlib stuff.

Pro

Addresses not only C but also C++

No worries to drop this PR in favour of something better.

stephanosio · 2019-12-20T10:39:59Z

Do you agree that it would be great to have a thread-safe newlib?

Not just great, it is absolutely imperative if we are going to do anything useful with the newlib.

Maybe #21519 should be labeled a "bug" and "high priority" since this issue practically renders the newlib useless?

I can see that there are many projects that require the newlib (e.g. net and gui) and this means that there is the possibility of them "randomly" crashing from the thread safety issues at any moment.

Those functions go in libstdc++ there are no simple hooks - one has to either add a full OS / thread model to GCC / libstdc++, or tweak the single thread implementation by adding hooks which we can use like the newlib stuff.

@pabigot This sounds like something that must be addressed before we can say C++ is supported in the Zephyr, alongside many other issues.

pabigot · 2019-12-20T11:02:49Z

@pabigot This sounds like something that must be addressed before we can say C++ is supported in the Zephyr, alongside many other issues.

Agreed, added to #18554.

Zephyr has some features that make it difficult to guarantee mutex/thread-safety, regardless of language: ZLIs and meta-IRQs. For the purposes of newlib support we can ignore them.

Add a config switch to adjust the size of the memory reserved for newlibs's dynamic locks. If size is set to 0, only malloc will be thread-safe. Add an an implementation for the locking hooks exposed by newlib. Signed-off-by: Markus Bernd Moessner <[email protected]>

pabigot

I think this is going to be a good step forward; thanks for taking it on. Just a couple minor non-blocking comments in addition to ones already raised.

IMO the commit messages don't benefit from (1/2) and (2/2) in the subject line.

It might be worth a link in the commit message to newlib documentation on how to support reentrancy, or at least a reference to the #21519 where there are pointers to such documentation.

In particular while looking at this I wanted to know why there were public symbols being defined with non-Zephyr implementation-reserved identifiers like __lock___foo, and had to grep through the newlib source to get an answer. A comment above the definition noting that these are referenced from newlib when it's built for thread-support (assuming that's true) would help future maintainers understand what these things are.

For style the number of blank lines between definitions isn't consistent (one separation has four). Also I don't think Zephyr generally adds a space in (t)x casts. Using uncrustify could help reveal issues.

kaidoho · 2019-12-23T22:33:04Z

@pabigot Regarding the libstdc++ issue which came up during the discussion here - shouldn't we open a separate issue for that?

pabigot · 2019-12-23T22:49:04Z

@pabigot Documentation (at least what exists and is not in source), even the newlib discussions are linked in the RFC - and the first thing I did, was to interlink RFC and PR. I've now added a textual link and copied over the links to the documentation.

It's nice to have it in the github issue, but when we come back to this in six months for maintenance it'd be nice to have something in the code or commit message. Going back to the issue and PR given a commit SHA1 is not particularly difficult, but it's not trivial either. Mentioning the RFC issue number as #21519 in the commit message would make it easier; a short sentence in the code explaining things might even make it unnecessary.

Regarding the libstdc++ issue which came up during the discussion here - shouldn't we open a separate issue for that?

Perhaps. There is a tie to this issue in #18554. I'm not clear on exactly what else needs to be done for C++ support. I don't believe we're ever going to get C++ threads to be supported by Zephyr: there's too much resistance to C++ at the project level, and I don't believe the thread model is compatible.

carlescufi · 2020-01-16T16:58:35Z

@kaidoho this PR seems a bit stuck. An option to move it forward is to add the "Dev-review" label for it to be discussed in the dev review meeting. Another one is to continue the discussion here with @stephanosio and @andyross

kaidoho · 2020-01-16T19:24:50Z

@carlescufi you are right, on the one hand I am waiting for directions and on the other I began to look into GCC / newlib to find out what it takes to have them support a Zephyr thread model. This will end in an RFC with the aim to have a custom toolchain for Zephyr. Perhaps, one would implement this PR differently when GCC / newlib has to be patched anyway. So, I think it is best to write the RFC regarding the toolchain, link this RFC/PR, and then see which direction the discussion takes. Ok, or would you do differently?

alexanderwachter · 2020-09-03T17:00:27Z

@stephanosio @andyross @andrewboie @carlescufi @pabigot.
It seems that this PR is still relevant for newlib support. How do we proceed with that?

andrewboie · 2020-09-03T17:25:07Z

@stephanosio @andyross @andrewboie @carlescufi @pabigot.
It seems that this PR is still relevant for newlib support. How do we proceed with that?

The issue tracking this is currently tracked as "enhancement" and not "bug", and there isn't pressure applied at release time to resolve it since it doesn't contribute to the release bug count requirements. We rely solely on the motivation of the reporter/author to move it along.

I think it should be promoted to "bug", but agree to scope it for 2.5 and find a dedicate owner to see it through if @kaidoho isn't working on it.

github-actions · 2020-11-03T00:49:42Z

This pull request has been marked as stale because it has been open (more than) 60 days with no activity. Remove the stale label or add a comment saying that you would like to have the label removed otherwise this pull request will automatically be closed in 14 days. Note, that you can always re-open a closed pull request at any time.

github-actions · 2021-03-22T01:25:53Z

This pull request has been marked as stale because it has been open (more than) 60 days with no activity. Remove the stale label or add a comment saying that you would like to have the label removed otherwise this pull request will automatically be closed in 14 days. Note, that you can always re-open a closed pull request at any time.

Build newlib library to be thread-safe in multithreaded environment. zephyrproject-rtos/zephyr#21518 zephyrproject-rtos/zephyr#21519 zephyrproject-rtos/zephyr#36201 https://sourceware.org/legacy-ml/newlib/2016/msg01165.html https://sourceware.org/git/?p=newlib-cygwin.git;a=commit;h=bd54749095ee45d7136b6e7c8a1e5218749c87b6 Error log: newlib/libc-hooks.c:310:1: note: in expansion of macro 'BUILD_ASSERT' BUILD_ASSERT(IS_ENABLED(_RETARGETABLE_LOCKING), "Retargetable locking must be enabled"); Signed-off-by: Naveen Saini <[email protected]> Tested-by: Jon Mason <[email protected]>

libc: Make newlib thread-safe (1/2)

330b806

Add newlib reent struct to each k_thread struct and set newlib's global impure_ptr to point to the reent struct of the current thread after context switch. Signed-off-by: Markus Bernd Moessner <[email protected]>

kaidoho requested review from andrewboie, andyross, galak, ioannisg, MaureenHelm and nashif as code owners December 19, 2019 12:11

kaidoho mentioned this pull request Dec 19, 2019

RFC: libc: thread-safe newlib #21519

Closed

zephyrbot added area: C Library C Standard Library area: ARM ARM (32-bit) Architecture area: API Changes to public APIs area: Kernel labels Dec 19, 2019

stephanosio requested changes Dec 19, 2019

View reviewed changes

stephanosio reviewed Dec 19, 2019

View reviewed changes

pavlohamov reviewed Dec 19, 2019

View reviewed changes

lib/libc/newlib/libc-hooks.c Outdated Show resolved Hide resolved

andyross requested changes Dec 19, 2019

View reviewed changes

andrewboie reviewed Dec 19, 2019

View reviewed changes

andrewboie reviewed Dec 20, 2019

View reviewed changes

stephanosio added area: Architectures and removed area: ARM ARM (32-bit) Architecture labels Dec 20, 2019

pabigot mentioned this pull request Dec 20, 2019

Tracking Issue for C++ Support as of release 2.1 #18554

Closed

8 tasks

zephyrbot added the area: ARM ARM (32-bit) Architecture label Dec 20, 2019

pabigot reviewed Dec 23, 2019

View reviewed changes

github-actions bot added has-conflicts Issue/PR has conflicts with another issue/PR and removed has-conflicts Issue/PR has conflicts with another issue/PR labels Jun 29, 2020

github-actions bot added the Stale label Nov 3, 2020

github-actions bot closed this Nov 17, 2020

andrewboie reopened this Nov 17, 2020

andrewboie removed the Stale label Nov 17, 2020

zephyrbot requested review from andrewboie, andyross, carlocaione and stephanosio January 8, 2021 22:54

zephyrbot assigned andyross Jan 8, 2021

andyross mentioned this pull request Mar 8, 2021

Newlib has no synchronization #33164

Closed

github-actions bot added the Stale label Mar 22, 2021

github-actions bot closed this Apr 6, 2021

stephanosio mentioned this pull request May 12, 2021

Fix newlib malloc thread safety issue #35227

Merged


		__ASSERT(lock, "failed to allocate memory for newlib lock");

		(lock)->pSemOrMtx = (void ) &((char )lock)[sizeof(void )];



		#if !defined(_RETARGETABLE_LOCKING) \|\| \
		CONFIG_NEWLIB_LIBC_DYNAMIC_LOCK_MEM_SIZE == 0

libc: thread-safe newlib #21518

libc: thread-safe newlib #21518

Conversation

kaidoho commented Dec 19, 2019 • edited Loading

Relevant Links

Documentation

Mailing Lists

kaidoho commented Dec 19, 2019

zephyrbot commented Dec 19, 2019 • edited Loading

checkpatch issues

Choose a reason for hiding this comment

andrewboie Dec 19, 2019 • edited Loading

Choose a reason for hiding this comment

stephanosio Dec 20, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

stephanosio Dec 20, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

andyross left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kaidoho Dec 20, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

andrewboie left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kaidoho commented Dec 20, 2019 • edited Loading

stephanosio commented Dec 20, 2019

pabigot commented Dec 20, 2019

pabigot left a comment

Choose a reason for hiding this comment

kaidoho commented Dec 23, 2019 • edited Loading

pabigot commented Dec 23, 2019

carlescufi commented Jan 16, 2020

kaidoho commented Jan 16, 2020

alexanderwachter commented Sep 3, 2020

andrewboie commented Sep 3, 2020

github-actions bot commented Nov 3, 2020

github-actions bot commented Mar 22, 2021

kaidoho commented Dec 19, 2019 •

edited

Loading

zephyrbot commented Dec 19, 2019 •

edited

Loading

andrewboie Dec 19, 2019 •

edited

Loading

stephanosio Dec 20, 2019 •

edited

Loading

stephanosio Dec 20, 2019 •

edited

Loading

kaidoho Dec 20, 2019 •

edited

Loading

kaidoho commented Dec 20, 2019 •

edited

Loading

kaidoho commented Dec 23, 2019 •

edited

Loading