NMatrix Developer Guide

The goal of this page is to give a general overview of NMatrix for those who may want to hack on the code, and to make it easier for new contributors to jump right in. This is obviously still a work in progress, but will hopefully fill out over time.

Introduction to the organization of NMatrix

NMatrix from the C/C++ side

Organization

Components and how they work

Dense matrices

List matrices

Yale matrices

Playing nicely with the ruby garbage collector

####Background

From within C/C++ code, ruby exposes objects as the typedef VALUE, which as of this writing is an unsigned long. Internally, this might be just the (slightly modified) value of a number (for things like fixnum), or a casted pointer to a ruby data structure.

The ruby garbage collector is a mark and sweep type collector. For ruby programs (ignoring C extensions for the moment), this means, in the simplest sense, that the interpreter keeps track of all objects in existence, and when garbage collection runs, a first pass marks all objects that are accessible from the code, and a second pass frees up all the objects that are no longer accessible. In a C extension, this might cause problems: the C code might use ruby VALUEs that aren't accessible from the ruby side (and thus wouldn't be marked), but still shouldn't be freed up. There are a number of mechanisms in place to prevent this problem:

Garbage collection only runs once C code has returned to ruby, or during a call to a ruby C API method. This means that you don't need to worry about something like a dedicated GC thread starting garbage collection at any arbitrary point in your function; only defined points can be problematic.
Data_Wrap_Struct: this is a ruby C API method that is used to wrap a C struct in a ruby VALUE (see ruby's README.ext for more details). It allows you to pass a marking function and a freeing function. If the ruby garbage collector marks this ruby VALUE, then the marking function will be called. By creating an appropriate marking function, it's possible to mark VALUEs hidden in the C struct and prevent them from being garbage collected. For NMatrix, this mechanism is key for the implementation of object-dtype NMatrix objects.
Ruby checks the stack for VALUEs and pointers to VALUEs still in use by your C code. This is pretty neat. If you for instance have a case where your code is:

VALUE x = ...;
rb_call_some_c_api_method();
return x;

Then ruby should see x on the stack and make sure not to garbage collect it during that api call. The same is true if x is a VALUE* to some VALUE(s) on the heap.

The problem

Two cases aren't sufficiently dealt with by these mechanisms.

You have a pointer on the stack to some struct that internally contains VALUEs, but you don't have a pointer to those VALUEs (or the VALUEs themselves) on the stack, and you want to make a ruby C API call. This would be simply solved by just putting the VALUEs on the stack before the API call if not for the second problem.
Optimizing compilers. If you're running the compiler with any optimizations turned on, it's hard to guarantee that any particular VALUE is actually on the stack when you need it to be. Given that NMatrix is a library for scientific computing, in which it's common to be CPU-limited, turning off optimizations is not ideal.

The typical solution

The typical solution to the problem of the optimizing compiler is to mark VALUEs as volatile, a keyword that (simplistically) indicates that some code that the compiler doesn't know about (whether hardware, another thread, etc.) might interact with the variable declared volatile. This generally means that the compiler won't optimize volatile variables out because there might be some unintended side effect.

To solve the problem using volatile:

Find everywhere there's a call to a ruby API method (or a call to an NMatrix method that calls a ruby API method, etc.).
Before each call, ensure that all VALUEs in use by the code (whether normally declared directly or as part of a struct, etc.) are stored in a volatile variable on the stack.

However, it's not completely clear whether this will prevent all optimizations that would cause issues with the garbage collector. Even if volatile does prevent all problematic optimizations, it's not clear that this is desirable from a performance perspective (however, more testing would be needed to figure this out). A reasonable interpretation of recent C++ specifications might also be that use of volatile is discouraged except for hardware interactions. Thus, just marking all VALUEs volatile is perhaps not ideal.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

NMatrix Developer Guide

Introduction to the organization of NMatrix

NMatrix from the C/C++ side

Organization

Components and how they work

Dense matrices

List matrices

Yale matrices

Playing nicely with the ruby garbage collector

The problem

The typical solution

The NMatrix solution

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally