Apr 20, 2024

Parallelism via Fiber Coroutines

cosmic-fibers

Server applications such as web servers often need to service multiple requests in parallel. This can be achieve by using threads, event callbacks or via fiber coroutines. While fiber coroutines are less well known and understood, they have some compelling reasons to be uses instead of threads or callbacks.

A fiber coroutine is a code segment that runs with its own stack and cooperatively yields to other fibers when it needs to wait. Fibers can be viewed as threads, but only one fiber runs at a time. For Go programmers, fibers are similar to Go routines, while for JavaScript developers, fibers are comparable to async/await.

Ioto uses fiber coroutines to serve multiple requests in parallel. Ioto is based on a single-threaded fiber coroutine architecture that employs a non-blocking, event-driven design capable of handling numerous inbound and outbound requests simultaneously with minimal CPU and memory resources. Ioto simplifies programming by eliminating the complexity of threads and the inelegance of event callbacks through the use of fiber coroutines.

Ioto’s fibers are integrated into the I/O system, enabling parallelism to be effortlessly supported. All Ioto services support fibers, making your user extension code straightforward, easy to debug, and maintainable in the long term. You can use a straight-line procedural programming model to read and write sockets, issue HTTP client requests, send MQTT messages, or respond to web server requests.

Models of Parallelism

To implement parallelism in an application, a developer has three choices:

Threads
Non-blocking APIs with callbacks
Fiber coroutines

Threads

Programming with threads can be appealing at first, however a multithreaded design can be problematic. Subtle programming errors due to timing related issues, multithread lock deadlocks and race conditions can be extraordinarily difficult to detect and diagnose. All too often, they appear only in production deployments.

Although some developers excel in creating multithreaded designs, others may struggle when tasked with maintaining complex threaded code and debugging subtle race conditions and issues. Over time, a design that initially seemed reasonable can become increasingly challenging to maintain and support.

Callbacks

An alternative method for implementing parallelism involves the use of non-blocking APIs coupled with callbacks, which are often easier to test and debug compared to threaded designs. However, this approach often leads to decreased code quality due to the prevalent “callback-hell” phenomenon. In such cases, relatively simple algorithms become obfuscated as they are dispersed across cascading callbacks. This problem is especially pronounced in C or C++ coding designs that lack inline lambda functions for simplification. Consequently, linear algorithms are fragmented across multiple functions, and clear algorithms become increasingly difficult to decipher.

Fiber Coroutines

An appealing alternative for implementing parallelism is through the use of Fiber Coroutines. A Fiber coroutine refers to code that runs with its stack and cooperatively yields to other fibers when it needs to wait.

Fibers can be thought of as threads, but only one fiber runs at a time, eliminating the need for thread locking or synchronization. For Go programmers, fibers are akin to Go routines, while for JavaScript developers, fibers are similar to async/await.

By allowing programs to overlap waiting for I/O or other events with useful compute tasks, fibers achieve parallelism without the complexities involved in other methods.

Fibers address the primary issue with multi-threaded programming where multiple threads access the same data at the same time, requiring complex locking to safeguard data integrity. Furthermore, they resolve the primary problem with non-blocking callbacks by enabling a procedural straight-line coding style.

Although not flawless, fibers provide an efficient solution for achieving parallelism. It may not allow full utilization of all the CPU cores of a system within one program. However, for embedded device management, this is generally not a significant concern. Since device management applications are usually secondary to the primary role of the device, they should not monopolize the CPU cores of the device.

Parallelism Compared

Consider a threaded example:

int count = 0;
pthread_mutex_t mutex;

void increment() {
    pthread_mutex_lock(&mutex);
    count = count + 1;
    pthread_mutex_unlock(&mutex);
}

void getCount() {
    int c;
    pthread_mutex_lock(&mutex);
    c = count;
    pthread_mutex_unlock(&mutex);
    return c;
}

Now consider the fiber solution:

int count = 0;

void increment() {
    count = count + 1;
}

void getCount() {
    return ccount;
}

Since only one segment of code is executing at any one time, there is no possibility of fiber collisions.

Callback Example

When implementing parallelism with callbacks, applications must employ non-blocking I/O. While blocking I/O is simpler, it prohibits the application from performing any other function while waiting for I/O to complete.

For instance, consider an application that must execute a REST HTTP request to retrieve some remote data. While waiting for the request to complete, the application is blocked and cannot perform any other task for several seconds. Non-blocking I/O resolves this issue, but creates another problem known as “callback hell”.

Consider this pseudo-example:

//  Issue a request and invoke the onData callback on completion
httpFetch("https://www.example.com", onData)
return;

//  First Callback
static void onData(HttpResult *result)
{
    if (!result) {
        //  Invoke another request
        httpFetch("https://www.backup.com/");
    }
}

//  Second Callback
static void onComplete(HttpResult *result)
{
    //  Now we done and can process the result
}

As the level of callback nesting increases, the code’s intended purpose rapidly gets obscured.

The alternative Ioto code using fiber coroutines would look like this:

int data = urlGet("https://www.example.com");
if (!data) {
    data = urlGet("https://www.backup.com/");
}

The calls to urlGet will yield and other fibers will run while waiting for I/O. When the request completes, this fiber is transparently resumed and execution continues.

Code based on Fibers is more straightforward to code, debug, and maintain.

When transitioning Ioto from callbacks to fibers, several of our algorithms reduced code lines by over 30%.

Ioto Fibers in Practice

In practice, when working with Ioto, there is usually no need to explicitly program fiber yielding or resuming. The Ioto socket APIs are fiber-aware and will handle the yielding for you.

All Ioto services, including the web server, Url client, MQTT client, and AWS services, feature async APIs that are fiber-aware and will yield and resume automatically.

For example:

char buf[1024];

while ((nbytes = rSocketRead(sock, buf, sizeof(buf))) > 0) {
    printf("Got body data %.*s\n", (int) nbytes, buf);
}

Ioto I/O API

Ioto builds fiber support into the lowest layer of the “R” portable runtime. The following APIs support automatic fiber yielding:

The rReadSocket and rWriteSocket APIs will yield and resume the current fiber as required, allowing other fibers to continue running. It is important to note that only one fiber will execute at a time.

Fiber API

Ioto also supports a low level fiber API so you can construct your own fiber-enabled primitives.

Use rYieldFiber to yield the CPU and switch to another fiber. You must make alternate arrangements to call rResumeFiber when required.

Use rSpawnFiber to create a new fiber and immediately switch to it. For example:

void myFiberFunction(void *arg) {
    //  code here runs inside a fiber
}
rSpawnFiber(myFiberFunction, arg);

Integrating with External Services

But what should you do if you need to invoke an external service that will block?

You have two alternatives:

Use Non-Blocking APIs
Use threads

Using Non-Blocking with External Services

Ioto provides a flexible centralized eventing and waiting mechanism that can support any service that provides a select() compatible file descriptor.

If the external service has a non-blocking API and provides a file descriptor that is compatible with select or epoll, you can use the Ioto runtime wait APIs to be signaled when the external service is complete.

To wait for I/O on a file descriptor, call rAllocWait to create a wait object and rSetWaitHandler to nominate an event function to invoke.

For example:

wait = rAllocWait(fd);
rSetWaitHandler(wait, fn, arg, R_READABLE);

The nominated function will be run on a fiber coroutine when I/O on the file descriptor (fd) is ready.

Using Threads with External Services

The other option is to create a thread. However you must take care to properly yield the fiber first. The runtime provides a convenient rSpawnThread API that will do this for you. It will create a thread, yield the current fiber and then invoke your threadMain. When your threadMain exits, it will automatically resume the fiber.

For example:

rSpawnThread(threadMain, arg);

static void threadMain(void *arg)
{
    data = getFromExternalService();
    return data;
}

Manual Yield and Resume

Though unlikely, you may have a need to manually create fibers and yield and resume explicitly.

The APIs for this are: rAllocFiber, rYieldFiber and rResumeFiber.

See the Runtime API for more details.

Summary

Ioto eliminates the complexity of threads and verbosity of callbacks by using fiber coroutines. The result is a simple, highly efficient design that simplifies implementing and debugging IoT and embedded services.

Parallelism via Fiber Coroutines

Models of Parallelism

Threads

Callbacks

Fiber Coroutines

Parallelism Compared

Callback Example

Ioto Fibers in Practice

Ioto I/O API

Fiber API

Integrating with External Services

Using Non-Blocking with External Services

Using Threads with External Services

Manual Yield and Resume

Summary

Want More Now?

Comments

{{comment.name}} said ...

Make a Comment