semaphores with negative init value

Credits: source

semaphores are a little asymmetric, let’s see how.

Semaphore is a synchronization construct in concurrent programming. But do you know about the asymmetric behavior of semaphores?

Let’s see some definitions.

A semaphore is an object with two methods Wait and Signal, a private integer counter and a private queue (of threads)

wait: Decrements the value of the semaphore variable by 1. If the new value of the semaphore variable is negative, the process executing wait is blocked (i.e., added to the semaphore’s queue). Otherwise, the process continues execution.

signal: Increments the value of semaphore variable by 1. After the increment, if the pre-increment value was negative (meaning there are processes waiting for a resource), it transfers a blocked process from the semaphore’s waiting queue to the ready queue.

Most students studying semaphores for the first time would summarize these definitions as,
* signal on a semaphore will increment its value by 1.
* wait on semaphore will decrement its value by 1. If the resulting value is negative, the thread will be blocked until the value turns positive.

Even though semaphores are used in multi-threaded code, we are going to see a simple example that simply prints a string in the terminal because the purpose of this article is to understand the subtle misconception.
In real life, mutex name is used to represent a binary semaphore. Here, for the lack of a better name, I’m going to use mutex as the name of semaphore. It is a counting semaphore.

Consider the following code.
Let’s call it program1.

// program1
mutex:= 0
print("Hello world")

One can explain it’s working like,
mutex is initialized to 0.
post(mutex) increments its value to 1
wait(mutex) will go through because mutex value is greater than 0.
It will print “Hello world” in the terminal.
And if we were to write and run actual code with above settings, it would print the string in the terminal.

Hmm, okay. The explanation sounds correct. And indeed we will get a message printed in the terminal, so it makes sense.

Guess what would be the output of the following program.
Let’s call it program2.
Note the mutex init value.

// program2
mutex:= -10

print("Hello world")

will it stay stuck on wait(mutex) or go through?

The correct answer is, it will go through, and it will print the message in the terminal.

If you thought “it will stay stuck” then this article is for you.
You must be thinking, how can the wait(mutex) go through with a negative mutex value?

That’s what I want to talk about.

While simplifying the wait and signal definitions for ease of understanding, we forgot to account for one important piece.

It is worth highlighting that, it is not just about the value of the semaphore.

But, before I explain what’s happening, guess what would be the output of the following program. Let’s call it program3.

// program3
mutex:= -10

print("Hello world")

will it print the message in the terminal?
The correct answer is, no. The program will stay stuck/blocked on the wait call.

You must be thinking, in program2, the mutex value was negative, and wait(mutex) went through. In the program3, the mutex value is negative and it doesn’t go through?

As I said, it is not just about the semaphore value.

It is about semaphore value and the post(signal) operation.

It means, whenever we do post on mutex, it un-blocks one of the thread/process in the blocked queue. And that is the essence of post operation.
And you can think of positive semaphore initialization values as bonus post operations.

So the value doesn’t have any effect on the semaphore behavior?
Remember, I said it's about semaphore value and post(signal) operation?

Okay, so what does it mean when I initialize a semaphore with a specific positive integer?
It means those many concurrent threads can access the given resource.
The wait operation is a way to check if the capacity is already exhausted or not.
That’s why the program3 didn’t run to completion. It will stay stuck on the wait(mutex) line.

But then, how come the program2 ran to completion?
Because of post operation.

Basically, a post unblocks a blocked thread.

Semaphores are kind of stateful. Meaning, if I start with a negative semaphore value, call a post on it, and if there is no thread in the blocked semaphore queue at that moment, it will remember that it is supposed to let one wait to go through.
Basically, a post will make a way for a wait to go through.
And that’s the explanation for program2.

Let’s do one more example just to make sure you understood it correctly.

// program4
mutex:= 1
print("Hello world")

Will this program run to completion of stay stuck?

The correct answer is, it will run to completion. It will print both the strings in the terminal.

The mutex is initialized to 1 ( read it as one bonus post operation)
Then there is a post(mutex) which will increment its value to 2.
So basically, two wait operations can go through.

So, in summary, it is not just about the semaphore value.
It is about the semaphore value and the post operations.

Hope this was useful.

Take care and have a good one!

HPC Engineer @ R&D Lab