Saturday, July 19, 2014

Concurrency and Race Conditions

                                                         Introduction


  • The problem of concurrency occurs when the system tries to do more than one thing at once.
  • With the use of SMP(symmetric multiprocessing, which means more than one processor or cores access the same memory) the chances of concurrent access has increased.
  • Race condition means uncontrolled access to the shared data.
  • This race condition leads to memory leak because the shared data is overwritten and the first wrote data is lost.
  • It is advised to avoid sharing of resource while writing a program, for eg. avoid using global variables.
  • It is also not possible to run away from sharing of resource as we have limited hardware and software resources.
  • So how do we deal with concurrent access?
Locking/Mutual Exclusion provides access management.


                                               Semaphores and Mutexes



  • A process in linux goes to sleep("blocks") when it has nothing to do.
  • Semaphores are used when the thread holding the lock can sleep.
  • Semaphore framwork consists of a single integer value plus a pair of functions (P and V).
  • Lets understand a semaphore operation-


A process which wants to access a critical region calls the function P, if the semaphone value is greater than 0 then we decrement the semaphore value by one and the process continues; if the semaphore ale is less than or equal to 0, then the process waits for some other process to release the lock.

A process which wants to get out of the critical region calls the function V, increments the semaphore value by one and wakes up the processes that are sleeping.



  • When we want our semaphore to be accessed by only one process at a time then such semaphore is called a Mutex(mutual exclusion).


                                          Linux Semaphore Implementation

  • To use semaphore in our file we must include the header file <asm/semaphore.h> header file.
  • We create semaphore directly and then set it to some value.   //void sema_init(struct semaphore *sem, int val);
  • For mutex also we can do similar things-
  • DECLARE_MUTEX(name);    //initialises the mutex with 1
  • DECLARE_MUTEX_LOCKED(name); //initialises the mutex with 0
If the mutex has to be allocated dynamically then we use either of these two functions-



void init_MUTEX(struct semaphore *sem);
void init_MUTEX_LOCKED(struct semaphore *sem);

In linux terminology the P function is called down and V function is called up function.


Semaphore Methods 
sema_init(struct semaphore *, int) //Initializes the dynamically created semaphore to the given count
init_MUTEX(struct semaphore *) //Initializes the dynamically created semaphore with a count of one
init_MUTEX_LOCKED(struct semaphore *) //Initializes the dynamically created semaphore with a count of zero (so it is initially locked)


down_interruptible (struct semaphore *) //Tries to acquire the given semaphore and enter interruptible sleep if it is contended
down(struct semaphore *) //Tries to acquire the given semaphore and enter uninterruptible sleep if it is contended
down_trylock(struct semaphore *) //Tries to acquire the given semaphore and immediately return nonzero if it is contended
up(struct semaphore *) //Releases the given semaphore and wakes a waiting task, if any


  • For different critical sections we use different semaphore names.

                                                ReaderWriter Semaphores

  • rwsems is a special type of semaphore, we need to include <linux/rwsem.h>
  • It is used rarely but is useful.
  • rwsem allows either one writer or an unlimited number of readers to hold the semaphore
  • rwsem must be initialised with void init_rwsem(struct rw_semaphore *sem);
  • For read only access


void down_read(struct rw_semaphore *sem);
int down_read_trylock(struct rw_semaphore *sem);
void up_read(struct rw_semaphore *sem);

  • Similary for writers we have 
void down_write(struct rw_semaphore *sem);
int down_write_trylock(struct rw_semaphore *sem);
void up_write(struct rw_semaphore *sem);
void downgrade_write(struct rw_semaphore *sem);// used when we want writers to finish write fast and read to take major timeslice

  • By default writers are given more priority over readers.
  • Thats why we use rwsemaphore when write access is used rarely and that too for a short period of time.

                                                               Completions

  • This feature was added in kernel version 2.4.7
  • This is a lightweight mechanism with only one task, it allows one thread to tell other one that the task is done.
  • The headerfile to be used for this is <linux/completion.h>



  • A completion can be created with:

DECLARE_COMPLETION(my_completion);


  • Or, if we want the completion to be created and initialized dynamically:

struct completion my_completion;
/* ... */
init_completion(&my_completion);


  • We wait for the completion by calling:

void wait_for_completion(struct completion *c);
  • This function performs an uninterruptible wait. 
  • If our code calls wait_for_completion and nobody ever completes the task, the result will be an unkillable process.




  • The actual completion event may be signalled by calling one of the following:

void complete(struct completion *c);
void complete_all(struct completion *c);


  • The two functions behave differently if more than one thread is waiting for the same completion event. complete wakes up only one of the waiting threads while complete_all allows all of them to proceed.







  • If we use complete_all, we must reinitialize the completion structure before reusing it.
  •  The macro:INIT_COMPLETION(struct completion c); can be used to quickly perform this reinitialization.
    • Lets see an example of completion variable being used.

    DECLARE_COMPLETION(comp);
    ssize_t complete_read (struct file *filp, char __user *buf, size_t count, loff_t
    *pos)
    {
     printk(KERN_DEBUG "process %i (%s) going to sleep\n",
     current->pid, current->comm);
     wait_for_completion(&comp);
     printk(KERN_DEBUG "awoken %i (%s)\n", current->pid, current->comm);
     return 0; /* EOF */
    }
    ssize_t complete_write (struct file *filp, const char __user *buf, size_t count,
     loff_t *pos)
    {
     printk(KERN_DEBUG "process %i (%s) awakening the readers...\n",
     current->pid, current->comm);
     complete(&comp);
     return count; /* succeed, to avoid retrial */
    }


    • We can see that the read operation will wait if the write opeation is being performed(till it finishes).



                                                                           Spinlocks


    • Unlike semaphores, spinklocks can be used in the code that can't sleep eg. interrupt handlers
    • Spinlock offers better performance than semaphore but there are some constraints also.
    • A spinlock is a mutual exclusion device that has only 2 states; locked/unlocked.
    • In coding perspective this lock sets/unsets a bit.
    • If the lock is unavailable then the code goes into a tight loop where it repeatedly checks the lock until it is available.
    • Care must be taken for setting and unsetting the bit as many threads are in loop that may set.
    • Spinlocks are designed to be used on a multiprocessor system.
    • Single processor system running a preemeptive code also behaves as an SMP system from the point of view of concurrency.
    • If a nonpreemeptive code running on a processor takes a lock then it would run till eternity  as  no other thread would ever be able to obtain the CPU to release the lock. For this reason, spinlock operation on a uniprocessor system is designed to perform NOOP(nooperation) and at someplaces whe change the masking statuses of some IRQs.
    • To use the spinlock we must include the header file s <linux/spinlock.h>
    • This initialization can be done at compile time by using spinlock_t my_lock = SPIN_LOCK_UNLOCKED;
    • Or the initialization can be done  at runtime with void spin_lock_init(spinlock_t *lock);
    • Before entering a critical section our code must obtain the requisite lock with: 
    void spin_lock(spinlock_t *lock);
    • Since all spinlock waits are, by their nature, uninterruptible so, once we callspin_lock, we will spin until the lock becomes available to our code .
    • To release a lock that we have obtained we call void spin_unlock(spinlock_t *lock);
    • Lets us consider a practical scenario
    We have taken a spinlock in our driver code and while holding the lock our driver experiences a function which will change the context of code for eg. copy_to_user() or put the process to sleep or say any interrupt comes that throws our driver code out of the processor. Our driver code is still holding the lock and won't free it for some other task.that wants it. 
    • This situation is undesirable so we acquire spinlock where the code is atomic and it can't sleep.
    • For example we dont use spinlock for malloc as it can sleep.
    • The kernel code takes care of spin lock and preemption, whenever spinlock is called the kernel blocks the preemption(interrupts) on that processor
    • Even in case of uniprocessor system preemption is disabled to avoid the race condition.
    • We must take care that our process must holdthe spinlock for a small period of time, beacuse longer is the hold longer is the wait for other process which whishes to acquire the lock,(the process might be a high priority one).
    • There are four functions that can lock a spinlock
    void spin_lock(spinlock_t *lock);//Normal one
    void spin_lock_irqsave(spinlock_t *lock, unsigned long flags); //disables interrupts (on
    the local processor only) before taking the spinlock and the previous interrupt state is stored in flags
    void spin_lock_irq(spinlock_t *lock);//used when we are sure nothing else might have diabled the interrupts already, it also disables interrupt
    void spin_lock_bh(spinlock_t *lock);//disables the software interrupt before taking the lock

    • We must use the proper lock in proper situation- eg. hardware interrupt and software interrupt or normal case.
    • Just as we saw the 4 ways to acquire the spinlock we have 4 ways to release also
    void spin_unlock(spinlock_t *lock);
    void spin_unlock_irqrestore(spinlock_t *lock, unsigned long flags);
    void spin_unlock_irq(spinlock_t *lock);
    void spin_unlock_bh(spinlock_t *lock);

    • There is also a set of nonblocking spinlock operations:
    • int spin_trylock(spinlock_t *lock);
    • int spin_trylock_bh(spinlock_t *lock);
    • These functions return nonzero on success (the lockwas obtained), 0 otherwise.
    • There is no “try” version that disables interrupts.


                                                  Reader/Writer Spinlocks


    • This is analogous to what we had seen in case of semaphores, a single writer and any number of readers.
    • rwlock_t my_rwlock = RW_LOCK_UNLOCKED; /* Static way */
    • rwlock_t my_rwlock;
    • rwlock_init(&my_rwlock); /* Dynamic way */
    • For readers, the following functions are available:

    void read_lock(rwlock_t *lock);

    void read_lock_irqsave(rwlock_t *lock, unsigned long flags);

    void read_lock_irq(rwlock_t *lock);

    void read_lock_bh(rwlock_t *lock);
    void read_unlock(rwlock_t *lock);
    void read_unlock_irqrestore(rwlock_t *lock, unsigned long flags);
    void read_unlock_irq(rwlock_t *lock);
    void read_unlock_bh(rwlock_t *lock);


    void write_lock(rwlock_t *lock);
    void write_lock_irqsave(rwlock_t *lock, unsigned long flags);
    void write_lock_irq(rwlock_t *lock);
    void write_lock_bh(rwlock_t *lock);
    int write_trylock(rwlock_t *lock);
    void write_unlock(rwlock_t *lock);
    void write_unlock_irqrestore(rwlock_t *lock, unsigned long flags);
    void write_unlock_irq(rwlock_t *lock);
    void write_unlock_bh(rwlock_t *lock);





                                                Alternatives to Locking ?


    There are situations in kernel programming where atomic access can be set up
    without the need for full locking.

    Atromic Operation- These operations are performed in a single machine cycle.
    An atomic_t holds an int value on all supported architectures. 

    Bit operations-

    The atomic_t type is good for performing integer arithmetic. It doesn’t workas well,

    however, when you need to manipulate individual bits in an atomic manner. For that

    purpose, instead, the kernel offers a set of functions that modify or test single bits

    atomically. Because the whole operation happens in a single step, no interrupt (or
    other processor) can interfere.

    Seqlocks-
    Used when the resource to be protected is small, simple and frequently accessed.Here readers are allowed the free access to the resource but they check for their collision with the writers. And if a colission is detected then they retry to read.


    For any clarifications or if you have any suggestion please post comments below.


    8 comments:

    1. Pretty section of content. I just stumbled upon your weblpog and in accession capital to assert that I acquire actually enjoyed account
      your blog posts. Any wayy I'll be subscribing to your
      feeds and even I achievement you access consistently rapidly.


      Revidw my weblog; project management app (marshalldrum.net)

      ReplyDelete
    2. Very nice post here and thanks for it .I always like and such a super contents of these post.Excellent and very cool idea and great content of different kinds of the valuable information's.
      advanced excel training in bangalore

      ReplyDelete
    3. I really like the dear information you offer in your articles. I’m able to bookmark your site and show the kids check out up here generally. Im fairly positive theyre likely to be informed a great deal of new stuff here than anyone
      Java training in Tambaram | Java training in Velachery

      Java training in Omr | Oracle training in Chennai

      ReplyDelete
    4. I found your blog while searching for the updates, I am happy to be here. Very useful content and also easily understandable providing.. Believe me I did wrote an post about tutorials for beginners with reference of your blog. 
      Data Science training in rajaji nagar | Data Science with Python training in chenni

      Data Science training in electronic city | Data Science training in USA

      Data science training in pune | Data science training in kalyan nagar

      ReplyDelete
    5. I would like to thank you for your nicely written post, its informative and your writing style encouraged me to read it till end. Thanks

      devops online training

      aws online training

      data science with python online training

      data science online training

      rpa online training

      ReplyDelete
    6. Why these people put the link here,they shouldn't do this.

      ReplyDelete
    7. How they different from pthread familiy method (ex, pthread_mutex, pthread_rwlock,pthread_spin_lock )? The latter is warpper of the formmer?

      ReplyDelete
    8. Took me time to read all the comments, but I really enjoyed the article. It proved to be Very helpful to me and I am sure to all the commenters here! It’s always nice when you can not only be informed, but also entertained! https://ufcstreaminglinks.website/

      ReplyDelete