17
Like 18 people like this. Be (/issue/137) From Issue #137 September 2005 (/issue/137) Kernel Korner Sleeping in the Kernel Jul 28, 2005 By Kedar Sovani (/user/801531) in The old sleep_on() function won't work reliabl\ in an age of SMP s\stems and h\perthreaded processors. Here's how to make a process sleep in a safe, cross platform wa\. In Linux kernel programming, there are numerous occasions when processes wait until something occurs or when sleeping processes need to be woken up to get some work done. There are different ways to achieve these things. All of the discussion in this article refers to kernel mode execution. A reference to a process means execution in kernel space in the context of that process. Some kernel code examples have been reformatted to fit this print format. Line numbers refer to lines in the original file. The schedule() Function In Linux, the ready-to-run processes are maintained on a run queue. A ready-to-run process has the state TASK_RUNNING. Once the timeslice of a running process is over, the Linux scheduler picks up another appropriate process from the run queue and allocates CPU power to that process. A process also voluntarily can relinquish the CPU. The schedule() function could be used by a process to indicate voluntarily to the scheduler that it can schedule some other process on the processor. Once the process is scheduled back again, execution begins from the point where the process had stopped²that is, execution begins from the call to the schedule() function. At times, processes want to wait until a certain event occurs, such as a device to initialise, I/O to complete or a timer to expire. In such a case, the process is said to sleep on that event. A process can go to sleep using the schedule() function. The following code puts the executing process to sleep: sleeping_task = current; set_current_state(TASK_INTERRUPTIBLE); schedule(); func1(); /* The rest of the code */ Now, let's take a look at what is happening in there. In the first statement, we store a reference to this process' task structure. current, which really is a macro, gives a pointer to the executing process' task_structure. set_current_state changes the state of the currently executing process from TASK_RUNNING to

Sleeping in the Kernel

Embed Size (px)

Citation preview

Page 1: Sleeping in the Kernel

Like 18peoplelikethis. Be

(/issue/137)

From Issue #137September 2005 (/issue/137)

Kernel Korner ­ Sleeping in the KernelJul 28, 2005 By Kedar Sovani (/user/801531) in

The old sleep_on() function won't work reliably in anage of SMP systems and hyperthreaded processors.Here's how to make a process sleep in a safe, cross­platform way.

In Linux kernel programming, there are numerousoccasions when processes wait until somethingoccurs or when sleeping processes need to bewoken up to get some work done. There aredifferent ways to achieve these things.

All of the discussion in this article refers to kernelmode execution. A reference to a process meansexecution in kernel space in the context of thatprocess.

Some kernel code examples have been reformatted to fit this print format. Linenumbers refer to lines in the original file.

The schedule() Function

In Linux, the ready-to-run processes are maintained on a run queue. A ready-to-runprocess has the state TASK_RUNNING. Once the timeslice of a running process isover, the Linux scheduler picks up another appropriate process from the run queueand allocates CPU power to that process.

A process also voluntarily can relinquish the CPU. The schedule() function could beused by a process to indicate voluntarily to the scheduler that it can schedule someother process on the processor.

Once the process is scheduled back again, execution begins from the point where theprocess had stopped—that is, execution begins from the call to the schedule()function.

At times, processes want to wait until a certain event occurs, such as a device toinitialise, I/O to complete or a timer to expire. In such a case, the process is said tosleep on that event. A process can go to sleep using the schedule() function. Thefollowing code puts the executing process to sleep:

sleeping_task = current;set_current_state(TASK_INTERRUPTIBLE);schedule();func1();/* The rest of the code */

Now, let's take a look at what is happening in there. In the first statement, we storea reference to this process' task structure. current, which really is a macro, gives apointer to the executing process' task_structure. set_current_state changes thestate of the currently executing process from TASK_RUNNING to

Page 2: Sleeping in the Kernel

TASK_INTERRUPTIBLE. In this case, as mentioned above, the schedule() functionsimply should schedule another process. But that happens only if the state of thetask is TASK_RUNNING. When the schedule() function is called with the state asTASK_INTERRUPTIBLE or TASK_UNINTERRUPTIBLE, an additional step isperformed: the currently executing process is moved off the run queue beforeanother process is scheduled. The effect of this is the executing process goes tosleep, as it no longer is on the run queue. Hence, it never is scheduled by thescheduler. And, that is how a process can sleep.

Now let's wake it up. Given a reference to a task structure, the process could bewoken up by calling:

wake_up_process(sleeping_task);

As you might have guessed, this sets the task state to TASK_RUNNING and puts thetask back on the run queue. Of course, the process runs only when the schedulerlooks at it the next time around.

So now you know the simplest way of sleeping and waking in the kernel.

Interruptible and Uninterruptible Sleep

A process can sleep in two different modes, interruptible and uninterruptible. In aninterruptible sleep, the process could be woken up for processing of signals. In anuninterruptible sleep, the process could not be woken up other than by issuing anexplicit wake_up. Interruptible sleep is the preferred way of sleeping, unless thereis a situation in which signals cannot be handled at all, such as device I/O.

Lost Wake-Up Problem

Almost always, processes go to sleep after checking some condition. The lost wake-up problem arises out of a race condition that occurs while a process goes toconditional sleep. It is a classic problem in operating systems.

Consider two processes, A and B. Process A is processing from a list, consumer,while the process B is adding to this list, producer. When the list is empty, process Asleeps. Process B wakes A up when it appends anything to the list. The code lookslike this:

Process A:1 spin_lock(&list_lock);2 if(list_empty(&list_head)) 3 spin_unlock(&list_lock);4 set_current_state(TASK_INTERRUPTIBLE);5 schedule();6 spin_lock(&list_lock);7 89 /* Rest of the code ... */10 spin_unlock(&list_lock);

Process B:100 spin_lock(&list_lock);101 list_add_tail(&list_head, new_node);102 spin_unlock(&list_lock);103 wake_up_process(processa_task);

There is one problem with this situation. It may happen that after process Aexecutes line 3 but before it executes line 4, process B is scheduled on anotherprocessor. In this timeslice, process B executes all its instructions, 100 through 103.Thus, it performs a wake-up on process A, which has not yet gone to sleep. Now,process A, wrongly assuming that it safely has performed the check for list_empty,

Page 3: Sleeping in the Kernel

sets the state to TASK_INTERRUPTIBLE and goes to sleep.

Thus, a wake up from process B is lost. This is known as the lost wake-up problem.Process A sleeps, even though there are nodes available on the list.

This problem could be avoided by restructuring the code for process A in thefollowing manner:

Process A:

1 set_current_state(TASK_INTERRUPTIBLE);2 spin_lock(&list_lock);3 if(list_empty(&list_head)) 4 spin_unlock(&list_lock);5 schedule();6 spin_lock(&list_lock);7 8 set_current_state(TASK_RUNNING);910 /* Rest of the code ... */11 spin_unlock(&list_lock);

This code avoids the lost wake-up problem. How? We have changed our currentstate to TASK_INTERRUPTIBLE, before we test the condition. So, what haschanged? The change is that whenever a wake_up_process is called for a processwhose state is TASK_INTERRUPTIBLE or TASK_UNINTERRUPTIBLE, and theprocess has not yet called schedule(), the state of the process is changed back toTASK_RUNNING.

Thus, in the above example, even if a wake-up is delivered by process B at any pointafter the check for list_empty is made, the state of A automatically is changed toTASK_RUNNING. Hence, the call to schedule() does not put process A to sleep; itmerely schedules it out for a while, as discussed earlier. Thus, the wake-up nolonger is lost.

Here is a code snippet of a real-life example from the Linux kernel (linux-2.6.11/kernel/sched.c: 4254):

4253 /* Wait for kthread_stop */4254 set_current_state(TASK_INTERRUPTIBLE);4255 while (!kthread_should_stop()) 4256 schedule();4257 set_current_state(TASK_INTERRUPTIBLE);4258 4259 __set_current_state(TASK_RUNNING);4260 return 0;

This code belongs to the migration_thread. The thread cannot exit until thekthread_should_stop() function returns 1. The thread sleeps while waiting for thefunction to return 0.

As can be seen from the code, the check for the kthread_should_stop condition ismade only after the state is TASK_INTERRUPTIBLE. Hence, the wake-up receivedafter the condition check but before the call to schedule() function is not lost.

______________________

Page 4: Sleeping in the Kernel

Comments

Comment viewing optionsThreaded list ­ expanded Date ­ newest first 50 comments per page Save settings

Select your preferred way to display the comments and click "Save settings" to activate your changes.

lost wakeup in wait_event_interruptible? (/article/8144#comment­355534)Submitted by Anonymous (not verified) on Thu, 09/02/2010 ­ 13:11.

Hello,

I see a 'lost wakeup' even in latest wait_event_interruptible. I know race condition got solvedafter sleep_on. So, I just want to know how/why this works w/o any problem. I saw a similarquestion but no answers (code snippet from 2.6.27 below)

thanks,Shankar.

­­­­

a) Wakeup occurs immediately before the call to prepare_to_wait()

b) Call to prepare_to_wait() sets process state to TASK_INTERRUPTIBLE

c) While the prepare_to_wait exits but just before condition is evaluated again, a h/w interrupt comes in!

d) The process is now still marked as TASK_INTERRUPTIBLE and will therefore not be

re­scheduled and will never execute the call to finish_wait() ­ so it will sleep

unless/until an interruption comes along...

­­­­­

#define __wait_event_interruptible(wq, condition, ret) \do \DEFINE_WAIT(__wait); \\for ( ; ; ) \prepare_to_wait(&wq, &__wait, TASK_INTERRUPTIBLE); \if (condition) \break; \if (!signal_pending(current)) \schedule(); \continue; \ \ret = ­ERESTARTSYS; \break; \ \finish_wait(&wq, &__wait); \ while (0)

#define wait_event_interruptible(wq, condition) \( \int __ret = 0; \if (!(condition)) \__wait_event_interruptible(wq, condition, __ret); \__ret; \)

1person

Page 5: Sleeping in the Kernel

wait_event_interruptible ­ retuns error (ERESTARTSYS) (/article/8144#comment­350531)Submitted by Anonymous on Thu, 04/08/2010 ­ 07:13.

Hi,

I am using wait_for_interruptable inread function and the isr wakes it up using wake_up_interruptable.Some times wait_for_interruptable returns error ERESTARTSYS.

The application usage scenario is as follows: I have a process which has lots of threads. And one threadhas read() function. Sometimes wake_up_interruptable() function return error and the full user­application iskilled. But the system contines to run with out any issues.

There are two possibilities this problem can occur:1) There are many threads in my process, some thread has caused some problem which results in killing theprocess. So the process sends signal to the thread which is doing read and the read thread which is waitingon wake_up_interruptable() comes out with error. One more observation here is that the "release" functionof my driver is called when this happens.2) Second possibility is that ­ Some thing wrong is happening in the driver and it is giving the error forwait_for_interruptable() and this inturn kills the userspace process. I am not very sure about this. But canthis kind of thing happen?

please provide some inputs on this ..

Waitqueues in bottom halve context (/article/8144#comment­347701)Submitted by Anonymous (not verified) on Thu, 01/21/2010 ­ 00:44.

Can we use wait queue in bottom halve context?For example, we have LOC in the following sourceKernel Version: Linux­2.6.18.5Path: \net\xfrm\xfrm_policy.cLn: 919Which calls schedule() function.This function is called when packet is received and looks forpolicy in softirq context.Please correct me if i misunderstood.Thanks in advance

Re: Waitqueues in bottom halve context (/article/8144#comment­347703)Submitted by Sathish Kumar (not verified) on Thu, 01/21/2010 ­ 01:01.

The function xfrm_lookup() is called from both user contextand kernel(softirq) context.The process is put to sleep when arguemnt:flag of the abovefunction is ­EAGAIN.User Context:If this function is called from user context it may have ­EAGAINKernel Context:If this function is called from kernel context (softirq), it will be NULL.So the softirq context process will not be put to wait queue.

Thanks,Sathish Kumar

misprint (/article/8144#comment­333198)Submitted by Anonymous (not verified) on Wed, 02/11/2009 ­ 10:03.

In the line:"This call causes the smbiod to sleep only if the DATA_READY bit is set","set" should read "clear", and I think this is a misprint here.

SIGKILL arrival (/article/8144#comment­267310)Submitted by mus_cular_dude (not verified) on Mon, 07/09/2007 ­ 15:38.

how will wait_event_interruptible will behave if SIGKILL happens to the user space processthat is sleeping using this function? My system gives kernel panic when i do 'kill ­SIGKILLpid'. is it because of the reason that im sleeping in kernel and i delivered SIGKILL?

In the schedule function (/article/8144#comment­235660)Submitted by Vinay Raghavan (not verified) on Mon, 04/16/2007 ­ 18:16.

In the schedule function section, you have mentioned that a way to wake up a process thathas just called the schedule function is to call wake_up_process(sleeping_task);

Page 6: Sleeping in the Kernel

(/user/801531)

(/user/4357)

(/user/801531)

But how does another process have a reference to the task structure of a process that hasjust called schedule.

In the code you have shown the example to get a reference to the task structure. But that isbeing done in the process that is calling schedule.

Please elaborate.

Thanks,Vinay

store it somewhere (/article/8144#comment­246730)Submitted by Kedar (not verified) on Mon, 05/21/2007 ­ 01:32.

It is upto you where/how you want to store that reference to the sleepingprocess's task struct. You may embed it in the appropriate data structures in yourcode.

Waking up a sleeping process from user space (/article/8144#comment­212372)Submitted by Anonymous (not verified) on Fri, 01/19/2007 ­ 12:56.

What is a good way to wake up a process that is asleep in kernel space from a process thatis in user space?

Perhaps a system call which invokes 'wake_up_process'?

The 'kill' system call doesn't seem to work...

schedule_timeout_,uninterruptible() (/article/8144#comment­124778)Submitted by Kedar Sovani (/user/801531) on Sat, 09/03/2005 ­ 01:11.

A patch that includes two new variants of schedule_timeout() has been includedin the ­mm kernel tree.

Almost all the calls to schedule_timeout(), set the state of the executing task toTASK_INTERRUPTIBLE / TASK_UNINTERRUPTIBLE. This patch embeds such functionality in thefollowing calls,

schedule_timeout_interruptible() and,schedule_timeout_uninterruptible()

Error in code samples (/article/8144#comment­124557)Submitted by rjbell4 (/user/4357) on Thu, 08/25/2005 ­ 19:09.

There is an error in the code samples. wait_event() and wait_event_interuptible()should not be passed the address of my_event, but my_event itself. That isbecause they are macros, and their implementations will wind up using theaddress­of operator (&) to take the address of the parameter they are passed.

Re : Error in code samples (/article/8144#comment­124777)Submitted by Kedar Sovani (/user/801531) on Sat, 09/03/2005 ­ 00:59.

Yes, thanks for that correction. In section "Wait Queues", theillustrations should be like this :

1. wait_event(my_event, (event_present == 1) );2. wait_event_interruptible(my_event, (event_present == 1) );

That is because they are (/article/8144#comment­329305)Submitted by Anonymous (not verified) on Thu, 11/27/2008 ­ 08:44.

That is because they are macros, and their implementations will windup using the address­of operator (&) to take the address of theparameter they are passed.mırç mırç (http://www.mirc34.net/) Chat chat(http://www.eschat.net/) türkçe mirc türkçe mirc (http://www.mircturkey.org/)

Page 7: Sleeping in the Kernel

Like 18peoplelikethis. Be

(/issue/137)

From Issue #137September 2005 (/issue/137)

Kernel Korner ­ Sleeping in the KernelJul 28, 2005 By Kedar Sovani (/user/801531) in

The old sleep_on() function won't work reliably in anage of SMP systems and hyperthreaded processors.Here's how to make a process sleep in a safe, cross­platform way.

Wait Queues

Wait queues are a higher-level mechanism used toput processes to sleep and wake them up. In mostinstances, you use wait queues. They are neededwhen more than one process wants to sleep on theoccurrence of one or more than one event.

A wait queue for an event is a list of nodes. Eachnode points to a process waiting for that event. Anindividual node in this list is called a wait queueentry. Processes that want to sleep while the event occurs add themselves to this listbefore going to sleep. On the occurrence of the event, one or more processes on thelist are woken up. Upon waking up, the processes remove themselves from the list.

A wait queue could be defined and initialised in the following manner:

wait_queue_head_t my_event;init_waitqueue_head(&my_event);

The same effect could be achieved by using this macro:

DECLARE_WAIT_QUEUE_HEAD(my_event);

Any process that wants to wait on my_event could use either of the followingoptions:

1. wait_event(&my_event, (event_present == 1) );

2. wait_event_interruptible(&my_event, (event_present == 1) );

The interruptible version 2 of the options above puts the process to an interruptiblesleep, whereas the other (option 1) puts the process into an uninterruptible sleep.

In most instances, a process goes to sleep only after checking some condition for theavailability of the resource. To facilitate that, both these functions take anexpression as the second argument. The process goes to sleep only if the expressionevaluates to false. Care is taken to avoid the lost wake-up problem.

Old kernel versions used the functions sleep_on() and interruptible_sleep_on(), butthose two functions can introduce bad race conditions and should not be used.

Let's now take a look at some of the calls for waking up process sleeping on a wait

Page 8: Sleeping in the Kernel

queue:

1. wake_up(&my_event);: wakes up only one process from the wait queue.

2. wake_up_all(&my_event);: wakes up all the processes on the wait queue.

3. wake_up_interruptible(&my_event);: wakes up only one process from thewait queue that is in interruptible sleep.

Wait Queues: Putting It Together

Let us look at a real-life example of how wait queues are used. smbiod is the I/Othread that performs I/O operations for the SMB filesystem. Here is a code snippetfor the smbiod thread (linux-2.6.11/fs/smbfs/smbiod.c: 291):

291 static int smbiod(void *unused)292 293 daemonize("smbiod");294295 allow_signal(SIGKILL);296297 VERBOSE("SMB Kernel thread starting " "(%d)...\n", current­>pid);298299 for (;;) 300 struct smb_sb_info *server;301 struct list_head *pos, *n;302303 /* FIXME: Use poll? */304 wait_event_interruptible(smbiod_wait,305 test_bit(SMBIOD_DATA_READY, &smbiod_flags));...... /* Some processing */312313 clear_bit(SMBIOD_DATA_READY, &smbiod_flags);314... /* Code to perform the requested I/O */......337 338339 VERBOSE("SMB Kernel thread exiting (%d)...\n", current­>pid);340 module_put_and_exit(0);341 342

As is clear from the code, smbiod is a thread that runs in a continuous loop as itprocesses I/O requests. When there are no I/O requests to process, the thread goesto sleep on the wait queue smbiod_wait. This is achieved by callingwait_event_interruptible (line 304). This call causes the smbiod to sleep only if theDATA_READY bit is set. As mentioned earlier, wait_event_interruptible takes careto avoid the lost wake-up problem.

Now, when a process wants to get some I/O done, it sets the DATA_READY bit inthe smbiod_flags and wakes up the smbiod thread to perform I/O. This can be seenin the following code snippet (linux-2.6.11/fs/smbfs/smbiod.c: 57):

Page 9: Sleeping in the Kernel

57 void smbiod_wake_up(void)58 59 if (smbiod_state == SMBIOD_DEAD)60 return;61 set_bit(SMBIOD_DATA_READY, &smbiod_flags);62 wake_up_interruptible(&smbiod_wait);63

wake_up_interruptible wakes up one process that was sleeping on the smbiod_waitwaitqueue. The function smb_add_request (linux-2.6.11/fs/smbfs/request.c: 279)calls the smbiod_wake_up function when it adds new requests for processing.

______________________

Comments

Comment viewing optionsThreaded list ­ expanded Date ­ newest first 50 comments per page Save settings

Select your preferred way to display the comments and click "Save settings" to activate your changes.

lost wakeup in wait_event_interruptible? (/article/8144#comment­355534)Submitted by Anonymous (not verified) on Thu, 09/02/2010 ­ 13:11.

Hello,

I see a 'lost wakeup' even in latest wait_event_interruptible. I know race condition got solvedafter sleep_on. So, I just want to know how/why this works w/o any problem. I saw a similarquestion but no answers (code snippet from 2.6.27 below)

thanks,Shankar.

­­­­

a) Wakeup occurs immediately before the call to prepare_to_wait()

b) Call to prepare_to_wait() sets process state to TASK_INTERRUPTIBLE

c) While the prepare_to_wait exits but just before condition is evaluated again, a h/w interrupt comes in!

d) The process is now still marked as TASK_INTERRUPTIBLE and will therefore not be

re­scheduled and will never execute the call to finish_wait() ­ so it will sleep

unless/until an interruption comes along...

­­­­­

#define __wait_event_interruptible(wq, condition, ret) \

1person

Page 10: Sleeping in the Kernel

do \DEFINE_WAIT(__wait); \\for ( ; ; ) \prepare_to_wait(&wq, &__wait, TASK_INTERRUPTIBLE); \if (condition) \break; \if (!signal_pending(current)) \schedule(); \continue; \ \ret = ­ERESTARTSYS; \break; \ \finish_wait(&wq, &__wait); \ while (0)

#define wait_event_interruptible(wq, condition) \( \int __ret = 0; \if (!(condition)) \__wait_event_interruptible(wq, condition, __ret); \__ret; \)

wait_event_interruptible ­ retuns error (ERESTARTSYS) (/article/8144#comment­350531)Submitted by Anonymous on Thu, 04/08/2010 ­ 07:13.

Hi,

I am using wait_for_interruptable inread function and the isr wakes it up using wake_up_interruptable.Some times wait_for_interruptable returns error ERESTARTSYS.

The application usage scenario is as follows: I have a process which has lots of threads. And one threadhas read() function. Sometimes wake_up_interruptable() function return error and the full user­application iskilled. But the system contines to run with out any issues.

There are two possibilities this problem can occur:1) There are many threads in my process, some thread has caused some problem which results in killing theprocess. So the process sends signal to the thread which is doing read and the read thread which is waitingon wake_up_interruptable() comes out with error. One more observation here is that the "release" functionof my driver is called when this happens.2) Second possibility is that ­ Some thing wrong is happening in the driver and it is giving the error forwait_for_interruptable() and this inturn kills the userspace process. I am not very sure about this. But canthis kind of thing happen?

please provide some inputs on this ..

Waitqueues in bottom halve context (/article/8144#comment­347701)Submitted by Anonymous (not verified) on Thu, 01/21/2010 ­ 00:44.

Can we use wait queue in bottom halve context?For example, we have LOC in the following sourceKernel Version: Linux­2.6.18.5Path: \net\xfrm\xfrm_policy.cLn: 919Which calls schedule() function.This function is called when packet is received and looks forpolicy in softirq context.Please correct me if i misunderstood.Thanks in advance

Re: Waitqueues in bottom halve context (/article/8144#comment­347703)Submitted by Sathish Kumar (not verified) on Thu, 01/21/2010 ­ 01:01.

The function xfrm_lookup() is called from both user contextand kernel(softirq) context.The process is put to sleep when arguemnt:flag of the abovefunction is ­EAGAIN.User Context:If this function is called from user context it may have ­EAGAINKernel Context:If this function is called from kernel context (softirq), it will be NULL.So the softirq context process will not be put to wait queue.

Page 11: Sleeping in the Kernel

(/user/801531)

(/user/4357)

Thanks,Sathish Kumar

misprint (/article/8144#comment­333198)Submitted by Anonymous (not verified) on Wed, 02/11/2009 ­ 10:03.

In the line:"This call causes the smbiod to sleep only if the DATA_READY bit is set","set" should read "clear", and I think this is a misprint here.

SIGKILL arrival (/article/8144#comment­267310)Submitted by mus_cular_dude (not verified) on Mon, 07/09/2007 ­ 15:38.

how will wait_event_interruptible will behave if SIGKILL happens to the user space processthat is sleeping using this function? My system gives kernel panic when i do 'kill ­SIGKILLpid'. is it because of the reason that im sleeping in kernel and i delivered SIGKILL?

In the schedule function (/article/8144#comment­235660)Submitted by Vinay Raghavan (not verified) on Mon, 04/16/2007 ­ 18:16.

In the schedule function section, you have mentioned that a way to wake up a process thathas just called the schedule function is to call wake_up_process(sleeping_task);

But how does another process have a reference to the task structure of a process that hasjust called schedule.

In the code you have shown the example to get a reference to the task structure. But that is being done inthe process that is calling schedule.

Please elaborate.

Thanks,Vinay

store it somewhere (/article/8144#comment­246730)Submitted by Kedar (not verified) on Mon, 05/21/2007 ­ 01:32.

It is upto you where/how you want to store that reference to the sleepingprocess's task struct. You may embed it in the appropriate data structures in yourcode.

Waking up a sleeping process from user space (/article/8144#comment­212372)Submitted by Anonymous (not verified) on Fri, 01/19/2007 ­ 12:56.

What is a good way to wake up a process that is asleep in kernel space from a process thatis in user space?

Perhaps a system call which invokes 'wake_up_process'?

The 'kill' system call doesn't seem to work...

schedule_timeout_,uninterruptible() (/article/8144#comment­124778)Submitted by Kedar Sovani (/user/801531) on Sat, 09/03/2005 ­ 01:11.

A patch that includes two new variants of schedule_timeout() has been includedin the ­mm kernel tree.

Almost all the calls to schedule_timeout(), set the state of the executing task toTASK_INTERRUPTIBLE / TASK_UNINTERRUPTIBLE. This patch embeds such functionality in thefollowing calls,

schedule_timeout_interruptible() and,schedule_timeout_uninterruptible()

Error in code samples (/article/8144#comment­124557)Submitted by rjbell4 (/user/4357) on Thu, 08/25/2005 ­ 19:09.

There is an error in the code samples. wait_event() and wait_event_interuptible()should not be passed the address of my_event, but my_event itself. That isbecause they are macros, and their implementations will wind up using theaddress­of operator (&) to take the address of the parameter they are passed.

Page 12: Sleeping in the Kernel

(/user/801531)

Re : Error in code samples (/article/8144#comment­124777)Submitted by Kedar Sovani (/user/801531) on Sat, 09/03/2005 ­ 00:59.

Yes, thanks for that correction. In section "Wait Queues", theillustrations should be like this :

1. wait_event(my_event, (event_present == 1) );2. wait_event_interruptible(my_event, (event_present == 1) );

That is because they are (/article/8144#comment­329305)Submitted by Anonymous (not verified) on Thu, 11/27/2008 ­ 08:44.

That is because they are macros, and their implementations will windup using the address­of operator (&) to take the address of theparameter they are passed.mırç mırç (http://www.mirc34.net/) Chat chat(http://www.eschat.net/) türkçe mirc türkçe mirc (http://www.mircturkey.org/)

Page 13: Sleeping in the Kernel

Like 18peoplelikethis. Be

(/issue/137)

From Issue #137September 2005 (/issue/137)

Kernel Korner ­ Sleeping in the KernelJul 28, 2005 By Kedar Sovani (/user/801531) in

The old sleep_on() function won't work reliably in anage of SMP systems and hyperthreaded processors.Here's how to make a process sleep in a safe, cross­platform way.

Thundering Herd Problem

Another classical operating system problem arisesdue to the use of the wake_up_all function. Let usconsider a scenario in which a set of processes aresleeping on a wait queue, wanting to acquire alock.

Once the process that has acquired the lock isdone with it, it releases the lock and wakes up allthe processes sleeping on the wait queue. All theprocesses try to grab the lock. Eventually, only one of these acquires the lock andthe rest go back to sleep.

This behavior is not good for performance. If we already know that only one processis going to resume while the rest of the processes go back to sleep again, why wakethem up in the first place? It consumes valuable CPU cycles and incurs context-switching overheads. This problem is called the thundering herd problem. That iswhy using the wake_up_all function should be done carefully, only when you knowthat it is required. Otherwise, go ahead and use the wake_up function that wakes uponly one process at a time.

So, when would the wake_up_all function be used? It is used in scenarios whenprocesses want to take a shared lock on something. For example, processes waitingto read data on a page could all be woken up at the same moment.

Time-Bound Sleep

You frequently may want to delay the execution of your process for a given amountof time. It may be required to allow the hardware to catch up or to carry out anactivity after specified time intervals, such as polling a device, flushing data to diskor retransmitting a network request. This can be achieved by the functionschedule_timeout(timeout), a variant of schedule(). This function puts the processto sleep until timeout jiffies have elapsed. jiffies is a kernel variable that isincremented for every timer interrupt.

As with schedule(), the state of the process has to be changed toTASK_INTERRUPTIBLE/TASK_UNINTERRUPTIBLE before calling this function.If the process is woken up earlier than timeout jiffies have elapsed, the number ofjiffies left is returned; otherwise, zero is returned.

Let us take a look at a real-life example (linux-2.6.11/arch/i386/kernel/apm.c:1415):

Page 14: Sleeping in the Kernel

1415 set_current_state(TASK_INTERRUPTIBLE);1416 for (;;) 1417 schedule_timeout(APM_CHECK_TIMEOUT);1418 if (exit_kapmd)1419 break;1421 * Ok, check all events, check for idle.... * (and mark us sleeping so as not to.... * count towards the load average)..1423 */1424 set_current_state(TASK_INTERRUPTIBLE);1425 apm_event_handler();1426

This code belongs to the APM thread. The thread polls the APM BIOS for events atintervals of APM_CHECK_TIMEOUT jiffies. As can be seen from the code, thethread calls schedule_timeout() to sleep for the given duration of time, after whichit calls apm_event_handler() to process any events.

You also may use a more convenient API, with which you can specify time inmilliseconds and seconds:

1. msleep(time_in_msec);

2. msleep_interruptible(time_in_msec);

3. ssleep(time_in_sec);

msleep(time_in_msec); and msleep_interruptible(time_in_msec); accept the timeto sleep in milliseconds, while ssleep(time_in_sec); accepts the time to sleep inseconds. These higher-level routines internally convert the time into jiffies,appropriately change the state of the process and call schedule_timeout(), thusmaking the process sleep.

I hope that you now have a basic understanding of how processes safely can sleepand wake up in the kernel. To understand the internal working of wait queues andadvanced uses, look at the implementations of init_waitqueue_head, as well asvariants of wait_event and wake_up.

Acknowledgement

Greg Kroah-Hartman reviewed a draft of this article and contributed valuablesuggestions.

Kedar Sovani (www.geocities.com/kedarsovani(http://www.geocities.com/kedarsovani) ) works for Kernel Corporation as a kerneldeveloper. His areas of interest include security, filesystems and distributedsystems.

______________________

Page 15: Sleeping in the Kernel

Comments

Comment viewing optionsThreaded list ­ expanded Date ­ newest first 50 comments per page Save settings

Select your preferred way to display the comments and click "Save settings" to activate your changes.

lost wakeup in wait_event_interruptible? (/article/8144#comment­355534)Submitted by Anonymous (not verified) on Thu, 09/02/2010 ­ 13:11.

Hello,

I see a 'lost wakeup' even in latest wait_event_interruptible. I know race condition got solvedafter sleep_on. So, I just want to know how/why this works w/o any problem. I saw a similarquestion but no answers (code snippet from 2.6.27 below)

thanks,Shankar.

­­­­

a) Wakeup occurs immediately before the call to prepare_to_wait()

b) Call to prepare_to_wait() sets process state to TASK_INTERRUPTIBLE

c) While the prepare_to_wait exits but just before condition is evaluated again, a h/w interrupt comes in!

d) The process is now still marked as TASK_INTERRUPTIBLE and will therefore not be

re­scheduled and will never execute the call to finish_wait() ­ so it will sleep

unless/until an interruption comes along...

­­­­­

#define __wait_event_interruptible(wq, condition, ret) \do \DEFINE_WAIT(__wait); \\for ( ; ; ) \prepare_to_wait(&wq, &__wait, TASK_INTERRUPTIBLE); \if (condition) \break; \if (!signal_pending(current)) \schedule(); \continue; \ \ret = ­ERESTARTSYS; \break; \ \finish_wait(&wq, &__wait); \ while (0)

#define wait_event_interruptible(wq, condition) \( \int __ret = 0; \if (!(condition)) \__wait_event_interruptible(wq, condition, __ret); \__ret; \)

wait_event_interruptible ­ retuns error (ERESTARTSYS) (/article/8144#comment­350531)Submitted by Anonymous on Thu, 04/08/2010 ­ 07:13.

Hi,

I am using wait_for_interruptable inread function and the isr wakes it up using wake_up_interruptable.Some times wait_for_interruptable returns error ERESTARTSYS.

The application usage scenario is as follows: I have a process which has lots of threads. And one thread

www.nimsoft.com/network­monitoring Ads by Google

Page 16: Sleeping in the Kernel

has read() function. Sometimes wake_up_interruptable() function return error and the full user­application iskilled. But the system contines to run with out any issues.

There are two possibilities this problem can occur:1) There are many threads in my process, some thread has caused some problem which results in killing theprocess. So the process sends signal to the thread which is doing read and the read thread which is waitingon wake_up_interruptable() comes out with error. One more observation here is that the "release" functionof my driver is called when this happens.2) Second possibility is that ­ Some thing wrong is happening in the driver and it is giving the error forwait_for_interruptable() and this inturn kills the userspace process. I am not very sure about this. But canthis kind of thing happen?

please provide some inputs on this ..

Waitqueues in bottom halve context (/article/8144#comment­347701)Submitted by Anonymous (not verified) on Thu, 01/21/2010 ­ 00:44.

Can we use wait queue in bottom halve context?For example, we have LOC in the following sourceKernel Version: Linux­2.6.18.5Path: \net\xfrm\xfrm_policy.cLn: 919Which calls schedule() function.This function is called when packet is received and looks forpolicy in softirq context.Please correct me if i misunderstood.Thanks in advance

Re: Waitqueues in bottom halve context (/article/8144#comment­347703)Submitted by Sathish Kumar (not verified) on Thu, 01/21/2010 ­ 01:01.

The function xfrm_lookup() is called from both user contextand kernel(softirq) context.The process is put to sleep when arguemnt:flag of the abovefunction is ­EAGAIN.User Context:If this function is called from user context it may have ­EAGAINKernel Context:If this function is called from kernel context (softirq), it will be NULL.So the softirq context process will not be put to wait queue.

Thanks,Sathish Kumar

misprint (/article/8144#comment­333198)Submitted by Anonymous (not verified) on Wed, 02/11/2009 ­ 10:03.

In the line:"This call causes the smbiod to sleep only if the DATA_READY bit is set","set" should read "clear", and I think this is a misprint here.

SIGKILL arrival (/article/8144#comment­267310)Submitted by mus_cular_dude (not verified) on Mon, 07/09/2007 ­ 15:38.

how will wait_event_interruptible will behave if SIGKILL happens to the user space processthat is sleeping using this function? My system gives kernel panic when i do 'kill ­SIGKILLpid'. is it because of the reason that im sleeping in kernel and i delivered SIGKILL?

In the schedule function (/article/8144#comment­235660)Submitted by Vinay Raghavan (not verified) on Mon, 04/16/2007 ­ 18:16.

In the schedule function section, you have mentioned that a way to wake up a process thathas just called the schedule function is to call wake_up_process(sleeping_task);

But how does another process have a reference to the task structure of a process that hasjust called schedule.

In the code you have shown the example to get a reference to the task structure. But that is being done inthe process that is calling schedule.

Please elaborate.

Thanks,Vinay

Page 17: Sleeping in the Kernel

(/user/801531)

(/user/4357)

(/user/801531)

store it somewhere (/article/8144#comment­246730)Submitted by Kedar (not verified) on Mon, 05/21/2007 ­ 01:32.

It is upto you where/how you want to store that reference to the sleepingprocess's task struct. You may embed it in the appropriate data structures in yourcode.

Waking up a sleeping process from user space (/article/8144#comment­212372)Submitted by Anonymous (not verified) on Fri, 01/19/2007 ­ 12:56.

What is a good way to wake up a process that is asleep in kernel space from a process thatis in user space?

Perhaps a system call which invokes 'wake_up_process'?

The 'kill' system call doesn't seem to work...

schedule_timeout_,uninterruptible() (/article/8144#comment­124778)Submitted by Kedar Sovani (/user/801531) on Sat, 09/03/2005 ­ 01:11.

A patch that includes two new variants of schedule_timeout() has been includedin the ­mm kernel tree.

Almost all the calls to schedule_timeout(), set the state of the executing task toTASK_INTERRUPTIBLE / TASK_UNINTERRUPTIBLE. This patch embeds such functionality in thefollowing calls,

schedule_timeout_interruptible() and,schedule_timeout_uninterruptible()

Error in code samples (/article/8144#comment­124557)Submitted by rjbell4 (/user/4357) on Thu, 08/25/2005 ­ 19:09.

There is an error in the code samples. wait_event() and wait_event_interuptible()should not be passed the address of my_event, but my_event itself. That isbecause they are macros, and their implementations will wind up using theaddress­of operator (&) to take the address of the parameter they are passed.

Re : Error in code samples (/article/8144#comment­124777)Submitted by Kedar Sovani (/user/801531) on Sat, 09/03/2005 ­ 00:59.

Yes, thanks for that correction. In section "Wait Queues", theillustrations should be like this :

1. wait_event(my_event, (event_present == 1) );2. wait_event_interruptible(my_event, (event_present == 1) );

That is because they are (/article/8144#comment­329305)Submitted by Anonymous (not verified) on Thu, 11/27/2008 ­ 08:44.

That is because they are macros, and their implementations will windup using the address­of operator (&) to take the address of theparameter they are passed.mırç mırç (http://www.mirc34.net/) Chat chat(http://www.eschat.net/) türkçe mirc türkçe mirc (http://www.mircturkey.org/)