LINUX KERNEL INTERNALS: Interrupts

INTERRUPTS

The kernel is responsible for servicing the request of hardwares.
The CPU must process the request from the hardware.
Since the CPU frequency and the hardware frequency is not the same( hardware is slower) so the the hardwares can't send the data/request to the CPU synchronously.
There are two ways in which CPU can check about the request from a hardware-

1. Polling
2. Interrupt

In polling the CPU keeps on checking all the hardwares of the availablilty of any request.
In interrupt the CPU takes care of the hardware only when the hardware requests for some service.
Polling is an expensive job as it requires a grater overhead.
The better way is to use interrupt as the hardware will request the CPU only when it has some request to be serviced.
Different devices are given different interrupt values called IRQ (interrupt request) lines.
For ex. IRQ zero is the timer interrupt and IRQ one is the keyboard interrupt.
An interrupt is physically produced by electronic signals originating from hardware devices and directed into input pins on an interrupt controller.
Some interrupt numbers are static and some interrupts are dynamically assigned.
Be it static or dynamic, the kernel must know which interrupt number is associated with which hardware.
The interrupt controller, in turn, sends a signal to the processor. The processor detects this signal and interrupts its current execution to handle the interrupt.
The processor can then notify the operating system that an interrupt has occurred, and the operating system can handle the interrupt appropriately.
Interrupt handlers in Linux need not be reentrant. When a given interrupt handler is executing, the corresponding interrupt line is masked out on all processors, preventing another interrupt on the same line from being received. Normally all other interrupts are enabled, so other interrupts are serviced, but the current line is always disabled.

Comparison between interrupts and exceptions-

Exceptions occur synchronously with respect to the processor clock while interrupts occur async.
That is why exceptions are often called synchronous interrupts.
Exceptions are produced by the processor while executing instructions either in response to a programming error (for example, divide by zero) or abnormal conditions that must be handled by the kernel (for example, a page fault).
Many processor architectures handle exceptions in a similar manner to interrupts, therefore, the kernel infrastructure for handling the two is similar.
Exceptions are of two types- traps and software interrupts.
Exceptions are produced by the processor while executing instructions either in response to a programming error (for example, divide by zero) or abnormal conditions that must be handled by the kernel (for example, a page fault).
A trap is a kind of exceptions, whose main purpose is for debugging (eg. notify the debugger that an instruction has been reached) or it occurs during abnormal conditions.
A software interrupt occur at the request of a programmer eg. System calls.

Interrupt Handler

These are the C functions that get executed when an interrupt comes.
Each interrupt is associated with a particular interrupt handler.
Interrupt handler is also known as interrupt service routine (ISR).
Since interrupts can come any time therefore interrupt handlers has to be short and quick.
At least the interrupt handler has to acknowledge the hardware and rest of the work can be done at a later time.

Top Halves Versus Bottom Halves

There are two goals that an interrupt handler needs to perform 1. execute quickly and 2. perform a large amount of work .
Because of these conflicting goals, the processing of interrupts is split into two parts, or halves.
The interrupt handler is the top half.
It is run immediately upon receipt of the interrupt and performs only the work that is time critical, such as acknowledging receipt of the interrupt or resetting the hardware.
Work that can be performed later is delayed until the bottom half.
The bottom half runs in the future, at a more convenient time, with all interrupts enabled.
Let us consider a case where we need to collect the data form a data card and then process it.
The most important job is to collect the data from data card to the memory and free the card for incoming data and this is done in top half.
The rest part which deals with the processing of data is done in the bottom half.

Registering an Interrupt Handler

Interrupt handlers are the responsibility of the driver managing the hardware. Each device has one associated driver and, if that device uses interrupts (and most do), then that driver registers one interrupt handler.
Drivers can register an interrupt handler and enable a given interrupt line for handling via the function

/* request_irq: allocate a given interrupt line */
int request_irq(unsigned int irq,
                irqreturn_t (*handler)(int, void *, struct pt_regs *),
                unsigned long irqflags,
                const char *devname,
                void *dev_id)

The first parameter, irq, specifies the interrupt number to allocate. For some devices, for example legacy PC devices such as the system timer or keyboard, this value is typically hard-coded. For most other devices, it is probed or otherwise determined programmatically and dynamically.
The second parameter, handler, is a function pointer to the actual interrupt handler that services this interrupt. This function is invoked whenever the operating system receives the interrupt. Note the specific prototype of the handler function: It takes three parameters and has a return value of irqreturn_t.
The third parameter, irqflags, might be either zero or a bit mask of one or more of the following flags:

SA_INTERRUPT This flag specifies that the given interrupt handler is a fast interrupt handler. Fast interrupt handlers run with all interrupts disabled on the local processor. This enables a fast handler to complete quickly, without possible interruption from other interrupts. By default (without this flag), all interrupts are enabled except the interrupt lines of any running handlers, which are masked out on all processors. Sans the timer interrupt, most interrupts do not want to enable this flag.
SA_SAMPLE_RANDOM This flag specifies that interrupts generated by this device should contribute to the kernel entropy pool. The kernel entropy pool provides truly random numbers derived from various random events. If this flag is specified, the timing of interrupts from this device are fed to the pool as entropy. Do not set this if your device issues interrupts at a predictable rate (for example, the system timer) or can be influenced by external attackers (for example, a networking device). On the other hand, most other hardware generates interrupts at nondeterministic times and is, therefore, a good source of entropy.
SA_SHIRQ This flag specifies that the interrupt line can be shared among multiple interrupt handlers. Each handler registered on a given line must specify this flag; otherwise, only one handler can exist per line. More information on shared handlers is provided in a following section.
The fourth parameter, devname, is an ASCII text representation of the device associated with the interrupt. For example, this value for the keyboard interrupt on a PC is "keyboard". These text names are used by /proc/irq and /proc/interrupts for communication with the user, which is discussed shortly.
The fifth parameter, dev_id, is used primarily for shared interrupt lines. When an interrupt handler is freed (discussed later), dev_id provides a unique cookie to allow the removal of only the desired interrupt handler from the interrupt line. Without this parameter, it would be impossible for the kernel to know which handler to remove on a given interrupt line. You can pass NULL here if the line is not shared, but you must pass a unique cookie if your interrupt line is shared (and unless your device is old and crusty and lives on the ISA bus, there is good chance it must support sharing). This pointer is also passed into the interrupt handler on each invocation. A common practice is to pass the driver's device structure: This pointer is unique and might be useful to have within the handlers and the Device Model.
On success, request_irq() returns zero. A nonzero value indicates error, in which case the specified interrupt handler was not registered. A common error is -EBUSY, which denotes that the given interrupt line is already in use (and either the current user or you did not specify SA_SHIRQ).
Note that request_irq() can sleep and therefore cannot be called from interrupt context or other situations where code cannot block.
On registration, an entry corresponding to the interrupt is created in /proc/irq.
The function proc_mkdir() is used to create new procfs entries. This function calls proc_create() to set up the new procfs entries, which in turn call kmalloc() to allocate memory
In a driver, requesting an interrupt line and installing a handler is done via request_irq():

if (request_irq(irqn, my_interrupt, SA_SHIRQ, "my_device", dev)) {
        printk(KERN_ERR "my_device: cannot register IRQ %d\n", irqn);
        return -EIO;
}

In this example, irqn is the requested interrupt line, my_interrupt is the handler, the line can be shared, the device is named "my_device," and we passed dev for dev_id. On failure, the code prints an error and returns. If the call returns zero, the handler has been successfully installed. From that point forward, the handler is invoked in response to an interrupt. It is important to initialize hardware and register an interrupt handler in the proper order to prevent the interrupt handler from running before the device is fully initialized.

Freeing an Interrupt Handler

When we unregister our device drivers it is compulsory to free the interrupt handler what we have registered for the device.
This frees the interrupt line.

void free_irq(unsigned int irq, void *dev_id)

If the specified interrupt line is not shared, this function removes the handler 
and disables the line. If the interrupt line is shared, the handler identified 
via dev_id is removed, but the interrupt line itself is disabled only 
when the last handler is removed.


A call to free_irq() must be made from process 
context.

Interrupt handling concepts

Each CPU core has only one interrupt line coming towards it from the Interrupt controller which has n number of interrupt lines.
When a core is executing an interrupt the interrupt is said to be in active state, let us suppose we get same interrupt immediately, in that case the the new interrupt will be put to pending and the resultant interrupt line will be in active pending state. Only when the active interrupt is cleared this pending one will be entertained for execution.
Let us suppose we are in such active pending situation and a higher priority interrupt arrives, in this case the lower priority one is temporarily interrupted and higher one is executed, when the higher one is done the lower one is executed. The lower priority one is still active when interrupted as its context remains in the stack.

down vote

On a SMP architecture Advanced Programmable Interrupt Controller(APIC) is used to route the interrupts from peripherals to the CPU's.

the APIC, based on
1. the routing table,
2. priority of the interrupt,
3. the load on the CPUs(higher busy one is less burdened)

Let us consider a case of same interrupt being received one an SMP system. FOr each core we have an APIC and for external interrupt interface we have one more APIC.

For example, consider a interrupt is received at IRQ line 10, this goes through external APIC,the interrupt is routed to a particular CPU APIC, for now consider CPU0, this interrupt line is masked until the ISR is handled, which means we will not get a interrupt of the same type if ISR execution is in progress, new occurrence will be put to pending state(only 1).

Once ISR is handled, only then the interrupt line is unmasked for future interrupts

How to write an interrupt handler?

The declaration of interrupt handler:

static irqreturn_t intr_handler(int irq, void *dev_id, struct pt_regs *regs)

The first parameter, irq, is the numeric value(10,11,30, e.t.c) of the interrupt line the handler is supposed to service

The second parameter, dev_id, is a generic pointer to the same dev_id that was given to request_irq() when the interrupt handler was registered. This value should be unique if it is intended for shared interrupt as it can act as a cookie to differentiate between multiple devices using the same interrupt handler.

The final parameter, regs, holds a pointer to a structure containing the processor registers and state before servicing the interrupt.

The return value of an interrupt handler is the special type irqreturn_t. An interrupt could be serviced or could not be An interrupt handler can return two special values, IRQ_NONE or IRQ_HANDLED. The former is returned when the interrupt handler detects an interrupt for which its device was not the originator. The latter is returned if the interrupt handler was correctly invoked, and its device did indeed cause the interrupt. Alternatively, IRQ_RETVAL(val) may be used. If val is non-zero, this macro returns IRQ_HANDLED. Otherwise, the macro returns IRQ_NONE. These special values are used to let the kernel know whether devices are issuing spurious (that is, unrequested) interrupts. If all the interrupt handlers on a given interrupt line return IRQ_NONE, then the kernel can detect the problem. Note the curious return type, irqreturn_t, which is simply an int

The interrupt handler is normally marked static because it is never called directly from another file.

The role of the interrupt handler depends entirely on the device and its reasons for issuing the interrupt. At a minimum, most interrupt handlers need to provide acknowledgment to the device that they received the interrupt. Devices that are more complex need to additionally send and receive data and perform extended work in the interrupt handler. As mentioned, the extended work is pushed as much as possible into the bottom half handler, which I have discussed in other post on Bottom Halves.

Shared Handlers

A shared handler is registered and executed much like a non-shared handler. There are three main differences:

The SA_SHIRQ flag must be set in the flags argument to request_irq().
The dev_id argument must be unique to each registered handler. A pointer to any per-device structure is sufficient; a common choice is the device structure as it is both unique and potentially useful to the handler. You cannot pass NULL for a shared handler!
The interrupt handler must be capable of distinguishing whether its device actually generated an interrupt. This requires both hardware support and associated logic in the interrupt handler. If the hardware did not offer this capability, there would be no way for the interrupt handler to know whether its associated device or some other device sharing the line caused the interrupt.

Interrupt Context

When executing an interrupt handler or bottom half, the kernel is in interrupt context. (process context is the mode of operation the kernel is in while it is executing on behalf of a process for example, executing a system call or running a kernel thread. )
In process context, the current macro points to the associated task. Furthermore, because a process is coupled to the kernel in process context, process context can sleep or otherwise invoke the scheduler.
Interrupt context is not associated with a process. The current macro is not relevant (although it points to the interrupted process). Without a backing process, interrupt context cannot sleephow would it ever reschedule? Therefore, you cannot call certain functions from interrupt context. If a function sleeps, you cannot use it from your interrupt handlerthis limits the functions that one can call from an interrupt handler.
Interrupt context is time critical because the interrupt handler interrupts other code. Code should be quick and simple. Busy looping is discouraged. This is a very important point; always keep in mind that your interrupt handler has interrupted other code (possibly even another interrupt handler on a different line!).
Because of this asynchronous nature, it is imperative that all interrupt handlers be as quick and as simple as possible. As much as possible, work should be pushed out from the interrupt handler and performed in a bottom half, which runs at a more convenient time.
The setup of an interrupt handler's stacks is a configuration option. Historically, interrupt handlers did not receive their own stacks. Instead, they would share the stack of the process that they interrupted.
The kernel stack is two pages in size; typically, that is 8KB on 32-bit architectures and 16KB on 64-bit architectures. Because in this setup interrupt handlers share the stack, they must be exceptionally economical with what data they allocate there. Of course, the kernel stack is limited to begin with, so all kernel code should be cautious.

Implementation of Interrupt Handling

The implementation of interrupt handler is architecture dependent and hardware dependent.

Path Of interrupt

A device issues an interrupt by sending an electric signal over its bus to the interrupt controller.
If the interrupt line is enabled (they can be masked out), the interrupt controller sends the interrupt to the processor.
In most architectures, this is accomplished by an electrical signal that is sent over a special pin to the processor. Unless interrupts are disabled in the processor (which can also happen), the processor immediately stops what it is doing, disables the interrupt system, and jumps to a predefined location in memory and executes the code located there. This predefined point is set up by the kernel and is the entry point for interrupt handlers.
The interrupt's journey in the kernel begins at this predefined entry point, just as system calls enter the kernel through a predefined exception handler.
For each interrupt line, the processor jumps to a unique location in memory and executes the code located there.
In this manner, the kernel knows the IRQ number of the incoming interrupt. The initial entry point simply saves this value and stores the current register values (which belong to the interrupted task) on the stack; then the kernel calls do_IRQ(). From here onward, most of the interrupt handling code is written in C however, it is still architecture dependent.
The do_IRQ() function is declared as

unsigned int do_IRQ(struct pt_regs regs)

Because the C calling convention places function arguments at the top of the stack, the pt_regs structure contains the initial register values that were previously saved in the assembly entry routine. Because the interrupt value was also saved, do_IRQ() can extract it.
The x86 code is int irq = regs.orig_eax & 0xff;
After the interrupt line is calculated, do_IRQ() acknowledges the receipt of the interrupt and disables interrupt delivery on the line. On normal PC machines, these operations are handled by mask_and_ack_8259A(), which do_IRQ() calls.
Next, do_IRQ() ensures that a valid handler is registered on the line, and that it is enabled and not currently executing. If so, it calls handle_IRQ_event() to run the installed interrupt handlers for the line. On x86, handle_IRQ_event() is

asmlinkage int handle_IRQ_event(unsigned int irq, struct pt_regs *regs,
                                struct irqaction *action)
{
        int status = 1;
        int retval = 0;

        if (!(action->flags & SA_INTERRUPT))
                local_irq_enable();

        do {
                status |= action->flags;
                retval |= action->handler(irq, action->dev_id, regs);
                action = action->next;
        } while (action);

        if (status & SA_SAMPLE_RANDOM)
                add_interrupt_randomness(irq);

        local_irq_disable();

        return retval;
}

First, because the processor disabled interrupts, they are turned back on unless SA_INTERRUPT was specified during the handler's registration. Recall that SA_INTERRUPT specifies that the handler must be run with interrupts disabled. Next, each potential handler is executed in a loop.
If this line is not shared, the loop terminates after the first iteration. Otherwise, all handlers are executed. After that, add_interrupt_randomness() is called if SA_SAMPLE_RANDOM was specified during registration.
This function uses the timing of the interrupt to generate entropy for the random number generator.
Finally, interrupts are again disabled (do_IRQ() expects them still to be off) and the function returns. Back in do_IRQ(), the function cleans up and returns to the initial entry point, which then jumps to ret_from_intr().
The routine ret_from_intr() is, as with the initial entry code, written in assembly. This routine checks whether a reschedule is pending (this implies that need_resched is set).
If a reschedule is pending, and the kernel is returning to user-space (that is, the interrupt interrupted a user process), schedule() is called. If the kernel is returning to kernel-space (that is, the interrupt interrupted the kernel itself), schedule() is called only if the preempt_count is zero (otherwise it is not safe to preempt the kernel). After schedule() returns, or if there is no work pending, the initial registers are restored and the kernel resumes whatever was interrupted
On x86, the initial assembly routines are located in arch/i386/kernel/entry.S and the C methods are located in arch/i386/kernel/irq.c. Other supported architectures are similar.

`/proc/interrupts`

Procfs is a virtual filesystem that exists only in kernel memory and is typically mounted at /proc.
Reading or writing files in procfs invokes kernel functions that simulate reading or writing from a real file.
A relevant example is the /proc/interrupts file, which is populated with statistics related to interrupts on the system. Here is sample output from a uniprocessor PC:

      CPU0  
 0:   3602371   XT-PIC   timer
 1:   3048      XT-PIC   i8042
 2:   0         XT-PIC   cascade
 4:   2689466   XT-PIC   uhci-hcd, eth0
 5:   0         XT-PIC   EMU10K1
 12:  85077     XT-PIC   uhci-hcd
 15:  24571     XT-PIC   aic7xxx
NMI:  0 
LOC:  3602236 
ERR:  0

The first column is the interrupt line. On this system, interrupts numbered 02, 4, 5, 12, and 15 are present
. Handlers are not installed on lines not displayed.
The second column is a counter of the number of interrupts received. A column is present for each processor on the system, but this machine has only one processor.
As you can see, the timer interrupt has received 3,602,371 interrupt, whereas the sound card (EMU10K1) has received none (which is an indication that it has not been used since the machine booted).
The third column is the interrupt controller handling this interrupt. XT-PIC corresponds to the standard PC programmable interrupt controller. On systems with an I/O APIC, most interrupts would list IO-APIC-level or IO-APIC-edge as their interrupt controller.
Finally, the last column is the device associated with this interrupt. This name is supplied by the devname parameter to request_irq(), as discussed previously. If the interrupt is shared, as is the case with interrupt number four in this example, all the devices registered on the interrupt line are listed.
procfs code is located primarily in fs/proc. The function that provides /proc/interrupts is, not surprisingly, architecture dependent and named show_interrupts().

12 comments:

AnonymousJune 30, 2014 at 5:32 AM
thanks, enjoyed reading it and finally got solid knowledge of Linux IRQ layer!
AnonymousSeptember 26, 2014 at 7:57 PM
Superb blog! Do you have any recommendations for aspiring writers?
I'm hoping to start my own website soon but I'm a little lost on everything.

Would you recommend starting with a free platform like Wordpress or go for a paid option? There are so
many choices out there thnat I'm completely overwhelmed ..
Any ideas? Thannk you!

Feell free to visit mmy web-site :: team task management
Vikas ChoudharyFebruary 25, 2015 at 5:14 AM
excellent man,wow!!!!!!!!!!!!!!!!!!!!!!
UnknownOctober 19, 2015 at 7:01 AM
thank you very much, i got good idea about irq in linux
pragyachitraSeptember 27, 2018 at 2:25 AM
This is a good post. This post give truly quality information. I’m definitely going to look into it. Really very useful tips are provided here. thank you so much. Keep up the good works.

angularjs Training in chennai
angularjs Training in chennai

angularjs-Training in tambaram

angularjs-Training in sholinganallur

angularjs-Training in velachery
gowsalyaOctober 15, 2018 at 11:15 PM
Thanks you for sharing this unique useful information content with us. Really awesome work. keep on blogging
advanced excel training in bangalore
UnknownOctober 17, 2018 at 11:58 PM
Thank you for benefiting from time to focus on this kind of, I feel firmly about it and also really like comprehending far more with this particular subject matter. In case doable, when you get know-how, is it possible to thoughts modernizing your site together with far more details? It’s extremely useful to me.
Java training in Marathahalli | Java training in Btm layout

Java training in Jaya nagar | Java training in Electronic city
UnknownOctober 18, 2018 at 12:13 AM
Great Article… I love to read your articles because your writing style is too good, its is very very helpful for all of us and I never get bored while reading your article because, they are becomes a more and more interesting from the starting lines until the end.
Data Science training in rajaji nagar | Data Science with Python training in chenni

Data Science training in electronic city | Data Science training in USA

Data science training in pune | Data science training in kalyan nagar
Asset Finance SoftwareMay 2, 2019 at 4:18 AM
NETSOL Technologies Inc is a worldwide provider of global IT
and enterprise application solutions which include credit and finance portfolio management systems,
SAP consulting, custom development, systems integration and technical services. Asset Finance Software
Asset Finance Software
CarlsenJune 27, 2019 at 8:03 AM
Nice Blogpost regarding Linux Learn Python the hard way

YellowStar5August 17, 2021 at 10:53 PM
This series is really awesome，thanks!

LINUX KERNEL INTERNALS

Popular Posts

Thursday, February 27, 2014

Interrupts