Sunday, February 7, 2016

Linux Interrupt Handling on AMP/SMP/HMP

Linux中断处理:

ARMv7
  • Exception entry steps
    1. Push eight registers
    2. Read vector table from memory based on table base + (exception number 4)
    3. Read SP from vector table(On Reset only, updates SP to top of stack from vector table.)
    4. Updates PC(with vector table read location)
    5. Load pipeline(Loads instructions from location pointed to by vector table.)
    6. Update LR
  1. FIQ mode has its own dedicated banked registers, r8-r14. R14 is the link register which holds the return address(+4) from the FIQ. But if your FIQ handler is able to be written such that it only uses r8-r13, it can take advantage of these banked registers in two ways: •One is that it does not incur the overhead of pushing and popping any registers that are used by the interrupt service routine (ISR). This can save a significant number of cycles on both entry and exit to the ISR. Also, the handler can rely on values persisting in registers from one call to the next, so that for example r8 may be used as a pointer to a hardware device and the handler can rely on the same value being in r8 the next time it is called.
  2. FIQ location at the end of the exception vector table (0x1C) means that if the FIQ handler code is placed directly at the end of the vector table, no branch is required - the code can execute directly from 0x1C. This saves a few cycles on entry to the ISR.
  3. FIQ has higher priority than IRQ. This means that when the core takes an FIQ exception, it automatically masks out IRQs. An IRQ cannot interrupt the FIQ handler. The opposite is not true - the IRQ does not mask FIQs and so the FIQ handler (if used) can interrupt the IRQ. Additionally, if both IRQ and FIQ requests occur at the same time, the core will deal with the FIQ first.

http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.ddi0333h/I30195.html

Entering an ARM exception


SCR[3:1] determine the mode that the processor enters on an FIQ, IRQ, or external abort exception, see System control and configuration.
When handling an ARM exception the processor:
  1. Preserves the address of the next instruction in the appropriate LR. When the exception entry is from:
    ARM and Jazelle states:
    The processor writes the value of the PC into the LR, offset by a value, current PC + 4 or PC + 8 depending on the exception, that causes the program to resume from the correct place on return.
    Thumb state:
    The processor writes the value of the PC into the LR, offset by a value, current PC + 2, PC + 4 or PC + 8 depending on the exception, that causes the program to resume from the correct place on return.
    The exception handler does not have to determine the state when entering an exception. For example, in the case of a SVC, MOVS PC, R14_svc always returns to the next instruction regardless of whether the SVC was executed in ARM or Thumb state.
  2. Copies the CPSR into the appropriate SPSR.
  3. Forces the CPSR mode bits to a value that depends on the exception.
  4. Forces the PC to fetch the next instruction from the relevant exception vector.
The processor can also set the interrupt and imprecise abort disable flags to prevent otherwise unmanageable nesting of exceptions.

Note

Exceptions are always entered, handled, and exited in ARM state. When the processor is in Thumb state or Jazelle state and an exception occurs, the switch to ARM state takes place automatically when the exception vector address is loaded into the PC.

1. interrupts and exceptions
Interrupts can be divided into synchronous (synchronous) INT and asynchronous (asynchronous) interrupt:
(1) the synchronization is interruptedalso known as exception is generated when the instruction is executed by the CPUcontrol unit, called sync, because only the CPU will be issued only after a command has finished executing interrupt. Typical exceptions are missing pagesand in addition to 0.
(2) asynchronous break is defined by other hardware devices in accordance with the CPU clock signal generated randomly.
1.1 ARM exception vector table
When an exception or interrupt occurs, the processor will set the PC to a specific memory address. The address in a known vector table (vector table) within a specified range of addresses. V-table entry are some of the jump instruction, jump to deal exclusively with an exception or interrupt subroutine.
Memory-mapped address 0x00000000 is reserved for the v-table. In some processors, vector table can choose to locate inthe storage space of a higher address (starting from the offset0xffff0000). Operating systems, such as Linux and Microsoftembedded operating systems, you can take advantage of this feature.
When an exception or interrupt occurs, the processor suspends normal execution from the vector table (see table 1-1) loadinstructions. Each v-table entry contains a jump instruction to point to a specific subroutine.



Reset vector is executed in processor power-on the first place. This instruction allows the processor to the initialization code.
Vector is undefined instruction if it cannot decode the instruction on the use of the processor.
Software interrupt vectors are called SWI instruction is executed. SWI instruction is often used as a mechanism for anoperating system routine is called.
Prefetch abort vector occurs when the processor tries to pick up an address without the proper access permissions when yourefer to actually stop occurred in "decoding".
Data vectors are prefetch aborts similar, occurs in an instruction tries to access data storage without the proper accessprivileges.
Interrupt request vector is used for the normal flow of execution of external hardware interrupts the processor. Only whenthe IRQ of the CPSR can occur when the bit is not masked.
Fast interrupt request vector is similar to an interrupt request, is to require a short interruption of the response time ofhardware. Only when the CPSR in the FIQ can occur when the bit is not masked.
2. data structure
There are three main data structures associated with the interrupt, namely: irq_desc and the irq_chip and irqaction. Here'sthe three data structures and the relationships between them.
Kernel irq_desc to describe an interruption, irq_desc structures are defined in include/Linux/IRQ.h:

  1. struct irq_desc {
  2.     unsigned int        irq; /*中断号*/
  3.     ……
  4.     irq_flow_handler_t     handle_irq; /*电流层中断处理函数 */
  5.     struct irq_chip        *chip; /*包含处理器相关的处理函数 */
  6.     struct msi_desc        *msi_desc;
  7.     void            *handler_data;
  8.     void            *chip_data;
  9.     struct irqaction    *action;     /* IRQ操作链表 */
  10.     unsigned int        status;     /* IRQ状态 */

  11.     unsigned int        depth;    
  12.     ……
  13.     const char        *name;
  14. } ____cacheline_internodealigned_in_smp;


Among them, handle_irq links to current layer function of interrupt processing, mainly for different trigger modes (level,edge). Handler_data can point to any data, for use by the handle_irq function. Whenever an interrupt occurs, architecture-related code is called handle_irq, the function responsible for providing processors used in chip-related methods, tocomplete the process necessary to break some of the underlying operations. Default functions provided by the kernel fordifferent interrupt types, such as handle_level_irq and handle_edge_irq, and so on.
Interrupt controller operations are encapsulated into irq_chip, the structure defined in include/Linux/IRQ.h:

  1. struct irq_chip {
  2.     const char    *name;
  3.     unsigned int    (*startup)(unsigned int irq);
  4.     void        (*shutdown)(unsigned int irq);
  5.     void        (*enable)(unsigned int irq);
  6.     void        (*disable)(unsigned int irq);

  7.     void        (*ack)(unsigned int irq);
  8.     void        (*mask)(unsigned int irq);
  9.     void        (*mask_ack)(unsigned int irq);
  10.     void        (*unmask)(unsigned int irq);
  11.     void        (*eoi)(unsigned int irq);

  12.     void        (*end)(unsigned int irq);
  13.     int        (*set_affinity)(unsigned int irq, const struct cpumask *dest);
  14.     int        (*retrigger)(unsigned int irq);
  15.     int        (*set_type)(unsigned int irq, unsigned int flow_type);
  16.     int        (*set_wake)(unsigned int irq, unsigned int on);
  17.     void        (*bus_lock)(unsigned int irq);
  18.     void        (*bus_sync_unlock)(unsigned int irq);
  19.     ……
  20.     const char    *typename;
  21. };
Action point to an action list, executed when the interrupt occurs. By a notification device driver, handler associated with itcan be placed in here. Irqaction structure defined in include/Linux/interrupt.h:
  1. struct irqaction {
  2.     irq_handler_t handler;
  3.     unsigned long flags;
  4.     const char *name;
  5.     void *dev_id;
  6.     struct irqaction *next;
  7.     int irq;
  8.     struct proc_dir_entry *dir;
  9.     irq_handler_t thread_fn;
  10.     struct task_struct *thread;
  11.     unsigned long thread_flags;
  12. };

Each processing function should be an instance of the struct irqaction, the most important member of this structure is thehandler, this is a function pointer. The function pointer is initialized through the request_irq, described in detail in thesubsequent sections. When a system is broken, the kernel will call the function pointed to by the pointer handling functions.Uniquely identifies a name and dev_id irqaction instance name is used to identify a device name, dev_id used to point todata structure instances of the device.
Please note that unlike irqaction, handle_irq in the irq_desc handler, this two function pointers is defined as:


  1. typedef    void (*irq_flow_handler_t)(unsigned int irq, struct irq_desc *desc);
  2. typedef irqreturn_t (*irq_handler_t)(int, void *);

Kernel maintains a global array of irq_desc to manage all of the interrupts. The array defined in kernel/IRQ/handle.c:

  1. struct irq_desc irq_desc[NR_IRQS] __cacheline_aligned_in_smp = {
  2.     [... NR_IRQS-1] = {
  3.         .status = IRQ_DISABLED,
  4.         .chip = &no_irq_chip,
  5.         .handle_irq = handle_bad_irq,
  6.         .depth = 1,
  7.         .lock = __RAW_SPIN_LOCK_UNLOCKED(irq_desc->lock),
  8.     }
  9. };
NR_IRQS expressed support for the maximum number of interrupts, defined in the arch/arm/mach-XXX/include/mach/irqs.hfile.
Figure 2-1 illustrates the above three relationships between data structures. All chip members are pointing to the sameirq_chip instance of irq_desc.









Interrupt initialization
ARM Linux interrupt initialization completed through three main functions: init_IRQ and early_trap_init, and early_irq_init.
Throughout the Linux interrupt initialization process is shown in Figure 3-1:













Figure 3-1 Linux interrupt initialization process
Early_trap_init functions defined in the arch/arm/kernel/traps.c file, in the setup_arch function called last, the main ARMexception vector table and reposition of the exception handler.


  1. void __init early_trap_init(void)
  2. {
  3.     unsigned long vectors = CONFIG_VECTORS_BASE;
  4.     extern char __stubs_start[], __stubs_end[];
  5.     extern char __vectors_start[], __vectors_end[];
  6.     extern char __kuser_helper_start[], __kuser_helper_end[];
  7.     int kuser_sz = __kuser_helper_end - __kuser_helper_start;

  8.     /*
  9.      * Copy exception vector tables into vectors address usually is 0xffff0000.
    * Copy the exception handler to vectors+0x200 address.
    */
  10.      */
  11.     memcpy((void *)vectors, __vectors_start, __vectors_end - __vectors_start);
  12.     memcpy((void *)vectors + 0x200, __stubs_start, __stubs_end - __stubs_start);
  13.     memcpy((void *)vectors + 0x1000 - kuser_sz, __kuser_helper_start, kuser_sz);

  14.     /*
  15. Copy signal processing functions     */
  16.     memcpy((void *)KERN_SIGRETURN_CODE, sigreturn_codes,
  17.      sizeof(sigreturn_codes));
  18. /*
  19.      * Copy signal processing functions
  20.      */
  21.     flush_icache_range(vectors, vectors + PAGE_SIZE);
  22.     modify_domain(DOMAIN_USER, DOMAIN_CLIENT);
  23. }
This function is defined in arch/arm/kernel/entry-armv. S exception vector tables and exception handler to reposition thestub: the exception vector table is copied to 0xFFFF_0000, stub copies of exception vector processing program to0xFFFF_0200. Then calls modify_domain () modified the exception vector table occupied by the page access, which allowsthe user to access the page, only the kernel can access. ARM processors will jump to 0xFFFF_0000 when an exception occurs(as "high-end vector configuration" when) the exception vector table, so the relocation work.
Early_irq_init functions defined in the kernel/IRQ/handle.c file:


  1. struct irq_desc irq_desc[NR_IRQS] __cacheline_aligned_in_smp = {
  2.     [... NR_IRQS-1] = {
  3.         .status = IRQ_DISABLED,
  4.         .chip = &no_irq_chip,
  5.         .handle_irq = handle_bad_irq,
  6.         .depth = 1,
  7.         .lock = __SPIN_LOCK_UNLOCKED(irq_desc->lock),
  8.     }
  9. };

  10. static unsigned int kstat_irqs_all[NR_IRQS][NR_CPUS];
  11. int __init early_irq_init(void)
  12. {
  13.     struct irq_desc *desc;
  14.     int count;
  15.     int i;

  16.     init_irq_default_affinity();

  17.     printk(KERN_INFO "NR_IRQS:%d\n", NR_IRQS);

  18.     desc = irq_desc;
  19.     count = ARRAY_SIZE(irq_desc);

  20.     for (= 0; i < count; i++) {
  21.         desc[i].irq = i;
  22.         alloc_desc_masks(&desc[i], 0, true);
  23.         init_desc_masks(&desc[i]);
  24.         desc[i].kstat_irqs = kstat_irqs_all[i];
  25.     }
  26.     return arch_early_irq_init();
  27. }

In the kernel/IRQ/handle.c file, depending on the kernel configuration option CONFIG_SPARSE_IRQ, you can select twodifferent versions of early_irq_init () in a compiler. CONFIG_SPARSE_IRQ configuration to support sparse, IRQ, for release ofthe kernel is useful, it allows you to define a high CONFIG_NR_CPUS values, but still do not want to consume too muchmemory. Under normal circumstances, we did not configure this option.
Early_irq_init's main job is to initialize the irq_desc[NR_IRQS to manage the interrupt] each element of the array (NR_IRQS represents total number of interrupts, defined in the IRQs.h file), which sets the interrupt of each Member in the array, makesthe kstat_irqs of each element in an array of fields (IRQ stats per CPU), defines a two-dimensional array kstat_irqs_ All of thecorresponding row. Alloc_desc_masks (&DESC[i], 0, true) and init_desc_masks (&DESC[i]) function in non-SMP platform, theempty function. Arch_early_irq_init () is mainly used for x86 and PPC platforms, NULL function on other platforms.
Init_IRQ functions defined in the arch/arm/kernel/IRQ.c file:


  1. void __init init_IRQ(void)
  2. {
  3.     int irq;

  4.     for (irq = 0; irq < NR_IRQS; irq++)
  5.         irq_desc[irq].status |= IRQ_NOREQUEST | IRQ_NOPROBE;

  6.     init_arch_irq();
  7. }
First traverse irq_desc init_IRQ function interrupt descriptor table, and initialize each interrupt status for IRQ_NOREQUESTand IRQ_NOPROBE. And then calling init_arch_irq Board-level correlation functions, the function is initialized in thesetup_arch function mdesc->init_irq and mdesc->init_irq are usually defined in Board-level file in the arch/arm/Mach-xxxdirectory. Init_arch_irq function initialize a CPU interrupt controller, usually the following for loop to initialize irq_desc arrays:


  1. for (irq = IRQ_TIMER0; irq <= IRQ_TIMER4; irq++) {
  2.         set_irq_chip(irq, &arch_chip);//将中断控制器操作结构体irq_chip的实例注册到irq_desc[irq].chip上
  3.         set_irq_handler(irq, handle_level_irq);
  4.         set_irq_flags(irq, IRQF_VALID);
  5.     }

Registering interrupt handlers
This section focuses on how to register interrupt handler. In Linux the request_irq function is used to register the interrupthandler, this function is implemented as:


  1. static inline int __must_check
  2. request_irq(unsigned int irq, irq_handler_t handler, unsigned long flags,
  3.      const char *name, void *dev)
  4. {
  5.     return request_threaded_irq(irq, handler, NULL, flags, name, dev);
  6. }
  7. int request_threaded_irq(unsigned int irq, irq_handler_t handler,
  8.              irq_handler_t thread_fn, unsigned long irqflags,
  9.              const char *devname, void *dev_id)
  10. {
  11.     struct irqaction *action;
  12.     struct irq_desc *desc;
  13.     int retval;

  14.     /*
  15.      * handle_IRQ_event() always ignores IRQF_DISABLED except for
  16.      * the _first_ irqaction (sigh). That can cause oopsing, but
  17.      * the behavior is classified as "will not fix" so we need to
  18.      * start nudging drivers away from using that idiom.
  19.      */
  20.     if ((irqflags & (IRQF_SHARED|IRQF_DISABLED)) ==
  21.                     (IRQF_SHARED|IRQF_DISABLED)) {
  22.         pr_warning(
  23.          "IRQ %d/%s: IRQF_DISABLED is not guaranteed on shared IRQs\n",
  24.             irq, devname);
  25.     }

  26. #ifdef CONFIG_LOCKDEP
  27.     /*
  28.      * Lockdep wants atomic interrupt handlers:
  29.      */
  30.     irqflags |= IRQF_DISABLED;
  31. #endif
  32.     /*
  33.      * Sanity-check: shared interrupts must pass in a real dev-ID,
  34.      * otherwise we'll have trouble later trying to figure out
  35.      * which interrupt is which (messes up the interrupt freeing
  36.      * logic etc).
  37.      */
  38.     if ((irqflags & IRQF_SHARED) && !dev_id)
  39.         return -EINVAL;

  40.     desc = irq_to_desc(irq);
  41.     if (!desc)
  42.         return -EINVAL;

  43.     if (desc->status & IRQ_NOREQUEST)
  44.         return -EINVAL;

  45.     if (!handler) {
  46.         if (!thread_fn)
  47.             return -EINVAL;
  48.         handler = irq_default_primary_handler;
  49.     }

  50.     action = kzalloc(sizeof(struct irqaction), GFP_KERNEL);
  51.     if (!action)
  52.         return -ENOMEM;

  53.     action->handler = handler;
  54.     action->thread_fn = thread_fn;
  55.     action->flags = irqflags;
  56.     action->name = devname;
  57.     action->dev_id = dev_id;

  58.     chip_bus_lock(irq, desc);
  59.     retval = __setup_irq(irq, desc, action);
  60.     chip_bus_sync_unlock(irq, desc);

  61.     if (retval)
  62.         kfree(action);
  63.     return retval;
  64. }
5. interrupt handling processes
Review section the content when an exception or interrupt occurs, the processor will set the PC to a specific address, tojump to an already initialized the exception vector table. So, to clear the interrupt processing of the process, starting fromthe exception vector table. As for ARM Linux, exception vector tables and exception handler arch/arm/kernel/entry_armv. Scompiled file.
The exception vector table implementation:


  1. .globl    __vectors_start
  2. __vectors_start:
  3.     swi    SYS_ERROR0
  4.     b    vector_und + stubs_offset
  5.     ldr    pc, .LCvswi + stubs_offset
  6.     b    vector_pabt + stubs_offset
  7.     b    vector_dabt + stubs_offset
  8.     b    vector_addrexcptn + stubs_offset
  9.     b    vector_irq + stubs_offset @中断入口,vector_irq
  10.     b    vector_fiq + stubs_offset

  11.     .globl    __vectors_end
  12. __vectors_end:
Vector_irq+stubs_offset to interrupt the entry point, stubs_offset is here to be added in order to achieve the position-independent programming. First look at how stubs_offset is calculated:
.equ stubs_offset, __vectors_start + 0x200 - __stubs_start
Already mentioned in the 3rd quarter, the kernel will begin copying the exception vector table to 0xFFFF_0000, stub copiesof the exception vector processing program to 0xFFFF_0200. 5-1 describes the exception vector tables and memory layoutbefore and after removal of the exception handler.









When the assembler instruction will jump to see b label into the offset relative to the current PC (± 32M) writes the script.Interrupt vector table when the kernel starts and stubs have been code removal, so if the interrupt vector table is still written as b vector_irq, the actual executed will not be able to jump to a converted vector_irq, because the script was originallywritten by the offset, so the offset is written in script after the move. Set offsets converted to offset, as shown in Figure 5-1,


offset = L1+L2
     = [0x200 - (irq_PC_X - __vectors_start_X)] + (vector_irq_X - __stubs_start_X)
     = [0x200 - (irq_PC - __vectors_start)] + (vector_irq - __stubs_start)
     = 0x200 - irq_PC + __vectors_start + vector_irq - __stubs_start
     = vector_irq + (__vectors_start + 0x200 - __stubs_start) - irq_PC
stubs_offset = __vectors_start + 0x200 - __stubs_start

Offset = vector_irq + stubs_offset-irq_PC, so interrupt entry point for "b vector_irq + stubs_offset", net of irq_PC is done bythe compiler at compile time.
Analysis of vector_irq function, first find out when an exception or interrupt causes the processor mode change, flow of ARMprocessor cores as shown in the following figure:

Find vector_irq between the __stubs_start and the __stubs_end function definition vector_stub IRQ, IRQ_MODE, 4, vector_stubof which is a macro (at arch/arm/kernel/entry_armv. S is defined in), in order to analyze the more intuitive, we expand thevector_stub macro as follows:

/*
* Interrupt dispatcher
*/
vector_irq:
.if 4
Sub LR, LR, #4 @ when the interrupt occurs, LR points to the final performance of the instruction address with 8. Only afterthe completion of current instruction, before entering the interrupt processing, so the return address should be pointingdown a command, that is (LR-4).
.endif

@
@ Save r0, lr_ (parent PC) and spsr_
@ (parent CPSR)
@
Stmia SP, {r0, LR} @ save the r0, in stacks in LR to IRQ mode
mrs lr, spsr
Str LR, [SP, #8] @ save the spsr to IRQ mode stack

@
@ Prepare for SVC32 mode. IRQs remain disabled.
@
mrs r0, cpsr
Eor r0, r0, # (IRQ_MODE ^ SVC_MODE) set @ SVC mode, but does not switch
MSR spsr_cxsf, r0 @ saved in spsr_irq

@
@ the branch table must immediately follow this code
@
And LR, LR, #0x0F @LR stores the last CPSR processor mode value, LR = LR and 0x0F out for judgment before theinterruption is a user mode or kernel information used to following the jump table indexes.
Mov r0, SP @ save the IRQ mode SP to r0, passed as a parameter is called __irq_usr or __irq_svc
LDR LR, [PC, LR, LSL #2] @PC add 8 points to the currently executing instruction address, that isthe base address of a jumptable. LR for the index, because it is a 4-byte aligned, so LR = LR << 2.
movs pc, lr @ branch to handler in SVC mode
@ When the Mov instruction after the "s" and the target when you register for the PC, the current mode of the spsr iscopied to the CPSR, so as to complete the mode switch (switch from IRQ to SVC) and continue executing instruction jumpsto the PC to
ENDPROC(vector_irq)

.long __irq_usr @ 0 (USR_26 / USR_32)
.long __irq_invalid @ 1 (FIQ_26 / FIQ_32)
.long __irq_invalid @ 2 (IRQ_26 / IRQ_32)
.long __irq_svc @ 3 (SVC_26 / SVC_32)
.long __irq_invalid @ 4
.long __irq_invalid @ 5
.long __irq_invalid @ 6
.long __irq_invalid @ 7
.long __irq_invalid @ 8
.long __irq_invalid @ 9
.long __irq_invalid @ a
.long __irq_invalid @ b
.long __irq_invalid @ c
.long __irq_invalid @ d
.long __irq_invalid @ e
.long __irq_invalid @ f

If an interrupt occurs in user mode before you go to the __irq_usr, which is defined as follows (arch/arm/kernel/entry_armv.S):


.align 5
__irq_usr:
Usr_entry @ save the interrupt context, later analysis
kuser_cmpxchg_check
#ifdef CONFIG_TRACE_IRQFLAGS
bl trace_hardirqs_off
#endif
Get_thread_info TSK @ get current process address of the Member variables in a process descriptor thread_info, and savethat address into a register TSK (R9) (entry-header. S in the definition)
#Ifdef CONFIG_PREEMPT @ if defined preempted, increasing to seize the number
LDR R8, [TSK, #TI_PREEMPT] @ access to preempt counter value
Add R7, R8, #1 @ preempt plus 1, identity against preemption
Str R7, [TSK, #TI_PREEMPT] @ and 1then write the results of processes in the kernel stack of variables
#endif
Irq_handler @ interrupt handler is called, and later analysis
#ifdef CONFIG_PREEMPT
LDR r0, [TSK, #TI_PREEMPT] @ access to preempt counter value
Str R8, [TSK, #TI_PREEMPT] @ restore the preempt to the value before the break
TEQ r0, R7 @ break preempt is equal
Strne r0, [r0,-r0] @ range exception occurs (writing data to address 0)?
#endif
#ifdef CONFIG_TRACE_IRQFLAGS
bl trace_hardirqs_on
#endif
mov why, #0 @r8=0
B ret_to_user @ interrupt processing is complete, restore the interrupt context and returns to break the position, lateranalysis
UNWIND(.fnend )
ENDPROC(__irq_usr)

Usr_entry is a macro defined in the code above, mainly used in the protected context stack:


  1. .macro    usr_entry
  2.  UNWIND(.fnstart    )
  3.  UNWIND(.cantunwind    )    @ dont unwind the user space
  4.     sub    sp, sp, #S_FRAME_SIZE @ATPCS中,堆栈被定义为递减式满堆栈,所以首先让sp向下移动#S_FRAME_SIZE(pt_regs结构体size),准备向栈中存放数据。此处的sp是svc模式下的栈指针。
  5.     stmib    sp, {r1 - r12}

  6.     ldmia    r0, {r1 - r3}
  7.     add    r0, sp, #S_PC        @ here for interlock avoidance
  8.     mov    r4, #-1            @ "" "" "" ""

  9.     str    r1, [sp]        @ save the "real" r0 copied
  10.                     @ from the exception stack

  11.     @
  12.     @ We are now ready to fill in the remaining blanks on the stack:
  13.     @
  14.     @ r2 - lr_<exception>, already fixed up for correct return/restart
  15.     @ r3 - spsr_<exception>
  16.     @ r4 - orig_r0 (see pt_regs definition in ptrace.h)
  17.     @
  18.     @ Also, separately save sp_usr and lr_usr
  19.     @
  20.     stmia    r0, {r2 - r4}
  21.     stmdb    r0, {sp, lr}^ @将user模式下的sp和lr保存到svc模式的栈中

  22.     @
  23.     @ Enable the alignment trap while in kernel mode
  24.     @
  25.     alignment_trap r0

  26.     @
  27.     @ Clear FP to mark the first stack frame
  28.     @
  29.     zero_fp
  30.     .endm
上面的这段代码主要是在填充结构体pt_regs ,在include/asm/ptrace.h中定义:
  1. struct pt_regs {
  2.     long uregs[18];
  3. };

  4. #define ARM_cpsr    uregs[16]
  5. #define ARM_pc        uregs[15]
  6. #define ARM_lr        uregs[14]
  7. #define ARM_sp        uregs[13]
  8. #define ARM_ip        uregs[12]
  9. #define ARM_fp        uregs[11]
  10. #define ARM_r10        uregs[10]
  11. #define ARM_r9        uregs[9]
  12. #define ARM_r8        uregs[8]
  13. #define ARM_r7        uregs[7]
  14. #define ARM_r6        uregs[6]
  15. #define ARM_r5        uregs[5]
  16. #define ARM_r4        uregs[4]
  17. #define ARM_r3        uregs[3]
  18. #define ARM_r2        uregs[2]
  19. #define ARM_r1        uregs[1]
  20. #define ARM_r0        uregs[0]
  21. #define ARM_ORIG_r0    uregs[17]
Usr_entry macrocomanda umpleri pt_regs structura cum ar fi arătat în figura 5-2, salvaţi mai întâi R1 ~ R12 a ARM_r1 ~ ARM_ip (verde), şi întrerupe r0 registrul conţinutului stocat în ARM_r0 (albastru), acesta va genera o întrerupere atunci când instrucţiunea următoare adresa lr_irq, spsr_irq şi R4 sunt salvate la ARM_pc, ARM_cpsr şi ARM_ORIG_ R0 (roşu), şi în cele din urmă în modul utilizator SP şi LR salvate în ARM_sp şi ARM_lr.








如果发生中断前处于核心态则进入__irq_svc,其定义如下(arch/arm/kernel/entry_armv.S):

  1. .align    5
  2. __irq_svc:
  3.     svc_entry @保存中断上下文

  4. #ifdef CONFIG_TRACE_IRQFLAGS
  5.     bl    trace_hardirqs_off
  6. #endif
  7. #ifdef CONFIG_PREEMPT
  8.     get_thread_info tsk
  9.     ldr    r8, [tsk, #TI_PREEMPT]        @ 获取preempt计数器值
  10.     add    r7, r8, #1            @ preempt加1,标识禁止抢占
  11.     str    r7, [tsk, #TI_PREEMPT] @将加1后的结果写入进程内核栈的变量中
  12. #endif

  13.     irq_handler @调用中断处理程序,稍后分析
  14. #ifdef CONFIG_PREEMPT
  15.     str    r8, [tsk, #TI_PREEMPT]        @ 恢复中断前的preempt计数器
  16.     ldr    r0, [tsk, #TI_FLAGS]        @ 获取flags
  17.     teq    r8, #0                @ 判断preempt是否等于0
  18.     movne    r0, #0                @ 如果preempt不等于0,r0=0
  19.     tst    r0, #_TIF_NEED_RESCHED @将r0与#_TIF_NEED_RESCHED做“与操作”
  20.     blne    svc_preempt @如果不等于0,说明发生内核抢占,需要重新调度。
  21. #endif

  22.     ldr    r0, [sp, #S_PSR]        @ irqs are already disabled
  23.     msr    spsr_cxsf, r0
  24. #ifdef CONFIG_TRACE_IRQFLAGS
  25.     tst    r0, #PSR_I_BIT
  26.     bleq    trace_hardirqs_on
  27. #endif
  28.     svc_exit r4     @恢复中断上下文,稍后分析。
  29.  UNWIND(.fnend        )
  30. ENDPROC(__irq_svc)
其中svc_entry是一个宏定义,主要用于保护中断上下文到栈中:

  1. .macro    svc_entry, stack_hole=0
  2.  UNWIND(.fnstart        )
  3.  UNWIND(.save {r0 - pc}        )
  4.     sub    sp, sp, #(S_FRAME_SIZE + \stack_hole)
  5.  SPFIX(    tst    sp, #4        )
  6.  SPFIX(    bicne    sp, sp, #4    )
  7.     stmib    sp, {r1 - r12}

  8.     ldmia    r0, {r1 - r3}
  9.     add    r5, sp, #S_SP        @ here for interlock avoidance
  10.     mov    r4, #-1            @ "" "" "" ""
  11.     add    r0, sp, #(S_FRAME_SIZE + \stack_hole)
  12.  SPFIX(    addne    r0, r0, #4    )
  13.     str    r1, [sp]        @ save the "real" r0 copied
  14.                     @ from the exception stack

  15.     mov    r1, lr

  16.     @
  17.     @ We are now ready to fill in the remaining blanks on the stack:
  18.     @
  19.     @ r0 - sp_svc
  20.     @ r1 - lr_svc
  21.     @ r2 - lr_<exception>, already fixed up for correct return/restart
  22.     @ r3 - spsr_<exception>
  23.     @ r4 - orig_r0 (see pt_regs definition in ptrace.h)
  24.     @
  25.     stmia    r5, {r0 - r4}
  26.     .endm
svc_entry宏填充pt_regs结构体的过程如图5-2所示,先将r1~r12保存到ARM_r1~ARM_ip(绿色部分),然后将产生中断时的r0寄存器内容保存到ARM_r0(蓝色部分),由于是在svc模式下产生的中断,所以最后将sp_svc、lr_svc、lr_irq、spsr_irq和r4保存到ARM_sp、ARM_lr、ARM_pc、ARM_cpsr和ARM_ORIG_r0(红色部分)。







The interrupt context saving process involved a total of 3 stack pointer, respectively is: sp_usr user mode stack pointer, thekernel mode stack pointer stack sp_irq stack pointer for sp_svc and IRQ mode. Sp_usr links to user mode stack that is createdin the setup_arg_pages function. Sp_svc links to kernel mode stack that is created in the alloc_thread_info function. Sp_irq inthe cpu_init function is assigned, pointing to the global variable stacks.IRQ[0].
After you save the interrupt context in our interrupt handler--irq_handler, defined in arch/arm/kernel/entry_armv. S-files:


  1. .macro    irq_handler
  2.     get_irqnr_preamble r5, lr 
  3. 1:    get_irqnr_and_base r0, r6, r5, lr @获取中断号,存到r0中,稍后分析
  4.     movne    r1, sp @如果中断号不等于0,将r1=sp,即pt_regs结构体首地址
  5.     @
  6.     @ routine called with r0 = irq number, r1 = struct pt_regs *
  7.     @
  8.     adrne    lr, 1b @如果r0(中断号)不等于0, lr(返回地址)等于标号1处,即get_irqnr_and_base r0, r6, r5, lr的那行,即循环处理所有的中断。
  9.     bne    asm_do_IRQ @进入中断处理,稍后分析。
  10. ……
  11.     .endm
Get_irqnr_and_base is used to determine the current interrupt (and closely related to the CPU), here no longer analysis. If you get an interrupt number is not equal to 0, then the interrupt number in R0 register as the first parameter, add pt_regsstructure stored in R1 register as the second parameter, jump to the asm_do_IRQ c function for further processing. In ordernot to let you switch back and forth between Assembly language and c language, or the final assembly code (return code)analyses to analyze asm_do_IRQ. Look back at the code at the __irq_usr and __irq_svc markings, after the completion of theirq_handler interrupt handler, to complete the interrupted exception handler returns to the point of interruption of work. Ifthe interruption in user space, then ret_to_user is called to resume interrupted the scene and returns user-space continues torun:


arch/arm/kernel/entry_armv.S
ENTRY(ret_to_user)
ret_slow_syscall:
    disable_irq    @ disable interrupts,此处不明白,disable_irq应该接受irq中断号作为参数,来禁止指定的irq号中断线。但是此处调用disable_irq之前并没有将irq中断号存入r0寄存器,这是为什么?
    ldr    r1, [tsk, #TI_FLAGS] @获取thread_info->flags
    tst    r1, #_TIF_WORK_MASK @判断是否有待处理的work
    bne    work_pending @如果有,则进入work_pending进一步处理,主要是完成用户进程抢占相关处理。
no_work_pending: @如果没有work待处理,则准备恢复中断现场,返回用户空间。
    /* perform architecture specific actions before user return */
    arch_ret_to_user r1, lr @调用体系结构相关的代码

    restore_user_regs fast = 0, offset = 0 @调用restore_user_regs
ENDPROC(ret_to_user)

以下是恢复中断现场寄存器的宏,就是将发生中断时保存在内核空间堆栈上的寄存器还原,可以对照图5-2所示的内核空间堆栈保存的内容来理解下面代码:
.macro    restore_user_regs, fast = 0, offset = 0
    ldr    r1, [sp, #\offset + S_PSR]    @ 从内核栈中获取发生中断时的cpsr值
    ldr    lr, [sp, #\offset + S_PC]!    @ 从内核栈中获取发生中断时的下一条指令地址
    msr    spsr_cxsf, r1            @ 将r1保存到spsr_svc
#if defined(CONFIG_CPU_32v6K)
    clrex                    @ clear the exclusive monitor
#elif defined (CONFIG_CPU_V6)
    strex    r1, r2, [sp]            @ clear the exclusive monitor
#endif
    .if    \fast
    ldmdb    sp, {r1 - lr}^    @ get calling r1 - lr
    .else
    ldmdb    sp, {r0 - lr}^ @ 存在^,所以将内核栈保存的内容恢复到用户空间的r0~lr寄存器
    .endif
    add    sp, sp, #S_FRAME_SIZE - S_PC 
    movs    pc, lr    @将发生中断时的下一条指令地址存入pc,从而返回中断点继续执行,并且将发生中断时的cpsr内容恢复到cpsr寄存器中(开启中断)。
    .endm
如果中断产生于内核空间,则调用svc_exit来恢复中断现场:

arch/arm/kernel/ entry-header.S
.macro    svc_exit, rpsr
    msr    spsr_cxsf, \rpsr
#if defined(CONFIG_CPU_32v6K)
    clrex                    @ clear the exclusive monitor
    ldmia    sp, {r0 - pc}^            @ load r0 - pc, cpsr
#elif defined (CONFIG_CPU_V6)
    ldr    r0, [sp]
    strex    r1, r2, [sp]            @ clear the exclusive monitor
    ldmib    sp, {r1 - pc}^            @ load r1 - pc, cpsr
#else
    ldmia    sp, {r0 - pc}^            @ 返回内核空间时,恢复中断现场比较简单,就是将r0-pc以及cpsr恢复即可,同时中断也被开启。
#endif
    .endm
ok,分析完所有与中断相关的汇编语言代码后,下面开始分析C语言代码:
在arch/arm/kernel/irq.c文件中找到asm_do_IRQ函数定义:

asmlinkage void __exception asm_do_IRQ(unsigned int irq, struct pt_regs *regs)
{
    /*保存新的寄存器集合指针到全局cpu变量,方便后续处理程序访问寄存器集合。*/
    struct pt_regs *old_regs = set_irq_regs(regs); 

    irq_enter();

    /*
     * Some hardware gives randomly wrong interrupts. Rather
     * than crashing, do something sensible.
     */
    if (unlikely(irq >= NR_IRQS)) { //判断中断号
        if (printk_ratelimit())
            printk(KERN_WARNING "Bad IRQ%u\n", irq);
        ack_bad_irq(irq);
    } else {
        generic_handle_irq(irq); //调用中断处理函数
    }

    /* AT91 specific workaround */
    irq_finish(irq);

    irq_exit();
    set_irq_regs(old_regs);
}
asm_do_IRQ是中断处理的C入口函数,主要负责调用request_irq注册的中断处理函数,其流程如图5-4所示:

图5-4 asm_do_IRQ流程
其中,set_irq_regs将指向寄存器结构体的指针保存在一个全局的CPU变量中,后续的程序可以通过该变量访问寄存器结构体。所以在进入中断处理前,先将全局CPU变量中保存的旧指针保留下来,等到中断处理结束后再将其恢复。irq_enter负责更新一些统计量:

void irq_enter(void)
{
    int cpu = smp_processor_id();

    rcu_irq_enter();
    if (idle_cpu(cpu) && !in_interrupt()) {
        __irq_enter();
        tick_check_idle(cpu);
    } else
        __irq_enter();
}
如果系统开启动态时钟特性且很长时间没有产生时钟中断,则调用tick_check_idle更新全局变量jiffies(关于动态时钟特性,在后续的总结中再进行分析)。宏__irq_enter()定义如下:

#define __irq_enter()                    \
    do {                        \
        account_system_vtime(current);        \
        add_preempt_count(HARDIRQ_OFFSET);    \
        trace_hardirq_enter();            \
    } while (0)
add_preempt_count(HARDIRQ_OFFSET)使表示中断处理程序嵌套层次的计数器加1。计数器保存在当前进程thread_info结构的preempt_count字段中:

图5-5 preempt_count结构
内核将preempt_count分成5部分:bit0~7与PREEMPT相关,bit8~15用作软中断计数器,bit16~25用作硬中断计数器,bit26用作不可屏蔽中断计数器,bit28用作PREEMPT_ACTIVE标志。
generic_handle_irq是体系结构无关函数,用来调用desc->handle_irq,该函数指针在中断初始化时指向了电流处理函数(handle_level_irq或handle_edge_irq),针对不同的中断触发类型(边沿触发或电平触发)做相应的处理。然后调用handle_IRQ_event遍历action链表从而调用该中断号对应的一个或多个中断处理程序action->handler,而action->handler就是通过request_irq初始化的。
首先分析一下handle_level_irq函数:

void handle_level_irq(unsigned int irq, struct irq_desc *desc)
{
    struct irqaction *action;
    irqreturn_t action_ret;

    spin_lock(&desc->lock); /*访问desc内容之前先加自旋锁*/
    mask_ack_irq(desc, irq); /*屏蔽与irq号对应的中断线 */

/* 在多处理器系统上,为了避免多cpu同时处理同一中断。
*当desc->status包含IRQ_INPROGRESS标志时,说明该中断
*正在另一个cpu上处理,因此当前cpu可以直接放弃处理。
*/
    if (unlikely(desc->status & IRQ_INPROGRESS)) 
        goto out_unlock;
    desc->status &= ~(IRQ_REPLAY | IRQ_WAITING);
    kstat_incr_irqs_this_cpu(irq, desc);

    /*
     *如果没有对该中断注册处理程序,即desc->action为NULL。
     * 或者desc->status设置为IRQ_DISABLED,表示该中断是被禁止的。
     * 以上两种情况只要出现一种即可放弃处理。
*/
    action = desc->action;
    if (unlikely(!action || (desc->status & IRQ_DISABLED)))
        goto out_unlock;

    desc->status |= IRQ_INPROGRESS; /*标识中断状态为正在处理*/
    spin_unlock(&desc->lock); /*释放自旋锁*/

    /*调用由request_irq注册的处理函数,稍后分析。*/
    action_ret = handle_IRQ_event(irq, action); 
    if (!noirqdebug)
        note_interrupt(irq, desc, action_ret);

    spin_lock(&desc->lock); /*访问desc内容前加自旋锁*/
    desc->status &= ~IRQ_INPROGRESS; /*清除“正在处理”的标识*/

/*如果desc->status 包含IRQ_ONESHOT,
*则将desc->status设置为IRQ_MASKED,使该中断仍处于被屏蔽状态。 */
    if (unlikely(desc->status & IRQ_ONESHOT)) 
        desc->status |= IRQ_MASKED;
/*如果中断处理函数中未对desc->status 设置为IRQ_ DISABLED,
*且desc->chip->unmask不为空,则desc->chip->unmask所指向的芯片相关函数,
*解除对该中断的屏蔽。 
*/
    else if (!(desc->status & IRQ_DISABLED) && desc->chip->unmask)
        desc->chip->unmask(irq);
out_unlock:
    spin_unlock(&desc->lock); /*释放自旋锁*/
}
再来介绍一下handle_edge_irq函数,相对于handle_level_irq要复杂一点:

void handle_edge_irq(unsigned int irq, struct irq_desc *desc)
{
    spin_lock(&desc->lock);
    desc->status &= ~(IRQ_REPLAY | IRQ_WAITING);
    /*
     * 如果该中断正在被其他cpu处理,或者是该中断已被禁止,
     * 则不处理该中断,但要将其标识为pending状态且屏蔽该中断以便后续处理
     */
    if (unlikely((desc->status & (IRQ_INPROGRESS | IRQ_DISABLED)) ||
         !desc->action)) {
        desc->status |= (IRQ_PENDING | IRQ_MASKED);
        mask_ack_irq(desc, irq);
        goto out_unlock;
    }
    kstat_incr_irqs_this_cpu(irq, desc);

    /* Start handling the irq */
    if (desc->chip->ack)
        desc->chip->ack(irq);

    /* 标识该中断状态为“正在处理”*/
    desc->status |= IRQ_INPROGRESS;

    do {
        struct irqaction *action = desc->action;
        irqreturn_t action_ret;

        if (unlikely(!action)) {
            desc->chip->mask(irq);
            goto out_unlock;
        }

        /*
         * 如果当处理该中断时有另一个中断到达,
         * 那么当时可能屏蔽了该中断。
         * 如果该中断没有被禁止,则解除对该中断的屏蔽。
         */
        if (unlikely((desc->status &
             (IRQ_PENDING | IRQ_MASKED | IRQ_DISABLED)) ==
             (IRQ_PENDING | IRQ_MASKED))) {
            desc->chip->unmask(irq);
            desc->status &= ~IRQ_MASKED;
        }

        desc->status &= ~IRQ_PENDING;
        spin_unlock(&desc->lock);
        /*调用由request_irq注册的处理函数,稍后分析。*/
        action_ret = handle_IRQ_event(irq, action);
        if (!noirqdebug)
            note_interrupt(irq, desc, action_ret);
        spin_lock(&desc->lock);
        /*如果该中断没有被禁止,并且有其他中断等待处理(IRQ_PENDING),
*则循环处理其他中断。
*/
    } while ((desc->status & (IRQ_PENDING | IRQ_DISABLED)) == IRQ_PENDING);

    desc->status &= ~IRQ_INPROGRESS;
out_unlock:
    spin_unlock(&desc->lock);
}
不管是电平触发还是边沿触发,最终都会通过handle_IRQ_event来调用注册的中断处理函数。

irqreturn_t handle_IRQ_event(unsigned int irq, struct irqaction *action)
{
    irqreturn_t ret, retval = IRQ_NONE;
    unsigned int status = 0;

/*如果注册中断时没有设置IRQF_DISABLED 标志,则在此处开启硬中断!开启后允许硬中断嵌套。从2.6.36版本内核开始,IRQF_DISABLED 标志被废除,此处不再开启硬中断,以防止中断嵌套可能造成栈溢出的潜在风险。详细信息参见:http://lwn.net/Articles/380931/*/
    if (!(action->flags & IRQF_DISABLED))
        local_irq_enable_in_hardirq();

    do {
        trace_irq_handler_entry(irq, action);
        /*调用由request_irq注册的中断处理函数*/
ret = action->handler(irq, action->dev_id);
        trace_irq_handler_exit(irq, action, ret);

        switch (ret) {
        case IRQ_WAKE_THREAD:/*进行中断线程化处理*/
            /*
             * Set result to handled so the spurious check
             * does not trigger.
             */
            ret = IRQ_HANDLED;

            if (unlikely(!action->thread_fn)) {
                warn_no_thread(irq, action);
                break;
            }

            if (likely(!test_bit(IRQTF_DIED,
                     &action->thread_flags))) {
                set_bit(IRQTF_RUNTHREAD, &action->thread_flags);
            /*唤醒由request_threaded_irq注册的中断处理线程*/
                wake_up_process(action->thread);
            }

            /* Fall through to add to randomness */
        case IRQ_HANDLED: /*中断处理函数正常返回*/
            status |= action->flags;
            break;

        default:
            break;
        }

        retval |= ret;
        action = action->next; /*指向下一个中断处理函数*/
    } while (action); /*循环调用注册在同一中断线上的中断处理函数(共享中断线)*/

/*如果注册中断时指定了IRQF_SAMPLE_RANDOM 标识,
*则调用add_interrupt_randomness函数,
*将发生中断的时间作为随机数产生器的熵 */
    if (status & IRQF_SAMPLE_RANDOM)
        add_interrupt_randomness(irq);
/*由于中断的开启和禁止不是嵌套的,所以与之前中断是否禁止不相关,
 *由于进入handle_IRQ_event之前是禁止中断的,所以在退出时也应该禁止中断。
*/
    local_irq_disable();

    return retval;
}
完成中断处理函数调用后,返回到asm_do_IRQ继续执行,其中最重要的是执行中断退出函数irq_exit:

void irq_exit(void)
{
    account_system_vtime(current);
    trace_hardirq_exit();
    sub_preempt_count(IRQ_EXIT_OFFSET);
    if (!in_interrupt() && local_softirq_pending())
        invoke_softirq();

#ifdef CONFIG_NO_HZ
    /* Make sure that timer wheel updates are propagated */
    rcu_irq_exit();
    if (idle_cpu(smp_processor_id()) && !in_interrupt() && !need_resched())
        tick_nohz_stop_sched_tick(0);
#endif
    preempt_enable_no_resched();
}
irq_exit函数首先将preempt_count计数器减去IRQ_EXIT_OFFSET,用来标识退出硬中断,这与irq_enter函数中的add_preempt_count相对应。在没有开启内核抢占特性的系统中,IRQ_EXIT_OFFSET=HARDIRQ_OFFSET,否则IRQ_EXIT_OFFSET=(HARDIRQ_OFFSET-1),意味着如果开启内核抢占则在退出硬中断时内核要暂时禁止抢占,因为紧接着可能要处理软中断。
之后irq_exit会通过宏in_interrupt()判断当前是否处在中断(interrupt)中:
#define irq_count()    (preempt_count() & (HARDIRQ_MASK | SOFTIRQ_MASK | NMI_MASK))
#define in_interrupt()        (irq_count())

通过分析宏in_interrupt()可知,内核认为HARDIRQ、SOFTIRQ和NMI都属于中断(interrupt)。所以当irq_exit判断当前不处于中断且有软中断正等待处理,则调用invoke_softirq()来触发软中断处理函数,稍后分析。
处理完软中断后,如果内核支持动态时钟,irq_exit会做一些动态时钟相关的处理,然后会调用preempt_enable_no_resched()函数开启内核抢占。返回asm_do_IRQ,该函数最后通过set_irq_regs(old_regs)将寄存器集合指针恢复到发生中断之前的值。asm_do_IRQ结束后会返回入口点,再次回到汇编语言代码。详见前面对汇编代码的分析。

No comments:

Post a Comment