Thursday, October 29, 2015

BARE METAL HA HA HA HA

Fundamentals ::: First stepping stones of making an embedded system. BARE METALS.
BARE METAL

METAL

http://www.wiki.xilinx.com/Linux+Drivers

IDEAS for future projects

OS development
Applications:

Simple OS with scheduler algorithm.
When too many interrupts (or other stuff) -> does not respect deadlines, misses deadlines. 
When deadlines are missed for too long, the processor takes time to generate bitstream and program an external fpga (using foss tools) which implements some function accelerated in hw. 
Use http://yosefk.com/blog/how-fpgas-work-and-why-youll-buy-one.html to get inspired on which function to accelerate.
Languages (for bare-metal development of OS):
Goes baremetal?

http://www.embedded.com/design/programming-languages-and-tools/4428704/2/Alternatives-to-C-C--for-system-programming-in-a-distributed-multicore-world

http://www.slideshare.net/xen_com_mgr/next-generation-cloud-rise-of-the-unikernel-v2-updated?next_slideshow=1

http://programmers.stackexchange.com/questions/274927/is-functional-language-without-runtime-written-in-c-possible

http://www.reddit.com/r/haskell/comments/29tgjd/ideal_programming_language_for_a_new_modern_os/

https://www.reddit.com/r/rust/comments/2web3t/is_rust_going_to_replace_cc_when_it_comes_to/

http://www.reddit.com/r/haskell/comments/2sahpi/why_no_embedded_systems/
- Projects:

http://repetae.net/computer/jhc/manual.html

https://zinc.rs/

Good for multicore?

http://www.embedded.com/design/programming-languages-and-tools/4438718/Programming-languages-for-multicore-systems-

Very good reading:

Educating Embedded Systems Hackers:
https://www.kth.se/polopoly_fs/1.580282!/wese2014.pdf

http://elib.dlr.de/96449/1/Thesis_report_Wei_final.pdf
MULTICORE RASPI 2

https://www.raspberrypi.org/forums/viewtopic.php?f=72&t=98904&start=25
https://github.com/PeterLemon/RaspberryPi/tree/master/SMP/SMPINIT (smp)
OPENCORES:
- K7 Gtx sim model. 8b10 from opencores, what about rx_align?
- Rchitecture
EYE OPEN ON:
- opencl openmp for embedded
- xen hypervisor
- functional haskell/erlang/rust

ARM Linux Exception handlers implementation


Aborts Exception Handling

Aborts can be generated either on failed instruction fetches (prefetch aborts) or failed data accesses (data aborts). They can come from the external memory system giving an error response on a memory access (indicating perhaps that the specified address does not correspond to real memory in the system).
Alternatively, the abort can be generated by the Memory Management Unit (MMU) of the processor. An operating system can use MMU aborts to dynamically allocate memory to applications.

 Prefetch Abort Implementation


1. After getting prefetch abort exception, current mode PC will store in exception_LR and 
    CPSR into exception_SPSR, then PC will point to prefetch abort vector address.
        __vectors_start:
        ...
             W(b)    vector_pabt + stubs_offset
      
2. So when ARM refers to the vector table it follows the branch and lands up here. At this moment ARM is in Abort mode, IRQs are disabled, LR contains PC of when abort occurred and SPSR contains CPSR of when abort occurred. Since we are in abort mode, so r13 (SP) is banked, so load SP with address of a small stack frame (size of 3 words) that we have created at cpu_init(). This is abort exception stack  
     
    .macro  vector_stub, name, mode, correction=0
     vector_\name: (in this case vector_pabt)
   ...
   
        @
        @ Save r0, lr_ (parent PC) and spsr_
        @ (parent CPSR)
        @
        stmia   sp, {r0, lr}            @ save r0, lr
        mrs     lr, spsr
        str     lr, [sp, #8]            @ save spsr


        @
        @ Prepare for SVC32 mode.  IRQs remain disabled.
        @
        mrs     r0, cpsr
        eor     r0, r0, #(\mode ^ SVC_MODE | PSR_ISETSTATE)
        msr     spsr_cxsf, r0

        movs    pc, lr                  @ branch to handler in SVC mode

3. After this basic setup is done, depending on the mode, in which ARM was, when exception occurred
    we switch to specific handler. We'll assume that ARM was executing in SVC mode, so we'll look into the
    details of __pabt_svc

       /*
        * Prefetch abort dispatcher
        * Enter in ABT mode, spsr = USR CPSR, lr = USR PC
        */
        vector_stub     pabt, ABT_MODE, 4

        .long   __pabt_usr                      @  0 (USR_26 / USR_32)
        .long   __pabt_invalid                  @  1 (FIQ_26 / FIQ_32)
        .long   __pabt_invalid                  @  2 (IRQ_26 / IRQ_32)
        .long   __pabt_svc                      @  3 (SVC_26 / SVC_32)

 4. __pabt_svc saves r0-12 on SVC mode stack (i.e kernel stack of process which was interrupted), reads
     LR and SPSR from temporary IRQ stack and saves them on SVC mode stack. After that it will call
     pabt_helper and increments the preempt count.
   
     _pabt_svc:
         svc_entry
         mov     r2, sp                          @ regs
         pabt_helper


5.
      .macro  pabt_helper
        @ PABORT handler takes pt_regs in r2, fault address in r4 and psr in r5
         
            #ifdef MULTI_PABORT
                     ldr     ip, .LCprocfns
                     mov     lr, pc
                     ldr     pc, [ip, #PROCESSOR_PABT_FUNC]
            #else
                     bl      CPU_PABORT_HANDLER
            #endif
      .endm

6.
     arch/arm/include/asm/glue-pf.h
     #define CPU_PABORT_HANDLER v7_pabort

     arch/arm/mm/pabort-v7.S
             .align  5
     ENTRY(v7_pabort)
             mrc     p15, 0, r0, c6, c0, 2           @ get IFAR
             mrc     p15, 0, r1, c5, c0, 1           @ get IFSR
             b       do_PrefetchAbort
     ENDPROC(v7_pabort)

7. here r0 contains Fault Addr Reg and r1 contains Fault Status Reg, based on status, respective function
    will be called

    arch/arm/mm/fault.c

    asmlinkage void __exception 
    do_PrefetchAbort(unsigned long addr, unsigned int ifsr, struct pt_regs *regs)

    arch/arm/mm/fsr-2level.c
         static struct fsr_info ifsr_info[] = {
                  ...
                 { do_translation_fault, SIGSEGV, SEGV_MAPERR,   "section translation fault"        },
                 { do_bad,               SIGSEGV, SEGV_ACCERR,   "page access flag fault"           },
                 { do_page_fault,        SIGSEGV, SEGV_MAPERR,   "page translation fault"           },
                 { do_sect_fault,        SIGSEGV, SEGV_ACCERR,   "section permission fault"         },
                 { do_sect_fault,        SIGSEGV, SEGV_ACCERR,   "section permission fault"         },
                  ... 
         }

8. On error case will do Unhandled prefetch abort and broadcast die notification
   
          if (!inf->fn(addr, ifsr | FSR_LNX_PF, regs))
                 return;
          printk(KERN_ALERT "Unhandled prefetch abort: %s (0x%03x) at 0x%08lx\n",
                inf->name, ifsr, addr);

          arm_notify_die("", regs, &info, ifsr, 0);

9. if we are in user mode will just send SIGSEGV and kill that process, or in SVC mode will trigger oops

    arch/arm/kernel/traps.c
          arm_notify_die:
                 if (user_mode(regs)) {
                       force_sig_info(info->si_signo, info, current);
                 } else {                
                      die(str, regs, err);
                 }

Data Abort Implementation


       Data abort and prefetch abort implementation is almost same. Only do_DataAbort() is get called.


Interrupt handler implementation

  • Interrupt Setup

  • Interrupt Handling 

      When a IRQ is raised, ARM stops what it is processing ( Asuming it is not processing a FIQ!),
      disables further IRQs (not FIQs), puts CPSR in SPSR, puts current PC to LR and swithes to IRQ
      mode, refers to the vector table and jumps to the exception handler. In our case it jumps to the
      exception handler of IRQ.          

ARM Linux Interrupt Handling

When a IRQ is raised, ARM stops what it is processing ( Asuming it is not processing a FIQ!), disables further IRQs (not FIQs),
puts CPSR in SPSR, puts current PC to LR and swithes to IRQ mode, refers to the vector table and jumps to the exception handler.
In our case it jumps to the exception handler of IRQ.

following is the snippet of code for exception handler code for IRQ (again from arch/arm/kernel/entry-armV.S file):

__vectors_start:
 ARM(   swi     SYS_ERROR0      )
 THUMB( svc     #0              )
 THUMB( nop                     )
W(b)    vector_und + stubs_offset
W(ldr)  pc, .LCvswi + stubs_offset
W(b)    vector_pabt + stubs_offset
W(b)    vector_dabt + stubs_offset
W(b)    vector_addrexcptn + stubs_offset
W(b)    vector_irq + stubs_offset
W(b)    vector_fiq + stubs_offset

.globl  __vectors_end
__vectors_end:


/*
 * Vector stubs.
 *
 * This code is copied to 0xffff0200 so we can use branches in the
 * vectors, rather than ldr's.  Note that this code must not
 * exceed 0x300 bytes.
 *
 * Common stub entry macro:
 *   Enter in IRQ mode, spsr = SVC/USR CPSR, lr = SVC/USR PC
 *
 * SP points to a minimal amount of processor-private memory, the address
 * of which is copied into r0 for the mode specific abort handler.
 */
.macro  vector_stub, name, mode, correction=0
.align  5

1. In our case vector_irq
vector_\name:
  
2.      .if \correction
sub     lr, lr, #\correction
.endif
3. So when ARM refers to the vector table it follows the branch and lands up here
   At this moment ARM is in IRQ mode, IRQs are disabled, LR contains PC of when interrupt occured and SPSR contains CPSR of when interrupt occured.
   Since we are in IRQ mode so r13 (SP) is banked, so we load SP with address of a small stack frame that we have created at cpu_init().
   This stack is only used when we are in IRQ mode.

@
@ Save r0, lr_ (parent PC) and spsr_
@ (parent CPSR)
@
stmia   sp, {r0, lr}            @ save r0, lr
mrs     lr, spsr
str     lr, [sp, #8]            @ save spsr

4. We save LR_ and SPSR_ on the temporary IRQ stack and we switch to SVC mode
  
@
@ Prepare for SVC32 mode.  IRQs remain disabled.
@
mrs     r0, cpsr
eor     r0, r0, #(\mode ^ SVC_MODE | PSR_ISETSTATE)
msr     spsr_cxsf, r0

@
@ the branch table must immediately follow this code
@
and     lr, lr, #0x0f
 THUMB( adr     r0, 1f                  )
 THUMB( ldr     lr, [r0, lr, lsl #2]    )
mov     r0, sp
 ARM(   ldr     lr, [pc, lr, lsl #2]    )

5. After this basic setup is done depending on the mode in which ARM was there when interrupt occured we switch to specific handler.
   We'll assume that ARM was executing in SVC mode, so we'll look ino the details of __irq_svc

/*
 * Interrupt dispatcher
 */
vector_stub     irq, IRQ_MODE, 4

.long   __irq_usr                       @  0  (USR_26 / USR_32)
.long   __irq_invalid                   @  1  (FIQ_26 / FIQ_32)
.long   __irq_invalid                   @  2  (IRQ_26 / IRQ_32)
.long   __irq_svc                       @  3  (SVC_26 / SVC_32)

6. __irq_svc saves r0-12 on SVC mode stack (i.e kernel stack of process which was interrupted), reads LR and SPSR from temporary IRQ stack
   and saves them on SVC mode stack. After that it will call irq_handler and increments the preemt count.

__irq_svc:
svc_entry
irq_handler

#ifdef CONFIG_PREEMPT
get_thread_info tsk
ldr     r8, [tsk, #TI_PREEMPT]          @ get preempt count
ldr     r0, [tsk, #TI_FLAGS]            @ get flags
teq     r8, #0                          @ if preempt count != 0
movne   r0, #0                          @ force flags to 0
tst     r0, #_TIF_NEED_RESCHED
blne    svc_preempt
#endif

#ifdef CONFIG_TRACE_IRQFLAGS
@ The parent context IRQs must have been enabled to get here in
@ the first place, so there's no point checking the PSR I bit.
bl      trace_hardirqs_on
#endif
svc_exit r5                             @ return from exception
 UNWIND(.fnend          )
ENDPROC(__irq_svc)


7. After this arch specific handler get called.

/*
 * Interrupt handling.
 */
.macro  irq_handler
#ifdef CONFIG_MULTI_IRQ_HANDLER
ldr     r1, =handle_arch_irq
mov     r0, sp
adr     lr, BSYM(9997f)
ldr     pc, [r1]
#else
arch_irq_handler_default
#endif
9997:
.endm


8. For the SoCs which are using ARM GIC:

arch/arm/kernel/setup.c:
 handle_arch_irq = mdesc->handle_irq

   arch/arm/mach-ux500/board-mop500.c:
  .handle_irq     = gic_handle_irq,

arch/arm/kernel/irq.c:
  handle_IRQ:generic_handle_irq

kernel/irq/irqdesc.c:
  generic_handle_irq:generic_handle_irq_desc

include/linux/irqdesc.h:
  generic_handle_irq_desc:desc->handle_irq

9. handle_irq is the actual call for the flow handler which we registered as handle_fasteoi_irq

kernel/irq/chip.c
handle_fasteoi_irq:handle_irq_event

kernel/irq/handle.c
handle_irq_event:handle_irq_event_percpu

kernel/irq/handle.c
handle_irq_event_percpu

do {
res = action->handler(irq, action->dev_id);
action = action->next;
} while (action);  

Thursday, October 22, 2015

Page Table 64B

https://bwidawsk.net/blog/index.php/2014/07/future-ppgtt-part-4-dynamic-page-table-allocations-64-bit-address-space-gpu-mirroring-and-yeah-something-about-relocs-too/

https://lwn.net/Articles/106177/


Monday, October 12, 2015

TZ

TZ Tegra Master: http://nv-tegra.nvidia.com/gitweb/?p=3rdparty/ote_partner/tlk.git;a=shortlog;h=refs/heads/master


https://github.com/ARM-software/arm-trusted-firmware


http://www.slideshare.net/linaroorg/hkg15502-arm-trusted-firmware-evolution?related=1

http://ds.arm.com/developer-resources/sample-code/

Program execution flow

The flow of program execution is shown below:
  secureStart     startup_secure.s: Initialization of Secure world
       |
    __main        ARM library initialization
       |
     main         main_secure.c: Enable caches and configure TZPC
       |
  monitorInit     monitor.s: initialize Monitor
       |
     main         main_secure.c: Print message and execute SMC
       |
     S -> NS
       |
  normalStart     startup_normal.s: Initialization of Normal world
       |
    __main        ARM library initialization
       |
     main         main_normal.c: Enable caches, print message and execute SMC
       |
    NS -> S
       |
  SMC_Handler     monitor.s: Perform context switch from NS to S
       |
     main         main_secure.c: Print message and execute SMC
       |
  SMC_Handler     monitor.s: Perform context switch from S to NS
       |
     S -> NS
       |
     main         main_normal.c: Print message and execute SMC
       |
    NS -> S
       |
  SMC_Handler     monitor.s: Perform context switch from NS to S
       |
     main         main_secure.c: Print message and execute SMC

Article ::http://file.scirp.org/Html/7-9301356_18574.htm



PASS V_1 : I2C bit banging

I2C CALL FLOW LK


I2C bit banging :: https://lwn.net/Articles/230571/

Bitbanging i2c bus driver using the GPIO API


This is a very simple bitbanging i2c bus driver utilizing the new
arch-neutral GPIO API. Useful for chips that don't have a built-in
i2c controller, additional i2c busses, or testing purposes.

To use, include something similar to the following in the
board-specific setup code:

  #include 

  static struct i2c_gpio_platform_data i2c_gpio_data = {
 .sda_pin = GPIO_PIN_FOO,
 .scl_pin = GPIO_PIN_BAR,
  };
  static struct platform_device i2c_gpio_device = {
 .name  = "i2c-gpio",
 .id  = 0,
 .dev  = {
  .platform_data = &i2c_gpio_data,
 },
  };

Register this platform_device, set up the i2c pins as GPIO if
required and you're ready to go. This will use default values for
udelay and timeout, and will work with GPIO hardware that does not
support open drain mode, but allows changing the direction of the SDA
and SCL lines on the fly.

Signed-off-by: Haavard Skinnemoen 
---
Sorry for the long delay. I didn't want to send out a patch just
before taking off for two weeks. This patch contains the following
changes compared to v2:

  o Make udelay and timeout parameterizable and fix comment
  o Default to a very low SCL frequency (6.6 kHz) if clock stretching
    isn't supported
  o Document that GPIO hardware must report actual state of sda and
    scl pins in open drain mode.
  o Add myself to MAINTAINERS
  o Add KERN_ERR to "probe failed" message
  o Use new gpio_direction_output() API

 MAINTAINERS                   |    5 +
 drivers/i2c/busses/Kconfig    |    8 ++
 drivers/i2c/busses/Makefile   |    1 +
 drivers/i2c/busses/i2c-gpio.c |  213 +++++++++++++++++++++++++++++++++++++++++
 include/linux/i2c-gpio.h      |   38 +++++++
 5 files changed, 265 insertions(+), 0 deletions(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index ef84419..fdecda4 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -1429,6 +1429,11 @@ L: linux-scsi@vger.kernel.org
 W: http://www.icp-vortex.com/
 S: Supported
 
+GENERIC GPIO I2C DRIVER
+P: Haavard Skinnemoen
+M: hskinnemoen@atmel.com
+S: Supported
+
 GENERIC HDLC DRIVER, N2, C101, PCI200SYN and WANXL DRIVERS
 P: Krzysztof Halasa
 M: khc@pm.waw.pl
diff --git a/drivers/i2c/busses/Kconfig b/drivers/i2c/busses/Kconfig
index fb19dbb..52f79d1 100644
--- a/drivers/i2c/busses/Kconfig
+++ b/drivers/i2c/busses/Kconfig
@@ -102,6 +102,14 @@ config I2C_ELEKTOR
    This support is also available as a module.  If so, the module 
    will be called i2c-elektor.
 
+config I2C_GPIO
+ tristate "GPIO-based bitbanging i2c driver"
+ depends on I2C && GENERIC_GPIO
+ select I2C_ALGOBIT
+ help
+   This is a very simple bitbanging i2c driver utilizing the
+   arch-neutral GPIO API to control the SCL and SDA lines.
+
 config I2C_HYDRA
  tristate "CHRP Apple Hydra Mac I/O I2C interface"
  depends on I2C && PCI && PPC_CHRP && EXPERIMENTAL
diff --git a/drivers/i2c/busses/Makefile b/drivers/i2c/busses/Makefile
index 290b540..68f2b05 100644
--- a/drivers/i2c/busses/Makefile
+++ b/drivers/i2c/busses/Makefile
@@ -11,6 +11,7 @@ obj-$(CONFIG_I2C_AMD8111) += i2c-amd8111.o
 obj-$(CONFIG_I2C_AT91)  += i2c-at91.o
 obj-$(CONFIG_I2C_AU1550) += i2c-au1550.o
 obj-$(CONFIG_I2C_ELEKTOR) += i2c-elektor.o
+obj-$(CONFIG_I2C_GPIO)  += i2c-gpio.o
 obj-$(CONFIG_I2C_HYDRA)  += i2c-hydra.o
 obj-$(CONFIG_I2C_I801)  += i2c-i801.o
 obj-$(CONFIG_I2C_I810)  += i2c-i810.o
diff --git a/drivers/i2c/busses/i2c-gpio.c b/drivers/i2c/busses/i2c-gpio.c
new file mode 100644
index 0000000..895d150
--- /dev/null
+++ b/drivers/i2c/busses/i2c-gpio.c
@@ -0,0 +1,213 @@
+/*
+ * Bitbanging i2c bus driver using the GPIO API
+ *
+ * Copyright (C) 2007 Atmel Corporation
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+
+/* Toggle SDA by changing the direction of the pin */
+static void i2c_gpio_setsda_dir(void *data, int state)
+{
+ struct i2c_gpio_platform_data *pdata = data;
+
+ if (state)
+  gpio_direction_input(pdata->sda_pin);
+ else
+  gpio_direction_output(pdata->sda_pin, 0);
+}
+
+/*
+ * Toggle SDA by changing the output value of the pin. This is only
+ * valid for pins configured as open drain (i.e. setting the value
+ * high effectively turns off the output driver.)
+ */
+static void i2c_gpio_setsda_val(void *data, int state)
+{
+ struct i2c_gpio_platform_data *pdata = data;
+
+ gpio_set_value(pdata->sda_pin, state);
+}
+
+/* Toggle SCL by changing the direction of the pin. */
+static void i2c_gpio_setscl_dir(void *data, int state)
+{
+ struct i2c_gpio_platform_data *pdata = data;
+
+ if (state)
+  gpio_direction_input(pdata->scl_pin);
+ else
+  gpio_direction_output(pdata->scl_pin, 0);
+}
+
+/*
+ * Toggle SCL by changing the output value of the pin. This is used
+ * for pins that are configured as open drain and for output-only
+ * pins. The latter case will break the i2c protocol, but it will
+ * often work in practice.
+ */
+static void i2c_gpio_setscl_val(void *data, int state)
+{
+ struct i2c_gpio_platform_data *pdata = data;
+
+ gpio_set_value(pdata->scl_pin, state);
+}
+
+int i2c_gpio_getsda(void *data)
+{
+ struct i2c_gpio_platform_data *pdata = data;
+
+ return gpio_get_value(pdata->sda_pin);
+}
+
+int i2c_gpio_getscl(void *data)
+{
+ struct i2c_gpio_platform_data *pdata = data;
+
+ return gpio_get_value(pdata->scl_pin);
+}
+
+static int __init i2c_gpio_probe(struct platform_device *pdev)
+{
+ struct i2c_gpio_platform_data *pdata;
+ struct i2c_algo_bit_data *bit_data;
+ struct i2c_adapter *adap;
+ int ret;
+
+ pdata = pdev->dev.platform_data;
+ if (!pdata)
+  return -ENXIO;
+
+ ret = -ENOMEM;
+ adap = kzalloc(sizeof(struct i2c_adapter), GFP_KERNEL);
+ if (!adap)
+  goto err_alloc_adap;
+ bit_data = kzalloc(sizeof(struct i2c_algo_bit_data), GFP_KERNEL);
+ if (!bit_data)
+  goto err_alloc_bit_data;
+
+ ret = gpio_request(pdata->sda_pin, "sda");
+ if (ret)
+  goto err_request_sda;
+ ret = gpio_request(pdata->scl_pin, "scl");
+ if (ret)
+  goto err_request_scl;
+
+ if (pdata->sda_is_open_drain) {
+  gpio_direction_output(pdata->sda_pin, 1);
+  bit_data->setsda = i2c_gpio_setsda_val;
+ } else {
+  gpio_direction_input(pdata->sda_pin);
+  bit_data->setsda = i2c_gpio_setsda_dir;
+ }
+
+ if (pdata->scl_is_open_drain || pdata->scl_is_output_only) {
+  gpio_direction_output(pdata->scl_pin, 1);
+  bit_data->setscl = i2c_gpio_setscl_val;
+ } else {
+  gpio_direction_input(pdata->scl_pin);
+  bit_data->setscl = i2c_gpio_setscl_dir;
+ }
+
+ if (!pdata->scl_is_output_only)
+  bit_data->getscl = i2c_gpio_getscl;
+ bit_data->getsda = i2c_gpio_getsda;
+
+ if (pdata->udelay)
+  bit_data->udelay = pdata->udelay;
+ else
+  bit_data->udelay = 50;   /* 66 kHz */
+
+ if (pdata->timeout)
+  bit_data->timeout = pdata->timeout;
+ else
+  bit_data->timeout = HZ / 10;  /* 100 ms */
+
+ bit_data->data = pdata;
+
+ adap->owner = THIS_MODULE;
+ snprintf(adap->name, I2C_NAME_SIZE, "i2c-gpio%d", pdev->id);
+ adap->algo_data = bit_data;
+ adap->dev.parent = &pdev->dev;
+
+ ret = i2c_bit_add_bus(adap);
+ if (ret)
+  goto err_add_bus;
+
+ platform_set_drvdata(pdev, adap);
+
+ dev_info(&pdev->dev, "using pins %u (sda) and %u (scl%s)\n",
+   pdata->sda_pin, pdata->scl_pin,
+   pdata->scl_is_output_only
+   ? ", no clock stretching" : "");
+
+ return 0;
+
+err_add_bus:
+ gpio_free(pdata->scl_pin);
+err_request_scl:
+ gpio_free(pdata->sda_pin);
+err_request_sda:
+ kfree(bit_data);
+err_alloc_bit_data:
+ kfree(adap);
+err_alloc_adap:
+ return ret;
+}
+
+static int __exit i2c_gpio_remove(struct platform_device *pdev)
+{
+ struct i2c_gpio_platform_data *pdata;
+ struct i2c_adapter *adap;
+
+ adap = platform_get_drvdata(pdev);
+ pdata = pdev->dev.platform_data;
+
+ i2c_del_adapter(adap);
+ gpio_free(pdata->scl_pin);
+ gpio_free(pdata->sda_pin);
+ kfree(adap->algo_data);
+ kfree(adap);
+
+ return 0;
+}
+
+static struct platform_driver i2c_gpio_driver = {
+ .driver  = {
+  .name = "i2c-gpio",
+  .owner = THIS_MODULE,
+ },
+ .remove  = __exit_p(i2c_gpio_remove),
+};
+
+static int __init i2c_gpio_init(void)
+{
+ int ret;
+
+ ret = platform_driver_probe(&i2c_gpio_driver, i2c_gpio_probe);
+ if (ret)
+  printk(KERN_ERR "i2c-gpio: probe failed: %d\n", ret);
+
+ return ret;
+}
+module_init(i2c_gpio_init);
+
+static void __exit i2c_gpio_exit(void)
+{
+ platform_driver_unregister(&i2c_gpio_driver);
+}
+module_exit(i2c_gpio_exit);
+
+MODULE_AUTHOR("Haavard Skinnemoen ");
+MODULE_DESCRIPTION("Platform-independent bitbanging i2c driver");
+MODULE_LICENSE("GPL");
diff --git a/include/linux/i2c-gpio.h b/include/linux/i2c-gpio.h
new file mode 100644
index 0000000..7812407
--- /dev/null
+++ b/include/linux/i2c-gpio.h
@@ -0,0 +1,38 @@
+/*
+ * i2c-gpio interface to platform code
+ *
+ * Copyright (C) 2007 Atmel Corporation
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ */
+#ifndef _LINUX_I2C_GPIO_H
+#define _LINUX_I2C_GPIO_H
+
+/**
+ * struct i2c_gpio_platform_data - Platform-dependent data for i2c-gpio
+ * @sda_pin: GPIO pin ID to use for SDA
+ * @scl_pin: GPIO pin ID to use for SCL
+ * @udelay: signal toggle delay. SCL frequency is (333 / udelay) kHz
+ * @timeout: clock stretching timeout in jiffies. If the slave keeps
+ * SCL low for longer than this, the transfer will time out.
+ * @sda_is_open_drain: SDA is configured as open drain, i.e. the pin
+ * isn't actively driven high when setting the output value high.
+ * gpio_get_value() must return the actual pin state even if the
+ * pin is configured as an output.
+ * @scl_is_open_drain: SCL is set up as open drain. Same requirements
+ * as for sda_is_open_drain apply.
+ * @scl_is_output_only: SCL output drivers cannot be turned off.
+ */
+struct i2c_gpio_platform_data {
+ unsigned int sda_pin;
+ unsigned int scl_pin;
+ int  udelay;
+ int  timeout;
+ unsigned int sda_is_open_drain:1;
+ unsigned int scl_is_open_drain:1;
+ unsigned int scl_is_output_only:1;
+};
+
+#endif /* _LINUX_I2C_GPIO_H */
-- 
1.4.4.4
http://thread.gmane.org/gmane.linux.kernel/1862094


http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=b1917578fd5d8efa67afa05a0d6d7e323f2802da

https://patchwork.ozlabs.org/patch/419034/
https://lkml.org/lkml/2014/11/12/249



This patch device tree binding documentation for rt5033 multifunction device.

Cc: Rob Herring 
Cc: Pawel Moll 
Cc: Mark Rutland 
Cc: Ian campbell 
Cc: Kumar Gala 
Signed-off-by: Beomho Seo 
Acked-by: Chanwoo Choi 
---
 Documentation/devicetree/bindings/mfd/rt5033.txt   |  115 ++++++++++++++++++++
 .../devicetree/bindings/vendor-prefixes.txt        |    1 +
 2 files changed, 116 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/mfd/rt5033.txt
diff --git a/Documentation/devicetree/bindings/mfd/rt5033.txt b/Documentation/devicetree/bindings/mfd/rt5033.txt
new file mode 100644
index 0000000..8b55a79
--- /dev/null
+++ b/Documentation/devicetree/bindings/mfd/rt5033.txt
@@ -0,0 +1,115 @@
+Richtek RT5033 Power management Integrated Circuit
+
+RT5033 is a Multifunction device which includes battery charger, fuel gauge,
+flash LED current source, LDO and synchronous Buck converter for portable
+applications. It is interfaced to host controller using i2c interface.
+
+Required properties:
+- compatible : Must be "richtek,rt5033"
+- reg : Specifies the i2c slave address of general part.
+- interrupts : This i2c devices has an IRQ line connected to the main SoC.
+- interrupt-parent : The parent interrupt controller.
+
+Optional node:
+Regulators: The regulators of RT5033 have to be instantiated under sub-node
+named "regulators" usinge the following format.
+
+Required properties:
+- compatible = Must be "richtek,rt5033-regulator"
+
+ regulators {
+  compatible = "richtek,rt5033-regulator";
+
+  regulator-name {
+   regulator-name = LDO/BUCK
+   standard regulator constraints...
+  };
+ };
+ refer Documentation/devicetree/bindings/regulator/regulator.txt
+
+
+Battery charger: There battery charger of RT5033 have to be instantiated under
+sub-node named "charger" using the following format.
+
+Required properties:
+- compatible : Must be "richtek,rt5033-charger".
+- richtek,pre-uamp : Current of pre-charge mode. The pre-charge current levels
+  are 350 mA to 650 mA programmed by I2C per 100 mA.
+- richtek,pre-threshold-uvolt : Voltage of threshold pre-charge mode. Battery
+  voltage is below pre-charge threshold voltage, the charger is in pre-charge
+  mode with pre-charge current. Its levels are 2.3 V  to 3.8 V programmed
+  by I2C per 0.1 V.
+- richtek,fast-uamp : Current of fast-charge mode. The fast-charge current
+  levels are 700 mA to 2000 mA programmed by I2C per 100 mA.
+- richtek,const-uvolt :  Battery regulation voltage of constant voltage mode.
+  This voltage level 3.65 V to 4.4 V bye I2C per 0.025 V.
+- richtek,eoc-uamp : This property is end of charge current. Its level 150 mA
+  to 200 mA.
+
+ charger {
+  compatible = "richtek,rt5033-charger";
+  richtek,pre-uamp = <350000>;
+  richtek,pre-threshold-uvolt = <3400000>;
+  richtek,fast-uamp = <2000000>;
+  richtek,const-uvolt = <4350000>;
+  richtek,eoc-uamp = <250000>;
+ };
+
+
+Fuelgauge: There fuelgauge of RT5033 to be instantiated node named "fuelgauge"
+using the following format.
+
+Required properties:
+- compatible = Must be "richtek,rt5033-battery".
+
+ i2c_fuel: i2c@1 {
+  compatible = "i2c-gpio";
+  standard i2c-gpio constraints...
+  fuelgauge {
+   compatible = "richtek,rt5033-battery".
+  };
+ };
+
+
+Example:
+
+  rt5033@34 {
+   compatible = "richtek,rt5033";
+   reg = <0x34>;
+   interrupt-parent = <&gpx1>;
+   interrupts = <5 0="">;
+
+   regulators {
+    compatible = "richtek,rt5033-regulator";
+
+    buck_reg: BUCK {
+     regulator-name = "BUCK";
+     regulator-min-microvolt = <1200000>;
+     regulator-max-microvolt = <1200000>;
+     regulator-always-on;
+    };
+   };
+
+   charger {
+    compatible = "richtek,rt5033-charger";
+    richtek,pre-uamp = <350000>;
+    richtek,pre-threshold-uvolt = <3400000>;
+    richtek,fast-uamp = <2000000>;
+    richtek,const-uvolt = <4350000>;
+    richtek,eoc-uamp = <250000>;
+   };
+
+  };
+
+  i2c_fuel: i2c@10 {
+   compatible = "i2c-gpio";
+   gpios = <&gpm3 1 0
+    &gpm3 0 0>;
+
+   fuel: rt5033-battery@35 {
+    compatible = "richtek,rt5033-battery";
+    interrupt-parent = <&gpx2>;
+    interrupts = <3 0="">;
+    reg = <0x35>;
+   };
+  };
diff --git a/Documentation/devicetree/bindings/vendor-prefixes.txt b/Documentation/devicetree/bindings/vendor-prefixes.txt
index 723999d..611b543 100644
--- a/Documentation/devicetree/bindings/vendor-prefixes.txt
+++ b/Documentation/devicetree/bindings/vendor-prefixes.txt
@@ -124,6 +124,7 @@ ralink Mediatek/Ralink Technology Corp.
 ramtron Ramtron International
 realtek Realtek Semiconductor Corp.
 renesas Renesas Electronics Corporation
+richtek Richtek Technology Corporation
 ricoh Ricoh Co. Ltd.
 rockchip Fuzhou Rockchip Electronics Co., Ltd
 samsung Samsung Semiconductor

Thursday, October 8, 2015

What are LDOs and how are they used?

http://www.analog.com/library/analogDialogue/archives/41-05/ldo.html

Ask The Applications Engineer—37
Low-Dropout Regulators
By Jerome Patoux 
This article introduces the basic topologies and suggests good practical usage for ensuring stable operation of low-dropout voltage regulators (LDOs). We will also discuss design characteristics of Analog Devices families of LDOs, which offer a flexible approach to maintaining dynamic- and dc stability.
Q: What are LDOs and how are they used?
A: Voltage regulators are used to provide a stable power supply voltage independent of load impedance, input-voltage variations, temperature, and time. Low-dropout regulators are distinguished by their ability to maintain regulation with small differences between supply voltage and load voltage. For example, as a lithium-ion battery drops from 4.2 V (fully charged) to 2.7 V (almost discharged), an LDO can maintain a constant 2.5 V at the load.
The increasing number of portable applications has thus led designers to consider LDOs to maintain the required system voltage independently of the state of battery charge. But portable systems are not the only kind of application that might benefit from LDOs. Any equipment that needs constant and stable voltage, while minimizing the upstream supply (or working with wide fluctuations in upstream supply), is a candidate for LDOs. Typical examples include circuitry with digital and RF loads.
A “linear” series voltage regulator (Figure 1) typically consists of a reference voltage, a means of scaling the output voltage and comparing it to the reference, a feedback amplifier, and a series pass transistor (bipolar or FET), whose voltage drop is controlled by the amplifier to maintain the output at the required value. If, for example, the load current decreases, causing the output to rise incrementally, the error voltage will increase, the amplifier output will rise, the voltage across the pass transistor will increase, and the output will return to its original value.
Figure 1. Basic enhancement-mode PMOS LDO.
In Figure 1, the error amplifier and PMOS transistor form a voltage-controlled current source. The output voltage, VOUT, is scaled down by the voltage divider (R1R2) and compared to the reference voltage (VREF). The error amplifier's output controls an enhancement-mode PMOS transistor.
The dropout voltage is the difference between the output voltage and the input voltage at which the circuit quits regulation with further reductions in input voltage. It is usually considered to be reached when the output voltage has dropped to 100 mV below the nominal value. This key factor, which characterizes the regulator, depends on load current and junction temperature of the pass transistor.
Q: How are regulators distinguished by dropout voltage?
A: We can suggest three classes: standard regulators, quasi-LDOs, and low-dropout regulators (LDOs).
Standard regulators, which typically employ NPN pass transistors, usually drop out at about 2 V.
Quasi-LDO regulators usually use a Darlington structure (Figure 2) to implement a pass device made up of an NPN transistor and a PNP. The dropout voltage, VSAT (PNP) + VBE (NPN), is typically about 1 V—more than an LDO but less than a standard regulator.
Figure 2. Quasi-LDO circuit.
LDO regulators are usually the optimal choice based on dropout voltage, typically 100 mV to 200 mV. The disadvantage, however, is that the ground-pin current of a LDO is usually higher than that of a quasi-LDO or a standard regulator.
Standard regulators have a higher dropout voltage and dissipation, and lower efficiency, than the other types. They can be replaced by LDO regulators much of the time, but the maximum input voltage specification—which can be lower than that for standard regulators—should be considered. In addition, some LDOs will need specially chosen external capacitors to maintain stability. The three types differ somewhat in both bandwidth and dynamic stability considerations.
Q: How can I select the best regulator for my application?
A: To choose the right regulator for a specific application, the type and range of input voltage (e.g., the output voltage of the dc-to-dc converter or switching power supply ahead of the regulator), needs to be considered. Also important are: the required output voltage, maximum load current, minimum dropout voltage, quiescent current, and power dissipation. Often, additional features may be useful, such as a shutdown pin or an error flag to indicate loss of regulation.
The source of the input voltage needs to be considered in order to choose a suitable category of LDO. In battery-powered applications, LDOs must maintain the required system voltage as the battery discharges. If the dc input voltage is provided from a rectified ac source, the dropout voltage may not be critical, so a standard regulator—which may be cheaper and can provide more load current—could be a better choice. But an LDO could be the right choice if lower power dissipation or a more precise output voltage is necessary.
The regulator should, of course, be able to provide enough current to the load with specified accuracy under worst-case conditions.
LDO TopologiesIn Figure 1, the pass device is a PMOS transistor. However, a variety of pass devices are available, and LDOs can be classified depending on which type of pass device is used. Their differing structures and characteristics offer various advantages and drawbacks.
Examples of four types of pass devices are shown in Figure 3, including NPN and PNP bipolar transistors, Darlington circuits, and PMOS transistors.
Figure 3. Examples of pass devices.
For a given supply voltage, the bipolar pass devices can deliver the highest output current. A PNP is preferred to an NPN, because the base of the PNP can be pulled to ground, fully saturating the transistor if necessary. The base of the NPN can only be pulled as high as the supply voltage, limiting the minimum voltage drop to one VBE. Therefore, NPN and Darlington pass devices can’t provide dropout voltages below 1 V. They can be valuable, however, where wide bandwidth and immunity to capacitive loading are necessary (thanks to their characteristically low ZOUT).
PMOS and PNP transistors can be effectively saturated, minimizing the voltage loss and the power dissipated by the pass device, thus allowing low dropout, high-efficiency voltage regulators. PMOS pass devices can provide the lowest possible dropout voltage drop, approximately RDS(ON) × IL. They also allow the quiescent current flow to be minimized. The main drawback is that the MOS transistor is often an external component—especially for controlling high currents—thus making the IC acontroller, rather than a complete self-contained regulator.
The power loss in a complete regulator is
PD = (VIN – VOUT ) IL VIN IGND
The first part of this relationship is the dissipation of the pass device; the second part is the power consumption of the controller portion of the circuit. The ground current in some regulators, especially those using saturable bipolar transistors as pass devices, can peak during power-up.
Q: How can LDO dynamic stability be ensured?
A: Classical LDO circuit designs for general-purpose applications have problems with stability. The difficulties stem from the nature of their feedback circuits, the wide range of possible loads, the variability of elements within the loop, and the difficulty of obtaining precision compensation devices with consistent parameters. These considerations will be discussed below, followed by a description of the anyCAP® circuit topology, which has improved stability.
LDOs generally use a feedback loop to provide a constant voltage, independent of load, at the output. As is true for any high-gain feedback loop, the location of the poles and zeros in the loop-gain transfer function will determine the stability.
NPN-based regulators, with their low-impedance emitter-loaded output, tend to be relatively insensitive to output capacitive loading. PNP and PMOS regulators, however, have higher output impedance (collector loaded in the case of the PNP). In addition, the loop’s gain and phase characteristics strongly depend on the load impedance, thus requiring special consideration for stability.
The transfer function of PNP- and PMOS-based LDOs has several poles that impact stability:
  • The dominant pole (P0 in Figure 4) is set by the error amplifier; it is controlled and fixed, in conjunction with the gm of the amplifier, through an internal compensation capacitance CCOMP. This pole is common to all of the LDO topologies described above.
  • The second pole (P1) is set by the output elements (the combination of the output capacitance and the load capacitance and resistance). This makes the application problem more difficult to handle, as these elements affect both the loop gain and bandwidth.
  • A third pole (P2) is due to parasitic capacitance around the pass elements. PNP power transistors have a unity-gain frequency (fT) much lower than that of comparable NPN transistors, under the same conditions.
Figure 4. LDO frequency amplitude response.
As Figure 4 shows, each pole contributes 20 dB/decade of roll-off in gain, with up to 90° of phase shift. As the LDOs discussed here have multiple poles, the linear regulator will be unstable if the phase shift at the unity-gain frequency approaches –180°. Figure 4 also shows the effect of loading the regulator with a capacitor, whose effective series resistance(ESR) will add a zero (ZESR) into the transfer function. This zero will help to compensate for one of the poles and can help to stabilize the loop if it occurs below the unity-gain frequency and keeps the phase shift well below –180° at that frequency.
ESR can be critical for stability, especially for LDOs with vertical-PNP pass devices. As a parasitic property of a capacitor, however, the ESR is not always well-controlled. A circuit may require the ESR to fall within a certain window to ensure that the LDO operates in the stable region for all output currents (Figure 5).
Figure 5. Stability as a function of output current and load-capacitor ESR.
Even in principle, choosing the right capacitor with the right ESR (high enough to reduce the slope before the frequency response crosses through 0 dB, yet low enough to bring the gain below 0 dB before the associated pole, P2) can be challenging. Yet the practical considerations add further challenges: ESR varies, depending on the brand; and the minimum capacitance value to use in production will require bench tests, including extreme cases with minimum ambient temperature and maximum load. The choice of the type of capacitor is also important. Perhaps the most suitable are tantalum capacitors, despite their large size in the higher-capacitance ranges. Aluminum electrolytics are compact, but their ESR tends to deteriorate at low temperatures, and they don't work well below –30°C. Multilayer ceramic types do not have sufficient capacitance for conventional LDOs (but they are suitable for anyCAP designs, read on).
Analog Devices anyCAP family of LDOsLDO implementation is considerably easier now, thanks to improvements in both dc and ac performance associated with regulators employing the Analog Devices anyCAP LDO architecture. As the term implies, regulators embodying it are relatively insensitive to both the size of the capacitor and its ESR, thus allowing for a wider possible range of output capacitance. The approach has spread and is now more widely available in the marketplace, but it may be helpful to understand how this architecture (Figure 6) simplifies the stability issue.
Figure 6. Simplified schematic of anyCAP LDO.
The anyCAP family of LDOs, including the 100-mA ADP3307 and the 200-mA low-quiescent-current ADP3331, can remain stable with output capacitance as low as 0.47 µF, using good-quality capacitors of any type, including compact multilayer ceramic. ESR is essentially a nonissue.
The simplified schematic of Figure 6 shows how a single loop provides both regulation and reference functions. The output is sensed by the external R1-R2 voltage divider, and fed back to the input of a high-gain amplifier through diode D1 and the R3-R4 divider. At equilibrium, the amplifier produces a large, repeatable, well-controlled offset voltage that is proportional to absolute temperature (PTAT). This voltage combines with the complementary temperature-sensitive diode voltage drop to form the implicit reference, a temperature-independent virtual band-gap voltage.
The amplifier output connects to an unusual noninverting driver that controls the pass transistor, allowing the frequency compensation to include the load capacitor in a pole-splitting arrangement based on Miller compensation. This provides reduced sensitivity to value, type, and ESR of the load capacitor. Additional advantages of the pole-splitting scheme include superior line-noise rejection and very high regulator gain, thereby providing exceptional accuracy and excellent line and load regulation.
Q: Would you discuss the Analog Devices families of LDOs?
A: The choice of LDO depends, of course, on the supply voltage range, load voltage, and required maximum dropout voltage. The main differences between devices focus on power consumption, efficiency, price, ease of use, and the various specifications and packages available.
The popular ADP33xx anyCAP family of ADI LDOs has been on the market for several years. Based on a BiCMOS process and a PNP pass transistor, it allows good regulation and many of the advantages mentioned above, but tends to be somewhat more expensive than CMOS parts.
Some recent designs, such as the ADP17xx family, are entirely CMOS-based, with a PMOS pass transistor, which allows the fabrication of LDOs at lower cost, but with a trade-off on line-regulation performance. Devices in this family can handle a large range of output capacitance, but they still require at least 1 µF and 500-mohm ESR. For example, the 150-mA ADP1710 andADP1711 are optimized for stable operation with small 1-µF ceramic output capacitors, allowing for good transient performance while occupying minimal board space, and the 300-mA ADP1712ADP1713, and ADP1714 can use 2.2-µF capacitors.
Both of these families have 16 fixed-output-voltage options, from 0.75 V to 3.3 V, as well as an adjustable-output option in the 0.8-V to 5-V range. Accuracy is to within ±2% over line, load, and temperature. The ADP1711 and ADP1713 fixed-voltage versions allow for a reference-bypass capacitor to be connected; this reduces output voltage noise and improves power-supply rejection. The ADP1714 includes a tracking feature, which allows the output to follow an external voltage rail or reference. Dropout voltages at rated load are 150 mV for the ADP1710 and ADP1711; and 170 mV for the ADP1712, ADP1713, and ADP1714. Power-supply rejection (PSR) is high (69 dB and 72 dB at 1 kHz), and power consumption is low, with ground current of 40 µA and 75 µA with 100-µA load.
Typical transient responses of the ADP1710 and ADP1711 are compared in Figure 7 for a nearly full-load step, with 1-µF and 22-µF input- and output capacitors.
Figure 7. Transient response of ADP1710/ADP1711.
The operating junction temperature range is –40°C to +125°C. Both families are available in tiny 5-lead TSOT packages, a small-footprint solution to the variety of power needs.




Sunday, October 4, 2015

Just discovered Dont know wat dis

http://www.crazypirate.me/futex/chapter4.html
http://blog.nativeflow.com/the-futex-vulnerability


https://events.linuxfoundation.org/sites/events/files/slides/linuxcon-2014-locking-final.pdf

Revisited :atomic

 

Atomic operations
http://simpleopencl.blogspot.in/2013/05/atomic-operations-and-floats-in-opencl.html
Several different atomic operations are supported,
almost all only for integers:
addition (integers and 32-bit floats)
minimum / maximum
increment / decrement
exchange / compare-and-swap
bitwise AND OR XOR
These are
quite fast for data in local memory
slower for data in global memory
(better on new Kepler hardware)
Lecture 3 – p. 17
Atomic operations
Compare-and-swap:
int atomic_cmpxchg(volatile __local int
*p,
int cmp, int val);
if compare equals old value stored at address then
val is stored instead
in either case, routine returns the value of old
seems a bizarre routine at first sight, but can be very
useful for atomic locks
also can be used to implement 64-bit floating point
atomic addition
Lecture 3 – p. 18
Global atomic lock
// global variable: 0 unlocked, 1 locked
__global volatile int lock=0;
__kernel void kernel(...) {
...
if (get_local_id(0)==0) {
// set lock
do {} while(atomic_cmpxchg(&lock, 0, 1));
}
...
// free lock
lock = 0;
} Lecture 3 – p. 19Global atomic lock
Problem: when a work-item writes data to global memory
the order of completion is not guaranteed, so global writes
may not have completed by the time the lock is unlocked
__kernel void kernel(...) {
...
if (get_local_id(0)==0) {
do {} while(atomic_cmpxchg(&lock,0,1));
...
mem_fence(CLK_GLOBAL_MEM_FENCE); // order writes
// free lock
lock = 0;
}
} Lecture 3 – p. 20mem_fence
mem fence();
order all preceding global or local (or both) reads and
writes
means all loads/stores committed to memory before
any following loads/stores
mem fence write();
same as above, but only for stores
mem fence read();
same as above, but only for loads
Different to barrier() – non-blocking
Lecture 3 – p. 21Summary
lots of esoteric capabilities – don’t worry about most of
them
essential to understand work-item divergence – can
have a very big impact on performance
barrier() is vital – will see another use of it in next
lecture
the rest can be ignored until you have a critical need
– then read the documentation carefully and look for
examples in the SDK
http://blog.csdn.net/angle_birds/article/details/7891174
所谓原子操作,就是该操作绝不会在执行完毕前被任何其他任务或事件打断,也就说,它的最小的执行单位,不可能有比它更小的执行单位。因此这里的原子实际是使用了物理学里的物质微粒的概念。
原子操作需要硬件的支持,因此是架构相关的,其API和原子类型的定义都定义在内核源码树的include/asm/atomic.h文件中,它们都使用汇编语言实现,因为C语言并不能实现这样的操作。
原子操作主要用于实现资源计数,很多引用计数(refcnt)就是通过原子操作实现的。原子类型定义如下:
typedef struct 

     volatile int counter; 

atomic_t;
volatile修饰字段告诉gcc不要对该类型的数据做优化处理,对它的访问都是对内存的访问,而不是对寄存器的访问。

Linux中的基本原子操作
宏或者函数
说明
Atomic_read
返回原子变量的值
Atomic_set
设置原子变量的值。
Atomic_add
原子的递增计数的值。
Atomic_sub
原子的递减计数的值。
atomic_cmpxchg
原子比较并交换计数值。
atomic_clear_mask
原子的清除掩码。

除此以外,还有一组操作64位原子变量的变体,以及一些位操作宏及函数。这里不再罗列。
/**
 返回原子变量的值。
 这里强制将counter转换为volatile int并取其值。目的就是为了避免编译优化。
 */
#define atomic_read(v)   (*(volatile int *)&(v)->counter)
/**
 设置原子变量的值。
 */
#define atomic_set(v,i)    (((v)->counter) = (i))

原子递增的实现比较精妙,理解它的关键是需要明白ldrexstrex这一对指令的含义。
/**
 原子的递增计数的值。
 */
static inline void atomic_add(int i, atomic_t *v)
{
         unsigned long tmp;
         int result;

         /**
          * __volatile__是为了防止编译器乱序。与"#define atomic_read(v)          (*(volatile int *)&(v)->counter)"中的volatile类似。
          */
         __asm__ __volatile__("@ atomic_add\n"
         /**
          * ldrexarm为了支持多核引入的新指令,表示"排它性"加载。与mipsll指令一样的效果。
          它与"排它性"存储配对使用。
          */
"1:    ldrex         %0, [%3]\n"
         /**
          原子变量的值已经加载到寄存器中,这里对寄存器中的值减去指定的值。
          */
"       add  %0, %0, %4\n"
         /**
          * strex"排它性"的存储寄存器的值到内存中。类似于mipssc指令。
          */
"       strex         %1, %0, [%3]\n"
         /**
          关键代码是这里的判断。如果在ldrexstrex之间,其他核没有对原子变量变量进行加载存储操作,
          那么寄存器中值就是0,否则非0.
          */
"       teq   %1, #0\n"
         /**
          如果其他核与本核冲突,那么寄存器值为非0,这里跳转到标号1处,重新加载内存的值并递增其值。
          */
"       bne  1b"
         : "=&r" (result), "=&r" (tmp), "+Qo" (v->counter)
         : "r" (&v->counter), "Ir" (i)
         : "cc");
}

atomic_add_return递增原子变量的值,并返回它的新值。它与atomic_add的最大不同,在于在原子递增前后各增加了一句:smp_mb();
这是由linux原子操作函数的语义规定的:所有对原子变量的操作,如果需要向调用者返回结果,那么就需要增加多核内存屏障的语义。通俗的说,就是其他核看到本核对原子变量的操作结果时,本核在原子变量前的操作对其他核也是可见的。

理解了atomic_add,其他原子变量的实现也就容易理解了。这里不再详述。

atomic_cmpxchg()函数实现了一个比较+交换的原子操作(原子就是说cpu要不就不
做,要做就一定要做完某些操作才能干别的事情,对应这里就是比较和交换要一次过做完).
atomic_cmpxchg()比较kgdb_active->count的值是否等用-1,如果是则把cpu的值赋
给kgdb_active->count,否则不修改它的值,atomic_cmpxchg返回
kgdb_active->count赋值前的值.
kgdb_active是一个全局原子变量,定义在kernel/kgdb.c中,用来记录当前正在执行
kgdb代码的cpu号,它起到一个锁的作用,因为同一时间只能有一个cpu执行kgdb的代
码,这是可以想象得到的,如果两个cpu在两个不同断点被触发,那究竟是谁和远端gdb通
信呢?前一条命令被 cpu1拿了,后一条却去了cpu2那里,那还得了。
kgdb_active的初始值为-1,-1表示当前kgdb的处理函数并没有被触发,相反如果
kgdb已经在运行,那么kgdb_active就有它自己的值,这些处理都是针对多cpu的,如
果只有一个cpu,这个世界就简单多了。这里是防止多个kgdb的实例在不同cpu被触发
引起互相干扰。考虑这种情况,在cpu1上有一个断点让kgdb起来,这时,kgdb_active
还是-1,cpu1很顺利就给kgdb_active赋值然后进入后面的操作.这时cpu2中kgdb也被触发.
它也想进入后面的操作,但是这时候kgdb_active已经不再是-1,cpu2只能不断地比较
kgdb_active的值和执行cpu_relax(),宏cpu_relax()可以简化为一条pause汇编,通过引
入一个很短的延迟,加快了紧跟在锁后面的代码的执行并减少能源的消耗,实际上就是
让cpu2等。当cpu1在退出kgdb_handle_exception()前会把 kgdb_active赋回-1, 这样
cpu2就可以进行后面的操作了。kgdb使用大量的原子操作来完成锁的功能,后面还会
看到. atomic操作加上cpu_relax()跟一个自旋锁很相似。