What is the booting process for ARM?

After Power is ON The cpu will start executing exception mode 1st one is reset ,As Reset must run as superviser mode since CPU doesn't know the status of the register at this time of execution it cant go into the superviser mode .To achieve this small code need to be written (See at end). after this other exceptions can be handled by loading the address into PC .

.globl _start
 _start: b       reset
    ldr     pc, _undefined_instruction
    ldr     pc, _software_interrupt
    ldr     pc, _prefetch_abort
    ldr     pc, _data_abort
    ldr     pc, _not_used
    ldr     pc, _irq
    ldr     pc, _fiq

reset:
    mrs     r0,cpsr                 /* set the cpu to SVC32 mode        */
    bic     r0,r0,#0x1f             /* (superviser mode, M=10011)       */
    orr     r0,r0,#0x13
    msr     cpsr,r0

Currently, there are two exception models in the ARM architecture (reset is considered a kind of exception):

The classic model, used in pre-Cortex chip and current Cortex-A/R chips. In it, the memory at 0 contains several exception handlers:

 Offset  Handler
 ===============
 00      Reset 
 04      Undefined Instruction
 08      Supervisor Call (SVC)
 0C      Prefetch Abort
 10      Data Abort
 14      (Reserved)
 18      Interrupt (IRQ)
 1C      Fast Interrupt (FIQ)

When the exception happens, the processor just starts execution from a specific offset, so usually this table contains single-instruction branches to the complete handlers further in the code. A typical classic vector table looks like following:

00000000   LDR   PC, =Reset
00000004   LDR   PC, =Undef
00000008   LDR   PC, =SVC
0000000C   LDR   PC, =PrefAbort
00000010   LDR   PC, =DataAbort
00000014   NOP
00000018   LDR   PC, =IRQ
0000001C   LDR   PC, =FIQ

At runtime, the vector table can be relocated to 0xFFFF0000, which is often implemented as a tightly-coupled memory range for the fastest exception handling. However, the power-on reset usually begins at 0x00000000 (but in some chips can be set to 0xFFFF0000 by a processor pin).

The new microcontroller model is used in the Cortex-M line of chips. There, the vector table at 0 is actually a table of vectors (pointers), not instructions. The first entry contains the start-up value for the SP register, the second is the reset vector. This allows writing the reset handler directly in C, since the processor sets up the stack. Again, the table can be relocated at runtime. The typical vector table for Cortex-M begins like this:

__Vectors       DCD     __initial_sp              ; Top of Stack
                DCD     Reset_Handler             ; Reset Handler
                DCD     NMI_Handler               ; NMI Handler
                DCD     HardFault_Handler         ; Hard Fault Handler
                DCD     MemManage_Handler         ; MPU Fault Handler
                DCD     BusFault_Handler          ; Bus Fault Handler
                DCD     UsageFault_Handler        ; Usage Fault Handler
                [...more vectors...]

Note that in the modern complex chips such as OMAP3 or Apple's A4 the first piece of code which is executed is usually not user code but the on-chip Boot ROM. It might check various conditions to determine where to load the user code from and whether to load it at all (e.g. it could require a valid digital signature). In such cases, the user code might have to conform to different start-up conventions.