在 Linux系統中,對於多核的ARM晶元而言,在Biotron代碼中,每個CPU都會識別自身ID,如果ID是0,則引導Bootloader和 Linux內核執行,如果ID不是0,則Biotron一般在上電時將自身置於WFI或者WFE狀態,並等待CPU0給其發CPU核間中斷或事件(一般通過SEV指令 ...
在 Linux系統中,對於多核的ARM晶元而言,在Biotron代碼中,每個CPU都會識別自身ID,如果ID是0,則引導Bootloader和 Linux內核執行,如果ID不是0,則Biotron一般在上電時將自身置於WFI或者WFE狀態,並等待CPU0給其發CPU核間中斷或事件(一般通過SEV指令)以喚醒它。一個典型的多核 Linux啟動過程如圖20.6所示。
被CPU0喚醒的CPUn可以在運行過程中進行熱插拔,譬如運行如下命令即可卸載CPU1,並且將CPUI上的任務全部遷移到其他CPU中:
# echo 0 > /sys/devices/system/cpu/cpu1/online
同理,運行如下命令可以再次啟動CPU1:
# echo 1 > /sys/devices/system/cpu/cpu1/online
之後CPU1會主動參與系統中各個CPU之間的運行任務的負載均衡工作;
CPUO喚醒其他CPU的動作在內核中被封裝為一個 smp_operations
的結構體,對於ARM而言,它定義於 arch/arm/include/asm/smp.h中。該結構體的成員函數如代碼清單所示。
struct smp_operations {
#ifdef CONFIG_SMP
/*
* Setup the set of possible CPUs (via set_cpu_possible)
*/
void (*smp_init_cpus)(void);
/*
* Initialize cpu_possible map, and enable coherency
*/
void (*smp_prepare_cpus)(unsigned int max_cpus);
/*
* Perform platform specific initialisation of the specified CPU.
*/
void (*smp_secondary_init)(unsigned int cpu);
/*
* Boot a secondary CPU, and assign it the specified idle task.
* This also gives us the initial stack to use for this CPU.
*/
int (*smp_boot_secondary)(unsigned int cpu, struct task_struct *idle);
#ifdef CONFIG_HOTPLUG_CPU
int (*cpu_kill)(unsigned int cpu);
void (*cpu_die)(unsigned int cpu);
int (*cpu_disable)(unsigned int cpu);
#endif
#endif
};
CPUO喚醒其他CPU的動作在內核中被封裝為一個 smp_operations
的結構體,對於ARM而言,它定義於 arch/arm/include/asm/smp.h
中。該結構體的成員函數如代碼清單所示。
DT_MACHINE_START(VEXPRESS DT,"ARM-Versatile Express)
.dt_compat = v2m_dt_match,
.smp = smp_ops(express_smp_ops),
.map_io = v2m_dt_map_io,
MACHINE_END
通過 arch/arm/mach-vexpress/platsmp.c
的實現代碼可以看出, smp_operations
的成員函數smp_init_cpus()
,即 vexpress_smp_init_cpus
調用的ct_ca9x4_init_cpu_map
(會探測SoC內CPU核的個數,並通過 set_cpu_possible
設置這些CPU可見。
而 smp_operations
的成員函數 smp_prepare_cpus
,即 vexpress_smp_prepare_cpus
則會通過v2m_flags_set
( virt_to_phys( versatile_secondary_startup
)設置其他CPU的啟動地址為versatile_secondary_startup
,如代碼清單所示。
在smp_prepare_cpus()
設置CPU1...n啟動地址:
static void __init vexpress_smp_prepare_cpus(unsigned int max_cpus)
{
/*
* Initialise the present map, which describes the set of CPUs
* actually populated at the present time.
*/
if (ct_desc)
ct_desc->smp_enable(max_cpus);
else
vexpress_dt_smp_prepare_cpus(max_cpus);
/*
* Write the address of secondary startup into the
* system-wide flags register. The boot monitor waits
* until it receives a soft interrupt, and then the
* secondary CPU branches to this address.
*/
vexpress_flags_set(virt_to_phys(versatile_secondary_startup));
}
註意這部分具體實現方式是與SOC相關的,由晶元設計及晶元內部的Bootrom決定。對於VEXPRESS來講,設置方法如下:
void __init v2m_flags_set(u32 data)
{
writel(~0, v2m_sysreg_base + V2M_SYS_FLAGSCLR);
writel(data, v2m_sysreg_base + V2M_SYS_FLAGSCLR);
}
即填充v2m_sysreg_base+V2M_SYS_FLAGSCLR
標記清除寄存器為0xFFFFFFFF,將CPU1...n初始啟動執行的指令地址填入v2m_sysreg_base+V2M_SYS_FLAGSSET
寄存器。這兩個地址由晶元內部的Bootrom程式設定的。填入的CPU1...n的起始地址都通過virt_to_phys
轉化為物理地址,因為此時CPU1...n的MMU尚未開啟;
比較關鍵的是smp_operations的成員函數smp_boot_secondary()
,它是完成CPU最終喚醒的工作,對於本例而言,versatile_boot_secondary()
;
CPU0通過終端喚醒其他CPU:
/*
* Write pen_release in a way that is guaranteed to be visible to all
* observers, irrespective of whether they're taking part in coherency
* or not. This is necessary for the hotplug code to work reliably.
*/
static void __cpuinit write_pen_release(int val)
{
pen_release = val;
smp_wmb();
__cpuc_flush_dcache_area((void *)&pen_release, sizeof(pen_release));
outer_clean_range(__pa(&pen_release), __pa(&pen_release + 1));
}
int __cpuinit versatile_boot_secondary(unsigned int cpu, struct task_struct *idle)
{
unsigned long timeout;
/*
* Set synchronisation state between this boot processor
* and the secondary one
*/
spin_lock(&boot_lock);
/*
* This is really belt and braces; we hold unintended secondary
* CPUs in the holding pen until we're ready for them. However,
* since we haven't sent them a soft interrupt, they shouldn't
* be there.
*/
write_pen_release(cpu_logical_map(cpu));
/*
* Send the secondary CPU a soft interrupt, thereby causing
* the boot monitor to read the system wide flags register,
* and branch to the address found there.
*/
arch_send_wakeup_ipi_mask(cpumask_of(cpu));
timeout = jiffies + (1 * HZ);
while (time_before(jiffies, timeout)) {
smp_rmb();
if (pen_release == -1)
break;
udelay(10);
}
/*
* now the secondary core is starting up let it run its
* calibrations, then wait for it to finish
*/
spin_unlock(&boot_lock);
return pen_release != -1 ? -ENOSYS : 0;
}
調用的 write_pen_release
會將 pen_release
變數設置為要喚醒的CPU核的CPU號 cpu_logical_map(cpu)
,而後通過 arch_send_wakeup_ipi
mask給要喚醒的CPU發IPI中斷,這個時候,被喚醒的CPU會退出WFI狀態並從前面 smp_operations
中的smp_prepare_cpus
成員函數,即 vexpress_smp_prepare_cpus
里通過 v2m_flags_set()
設置的起始地址 versatile_secondary_startup
開始執行,如果順利的話,該CPU會將原先為正數的pen_release
寫為-1,以便CPU0從等待pen_release
成為-1的迴圈跳出;
versatile_secondary_startup
實現於arch/arm/plat-versatile/headsmp.S
中,是一段彙編,如下代碼所示:
/*
* Realview/Versatile Express specific entry point for secondary CPUs.
* This provides a "holding pen" into which all secondary cores are held
* until we're ready for them to initialise.
*/
ENTRY(versatile_secondary_startup)
mrc p15, 0, r0, c0, c0, 5
bic r0, #0xff000000
adr r4, 1f
ldmia r4, {r5, r6}
sub r4, r4, r5
add r6, r6, r4
pen: ldr r7, [r6]
cmp r7, r0
bne pen
/*
* we've been released from the holding pen: secondary_stack
* should now contain the SVC stack for this core
*/
b secondary_startup
.align
1: .long .
.long pen_release
ENDPROC(versatile_secondary_startup)
上述迴圈代碼的迴圈是等待pen_release
變數稱為CPU0設置的cpu_logical_map(cpu)
,一般就直接成立了。第16行調用內核通用的secondary_startup()
函數,經過一系列初始化(如MMU等),最終新的被喚醒的CPU將調用smp_operations
的smp_secondary_init()
的成員函數,對於本例為versatile_secondary_init()
;
void __cpuinit versatile_secondary_init(unsigned int cpu)
{
/*
* let the primary processor know we're out of the
* pen, then head off into the C entry point
*/
write_pen_release(-1);
/*
* Synchronise with the boot thread.
*/
spin_lock(&boot_lock);
spin_unlock(&boot_lock);
}
上述代碼會將pen_release
寫為-1,於是CPU0還在執行代碼的versatile_boot_secondary()
函數中的如下迴圈就退出了:
timeout = jiffies + (1 * HZ);
while (time_before(jiffies, timeout)) {
smp_rmb();
if (pen_release == -1)
break;
udelay(10);
}
這樣CPU0就知道目標CPU已經被正確地喚醒,此後CPU0和新喚醒的其他CPU各自運行。整個系統在運行過程中會進行實時進程和正常進程的動態負載均衡。
下圖總結了前文提到的vexpress_smp_prepare_cpus()
、versatile_boot_secondary()
、write_pen_release()
、versatile_secondary_startup()
、versatile_secondary_init()
這些函數的執行順序;