title: msp430點燈實驗 date: 2023-04-15 15:31:25 description: 基於msp430f5529點燈實驗 一、實驗內容 使用開發板:msp430f5529 使用的LED燈:為開發板上自帶的User LEDs(LED1、LED2) 環境:CCS (Versi ...
1. 說明 1> 本篇是實際工作中linux上碰到的一個問題,一個使用了CGroup的進程處於R狀態但不執行,也不退出,還不能kill,經過深入挖掘才發現是Cgroup的內核bug 2>發現該bug後,去年給RedHat提交過漏洞,但可惜並未通過,不知道為什麼,這裡就發我博客公開了 3> 前面的2個帖子《極簡cfs公平調度演算法》《極簡組調度-CGroup如何限制cpu》是為了瞭解本篇這個內核bug而寫的,需要linux內核進程調度和CGroup控制的基本原理才能夠比較清晰的瞭解這個內核bug的來龍去脈 4> 本文所用的內核調試工具是crash,大家可以到官網上去查看crash命令的使用,這裡就不多介紹了 https://crash-utility.github.io/help.html 2. 問題 2.1 觸發bug code(code較長,請展開代碼) 2.1.1 code
![](https://images.cnblogs.com/OutliningIndicators/ContractedBlock.gif)
#include <iostream> #include <sys/types.h> #include <signal.h> #include <stdio.h> #include <stdlib.h> #include <unistd.h> #include <errno.h> #include <sys/stat.h> #include <pthread.h> #include <sys/time.h> #include <string> using namespace std; std::string sub_cgroup_dir("/sys/fs/cgroup/cpu/test"); // common lib bool is_dir(const std::string& path) { struct stat statbuf; if (stat(path.c_str(), &statbuf) == 0 ) { if (0 != S_ISDIR(statbuf.st_mode)) { return true; } } return false; } bool write_file(const std::string& file_path, int num) { FILE* fp = fopen(file_path.c_str(), "w"); if (fp = NULL) { return false; } std::string write_data = to_string(num); fputs(write_data.c_str(), fp); fclose(fp); return true; } // ms long get_ms_timestamp() { timeval tv; gettimeofday(&tv, NULL); return (tv.tv_sec * 1000 + tv.tv_usec / 1000); } // cgroup bool create_cgroup() { if (is_dir(sub_cgroup_dir) == false) { if (mkdir(sub_cgroup_dir.c_str(), S_IRWXU | S_IRGRP) != 0) { cout << "mkdir cgroup dir fail" << endl; return false; } } int pid = getpid(); cout << "pid is " << pid << endl; std::string procs_path = sub_cgroup_dir + "/cgroup.procs"; return write_file(procs_path, pid); } bool set_period(int period) { std::string period_path = sub_cgroup_dir + "/cpu.cfs_period_us"; return write_file(period_path, period); } bool set_quota(int quota) { std::string quota_path = sub_cgroup_dir + "/cpu.cfs_quota_us"; return write_file(quota_path, quota); } // thread // param: ms interval void* thread_func(void* param) { int i = 0; int interval = (long)param; long last = get_ms_timestamp(); while (true) { i++; if (i % 1000 != 0) { continue; } long current = get_ms_timestamp(); if ((current - last) >= interval) { usleep(1000); last = current; } } pthread_exit(NULL); } void test_thread() { const int k_thread_num = 10; pthread_t pthreads[k_thread_num]; for (int i = 0; i < k_thread_num; i++) { if (pthread_create(&pthreads[i], NULL, thread_func, (void*)(i + 1)) != 0) { cout << "create thread fail" << endl; } else { cout << "create thread success,tid is " << pthreads[i] << endl; } } } //argv[0] : period //argv[1] : quota int main(int argc,char* argv[]) { if (argc <3) { cout << "usage : ./inactive timer $period $quota" << endl; return -1; } int period = stoi(argv[1]); int quota = stoi(argv[2]); cout << "period is " << period << endl; cout << "quota is " << quota << endl; test_thread(); if (create_cgroup() == false) { cout << "create cgroup fail" << endl; return -1; } int i =0; while (true) { if (i > 20) { i = 0; } i++; long current = get_ms_timestamp(); long last = current; while ((current - last) < i) { usleep(1000); current = get_ms_timestamp(); } set_period(period); set_quota(quota); } return 0; }View Code
2.1.2 編譯
g++ -std=c++11 -lpthread trigger_cgroup_timer_inactive.cpp -o inactive_timer
2.1.3 在CentOS7.0~7.5的系統上執行程式
./inactive_timer 100000 10000
2.1.4 上述代碼主要幹了2件事 1> 將自己進程設置為CGroup控制cpu 2> 反覆設置CGroup的cpu.cfs_period_us和cpu.cfs_quota_us 3> 起10個線程消耗cpu 2.1.5《極簡組調度-CGroup如何限制cpu》已經講過CGroup限制cpu的原理: CGroup控制cpu是通過cfs_period_us指定的一個時間周期內,CGroup下的進程,能使用cfs_quota_us時間長度的cpu,如果在該周期內使用的cpu超過了cfs_quota_us設定的值,則將其throttled,即將其從公平調度運行隊列中移出,然後等待定時器觸發下個周期unthrottle後再移入,從而達到控制cpu的效果。 2.2 現象
![](https://img2023.cnblogs.com/blog/818872/202304/818872-20230415173929251-1217747819.png)
![](https://img2023.cnblogs.com/blog/818872/202304/818872-20230415173929248-1384460606.png)
![](https://img2023.cnblogs.com/blog/818872/202304/818872-20230415173929346-389924953.png)
![](https://img2023.cnblogs.com/blog/818872/202304/818872-20230415173929347-1958393815.png)
![](https://img2023.cnblogs.com/blog/818872/202304/818872-20230415173929273-1926683778.png)
![](https://img2023.cnblogs.com/blog/818872/202304/818872-20230415173929311-288335838.png)
![](https://img2023.cnblogs.com/blog/818872/202304/818872-20230415173929373-1384389752.png)
tg_set_cfs_quota() tg_set_cfs_bandwidth() /* restart the period timer (if active) to handle new period expiry */ if (runtime_enabled && cfs_b->timer_active) { /* force a reprogram */ cfs_b->timer_active = 0; __start_cfs_bandwidth(cfs_b); }仔細觀察上述代碼,設想如下場景: 1> 線上程A設置CGroup的quota或者period時,將cfs_b->timer_active設為0,調用_start_cfs_bandwidth()後,在未執行到__start_cfs_bandwidth()代碼580行hrtimer_cancel()之前,cpu切換到B線程 2> 線程B也調用__start_cfs_bandwidth(),執行完後將cfs_b->timer_active設為1,並調用start_bandwidth_timer()激活timer,此時cpu切換到線程A 3> 線程A恢復並繼續執行,調用hrtimer_cancel()讓period_timer失效,然後執行到__start_cfs_bandwidth()代碼585行後,發現cfs_b->timer_active為1,直接return,而不再將period_timer激活
![](https://img2023.cnblogs.com/blog/818872/202304/818872-20230415173929312-596333791.png)
![](https://img2023.cnblogs.com/blog/818872/202304/818872-20230415190752064-12523573.png)
2> 當定時器失效後,由於3.2中線程B將cfs_b->timer_active = 1,所以即使下次時鐘中斷執行到assign_cfs_rq_runtime()中時,由於誤判timer是active的,也不會調用__start_cfs_bandwidth()再次激活timer,這樣被throttle的group se永遠不會被unthrottle投入rq調度了
![](https://img2023.cnblogs.com/blog/818872/202304/818872-20230415173929275-317826516.png)
![](https://img2023.cnblogs.com/blog/818872/202304/818872-20230415173929292-1442914109.png)