emm,又又遇到問題啦,現有業務系統應用上線存在視窗期,不能滿足正常任務迭代上線。在非視窗期上線容易導致資料庫、mq、jsf等線程中斷,進而導致需要手動修單問題。故而通過添加優雅停機功能進行優化,令其在上線前選擇優雅停機後,會優先斷掉新流量的涌入,並預留一定時間處理現存連接,最後完全下線,可有效擴大... ...
1.前言
emm,又又遇到問題啦,現有業務系統應用上線存在視窗期,不能滿足正常任務迭代上線。在非視窗期上線容易導致資料庫、mq、jsf等線程中斷,進而導致需要手動修單問題。故而通過添加優雅停機功能進行優化,令其在上線前選擇優雅停機後,會優先斷掉新流量的涌入,並預留一定時間處理現存連接,最後完全下線,可有效擴大上線預留視窗時間並降低上線期間線程中斷,進而降低手動修單。可是什麼是優雅停機呢?為什麼現有的系統技術沒有原生的優雅停機機制呢?通過調研整理文章如下。
2.何為優雅停機?
• 優雅停機是指為確保應用關閉時,通知應用進程釋放所占用的資源。
• 線程池
,shutdown(不接受新任務等待處理完)還是shutdownNow(調用Thread.interrupt進行中斷)。
• socket鏈接,比如:netty、jmq、fmq
。(需要著重處理)
• 告知註冊中心快速下線,比如jsf
。(需要著重處理)
• 清理臨時文件。
• 各種堆內堆外記憶體釋放。
總之,進程強行終止會帶來數據丟失或者終端無法恢復到正常狀態,在分散式環境下可能導致數據不一致的情況。
3.導致優雅停機不優雅的元凶之-kill命令
• kill指令
◦ kill -15 :kill指令預設就是-15,知識發送一個SIGTERM
信號通知進程終止,由進程自行決定
怎麼做,即進程不一定終止。一般不直接使用kill -15,不一定能夠終止進程。
◦ kill -9:強制終止進程,進程會被立刻終止。kill -9 過於暴力,往往會出現事務執行、業務處理中斷的情況,導致資料庫中存在臟數據
、系統中存在殘留文件等情況。如果要使用kill -9,儘量先使用kill -15給進程一個處理善後的機會。該命令可以模擬一次系統宕機,系統斷電等極端情況。
◦ kill -2:類似Ctrl + C退出,會先保存相關數據再終止進程。kill -2立刻終止正在執行的代碼
->保存數據
->終止進程
,只是在進程終止之前會保存相關數據,依然會出現事務執行、業務處理中斷的情況,做不到優雅停機。
4.引申問題:jvm如何接受處理linux信號量的?
• 在jvm啟動時就載入了自定義SingalHandler
,關閉jvm時觸發對應的handle。
public interface SignalHandler {
SignalHandler SIG_DFL = new NativeSignalHandler(0L);
SignalHandler SIG_IGN = new NativeSignalHandler(1L);
void handle(Signal var1);
}
class Terminator {
private static SignalHandler handler = null;
Terminator() {
}
//jvm設置SignalHandler,在System.initializeSystemClass中觸發
static void setup() {
if (handler == null) {
SignalHandler var0 = new SignalHandler() {
public void handle(Signal var1) {
Shutdown.exit(var1.getNumber() + 128);//調用Shutdown.exit
}
};
handler = var0;
try {
Signal.handle(new Signal("INT"), var0);//中斷時
} catch (IllegalArgumentException var3) {
}
try {
Signal.handle(new Signal("TERM"), var0);//終止時
} catch (IllegalArgumentException var2) {
}
}
}
}
• Runtime.addShutdownHook。在瞭解Shutdown.exit
之前,先看Runtime.getRuntime().addShutdownHook(shutdownHook)
;則是為jvm中增加一個關閉的鉤子,當jvm關閉
的時候調用。
public class Runtime {
public void addShutdownHook(Thread hook) {
SecurityManager sm = System.getSecurityManager();
if (sm != null) {
sm.checkPermission(new RuntimePermission("shutdownHooks"));
}
ApplicationShutdownHooks.add(hook);
}
}
class ApplicationShutdownHooks {
/* The set of registered hooks */
private static IdentityHashMap<Thread, Thread> hooks;
static synchronized void add(Thread hook) {
if(hooks == null)
throw new IllegalStateException("Shutdown in progress");
if (hook.isAlive())
throw new IllegalArgumentException("Hook already running");
if (hooks.containsKey(hook))
throw new IllegalArgumentException("Hook previously registered");
hooks.put(hook, hook);
}
}
//它含數據結構和邏輯管理虛擬機關閉序列
class Shutdown {
/* Shutdown 系列狀態*/
private static final int RUNNING = 0;
private static final int HOOKS = 1;
private static final int FINALIZERS = 2;
private static int state = RUNNING;
/* 是否應該運行所以finalizers來exit? */
private static boolean runFinalizersOnExit = false;
// 系統關閉鉤子註冊一個預定義的插槽.
// 關閉鉤子的列表如下:
// (0) Console restore hook
// (1) Application hooks
// (2) DeleteOnExit hook
private static final int MAX_SYSTEM_HOOKS = 10;
private static final Runnable[] hooks = new Runnable[MAX_SYSTEM_HOOKS];
// 當前運行關閉鉤子的鉤子的索引
private static int currentRunningHook = 0;
/* 前面的靜態欄位由這個鎖保護 */
private static class Lock { };
private static Object lock = new Lock();
/* 為native halt方法提供鎖對象 */
private static Object haltLock = new Lock();
static void add(int slot, boolean registerShutdownInProgress, Runnable hook) {
synchronized (lock) {
if (hooks[slot] != null)
throw new InternalError("Shutdown hook at slot " + slot + " already registered");
if (!registerShutdownInProgress) {//執行shutdown過程中不添加hook
if (state > RUNNING)//如果已經在執行shutdown操作不能添加hook
throw new IllegalStateException("Shutdown in progress");
} else {//如果hooks已經執行完畢不能再添加hook。如果正在執行hooks時,添加的槽點小於當前執行的槽點位置也不能添加
if (state > HOOKS || (state == HOOKS && slot <= currentRunningHook))
throw new IllegalStateException("Shutdown in progress");
}
hooks[slot] = hook;
}
}
/* 執行所有註冊的hooks
*/
private static void runHooks() {
for (int i=0; i < MAX_SYSTEM_HOOKS; i++) {
try {
Runnable hook;
synchronized (lock) {
// acquire the lock to make sure the hook registered during
// shutdown is visible here.
currentRunningHook = i;
hook = hooks[i];
}
if (hook != null) hook.run();
} catch(Throwable t) {
if (t instanceof ThreadDeath) {
ThreadDeath td = (ThreadDeath)t;
throw td;
}
}
}
}
/* 關閉JVM的操作
*/
static void halt(int status) {
synchronized (haltLock) {
halt0(status);
}
}
//JNI方法
static native void halt0(int status);
// shutdown的執行順序:runHooks > runFinalizersOnExit
private static void sequence() {
synchronized (lock) {
/* Guard against the possibility of a daemon thread invoking exit
* after DestroyJavaVM initiates the shutdown sequence
*/
if (state != HOOKS) return;
}
runHooks();
boolean rfoe;
synchronized (lock) {
state = FINALIZERS;
rfoe = runFinalizersOnExit;
}
if (rfoe) runAllFinalizers();
}
//Runtime.exit時執行,runHooks > runFinalizersOnExit > halt
static void exit(int status) {
boolean runMoreFinalizers = false;
synchronized (lock) {
if (status != 0) runFinalizersOnExit = false;
switch (state) {
case RUNNING: /* Initiate shutdown */
state = HOOKS;
break;
case HOOKS: /* Stall and halt */
break;
case FINALIZERS:
if (status != 0) {
/* Halt immediately on nonzero status */
halt(status);
} else {
/* Compatibility with old behavior:
* Run more finalizers and then halt
*/
runMoreFinalizers = runFinalizersOnExit;
}
break;
}
}
if (runMoreFinalizers) {
runAllFinalizers();
halt(status);
}
synchronized (Shutdown.class) {
/* Synchronize on the class object, causing any other thread
* that attempts to initiate shutdown to stall indefinitely
*/
sequence();
halt(status);
}
}
//shutdown操作,與exit不同的是不做halt操作(關閉JVM)
static void shutdown() {
synchronized (lock) {
switch (state) {
case RUNNING: /* Initiate shutdown */
state = HOOKS;
break;
case HOOKS: /* Stall and then return */
case FINALIZERS:
break;
}
}
synchronized (Shutdown.class) {
sequence();
}
}
}
5.Spring 中是如何實現優雅停機的?
• 以Spring3.2.12在spring
中通過ContexClosedEvent
事件來觸發一些動作,主要通過LifecycleProcessor.onClose
來做stopBeans
。由此可見spring
也基於jvm
做了擴展。
public abstract class AbstractApplicationContext extends DefaultResourceLoader {
public void registerShutdownHook() {
if (this.shutdownHook == null) {
// No shutdown hook registered yet.
this.shutdownHook = new Thread() {
@Override
public void run() {
doClose();
}
};
Runtime.getRuntime().addShutdownHook(this.shutdownHook);
}
}
protected void doClose() {
boolean actuallyClose;
synchronized (this.activeMonitor) {
actuallyClose = this.active && !this.closed;
this.closed = true;
}
if (actuallyClose) {
if (logger.isInfoEnabled()) {
logger.info("Closing " + this);
}
LiveBeansView.unregisterApplicationContext(this);
try {
//發佈應用內的關閉事件
publishEvent(new ContextClosedEvent(this));
}
catch (Throwable ex) {
logger.warn("Exception thrown from ApplicationListener handling ContextClosedEvent", ex);
}
// 停止所有的Lifecycle beans.
try {
getLifecycleProcessor().onClose();
}
catch (Throwable ex) {
logger.warn("Exception thrown from LifecycleProcessor on context close", ex);
}
// 銷毀spring 的 BeanFactory可能會緩存單例的 Bean.
destroyBeans();
// 關閉當前應用上下文(BeanFactory)
closeBeanFactory();
// 執行子類的關閉邏輯
onClose();
synchronized (this.activeMonitor) {
this.active = false;
}
}
}
}
public interface LifecycleProcessor extends Lifecycle {
/**
* Notification of context refresh, e.g. for auto-starting components.
*/
void onRefresh();
/**
* Notification of context close phase, e.g. for auto-stopping components.
*/
void onClose();
}
6.SpringBoot是如何做到優雅停機的?
• 優雅停機是springboot
的特性之一,在收到終止信號後,不再接受、處理新請求,但會在終止進程之前預留一小段緩衝時間,已完成正在處理的請求。註:優雅停機需要在tomcat的9.0.33及其之後的版本才支持
。
• springboot
中有spring-boot-starter-actuator
模塊提供了一個restful
介面,用於優雅停機。執行請求curl -X POST http://127.0.0.1:8088/shutdown
。待關閉成功則返回提示。註:線上環境url需要設置許可權,可配合spring-security使用火災nginx限制內網訪問``。
#啟用shutdown
endpoints.shutdown.enabled=true
#禁用密碼驗證
endpoints.shutdown.sensitive=false
#可統一指定所有endpoints的路徑
management.context-path=/manage
#指定管理埠和IP
management.port=8088
management.address=127.0.0.1
#開啟shutdown的安全驗證(spring-security)
endpoints.shutdown.sensitive=true
#驗證用戶名
security.user.name=admin
#驗證密碼
security.user.password=secret
#角色
management.security.role=SUPERUSER
• springboot
的shutdown
通過調用AbstractApplicationContext.close
實現的。
@ConfigurationProperties(
prefix = "endpoints.shutdown"
)
public class ShutdownMvcEndpoint extends EndpointMvcAdapter {
public ShutdownMvcEndpoint(ShutdownEndpoint delegate) {
super(delegate);
}
//post請求
@PostMapping(
produces = {"application/vnd.spring-boot.actuator.v1+json", "application/json"}
)
@ResponseBody
public Object invoke() {
return !this.getDelegate().isEnabled() ? new ResponseEntity(Collections.singletonMap("message", "This endpoint is disabled"), HttpStatus.NOT_FOUND) : super.invoke();
}
}
@ConfigurationProperties(
prefix = "endpoints.shutdown"
)
public class ShutdownEndpoint extends AbstractEndpoint<Map<String, Object>> implements ApplicationContextAware {
private static final Map<String, Object> NO_CONTEXT_MESSAGE = Collections.unmodifiableMap(Collections.singletonMap("message", "No context to shutdown."));
private static final Map<String, Object> SHUTDOWN_MESSAGE = Collections.unmodifiableMap(Collections.singletonMap("message", "Shutting down, bye..."));
private ConfigurableApplicationContext context;
public ShutdownEndpoint() {
super("shutdown", true, false);
}
//執行關閉
public Map<String, Object> invoke() {
if (this.context == null) {
return NO_CONTEXT_MESSAGE;
} else {
boolean var6 = false;
Map var1;
class NamelessClass_1 implements Runnable {
NamelessClass_1() {
}
public void run() {
try {
Thread.sleep(500L);
} catch (InterruptedException var2) {
Thread.currentThread().interrupt();
}
//這個調用的就是AbstractApplicationContext.close
ShutdownEndpoint.this.context.close();
}
}
try {
var6 = true;
var1 = SHUTDOWN_MESSAGE;
var6 = false;
} finally {
if (var6) {
Thread thread = new Thread(new NamelessClass_1());
thread.setContextClassLoader(this.getClass().getClassLoader());
thread.start();
}
}
Thread thread = new Thread(new NamelessClass_1());
thread.setContextClassLoader(this.getClass().getClassLoader());
thread.start();
return var1;
}
}
}
7.知識拓展之Tomcat和Spring的關係?
通過參與雲工廠優雅停機重構發現Tomcat
和Spring
均存在問題,故而查詢探究兩者之間。
• Tomcat
和jettey
是HTTP伺服器和Servlet容器,負責給類似Spring這種servlet提供一個運行的環境,其中:Http伺服器與Servlet容器的功能界限是:可以把HTTP伺服器想象成前臺
的接待,負責網路通信和解析請求,Servlet容器是業務
部門,負責處理業務請求。
• Tomcat和Servlet作為Web伺服器和Servlet容器的結合,可以接受網路http請求解析為Servlet規範的請求對象和響應對象。比如,HttpServletRequest對象是Tomcat提供的,Servlet是規範,Tomcat是實現規範的Servlet容器,SpringMVC是處理Servlet請求的應用,其中DispatcherServlet實現了Servlet介面,Tomcat負責載入和調用DispatcherServlet。同時,DispatcherServlet有自己的容器(SpringMVC)容器,這個容器負責管理SpringMVC相關的bean,比如Controler和ViewResolver等。同時,Spring中還有其他的Bean比如Service和DAO等,這些由全局的Spring IOC容器管理,因此,Spring有兩個IOC容器。
• 如果只是使用spring(不包含springmvc),那麼是tomcat容器解析xml文件,通過反射實例化對應的類,根據這些servlet規範實現類,觸發對應的代碼處理邏輯,這個時候tomcat負責http報文的解析和servlet調度的工作。
• 如果使用spring mvc,那麼tomcat只是解析http報文,然後將其轉發給dispatchsetvlet,然後由springmvc根據其配置,實例對應的類,執行對應的邏輯,然後返回結果給dispatchservlet,最後由它轉發給tomcat,由tomcat負責構建http報文數據。
8.實戰演練
• mq
(jmq、fmq
)通過添加hook
在停機時調用pause
先停止該應用的消費,防止出現上線期間mq
中線程池的線程中斷
的情況發生。
/**
* @ClassName ShutDownHook
* @Description
* @Date 2022/10/28 17:47
**/
@Component
@Slf4j
public class ShutDownHook {
@Value("${shutdown.waitTime:10}")
private int waitTime;
@Resource
com.jdjr.fmq.client.consumer.MessageConsumer fmqMessageConsumer;
@Resource
com.jd.jmq.client.consumer.MessageConsumer jmqMessageConsumer;
@PreDestroy
public void destroyHook() {
try {
log.info("ShutDownHook destroy");
jmqMessageConsumer.pause();
fmqMessageConsumer.pause();
int i = 0;
while (i < waitTime) {
try {
Thread.sleep(1000);
log.info("距離服務關停還有{}秒", waitTime - i++);
} catch (Throwable e) {
log.error("異常", e);
}
}
} catch (Throwable e) {
log.error("異常", e);
}
}
}
• 在優雅停機時需要先把jsf
生產者下線,並預留一定時間消費完畢,行雲部署有相關stop.sh腳本,項目中通過在shutdown中編寫方法實現。
jsf啟停分析
:見京東內部cf文檔;
@Component
@Lazy(value = false)
public class ShutDown implements ApplicationContextAware {
private static Logger logger = LoggerFactory.getLogger(ShutDown.class);
@Value("${shutdown.waitTime:60}")
private int waitTime;
@Resource
com.jdjr.fmq.client.consumer.MessageConsumer fmqMessageConsumer;
@PostConstruct
public void init() {
logger.info("ShutDownHook init");
}
private ApplicationContext applicationContext = null;
@PreDestroy
public void destroyHook() {
try {
logger.info("ShutDownHook destroy");
destroyJsfProvider();
fmqMessageConsumer.pause();
int i = 0;
while (i < waitTime) {
try {
Thread.sleep(1000);
logger.info("距離服務關停還有{}秒", waitTime - i++);
} catch (Throwable e) {
logger.error("異常", e);
}
}
} catch (Throwable e) {
logger.error("異常", e);
}
}
private void destroyJsfProvider() {
logger.info("關閉所有JSF生產者");
if (null != applicationContext) {
String[] providerBeanNames = applicationContext.getBeanNamesForType(ProviderBean.class);
for (String name : providerBeanNames) {
try {
logger.info("嘗試關閉JSF生產者" + name);
ProviderBean bean=(ProviderBean)applicationContext.getBean(name);
bean.destroy();
logger.info("關閉JSF生產者" + name + "成功");
} catch (BeanCreationNotAllowedException re){
logger.error("JSF生產者" + name + "未初始化,忽略");
} catch (Exception e) {
logger.error("關閉JSF生產者失敗", e);
}
}
}
logger.info("所有JSF生產者已關閉");
}
@Override
public void setApplicationContext(ApplicationContext applicationContext) throws BeansException {
this.applicationContext = applicationContext;
((AbstractApplicationContext)applicationContext).registerShutdownHook();
}
}
• absfactory-base-custcenter
應用優雅停機出現日誌無法列印問題,排查定位發現問題如下:通過本地debug發現優雅停機先銷毀logback
日誌列印線程,導致實際倒計時的日誌無法列印。
<!-- fix-程式關停時,logback先銷毀的問題-->
<context-param>
<param-name>logbackDisableServletContainerInitializer</param-name>
<param-value>true</param-value>
</context-param>
9.總結
現有的springboot內置Tomcat能通過配置參數達到優雅停機的效果。但是因為業務系統中的代碼中存在多種技術交叉應用,針對Tomcat和springmvc不同的應用確實需要花費時間研究底層原理來編寫相關類實現同springboot配置參數托管的效果。
作者:京東科技 宋慧超
來源:京東雲開發者社區