zabbix wait for 15s seconds 出現原因及調優建議

在監控設備的時候，在server端的日誌中有時候會見到類似another network error, wait for 15s seconds的異常，今天我們看下這個問題的出現原因和解決方案:問題定位到poller.c，看下下麵兩份代碼:這個get_values的部分代碼: 這裡是zbx_deac ...

在監控設備的時候，在server端的日誌中有時候會見到類似another network error, wait for 15s seconds的異常，今天我們看下這個問題的出現原因和解決方案:

問題定位到poller.c，看下下麵兩份代碼:

這個get_values的部分代碼:

    for (i = 0; i < num; i++)
    {
        switch (errcodes[i])
        {
            case SUCCEED:
            case NOTSUPPORTED:
            case AGENT_ERROR:
                if (HOST_AVAILABLE_TRUE != last_available)
                {
                    zbx_activate_item_host(&items[i], &timespec);
                    last_available = HOST_AVAILABLE_TRUE;
                }
                break;
            case NETWORK_ERROR:
            case GATEWAY_ERROR:
            case TIMEOUT_ERROR:
                if (HOST_AVAILABLE_FALSE != last_available)
                {
                    zbx_deactivate_item_host(&items[i], &timespec, results[i].msg);
                    last_available = HOST_AVAILABLE_FALSE;
                }
                break;
            case CONFIG_ERROR:
                /* nothing to do */
                break;
            default:
                zbx_error("unknown response code returned: %d", errcodes[i]);
                THIS_SHOULD_NEVER_HAPPEN;
        }

這裡是zbx_deactivate_item_host的代碼:

void    zbx_deactivate_item_host(DC_ITEM *item, zbx_timespec_t *ts, const char *error)                       //   #0
{
    const char        *__function_name = "zbx_deactivate_item_host";
    zbx_host_availability_t    in, out;                                                                         //   #1
    unsigned char        agent_type;                                                                          //   #2

    zabbix_log(LOG_LEVEL_DEBUG, "In %s() hostid:" ZBX_FS_UI64 " itemid:" ZBX_FS_UI64 " type:%d",             //   #3
            __function_name, item->host.hostid, item->itemid, (int)item->type);

    zbx_host_availability_init(&in, item->host.hostid);                                                      //   #4
    zbx_host_availability_init(&out,item->host.hostid);                                                      //   #5

    if (ZBX_AGENT_UNKNOWN == (agent_type = host_availability_agent_by_item_type(item->type)))                //   #6
        goto out;

    if (FAIL == host_get_availability(&item->host, agent_type, &in))                                         //   #7
        goto out;

    if (FAIL == DChost_deactivate(item->host.hostid, agent_type, ts, &in.agents[agent_type],                 //   #8
            &out.agents[agent_type], error))
    {
        goto out;
    }

    if (FAIL == db_host_update_availability(&out))                                                           //   #9
        goto out;

    host_set_availability(&item->host, agent_type, &out);                                                    //   #10

    if (0 == in.agents[agent_type].errors_from)                                                              //   #11
    {
        zabbix_log(LOG_LEVEL_WARNING, "%s item \"%s\" on host \"%s\" failed:"                                //   #12
                " first network error, wait for %d seconds",
                zbx_agent_type_string(item->type), item->key_orig, item->host.host,
                out.agents[agent_type].disable_until - ts->sec);
    }
    else
    {
        if (HOST_AVAILABLE_FALSE != in.agents[agent_type].available)                                         //   #13
        {
            if (HOST_AVAILABLE_FALSE != out.agents[agent_type].available)                                    //   #14
            {
                zabbix_log(LOG_LEVEL_WARNING, "%s item \"%s\" on host \"%s\" failed:"                        //   #15
                        " another network error, wait for %d seconds",
                        zbx_agent_type_string(item->type), item->key_orig, item->host.host,
                        out.agents[agent_type].disable_until - ts->sec);
            }
            else
            {
                zabbix_log(LOG_LEVEL_WARNING, "temporarily disabling %s checks on host \"%s\":"              //   #16
                        " host unavailable",
                        zbx_agent_type_string(item->type), item->host.host);
            }
        }
    }

    zabbix_log(LOG_LEVEL_DEBUG, "%s() errors_from:%d available:%d", __function_name,
            out.agents[agent_type].errors_from, out.agents[agent_type].available);
out:
    zbx_host_availability_clean(&out);
    zbx_host_availability_clean(&in);

    zabbix_log(LOG_LEVEL_DEBUG, "End of %s()", __function_name);
}

下麵看下這裡是zbx_deactivate_item_host的代碼的邏輯：

#0 zbx_deactivate_item_host函數接收三個參數

        1 結構體指針，主機的一些綜合參數
            //dbcache.h 
            typedef struct
            {
                DC_HOST            host;
                DC_INTERFACE        interface;
                zbx_uint64_t        itemid;
                zbx_uint64_t        lastlogsize;
                zbx_uint64_t        valuemapid;
                unsigned char        type;
                unsigned char        value_type;
                unsigned char        state;
                unsigned char        snmpv3_securitylevel;
                unsigned char        authtype;
                unsigned char        flags;
                unsigned char        snmpv3_authprotocol;
                unsigned char        snmpv3_privprotocol;
                unsigned char        inventory_link;
                unsigned char        status;
                unsigned char        history;
                unsigned char        trends;
                unsigned char        follow_redirects;
                unsigned char        post_type;
                unsigned char        retrieve_mode;
                unsigned char        request_method;
                unsigned char        output_format;
                unsigned char        verify_peer;
                unsigned char        verify_host;
                unsigned char        allow_traps;
                char            key_orig[ITEM_KEY_LEN * ZBX_MAX_BYTES_IN_UTF8_CHAR + 1], *key;
                char            *units;
                char            *delay;
                int            history_sec;
                int            nextcheck;
                int            lastclock;
                int            mtime;
                char            trapper_hosts[ITEM_TRAPPER_HOSTS_LEN_MAX];
                char            logtimefmt[ITEM_LOGTIMEFMT_LEN_MAX];
                char            snmp_community_orig[ITEM_SNMP_COMMUNITY_LEN_MAX], *snmp_community;
                char            snmp_oid_orig[ITEM_SNMP_OID_LEN_MAX], *snmp_oid;
                char            snmpv3_securityname_orig[ITEM_SNMPV3_SECURITYNAME_LEN_MAX], *snmpv3_securityname;
                char            snmpv3_authpassphrase_orig[ITEM_SNMPV3_AUTHPASSPHRASE_LEN_MAX], *snmpv3_authpassphrase;
                char            snmpv3_privpassphrase_orig[ITEM_SNMPV3_PRIVPASSPHRASE_LEN_MAX], *snmpv3_privpassphrase;
                char            ipmi_sensor[ITEM_IPMI_SENSOR_LEN_MAX];
                char            *params;
                char            username_orig[ITEM_USERNAME_LEN_MAX], *username;
                char            publickey_orig[ITEM_PUBLICKEY_LEN_MAX], *publickey;
                char            privatekey_orig[ITEM_PRIVATEKEY_LEN_MAX], *privatekey;
                char            password_orig[ITEM_PASSWORD_LEN_MAX], *password;
                char            snmpv3_contextname_orig[ITEM_SNMPV3_CONTEXTNAME_LEN_MAX], *snmpv3_contextname;
                char            jmx_endpoint_orig[ITEM_JMX_ENDPOINT_LEN_MAX], *jmx_endpoint;
                char            timeout_orig[ITEM_TIMEOUT_LEN_MAX], *timeout;
                char            url_orig[ITEM_URL_LEN_MAX], *url;
                char            query_fields_orig[ITEM_QUERY_FIELDS_LEN_MAX], *query_fields;
                char            *posts;
                char            status_codes_orig[ITEM_STATUS_CODES_LEN_MAX], *status_codes;
                char            http_proxy_orig[ITEM_HTTP_PROXY_LEN_MAX], *http_proxy;
                char            *headers;
                char            ssl_cert_file_orig[ITEM_SSL_CERT_FILE_LEN_MAX], *ssl_cert_file;
                char            ssl_key_file_orig[ITEM_SSL_KEY_FILE_LEN_MAX], *ssl_key_file;
                char            ssl_key_password_orig[ITEM_SSL_KEY_PASSWORD_LEN_MAX], *ssl_key_password;
                char            *error;
            }
            DC_ITEM;
        2 結構體指針
            //common.h
            typedef struct
            {
                int    sec;    /* seconds */
                int    ns;    /* nanoseconds */
            }
            zbx_timespec_t;
            
        3 錯誤信息

#1 定義了兩個結構體數組 in 和 out

        //db.h
        typedef struct
        {
            /* flags specifying which fields are set, see ZBX_FLAGS_AGENT_STATUS_* defines */
            unsigned char    flags;

            /* agent availability fields */
            unsigned char    available;
            char        *error;
            int        errors_from;
            int        disable_until;
        }
        zbx_agent_availability_t;

        typedef struct
        {
            zbx_uint64_t            hostid;

            zbx_agent_availability_t    agents[ZBX_AGENT_MAX];         //這裡的ZBX_AGENT_MAX 為4 ，分別代表ZABBIX, SNMP, IPMI, JMX4種類型
        }
        zbx_host_availability_t;

#2 聲明unsigned char agent_type，unsigned char和char的區別是char表示-128-127，unsigned char 表示0-255，這裡的255會在後面遇到，所以需要255的這個表示範圍

#3 記錄DEBUG 的log，如果需要顯示這份日誌，需要將server端的配置文件debug等級更改為5，不過我不建議你這麼做

#4 初始化主機IN可用性數據

    //dbconfig.c
    void    zbx_host_availability_init(zbx_host_availability_t *availability, zbx_uint64_t hostid)
    {
        memset(availability, 0, sizeof(zbx_host_availability_t));
        availability->hostid = hostid;
    }

#5 同#4一樣，只不過是OUT

#6 為agent_type賦值，如果agent_type不屬於#1中的四種，跳至out處

    1、host_availability_agent_by_item_type 位於poller.c，接收item的type欄位，用來判斷監控類型
    //poller.c
        static unsigned char    host_availability_agent_by_item_type(unsigned char type)
        {
            switch (type)
            {
                case ITEM_TYPE_ZABBIX:
                    return ZBX_AGENT_ZABBIX;
                    break;
                case ITEM_TYPE_SNMPv1:
                case ITEM_TYPE_SNMPv2c:
                case ITEM_TYPE_SNMPv3:
                    return ZBX_AGENT_SNMP;
                    break;
                case ITEM_TYPE_IPMI:
                    return ZBX_AGENT_IPMI;
                    break;
                case ITEM_TYPE_JMX:
                    return ZBX_AGENT_JMX;
                    break;
                default:
                    return ZBX_AGENT_UNKNOWN;
            }
        }
    2、ZBX_AGENT_UNKNOWN 常量 為 255 對應之前的 #2

#7 根據agent_type來判斷主機的可用性，網路設備會匹配到ZBX_AGENT_SNMP，四個值分別代表的意思是

    //poller.c
    static int    host_get_availability(const DC_HOST *dc_host, unsigned char agent, zbx_host_availability_t *ha)
    {
        zbx_agent_availability_t    *availability = &ha->agents[agent];

        availability->flags = ZBX_FLAGS_AGENT_STATUS;

        switch (agent)
        {
            case ZBX_AGENT_ZABBIX:
                availability->available = dc_host->available;
                availability->error = zbx_strdup(NULL, dc_host->error);
                availability->errors_from = dc_host->errors_from;
                availability->disable_until = dc_host->disable_until;
                break;
            case ZBX_AGENT_SNMP:
                availability->available = dc_host->snmp_available;  //主機的snmp可用狀態
                availability->error = zbx_strdup(NULL, dc_host->snmp_error);  //錯誤信息
                availability->errors_from = dc_host->snmp_errors_from;      //錯誤發生時間
                availability->disable_until = dc_host->snmp_disable_until;  //下次延遲檢測時間
                break;
            case ZBX_AGENT_IPMI:
                availability->available = dc_host->ipmi_available;
                availability->error = zbx_strdup(NULL, dc_host->ipmi_error);
                availability->errors_from = dc_host->ipmi_errors_from;
                availability->disable_until = dc_host->ipmi_disable_until;
                break;
            case ZBX_AGENT_JMX:
                availability->available = dc_host->jmx_available;
                availability->error = zbx_strdup(NULL, dc_host->jmx_error);
                availability->disable_until = dc_host->jmx_disable_until;
                availability->errors_from = dc_host->jmx_errors_from;
                break;
            default:
                return FAIL;
        }

        ha->hostid = dc_host->hostid;

        return SUCCEED;
    }

    //dbcache.h
    typedef struct
    {
        zbx_uint64_t    hostid;
        zbx_uint64_t    proxy_hostid;
        char        host[HOST_HOST_LEN_MAX];
        char        name[HOST_NAME_LEN * ZBX_MAX_BYTES_IN_UTF8_CHAR + 1];
        unsigned char    maintenance_status;
        unsigned char    maintenance_type;
        int        maintenance_from;
        int        errors_from;
        unsigned char    available;
        int        disable_until;
        int        snmp_errors_from;
        unsigned char    snmp_available;
        int        snmp_disable_until;
        int        ipmi_errors_from;
        unsigned char    ipmi_available;
        int        ipmi_disable_until;
        signed char    ipmi_authtype;
        unsigned char    ipmi_privilege;
        char        ipmi_username[HOST_IPMI_USERNAME_LEN_MAX];
        char        ipmi_password[HOST_IPMI_PASSWORD_LEN_MAX];
        int        jmx_errors_from;
        unsigned char    jmx_available;
        int        jmx_disable_until;
        char        inventory_mode;
        unsigned char    status;
        unsigned char    tls_connect;
        unsigned char    tls_accept;
    #if defined(HAVE_POLARSSL) || defined(HAVE_GNUTLS) || defined(HAVE_OPENSSL)
        char        tls_issuer[HOST_TLS_ISSUER_LEN_MAX];
        char        tls_subject[HOST_TLS_SUBJECT_LEN_MAX];
        char        tls_psk_identity[HOST_TLS_PSK_IDENTITY_LEN_MAX];
        char        tls_psk[HOST_TLS_PSK_LEN_MAX];
    #endif
        char        error[HOST_ERROR_LEN_MAX];
        char        snmp_error[HOST_ERROR_LEN_MAX];
        char        ipmi_error[HOST_ERROR_LEN_MAX];
        char        jmx_error[HOST_ERROR_LEN_MAX];
    }
    DC_HOST;
    
    //db.h
    #define ZBX_FLAGS_AGENT_STATUS_AVAILABLE    0x00000001
    #define ZBX_FLAGS_AGENT_STATUS_ERROR        0x00000002
    #define ZBX_FLAGS_AGENT_STATUS_ERRORS_FROM    0x00000004
    #define ZBX_FLAGS_AGENT_STATUS_DISABLE_UNTIL    0x00000008
    #define ZBX_FLAGS_AGENT_STATUS        (ZBX_FLAGS_AGENT_STATUS_AVAILABLE |    \
                        ZBX_FLAGS_AGENT_STATUS_ERROR |        \
                        ZBX_FLAGS_AGENT_STATUS_ERRORS_FROM |    \
                        ZBX_FLAGS_AGENT_STATUS_DISABLE_UNTIL) 

     
    //common.h
    #define    FAIL        -1

#8 根據agent_type 設置主機狀態

    //dbconfig.c
    int    DChost_deactivate(zbx_uint64_t hostid, unsigned char agent_type, const zbx_timespec_t *ts,
            zbx_agent_availability_t *in, zbx_agent_availability_t *out, const char *error_msg)
    {
        int        ret = FAIL, errors_from,disable_until;
        const char    *error;
        unsigned char    available;
        ZBX_DC_HOST    *dc_host;


        /* don't try deactivating host if the unreachable delay has not passed since the first error */
        if (CONFIG_UNREACHABLE_DELAY > ts->sec - in->errors_from) 
            goto out;

        WRLOCK_CACHE;

        if (NULL == (dc_host = (ZBX_DC_HOST *)zbx_hashset_search(&config->hosts, &hostid)))
            goto unlock;

        /* Don't try deactivating host if:                */
        /* - (server, proxy) it's not monitored any more; */
        /* - (server) it's monitored by proxy.            */
        if ((0 != (program_type & ZBX_PROGRAM_TYPE_SERVER) && 0 != dc_host->proxy_hostid) ||
                HOST_STATUS_MONITORED != dc_host->status)
        {
            goto unlock;
        }

        DChost_get_agent_availability(dc_host, agent_type, in);

        available = in->available;
        error = in->error;

        if (0 == in->errors_from)
        {
            /* first error, schedule next unreachable check */
            errors_from = ts->sec;
            disable_until = ts->sec + CONFIG_UNREACHABLE_DELAY;
        }
        else
        {
            errors_from = in->errors_from;
            disable_until = in->disable_until;

            /* Check if other pollers haven't already attempted deactivating host. */
            /* In that case should wait the initial unreachable delay before       */
            /* trying to make it unavailable.                                      */
            if (CONFIG_UNREACHABLE_DELAY <= ts->sec - errors_from)
            {
                /* repeating error */
                if (CONFIG_UNREACHABLE_PERIOD > ts->sec - errors_from)
                {
                    /* leave host available, schedule next unreachable check */
                    disable_until = ts->sec + CONFIG_UNREACHABLE_DELAY;
                }
                else
                {
                    /* make host unavailable, schedule next unavailable check */
                    disable_until = ts->sec + CONFIG_UNAVAILABLE_DELAY;
                    available = HOST_AVAILABLE_FALSE;
                    error = error_msg;
                }
            }
        }

        zbx_agent_availability_init(out, available, error, errors_from, disable_until);
        DChost_set_agent_availability(dc_host, ts->sec, agent_type, out);

        if (ZBX_FLAGS_AGENT_STATUS_NONE != out->flags)
            ret = SUCCEED;
    unlock:
        UNLOCK_CACHE;
    out:
        return ret;
    }

主要看下這段:

if (0 == in->errors_from)
        {
            /* first error, schedule next unreachable check */
            errors_from = ts->sec;
            disable_until = ts->sec + CONFIG_UNREACHABLE_DELAY;
        }
        else
        {
            errors_from = in->errors_from;
            disable_until = in->disable_until;

            /* Check if other pollers haven't already attempted deactivating host. */
            /* In that case should wait the initial unreachable delay before       */
            /* trying to make it unavailable.                                      */
            if (CONFIG_UNREACHABLE_DELAY <= ts->sec - errors_from)
            {
                /* repeating error */
                if (CONFIG_UNREACHABLE_PERIOD > ts->sec - errors_from)
                {
                    /* leave host available, schedule next unreachable check */
                    disable_until = ts->sec + CONFIG_UNREACHABLE_DELAY;
                }
                else
                {
                    /* make host unavailable, schedule next unavailable check */
                    disable_until = ts->sec + CONFIG_UNAVAILABLE_DELAY;
                    available = HOST_AVAILABLE_FALSE;
                    error = error_msg;
                }
            }
        }

        如果錯誤第一次出現:
            錯誤發生時間=檢查的時間戳
            下次的檢查時間 = 時間戳+15s
        否則:
            錯誤發生時間 = in->errors_from
            下次檢查時間 = in->disable_until

            檢查的時間戳-錯誤發生時間>=15s:
                檢查的時間戳-錯誤發生時間< 45s:
                    下次的檢查時間 = 檢查的時間戳+15s
                否則:
                    下一次檢查時間 =檢查的時間戳+15s
                    主機可用性為不可用

用配置文件來解釋就是: 如果由於網路等原因沒有實現項目的及時監控，第一次的監控間隔為UnreachableDelay時間(15s),如果這次也失敗了，那麼從第一次失敗到本次檢查在UnreachablePeriod時間內，會再次在UnreachableDelay時間後監控

#9 更新資料庫中的主機可用性信息

    // poller.c
    static int    db_host_update_availability(const zbx_host_availability_t *ha)
    {
        char    *sql = NULL;
        size_t    sql_alloc = 0, sql_offset = 0;

        if (SUCCEED == zbx_sql_add_host_availability(&sql, &sql_alloc, &sql_offset, ha))
        {
            DBbegin();
            DBexecute("%s", sql);
            DBcommit();

            zbx_free(sql);

            return SUCCEED;
        }

        return FAIL;
    }

#10 根據agent_type設置主機可用性信息

    //poller.c
    static int    host_set_availability(DC_HOST *dc_host, unsigned char agent, const zbx_host_availability_t *ha)
    {
        const zbx_agent_availability_t    *availability = &ha->agents[agent];
        unsigned char            *pavailable;
        int                *perrors_from, *pdisable_until;
        char                *perror;

        switch (agent)
        {
            case ZBX_AGENT_ZABBIX:
                pavailable = &dc_host->available;
                perror = dc_host->error;
                perrors_from = &dc_host->errors_from;
                pdisable_until = &dc_host->disable_until;
                break;
            case ZBX_AGENT_SNMP:
                pavailable = &dc_host->snmp_available;
                perror = dc_host->snmp_error;
                perrors_from = &dc_host->snmp_errors_from;
                pdisable_until = &dc_host->snmp_disable_until;
                break;
            case ZBX_AGENT_IPMI:
                pavailable = &dc_host->ipmi_available;
                perror = dc_host->ipmi_error;
                perrors_from = &dc_host->ipmi_errors_from;
                pdisable_until = &dc_host->ipmi_disable_until;
                break;
            case ZBX_AGENT_JMX:
                pavailable = &dc_host->jmx_available;
                perror = dc_host->jmx_error;
                pdisable_until = &dc_host->jmx_disable_until;
                perrors_from = &dc_host->jmx_errors_from;
                break;
            default:
                return FAIL;
        }

        if (0 != (availability->flags & ZBX_FLAGS_AGENT_STATUS_AVAILABLE))
            *pavailable = availability->available;

        if (0 != (availability->flags & ZBX_FLAGS_AGENT_STATUS_ERROR))
            zbx_strlcpy(perror, availability->error, HOST_ERROR_LEN_MAX);

        if (0 != (availability->flags & ZBX_FLAGS_AGENT_STATUS_ERRORS_FROM))
            *perrors_from = availability->errors_from;

        if (0 != (availability->flags & ZBX_FLAGS_AGENT_STATUS_DISABLE_UNTIL))
            *pdisable_until = availability->disable_until;

        return SUCCEED;
    }

#11-16
    如果是第一次檢查：
       　　記錄日誌first network error, wait for 15 seconds
    否則:
       　　如果資料庫中的主機如果顯示可用:
            　　　　記錄日誌another network error, wait for 15 seconds
        　　否則
            　　　　記錄日誌temporarily disabling(這是前段頁面的綠色圖標會變為紅色)

從上面的代碼可以看出，在三中情況下會產生network error, wait for 15s seconds的日誌，分別是在poller過程中產生的網路錯誤，網關問題，或者是檢查超時。總結下來就是:zabbix server 與zabbix agentd的連接和數據的收發不能成功或者在取得數據的一系列處理中花費的時間超過了zabbix server 的Timeout參數情況下發生。

從正常取值到出現異常的處理過程是這樣的:

正常取值   UnreachableDelay UnreachableDelay   UnreachableDelay      UnnavailableDelay        恢復
                  |                 |                  |
                  |                 |                |
                  -----------------------UnreachablePeriod------------
    1             2                                      3                  4                     5

過程日誌

1 獲取正常監控數據
2 發生錯誤                 ------------>first network
3 再次發生錯誤          ------------>another network
4 置為不可用            ------------>temporarily disabling
5 恢復                        ------------>resuming

日誌中的15s在配置文件中對應的配置UnreachableDelay,預設為15s，在源碼中的位置是server.c中的CONFIG_UNREACHABLE_DELAY，
但註意這個配置不會解決任何network error的問題，只是為計算下一個檢查時間提供時間依據。還有大家應該註意到了UnreachableDelay參數和UnreachablePeriod是倍數關係。我們在調優的時候需要註意下。

從zabbix 1.8版使用至今，根據我這幾年的經驗分析產生此類日誌基本出現在網路設備，伺服器很少出現，這與SNMP使用UDP協議有關係，但主要問題還是幾方面問題:

1、網路不穩定
2、設備端問題
3、poller排隊了
4、Timeout超時了

這四種中的Timeout和poller又是有相互聯繫的，關於伺服器如何設置poller，我後面的文章再介紹，先暫時分別來看下這四種情況:

網路不穩定多出現於幾種情況:

1、使用公網實現和IDC互連，也就是被檢查設備和server不在一個IDC，這種情況建議在另一端增加proxy，使對端設備的檢測都在內網進行
2、使用雲端網路，使用雲端的網路互連方式打通雲端設備和IDC的互連，這種情況的網路對於用戶來說就是一種黑盒，基本無法排障，如果你使用大廠的服務，會偶爾出現日誌報錯，但不會影響到使用體驗

網路設備端問題的情況:

1、設備性能:如何判斷網路設備端問題呢?可以在網路設備上debug snmp信息，看每個包是否是都回了還是報錯了，這種情況可以將snmp的採取間隔加大，
2、對端和server連接的埠帶寬打滿了

poller排隊處理;
    poller數量是由zabbix_server配置文件中的startpollers指定，poller.c主要做幾件事:1、從隊列中獲取item的數據 2、獲取item獲取監控數據 3、把數據放入緩存
    poller只會處理被動狀態的監控項:
        如果你是伺服器出現此類日誌:解決方法一種是增大poller的數量，一種是把被動模式改為主動模式，
        如果你是網路設備:改用腳本實現，或者增大poller數量

關於Timeout ,這裡有同學可能會說將伺服器的檢查時間調長為30s，這種設置如果檢查設備少沒關係，數量比較多我不建議這樣調整，超過2s的檢測項都改在agentd改用腳本實現吧

以上，是我使用zabbix中關於日誌報警wait for 15s seconds 的一些理解和心得，如果文章內容對你有所幫助，請點個贊吧。如果你發現文中有錯誤的方面，也請留言給我，謝謝！