MySQL 8.0 Reference Manual(讀書筆記59節--Controlling the Query Optimizer(3))

来源:https://www.cnblogs.com/xuliuzai/p/18219320
-Advertisement-
Play Games

【標題】ODBC驅動類問題定位方法 【需求分類】故障分析 【關鍵字】ODBC 【需求描述】由於我們的ODBC介面目前尚不完善,經常會遇見ODBC介面能力不足導致應用功能無法運行的問題,需要定位手段確定底層是哪個介面報錯 【需求原因分析】方便一線資料庫管理員初步分析故障 【分析方法】 ● 由於我們的O ...


3.7 Statement Execution Time Optimizer Hints

The MAX_EXECUTION_TIME hint is permitted only for SELECT statements. It places a limit N (a timeout value in milliseconds) on how long a statement is permitted to execute before the server terminates it:

MAX_EXECUTION_TIME(N)

Example with a timeout of 1 second (1000 milliseconds):

SELECT /*+ MAX_EXECUTION_TIME(1000) */ * FROM t1 INNER JOIN t2 WHERE ...

The MAX_EXECUTION_TIME(N) hint sets a statement execution timeout of N milliseconds. If this option is absent【æbˈsent 缺席的;不存在;不在的;缺少;心不在焉的;出神的;】 or N is 0, the statement timeout established【ɪˈstæblɪʃt 建立(尤指正式關係);設立;確立;創立;使立足;使穩固;】 by the max_execution_time system variable applies.

The MAX_EXECUTION_TIME hint is applicable as follows:

• For statements with multiple SELECT keywords, such as unions or statements with subqueries, MAX_EXECUTION_TIME applies to the entire statement and must appear after the first SELECT.

• It applies to read-only SELECT statements. Statements that are not read only are those that invoke a stored function that modifies data as a side effect.

• It does not apply to SELECT statements in stored programs and is ignored.

3.8 Variable-Setting Hint Syntax

The SET_VAR hint sets the session value of a system variable temporarily【ˈtɛmpəˌrɛrəlɪ 暫時;】 (for the duration of a single statement). Examples:

SELECT /*+ SET_VAR(sort_buffer_size = 16M) */ name FROM people ORDER BY name;
INSERT /*+ SET_VAR(foreign_key_checks=OFF) */ INTO t2 VALUES(2);
SELECT /*+ SET_VAR(optimizer_switch = 'mrr_cost_based=off') */ 1;

Syntax of the SET_VAR hint:

SET_VAR(var_name = value)

var_name names a system variable that has a session value (although not all such variables can be named, as explained later). value is the value to assign to the variable; the value must be a scalar.

SET_VAR makes a temporary variable change, as demonstrated by these statements:

mysql> SELECT @@unique_checks;
+-----------------+
| @@unique_checks |
+-----------------+
| 1               |
+-----------------+
mysql> SELECT /*+ SET_VAR(unique_checks=OFF) */ @@unique_checks;
+-----------------+
| @@unique_checks |
+-----------------+
| 0               |
+-----------------+
mysql> SELECT @@unique_checks;
+-----------------+
| @@unique_checks |
+-----------------+
| 1               |
+-----------------+

 With SET_VAR, there is no need to save and restore the variable value. This enables you to replace multiple statements by a single statement. Consider this sequence of statements:

SET @saved_val = @@SESSION.var_name;
SET @@SESSION.var_name = value;
SELECT ...
SET @@SESSION.var_name = @saved_val;

The sequence can be replaced by this single statement:

SELECT /*+ SET_VAR(var_name = value) ...

Standalone【單獨的,獨立的;】 SET statements permit any of these syntaxes for naming session variables:

SET SESSION var_name = value;
SET @@SESSION.var_name = value;
SET @@.var_name = value;

Because the SET_VAR hint applies only to session variables, session scope is implicit, and SESSION, @@SESSION., and @@ are neither needed nor permitted. Including explicit session-indicator syntax results in the SET_VAR hint being ignored with a warning.

Not all session variables are permitted for use with SET_VAR. Individual system variable descriptions indicate whether each variable is hintable.descriptions indicate whether each variable is hintable; see Section 5.1.8, “Server System Variables”. You can also check a system variable at runtime by attempting to use it with SET_VAR. If the variable is not hintable, a warning occurs:

mysql> SELECT /*+ SET_VAR(collation_server = 'utf8mb4') */ 1;
+---+
| 1 |
+---+
| 1 |
+---+
1 row in set, 1 warning (0.00 sec)
mysql> SHOW WARNINGS\G
*************************** 1. row ***************************
 Level: Warning
 Code: 4537
Message: Variable 'collation_server' cannot be set using SET_VAR hint.

SET_VAR syntax permits setting only a single variable, but multiple hints can be given to set multiple variables:

SELECT /*+ SET_VAR(optimizer_switch = 'mrr_cost_based=off')
 SET_VAR(max_heap_table_size = 1G) */ 1;

If several hints with the same variable name appear in the same statement, the first one is applied and the others are ignored with a warning:

SELECT /*+ SET_VAR(max_heap_table_size = 1G)
 SET_VAR(max_heap_table_size = 3G) */ 1;

In this case, the second hint is ignored with a warning that it is conflicting.

A SET_VAR hint is ignored with a warning if no system variable has the specified name or the variable value is incorrect:

SELECT /*+ SET_VAR(max_size = 1G) */ 1;
SELECT /*+ SET_VAR(optimizer_switch = 'mrr_cost_based=yes') */ 1;

For the first statement, there is no max_size variable. For the second statement, mrr_cost_based takes values of on or off, so attempting to set it to yes is incorrect. In each case, the hint is ignored with a warning.

The SET_VAR hint is permitted only at the statement level. If used in a subquery, the hint is ignored with a warning.

Replicas ignore SET_VAR hints in replicated statements to avoid the potential for security issues.

3.9 Resource Group Hint Syntax

The RESOURCE_GROUP optimizer hint is used for resource group management. This hint assigns【əˈsaɪnz 分配(某物);指定;指派;分派,佈置(工作、任務等);派遣;委派;】 the thread that executes a statement to the named resource group temporarily (for the duration of the statement). It requires the RESOURCE_GROUP_ADMIN or RESOURCE_GROUP_USER privilege.

 Examples:

SELECT /*+ RESOURCE_GROUP(USR_default) */ name FROM people ORDER BY name;
INSERT /*+ RESOURCE_GROUP(Batch) */ INTO t2 VALUES(2);

Syntax of the RESOURCE_GROUP hint:

RESOURCE_GROUP(group_name)

group_name indicates the resource group to which the thread should be assigned for the duration of statement execution. If the group is nonexistent, a warning occurs and the hint is ignored.

The RESOURCE_GROUP hint must appear after the initial statement keyword (SELECT, INSERT, REPLACE, UPDATE, or DELETE).

An alternative to RESOURCE_GROUP is the SET RESOURCE GROUP statement, which nontemporarily assigns threads to a resource group.

3.10 Optimizer Hints for Naming Query Blocks

Table-level, index-level, and subquery optimizer hints permit specific query blocks to be named as part of their argument syntax. To create these names, use the QB_NAME hint, which assigns a name to the query block in which it occurs:

QB_NAME(name)

QB_NAME hints can be used to make explicit【ɪkˈsplɪsɪt 明確的;詳述的;直言的, 坦率的;一目瞭然的;】 in a clear way which query blocks other hints apply to. They also permit all non-query block name hints to be specified within a single hint comment for easier understanding of complex statements. Consider the following statement:

SELECT ...
 FROM (SELECT ...
 FROM (SELECT ... FROM ...)) ...

QB_NAME hints assign names to query blocks in the statement:

SELECT /*+ QB_NAME(qb1) */ ...
 FROM (SELECT /*+ QB_NAME(qb2) */ ...
 FROM (SELECT /*+ QB_NAME(qb3) */ ... FROM ...)) ...

Then other hints can use those names to refer to the appropriate query blocks:

SELECT /*+ QB_NAME(qb1) MRR(@qb1 t1) BKA(@qb2) NO_MRR(@qb3t1 idx1, id2) */ ...
 FROM (SELECT /*+ QB_NAME(qb2) */ ...
 FROM (SELECT /*+ QB_NAME(qb3) */ ... FROM ...)) ...

The resulting effect is as follows:

• MRR(@qb1 t1) applies to table t1 in query block qb1.

• BKA(@qb2) applies to query block qb2.

• NO_MRR(@qb3 t1 idx1, id2) applies to indexes idx1 and idx2 in table t1 in query block qb3.

Query block names are identifiers and follow the usual rules about what names are valid and how to quote them. For example, a query block name that contains spaces must be quoted, which can be done using backticks:

SELECT /*+ BKA(@`my hint name`) */ ...
 FROM (SELECT /*+ QB_NAME(`my hint name`) */ ...) ...

If the ANSI_QUOTES SQL mode is enabled, it is also possible to quote query block names within double quotation marks:

SELECT /*+ BKA(@"my hint name") */ ...
 FROM (SELECT /*+ QB_NAME("my hint name") */ ...) ...

4 Index Hints

Index hints give the optimizer information about how to choose indexes during query processing. Index hints, described here, differ from optimizer hints, Index and optimizer hints may be used separately or together.

Index hints apply to SELECT and UPDATE statements. They also work with multi-table DELETE statements, but not with single-table DELETE, as shown later in this section.

Index hints are specified following a table name. (For the general syntax for specifying tables in a SELECT statement) The syntax for referring to an individual table, including index hints, looks like this:

tbl_name [[AS] alias] [index_hint_list]
index_hint_list:
 index_hint [index_hint] ...
index_hint:
 USE {INDEX|KEY}
 [FOR {JOIN|ORDER BY|GROUP BY}] ([index_list])
 | {IGNORE|FORCE} {INDEX|KEY}
 [FOR {JOIN|ORDER BY|GROUP BY}] (index_list)
index_list:
 index_name [, index_name] ...

The USE INDEX (index_list) hint tells MySQL to use only one of the named indexes to find rows in the table. The alternative syntax IGNORE INDEX (index_list) tells MySQL to not use some particular index or indexes. These hints are useful if EXPLAIN shows that MySQL is using the wrong index from the list of possible indexes.

The FORCE INDEX hint acts like USE INDEX (index_list), with the addition that a table scan is assumed to be very expensive. In other words, a table scan is used only if there is no way to use one of the named indexes to find rows in the table.

【As of MySQL 8.0.20, the server supports the index-level optimizer hints JOIN_INDEX, GROUP_INDEX, ORDER_INDEX, and INDEX, which are equivalent to and intended to supersede FORCE INDEX index hints, as well as the NO_JOIN_INDEX, NO_GROUP_INDEX, NO_ORDER_INDEX, and NO_INDEX optimizer hints, which are equivalent to and intended to supersede IGNORE INDEX index hints. Thus, you should expect USE INDEX, FORCE INDEX, and IGNORE INDEX to be deprecated in a future release of MySQL, and at some time thereafter to be removed altogether.

These index-level optimizer hints are supported with both single-table and multitable DELETE statements.】

Each hint requires index names, not column names. To refer to a primary key, use the name PRIMARY. To see the index names for a table, use the SHOW INDEX statement or the Information Schema STATISTICS table.

An index_name value need not be a full index name. It can be an unambiguous【ˌʌnæmˈbɪɡjuəs 明確的;毫不含糊的;無歧義的;意思清楚的;】 prefix of an index name. If a prefix is ambiguous, an error occurs.

Examples:

SELECT * FROM table1 USE INDEX (col1_index,col2_index)
 WHERE col1=1 AND col2=2 AND col3=3;
SELECT * FROM table1 IGNORE INDEX (col3_index)
 WHERE col1=1 AND col2=2 AND col3=3;

The syntax for index hints has the following characteristics:

• It is syntactically valid to omit index_list for USE INDEX, which means “use no indexes.” Omitting index_list for FORCE INDEX or IGNORE INDEX is a syntax error.

• You can specify the scope of an index hint by adding a FOR clause to the hint. This provides more fine-grained control over optimizer selection of an execution plan for various phases of query processing.

To affect only the indexes used when MySQL decides how to find rows in the table and how to process joins, use FOR JOIN. To influence index usage for sorting or grouping rows, use FOR ORDER BY or FOR GROUP BY.

• You can specify multiple index hints:

SELECT * FROM t1 USE INDEX (i1) IGNORE INDEX FOR ORDER BY (i2) ORDER BY a;

It is not an error to name the same index in several hints (even within the same hint):

SELECT * FROM t1 USE INDEX (i1) USE INDEX (i1,i1);

However, it is an error to mix USE INDEX and FORCE INDEX for the same table:

SELECT * FROM t1 USE INDEX FOR JOIN (i1) FORCE INDEX FOR JOIN (i2);

If an index hint includes no FOR clause, the scope of the hint is to apply to all parts of the statement. For example, this hint:

IGNORE INDEX (i1)

is equivalent to this combination of hints:

IGNORE INDEX FOR JOIN (i1)
IGNORE INDEX FOR ORDER BY (i1)
IGNORE INDEX FOR GROUP BY (i1)

In MySQL 5.0, hint scope with no FOR clause was to apply only to row retrieval. To cause the server to use this older behavior when no FOR clause is present, enable the old system variable at server startup. Take care about enabling this variable in a replication setup. With statement-based binary logging, having different modes for the source and replicas might lead to replication errors.

When index hints are processed, they are collected in a single list by type (USE, FORCE, IGNORE) and by scope (FOR JOIN, FOR ORDER BY, FOR GROUP BY). For example:

SELECT * FROM t1
 USE INDEX () IGNORE INDEX (i2) USE INDEX (i1) USE INDEX (i2);

is equivalent to:

SELECT * FROM t1
 USE INDEX (i1,i2) IGNORE INDEX (i2);

The index hints then are applied for each scope【skoʊp (題目、組織、活動等的)範圍;(做或實現某事的)機會,能力;…鏡(觀察儀器);】 in the following order:

1. {USE|FORCE} INDEX is applied if present. (If not, the optimizer-determined set of indexes is used.)

2. IGNORE INDEX is applied over the result of the previous step. For example, the following two queries are equivalent:

SELECT * FROM t1 USE INDEX (i1) IGNORE INDEX (i2) USE INDEX (i2);
SELECT * FROM t1 USE INDEX (i1);

For FULLTEXT searches, index hints work as follows:

• For natural language mode searches, index hints are silently ignored. For example, IGNORE INDEX(i1) is ignored with no warning and the index is still used.

• For boolean mode searches, index hints with FOR ORDER BY or FOR GROUP BY are silently ignored. Index hints with FOR JOIN or no FOR modifier are honored. In contrast to how hints apply for non-FULLTEXT searches, the hint is used for all phases of query execution (finding rows and retrieval, grouping, and ordering). This is true even if the hint is given for a non-FULLTEXT index.

For example, the following two queries are equivalent【ɪˈkwɪvələnt (價值、數量、意義、重要性等)相同的;相等的;】:

SELECT * FROM t
 USE INDEX (index1)
 IGNORE INDEX FOR ORDER BY (index1)
 IGNORE INDEX FOR GROUP BY (index1)
 WHERE ... IN BOOLEAN MODE ... ;
SELECT * FROM t
 USE INDEX (index1)
 WHERE ... IN BOOLEAN MODE ... ;

Index hints work with DELETE statements, but only if you use multi-table DELETE syntax, as shown here:

mysql> EXPLAIN DELETE FROM t1 USE INDEX(col2)
 -> WHERE col1 BETWEEN 1 AND 100 AND COL2 BETWEEN 1 AND 100\G
ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that
corresponds to your MySQL server version for the right syntax to use near 'use
index(col2) where col1 between 1 and 100 and col2 between 1 and 100' at line 1
 
mysql> EXPLAIN DELETE t1.* FROM t1 USE INDEX(col2)
 -> WHERE col1 BETWEEN 1 AND 100 AND COL2 BETWEEN 1 AND 100\G
*************************** 1. row ***************************
 id: 1
 select_type: DELETE
 table: t1
 partitions: NULL
 type: range
possible_keys: col2
 key: col2
 key_len: 5
 ref: NULL
 rows: 72
 filtered: 11.11
 Extra: Using where
1 row in set, 1 warning (0.00 sec)

5 The Optimizer Cost Model

To generate execution plans, the optimizer uses a cost model that is based on estimates of the cost of various operations that occur during query execution. The optimizer has a set of compiled-in default “cost constants” available to it to make decisions regarding execution plans.

The optimizer also has a database of cost estimates to use during execution plan construction. These estimates are stored in the server_cost and engine_cost tables in the mysql system database and are configurable at any time. The intent of these tables is to make it possible to easily adjust the cost estimates that the optimizer uses when it attempts to arrive at query execution plans.

5.1 Cost Model General Operation

The configurable optimizer cost model works like this:

• The server reads the cost model tables into memory at startup and uses the in-memory values at runtime. Any non-NULL cost estimate specified in the tables takes precedence over the corresponding compiled-in default cost constant. Any NULL estimate indicates to the optimizer to use the compiled-in default.

• At runtime, the server may re-read the cost tables. This occurs when a storage engine is dynamically loaded or when a FLUSH OPTIMIZER_COSTS statement is executed.

• Cost tables enable server administrators to easily adjust cost estimates by changing entries in the tables. It is also easy to revert to a default by setting an entry's cost to NULL. The optimizer uses the in-memory cost values, so changes to the tables should be followed by FLUSH OPTIMIZER_COSTS to take effect.

• The in-memory cost estimates that are current when a client session begins apply throughout that session until it ends. In particular, if the server re-reads the cost tables, any changed estimates apply only to subsequently started sessions. Existing sessions are unaffected.

• Cost tables are specific to a given server instance. The server does not replicate cost table changes to replicas.

5.2 The Cost Model Database

The optimizer cost model database consists of two tables in the mysql system database that contain cost estimate information for operations that occur during query execution:

• server_cost: Optimizer cost estimates for general server operations

• engine_cost: Optimizer cost estimates for operations specific to particular storage engines

The server_cost table contains these columns:

• cost_name

The name of a cost estimate used in the cost model. The name is not case-sensitive. If the server does not recognize the cost name when it reads this table, it writes a warning to the error log.

• cost_value

The cost estimate value. If the value is non-NULL, the server uses it as the cost. Otherwise, it uses the default estimate (the compiled-in value). DBAs can change a cost estimate by updating this column. If the server finds that the cost value is invalid (nonpositive) when it reads this table, it writes a warning to the error log.

To override a default cost estimate (for an entry that specifies NULL), set the cost to a non-NULL value. To revert to the default, set the value to NULL. Then execute FLUSH OPTIMIZER_COSTS to tell the server to re-read the cost tables.

• last_update

The time of the last row update.

• comment

A descriptive【dɪˈskrɪptɪv 描寫的;敘述的;說明的;描寫性的(描述語言的實際應用而非使用規則);】 comment associated with the cost estimate. DBAs can use this column to provide information about why a cost estimate row stores a particular value.

• default_value

The default (compiled-in) value for the cost estimate. This column is a read-only generated column that retains its value even if the associated cost estimate is changed. For rows added to the table at runtime, the value of this column is NULL.

The primary key for the server_cost table is the cost_name column, so it is not possible to create multiple entries for any cost estimate.

The server recognizes these cost_name values for the server_cost table:

• disk_temptable_create_cost, disk_temptable_row_cost

The cost estimates for internally created temporary tables stored in a disk-based storage engine (either InnoDB or MyISAM). Increasing these values increases the cost estimate of using internal temporary tables and makes the optimizer prefer query plans with less use of them.

The larger default values for these disk parameters compared to the default values for the corresponding memory parameters (memory_temptable_create_cost, memory_temptable_row_cost) reflects the greater cost of processing disk-based tables.

• key_compare_cost

The cost of comparing record keys. Increasing this value causes a query plan that compares many keys to become more expensive. For example, a query plan that performs a filesort becomes relatively more expensive compared to a query plan that avoids sorting by using an index.

• memory_temptable_create_cost, memory_temptable_row_cost

The cost estimates for internally created temporary tables stored in the MEMORY storage engine. Increasing these values increases the cost estimate of using internal temporary tables and makes the optimizer prefer query plans with less use of them.

The smaller default values for these memory parameters compared to the default values for the corresponding disk parameters (disk_temptable_create_cost, disk_temptable_row_cost) reflects the lesser cost of processing memory-based tables.

• row_evaluate_cost

The cost of evaluating record conditions. Increasing this value causes a query plan that examines many rows to become more expensive compared to a query plan that examines fewer rows. For example, a table scan becomes relatively more expensive compared to a range scan that reads fewer rows.

The engine_cost table contains these columns:

• engine_name

The name of the storage engine to which this cost estimate applies. The name is not case-sensitive. If the value is default, it applies to all storage engines that have no named entry of their own. If the server does not recognize the engine name when it reads this table, it writes a warning to the error log.

• device_type

The device type to which this cost estimate applies. The column is intended for specifying different cost estimates for different storage device types, such as hard disk drives versus solid state drives. Currently, this information is not used and 0 is the only permitted value.

 • cost_name

Same as in the server_cost table.

• cost_value

Same as in the server_cost table.

• last_update

Same as in the server_cost table.

• comment

Same as in the server_cost table.

• default_value

The default (compiled-in) value for the cost estimate. This column is a read-only generated column that retains its value even if the associated cost estimate is changed. For rows added to the table at runtime, the value of this column is NULL, with the exception that if the row has the same cost_name value as one of the original rows, the default_value column has the same value as that row.

The primary key for the engine_cost table is a tuple comprising the (cost_name, engine_name, device_type) columns, so it is not possible to create multiple entries for any combination of values in those columns.

The server recognizes these cost_name values for the engine_cost table:

• io_block_read_cost

The cost of reading an index or data block from disk. Increasing this value causes a query plan that reads many disk blocks to become more expensive compared to a query plan that reads fewer disk blocks. For example, a table scan becomes relatively more expensive compared to a range scan that reads fewer blocks.

• memory_block_read_cost

Similar to io_block_read_cost, but represents the cost of reading an index or data block from an inmemory database buffer.

If the io_block_read_cost and memory_block_read_cost values differ, the execution plan may change between two runs of the same query. Suppose that the cost for memory access is less than the cost for disk access. In that case, at server startup before data has been read into the buffer pool, you may get a different plan than after the query has been run because then the data is in memory.

5.3 Making Changes to the Cost Model Database

For DBAs who wish to change the cost model parameters from their defaults, try doubling or halving the value and measuring the effect.

Changes to the io_block_read_cost and memory_block_read_cost parameters are most likely to yield worthwhile results. These parameter values enable cost models for data access methods to take into account the costs of reading information from different sources; that is, the cost of reading information from disk versus reading information already in a memory buffer. For example, all other things being equal, setting io_block_read_cost to a value larger than memory_block_read_cost causes the optimizer to prefer query plans that read information already held in memory to plans that must read from disk.

This example shows how to change the default value for io_block_read_cost:

UPDATE mysql.engine_cost
 SET cost_value = 2.0
 WHERE cost_name = 'io_block_read_cost';
FLUSH OPTIMIZER_COSTS;

This example shows how to change the value of io_block_read_cost only for the InnoDB storage engine:

INSERT INTO mysql.engine_cost
 VALUES ('InnoDB', 0, 'io_block_read_cost', 3.0,
 CURRENT_TIMESTAMP, 'Using a slower disk for InnoDB');
FLUSH OPTIMIZER_COSTS;

6 Optimizer Statistics

The column_statistics data dictionary table stores histogram statistics about column values, for use by the optimizer in constructing query execution plans. To perform histogram management, use the ANALYZE TABLE statement.

The column_statistics table has these characteristics:

• The table contains statistics for columns of all data types except geometry types (spatial data) and JSON.

• The table is persistent so that column statistics need not be created each time the server starts.

• The server performs updates to the table; users do not.

The column_statistics table is not directly accessible by users because it is part of the data dictionary. Histogram information is available using INFORMATION_SCHEMA.COLUMN_STATISTICS, which is implemented as a view on the data dictionary table. COLUMN_STATISTICS has these columns:

• SCHEMA_NAME, TABLE_NAME, COLUMN_NAME: The names of the schema, table, and column for which the statistics apply.

• HISTOGRAM: A JSON value describing the column statistics, stored as a histogram.

Column histograms contain buckets for parts of the range of values stored in the column. Histograms are JSON objects to permit flexibility in the representation of column statistics. Here is a sample histogram object:

{
 "buckets": [
 [
 1,
 0.3333333333333333
 ],
 [
 2,
 0.6666666666666666
 ],
 [
 3,
 1
 ]
 ],
 "null-values": 0,
 "last-updated": "2017-03-24 13:32:40.000000",
 "sampling-rate": 1,
 "histogram-type": "singleton",
 "number-of-buckets-specified": 128,
 "data-type": "int",
 "collation-id": 8
}

Histogram objects have these keys:

• buckets: The histogram buckets. Bucket structure depends on the histogram type.

For singleton histograms, buckets contain two values:

  • Value 1: The value for the bucket. The type depends on the column data type.
  • Value 2: A double representing the cumulative frequency for the value. For example, .25 and .75 indicate that 25% and 75% of the values in the column are less than or equal to the bucket value.

For equi-height histograms, buckets contain four values:

  • Values 1, 2: The lower and upper inclusive values for the bucket. The type depends on the column data type.
  • Value 3: A double representing the cumulative frequency for the value. For example, .25 and .75 indicate that 25% and 75% of the values in the column are less than or equal to the bucket upper value.
  • Value 4: The number of distinct values in the range from the bucket lower value to its upper value.

• null-values: A number between 0.0 and 1.0 indicating the fraction of column values that are SQL NULL values. If 0, the column contains no NULL values.

• last-updated: When the histogram was generated, as a UTC value in YYYY-MM-DD hh:mm:ss.uuuuuu format.

• sampling-rate: A number between 0.0 and 1.0 indicating the fraction of data that was sampled to create the histogram. A value of 1 means that all of the data was read (no sampling).

• histogram-type: The histogram type:

  • singleton: One bucket represents one single value in the column. This histogram type is created when the number of distinct values in the column is less than or equal to the number of buckets specified in the ANALYZE TABLE statement that generated the histogram.
  • equi-height: One bucket represents a range of values. This histogram type is created when the number of distinct values in the column is greater than the number of buckets specified in the ANALYZE TABLE statement that generated the histogram.

• number-of-buckets-specified: The number of buckets specified in the ANALYZE TABLE statement that generated the histogram.

• data-type: The type of data this histogram contains. This is needed when reading and parsing histograms from persistent storage into memory. The value is one of int, uint (unsigned integer), double, decimal, datetime, or string (includes character and binary strings).

• collation-id: The collation ID for the histogram data. It is mostly meaningful when the data-type value is string. Values correspond to ID column values in the Information Schema COLLATIONS table.

To extract particular values from the histogram objects, you can use JSON operations. For example:

mysql> SELECT
 TABLE_NAME, COLUMN_NAME,
 HISTOGRAM->>'$."data-type"' AS 'data-type',
 JSON_LENGTH(HISTOGRAM->>'$."buckets"') AS 'bucket-count'
 FROM INFORMATION_SCHEMA.COLUMN_STATISTICS;
+-----------------+-------------+-----------+--------------+
| TABLE_NAME      | COLUMN_NAME | data-type | bucket-count |
+-----------------+-------------+-----------+--------------+
| country         | Population  | int       | 226          |
| city            | Population  | int       | 1024         |
| countrylanguage | Language    | string    | 457          |
+-----------------+-------------+-----------+--------------+

The optimizer uses histogram statistics, if applicable, for columns of any data type for which statistics are collected. The optimizer applies histogram statistics to determine row estimates based on the selectivity (filtering effect) of column value comparisons against constant values. Predicates of these forms qualify for histogram use:

col_name = constant
col_name <> constant
col_name != constant
col_name > constant
col_name < constant
col_name >= constant

              
您的分享是我們最大的動力!

-Advertisement-
Play Games
更多相關文章
  • 很多APP都需要主動向用戶推送消息,這就需要用到長連接的服務,即我們通常提到的websocket,同樣也是使用socket服務,通信協議是基本類似的,在go中用的最多的、也是最簡單的socket服務就是gorilla/websocket,它有21.1K的star,足以說明它的受歡迎程度, 它的git ...
  • 本文測試環境為SQLserver2019 背景 某業務流水錶,會基於固定範圍內的業務編號做寫入以及查詢操作,熱數據的量級在億級別,一個典型的查詢是基於業務編碼查詢最新(時間戳)某種狀態的前N條數據 簡化後的表結構如下 create table TestTable01 ( id bigint iden ...
  • 資料庫引入LLVM之後,可以為具體的查詢生成定製化的機器碼,並儘可能地將數據存儲在CPU的寄存器中進一步加快計算的速度。 ...
  • Redis事務與MySQL事務 不一樣。 原子性:MySQL有Undo Log機制,支持強原子性,和回滾。Redis只能保證事務內指令可以不被干擾的在同一批次執行,且沒有機制保證全部成功則提交,部分失敗則回滾。 隔離性:MySQL的隔離性指多個事務可以併發執行,MySQL有MVCC機制。而Redis ...
  • MySQL死鎖接觸少,但面試又經常被問到怎麼辦? 最近有小伙伴在面試的時候,被問了MySQL死鎖,如何解決? 雖然也回答出來了,但是不夠全面體系化, 所以,小北給大家做一下系統化、體系化的梳理,幫助大家在面試過程中能夠脫穎而出,拿到自己心儀的Offer 插播一條:如果你近期準備面試跳槽,建議在htt ...
  • 隨著大數據技術和人工智慧的發展,企業逐漸意識到構建一個集中化的指標管理平臺的必要性。這樣的平臺旨在解決幾個核心問題:首先,確保所有部門都能通過統一的入口提交指標需求,實現需求的透明化管理;其次,建立完善的指標管理體系,涵蓋從需求定義、模型設計、數據集成、開發實施到應用監控的全鏈條;第三,通過自動化和 ...
  • 如何對MySQL中的資料庫進行優化,如何優化數據表中的數據類型、刪除重覆、冗餘索引;如何設計資料庫,資料庫的範式要求。如何對數據表進行分析、檢查、優化、拆分。 ...
  • 在當今的互聯網時代,隨著用戶數量和請求量的不斷增加,系統的性能和穩定性面臨著巨大的挑戰。限流演算法作為保障系統穩定性的重要手段之一,被廣泛應用於各種服務和應用中。限流的核心目的是對某一時間視窗內的請求數進行限制,保持系統的可用性和穩定性,防止因流量暴增而導致的系統運行緩慢或宕機。 常見限流演算法對比 常 ...
一周排行
    -Advertisement-
    Play Games
  • 移動開發(一):使用.NET MAUI開發第一個安卓APP 對於工作多年的C#程式員來說,近來想嘗試開發一款安卓APP,考慮了很久最終選擇使用.NET MAUI這個微軟官方的框架來嘗試體驗開發安卓APP,畢竟是使用Visual Studio開發工具,使用起來也比較的順手,結合微軟官方的教程進行了安卓 ...
  • 前言 QuestPDF 是一個開源 .NET 庫,用於生成 PDF 文檔。使用了C# Fluent API方式可簡化開發、減少錯誤並提高工作效率。利用它可以輕鬆生成 PDF 報告、發票、導出文件等。 項目介紹 QuestPDF 是一個革命性的開源 .NET 庫,它徹底改變了我們生成 PDF 文檔的方 ...
  • 項目地址 項目後端地址: https://github.com/ZyPLJ/ZYTteeHole 項目前端頁面地址: ZyPLJ/TreeHoleVue (github.com) https://github.com/ZyPLJ/TreeHoleVue 目前項目測試訪問地址: http://tree ...
  • 話不多說,直接開乾 一.下載 1.官方鏈接下載: https://www.microsoft.com/zh-cn/sql-server/sql-server-downloads 2.在下載目錄中找到下麵這個小的安裝包 SQL2022-SSEI-Dev.exe,運行開始下載SQL server; 二. ...
  • 前言 隨著物聯網(IoT)技術的迅猛發展,MQTT(消息隊列遙測傳輸)協議憑藉其輕量級和高效性,已成為眾多物聯網應用的首選通信標準。 MQTTnet 作為一個高性能的 .NET 開源庫,為 .NET 平臺上的 MQTT 客戶端與伺服器開發提供了強大的支持。 本文將全面介紹 MQTTnet 的核心功能 ...
  • Serilog支持多種接收器用於日誌存儲,增強器用於添加屬性,LogContext管理動態屬性,支持多種輸出格式包括純文本、JSON及ExpressionTemplate。還提供了自定義格式化選項,適用於不同需求。 ...
  • 目錄簡介獲取 HTML 文檔解析 HTML 文檔測試參考文章 簡介 動態內容網站使用 JavaScript 腳本動態檢索和渲染數據,爬取信息時需要模擬瀏覽器行為,否則獲取到的源碼基本是空的。 本文使用的爬取步驟如下: 使用 Selenium 獲取渲染後的 HTML 文檔 使用 HtmlAgility ...
  • 1.前言 什麼是熱更新 游戲或者軟體更新時,無需重新下載客戶端進行安裝,而是在應用程式啟動的情況下,在內部進行資源或者代碼更新 Unity目前常用熱更新解決方案 HybridCLR,Xlua,ILRuntime等 Unity目前常用資源管理解決方案 AssetBundles,Addressable, ...
  • 本文章主要是在C# ASP.NET Core Web API框架實現向手機發送驗證碼簡訊功能。這裡我選擇是一個互億無線簡訊驗證碼平臺,其實像阿裡雲,騰訊雲上面也可以。 首先我們先去 互億無線 https://www.ihuyi.com/api/sms.html 去註冊一個賬號 註冊完成賬號後,它會送 ...
  • 通過以下方式可以高效,並保證數據同步的可靠性 1.API設計 使用RESTful設計,確保API端點明確,並使用適當的HTTP方法(如POST用於創建,PUT用於更新)。 設計清晰的請求和響應模型,以確保客戶端能夠理解預期格式。 2.數據驗證 在伺服器端進行嚴格的數據驗證,確保接收到的數據符合預期格 ...