MySQL 8.0 Reference Manual（讀書筆記55節--Optimization and Indexes(2)）

Redis的刪除策略水一篇 Redis是一種記憶體級資料庫，所有數據均存放在記憶體中，記憶體中的數據可以通過TTL指令獲取其狀態 TTL返回的值有三種情況：正數，-1，-2 正數：代表該數據在記憶體中還能存活的時間 -1：永久有效的數據 -2 ：已經過期的數據或被刪除的數據或未定義的數據刪除策略就 ...

9 Comparison of B-Tree and Hash Indexes

Understanding the B-tree and hash data structures can help predict【prɪˈdɪkt 預測;預報;預言;預告;】 how different queries perform on different storage engines that use these data structures in their indexes, particularly for the MEMORY storage engine that lets you choose B-tree or hash indexes.

9.1 B-Tree Index Characteristics

A B-tree index can be used for column comparisons【kəmˈpɛrəsənz 比較;對比;相比;】 in expressions that use the =, >, >=, <, <=, or BETWEEN operators. The index also can be used for LIKE comparisons if the argument to LIKE is a constant string that does not start with a wildcard character. For example, the following SELECT statements use indexes:

SELECT * FROM tbl_name WHERE key_col LIKE 'Patrick%';
SELECT * FROM tbl_name WHERE key_col LIKE 'Pat%_ck%';

In the first statement, only rows with 'Patrick' <= key_col < 'Patricl' are considered. In the second statement, only rows with 'Pat' <= key_col < 'Pau' are considered.

The following SELECT statements do not use indexes:

SELECT * FROM tbl_name WHERE key_col LIKE '%Patrick%';
SELECT * FROM tbl_name WHERE key_col LIKE other_col;

In the first statement, the LIKE value begins with a wildcard character. In the second statement, the LIKE value is not a constant.

If you use ... LIKE '%string%' and string is longer than three characters, MySQL uses the Turbo Boyer-Moore algorithm to initialize the pattern for the string and then uses this pattern to perform the search more quickly.

A search using col_name IS NULL employs indexes if col_name is indexed.

Any index that does not span【spæn 跨度，範圍;跨距，寬度;】 all AND levels in the WHERE clause is not used to optimize the query. In other words, to be able to use an index, a prefix of the index must be used in every AND group.

The following WHERE clauses use indexes:

... WHERE index_part1=1 AND index_part2=2 AND other_column=3
 /* index = 1 OR index = 2 */
... WHERE index=1 OR A=10 AND index=2
 /* optimized like "index_part1='hello'" */
... WHERE index_part1='hello' AND index_part3=5
 /* Can use index on index1 but not on index2 or index3 */
... WHERE index1=1 AND index2=2 OR index1=3 AND index3=3;

These WHERE clauses do not use indexes:

 /* index_part1 is not used */
... WHERE index_part2=1 AND index_part3=2
 /* Index is not used in both parts of the WHERE clause */
... WHERE index=1 OR A=10
 /* No index spans all rows */
... WHERE index_part1=1 OR index_part2=10

Sometimes MySQL does not use an index, even if one is available. One circumstance under which this occurs is when the optimizer estimates that using the index would require MySQL to access a very large percentage of the rows in the table. (In this case, a table scan is likely to be much faster because it requires fewer seeks.) However, if such a query uses LIMIT to retrieve only some of the rows, MySQL uses an index anyway, because it can much more quickly find the few rows to return in the result.

9.2 Hash Index Characteristics

Hash indexes have somewhat different characteristics【ˌkɛrəktəˈrɪstɪks 特征;特點;品質；】 from those just discussed:

• They are used only for equality comparisons that use the = or <=> operators (but are very fast). They are not used for comparison operators such as < that find a range of values. Systems that rely on this type of single-value lookup are known as “key-value stores”; to use MySQL for such applications, use hash indexes wherever possible.

• The optimizer cannot use a hash index to speed up ORDER BY operations. (This type of index cannot be used to search for the next entry in order.)

• MySQL cannot determine approximately【əˈprɑːksɪmətli 大概;大約;約莫;】 how many rows there are between two values (this is used by the range optimizer to decide which index to use). This may affect some queries if you change a MyISAM or InnoDB table to a hash-indexed MEMORY table.

• Only whole keys can be used to search for a row. (With a B-tree index, any leftmost prefix of the key can be used to find rows.)

10 Use of Index Extensions

InnoDB automatically extends each secondary index by appending the primary key columns to it. Consider this table definition:

CREATE TABLE t1 (
 i1 INT NOT NULL DEFAULT 0,
 i2 INT NOT NULL DEFAULT 0,
 d DATE DEFAULT NULL,
 PRIMARY KEY (i1, i2),
 INDEX k_d (d)
) ENGINE = InnoDB;

This table defines the primary key on columns (i1, i2). It also defines a secondary index k_d on column (d), but internally InnoDB extends this index and treats it as columns (d, i1, i2).

The optimizer takes into account the primary key columns of the extended secondary index when determining how and whether to use that index. This can result in more efficient query execution plans and better performance.

The optimizer can use extended secondary indexes for ref, range, and index_merge index access, for Loose Index Scan access, for join and sorting optimization, and for MIN()/MAX() optimization.

The following example shows how execution plans are affected by whether the optimizer uses extended secondary indexes. Suppose that t1 is populated with these rows:

INSERT INTO t1 VALUES
(1, 1, '1998-01-01'), (1, 2, '1999-01-01'),
(1, 3, '2000-01-01'), (1, 4, '2001-01-01'),
(1, 5, '2002-01-01'), (2, 1, '1998-01-01'),
(2, 2, '1999-01-01'), (2, 3, '2000-01-01'),
(2, 4, '2001-01-01'), (2, 5, '2002-01-01'),
(3, 1, '1998-01-01'), (3, 2, '1999-01-01'),
(3, 3, '2000-01-01'), (3, 4, '2001-01-01'),
(3, 5, '2002-01-01'), (4, 1, '1998-01-01'),
(4, 2, '1999-01-01'), (4, 3, '2000-01-01'),
(4, 4, '2001-01-01'), (4, 5, '2002-01-01'),
(5, 1, '1998-01-01'), (5, 2, '1999-01-01'),
(5, 3, '2000-01-01'), (5, 4, '2001-01-01'),
(5, 5, '2002-01-01');

Now consider this query:

EXPLAIN SELECT COUNT(*) FROM t1 WHERE i1 = 3 AND d = '2000-01-01'

The execution plan depends on whether the extended index is used.

When the optimizer does not consider index extensions, it treats the index k_d as only (d). EXPLAIN for the query produces this result:

mysql> EXPLAIN SELECT COUNT(*) FROM t1 WHERE i1 = 3 AND d = '2000-01-01'\G
*************************** 1. row ***************************
 id: 1
 select_type: SIMPLE
 table: t1
 type: ref
possible_keys: PRIMARY,k_d
 key: k_d
 key_len: 4
 ref: const
 rows: 5
 Extra: Using where; Using index

When the optimizer takes index extensions into account, it treats k_d as (d, i1, i2). In this case, it can use the leftmost index prefix (d, i1) to produce a better execution plan:

mysql> EXPLAIN SELECT COUNT(*) FROM t1 WHERE i1 = 3 AND d = '2000-01-01'\G
*************************** 1. row ***************************
 id: 1
 select_type: SIMPLE
 table: t1
 type: ref
possible_keys: PRIMARY,k_d
 key: k_d
 key_len: 8
 ref: const,const
 rows: 1
 Extra: Using index

In both cases, key indicates that the optimizer uses secondary index k_d but the EXPLAIN output shows these improvements from using the extended index:

• key_len goes from 4 bytes to 8 bytes, indicating that key lookups use columns d and i1, not just d.

• The ref value changes from const to const,const because the key lookup uses two key parts, not one.

• The rows count decreases from 5 to 1, indicating that InnoDB should need to examine fewer rows to produce the result.

• The Extra value changes from Using where; Using index to Using index. This means that rows can be read using only the index, without consulting columns in the data row.

Differences in optimizer behavior for use of extended indexes can also be seen with SHOW STATUS:

FLUSH TABLE t1;
FLUSH STATUS;
SELECT COUNT(*) FROM t1 WHERE i1 = 3 AND d = '2000-01-01';
SHOW STATUS LIKE 'handler_read%'

The preceding statements include FLUSH TABLES and FLUSH STATUS to flush the table cache and clear the status counters.

Without index extensions, SHOW STATUS produces this result:

+-----------------------+-------+
| Variable_name         | Value |
+-----------------------+-------+
| Handler_read_first    | 0     |
| Handler_read_key      | 1     |
| Handler_read_last     | 0     |
| Handler_read_next     | 5     |
| Handler_read_prev     | 0     |
| Handler_read_rnd      | 0     |
| Handler_read_rnd_next | 0     |
+-----------------------+-------+

With index extensions, SHOW STATUS produces this result. The Handler_read_next value decreases from 5 to 1, indicating more efficient use of the index:

+-----------------------+-------+
| Variable_name         | Value |
+-----------------------+-------+
| Handler_read_first    | 0     |
| Handler_read_key      | 1     |
| Handler_read_last     | 0     |
| Handler_read_next     | 1     |
| Handler_read_prev     | 0     |
| Handler_read_rnd      | 0     |
| Handler_read_rnd_next | 0     |
+-----------------------+-------+

The use_index_extensions flag of the optimizer_switch system variable permits control over whether the optimizer takes the primary key columns into account when determining how to use an InnoDB table's secondary indexes. By default, use_index_extensions is enabled. To check whether disabling use of index extensions can improve performance, use this statement:

SET optimizer_switch = 'use_index_extensions=off';

Use of index extensions by the optimizer is subject to the usual limits on the number of key parts in an index (16) and the maximum key length (3072 bytes).

11 Optimizer Use of Generated Column Indexes

MySQL supports indexes on generated columns. For example:

CREATE TABLE t1 (f1 INT, gc INT AS (f1 + 1) STORED, INDEX (gc))

The generated column, gc, is defined as the expression f1 + 1. The column is also indexed and the optimizer can take that index into account during execution plan construction. In the following query, the WHERE clause refers to gc and the optimizer considers whether the index on that column yields a more efficient plan:

SELECT * FROM t1 WHERE gc > 9;

The optimizer can use indexes on generated columns to generate execution plans, even in the absence of direct references in queries to those columns by name. This occurs if the WHERE, ORDER BY, or GROUP BY clause refers to an expression that matches the definition of some indexed generated column. The following query does not refer directly to gc but does use an expression that matches the definition of gc:

SELECT * FROM t1 WHERE f1 + 1 > 9;

The optimizer recognizes that the expression f1 + 1 matches the definition of gc and that gc is indexed, so it considers that index during execution plan construction. You can see this using EXPLAIN:

mysql> EXPLAIN SELECT * FROM t1 WHERE f1 + 1 > 9\G
*************************** 1. row ***************************
 id: 1
 select_type: SIMPLE
 table: t1
 partitions: NULL
 type: range
possible_keys: gc
 key: gc
 key_len: 5
 ref: NULL
 rows: 1
 filtered: 100.00
 Extra: Using index condition

In effect, the optimizer has replaced the expression f1 + 1 with the name of the generated column that matches the expression. That is also apparent in the rewritten query available in the extended EXPLAIN information displayed by SHOW WARNINGS:

mysql> SHOW WARNINGS\G
*************************** 1. row ***************************
 Level: Note
 Code: 1003
Message: /* select#1 */ select `test`.`t1`.`f1` AS `f1`,`test`.`t1`.`gc`
 AS `gc` from `test`.`t1` where (`test`.`t1`.`gc` > 9)

The following restrictions and conditions apply to the optimizer's use of generated column indexes:

• For a query expression to match a generated column definition, the expression must be identical and it must have the same result type. For example, if the generated column expression is f1 + 1, the optimizer does not recognize a match if the query uses 1 + f1, or if f1 + 1 (an integer expression) is compared with a string.

• The optimization applies to these operators: =, <, <=, >, >=, BETWEEN, and IN(). For operators other than BETWEEN and IN(), either operand can be replaced by a matching generated column. For BETWEEN and IN(), only the first argument can be replaced by a matching generated column, and the other arguments must have the same result type. BETWEEN and IN() are not yet supported for comparisons involving JSON values.

• The generated column must be defined as an expression that contains at least a function call or one of the operators mentioned in the preceding item. The expression cannot consist of a simple reference to another column. For example, gc INT AS (f1) STORED consists only of a column reference, so indexes on gc are not considered.

• For comparisons of strings to indexed generated columns that compute a value from a JSON function that returns a quoted string, JSON_UNQUOTE() is needed in the column definition to remove the extra quotes from the function value. (For direct comparison of a string to the function result, the JSON comparator handles quote removal, but this does not occur for index lookups.) For example, instead of writing a column definition like this:

doc_name TEXT AS (JSON_EXTRACT(jdoc, '$.name')) STORED

Write it like this:

doc_name TEXT AS (JSON_UNQUOTE(JSON_EXTRACT(jdoc, '$.name'))) STORED

With the latter definition, the optimizer can detect a match for both of these comparisons:

... WHERE JSON_EXTRACT(jdoc, '$.name') = 'some_string' ...
... WHERE JSON_UNQUOTE(JSON_EXTRACT(jdoc, '$.name')) = 'some_string' ...

Without JSON_UNQUOTE() in the column definition, the optimizer detects a match only for the first of those comparisons.

• If the optimizer picks the wrong index, an index hint can be used to disable it and force the optimizer to make a different choice.

12 Invisible Indexes

MySQL supports invisible【ɪnˈvɪzəbl 看不見的;無形的(與服務而非商品有關);隱形的;】 indexes; that is, indexes that are not used by the optimizer. The feature applies to indexes other than primary keys (either explicit or implicit).

Indexes are visible by default. To control visibility explicitly for a new index, use a VISIBLE or INVISIBLE keyword as part of the index definition for CREATE TABLE, CREATE INDEX, or ALTER TABLE:

CREATE TABLE t1 (
 i INT,
 j INT,
 k INT,
 INDEX i_idx (i) INVISIBLE
) ENGINE = InnoDB;
CREATE INDEX j_idx ON t1 (j) INVISIBLE;
ALTER TABLE t1 ADD INDEX k_idx (k) INVISIBLE;

To alter the visibility of an existing index, use a VISIBLE or INVISIBLE keyword with the ALTER TABLE ... ALTER INDEX operation:

ALTER TABLE t1 ALTER INDEX i_idx INVISIBLE;
ALTER TABLE t1 ALTER INDEX i_idx VISIBLE;

Information about whether an index is visible or invisible is available from the Information Schema STATISTICS table or SHOW INDEX output. For example:

mysql> SELECT INDEX_NAME, IS_VISIBLE
 FROM INFORMATION_SCHEMA.STATISTICS
 WHERE TABLE_SCHEMA = 'db1' AND TABLE_NAME = 't1';
+------------+------------+
| INDEX_NAME | IS_VISIBLE |
+------------+------------+
| i_idx      | YES        |  
| j_idx      | NO         |
| k_idx      | NO         |
+------------+------------+

Invisible indexes make it possible to test the effect of removing an index on query performance, without making a destructive change that must be undone should the index turn out to be required. Dropping and re-adding an index can be expensive for a large table, whereas making it invisible and visible are fast, inplace operations.

If an index made invisible actually is needed or used by the optimizer, there are several ways to notice the effect of its absence on queries for the table:

• Errors occur for queries that include index hints that refer to the invisible index.

• Performance Schema data shows an increase in workload for affected queries.

• Queries have different EXPLAIN execution plans.

• Queries appear in the slow query log that did not appear there previously

The use_invisible_indexes flag of the optimizer_switch system variable controls whether the optimizer uses invisible indexes for query execution plan construction. If the flag is off (the default), the optimizer ignores invisible indexes (the same behavior as prior to the introduction of this flag). If the flag is on, invisible indexes remain invisible but the optimizer takes them into account for execution plan construction.

Using the SET_VAR optimizer hint to update the value of optimizer_switch temporarily, you can enable invisible indexes for the duration of a single query only, like this:

mysql> EXPLAIN SELECT /*+ SET_VAR(optimizer_switch = 'use_invisible_indexes=on') */
 > i, j FROM t1 WHERE j >= 50\G
*************************** 1. row ***************************
 id: 1
 select_type: SIMPLE
 table: t1
 partitions: NULL
 type: range
possible_keys: j_idx
 key: j_idx
 key_len: 5
 ref: NULL
 rows: 2
 filtered: 100.00
 Extra: Using index condition
mysql> EXPLAIN SELECT i, j FROM t1 WHERE j >= 50\G
*************************** 1. row ***************************
 id: 1
 select_type: SIMPLE
 table: t1
 partitions: NULL
 type: ALL
possible_keys: NULL
 key: NULL
 key_len: NULL
 ref: NULL
 rows: 5
 filtered: 33.33
 Extra: Using where

Index visibility does not affect index maintenance. For example, an index continues to be updated per changes to table rows, and a unique index prevents insertion of duplicates into a column, regardless of whether the index is visible or invisible.

A table with no explicit primary key may still have an effective implicit primary key if it has any UNIQUE indexes on NOT NULL columns. In this case, the first such index places the same constraint on table rows as an explicit primary key and that index cannot be made invisible. Consider the following table definition:

CREATE TABLE t2 (
 i INT NOT NULL,
 j INT NOT NULL,
 UNIQUE j_idx (j)
) ENGINE = InnoDB;

The definition includes no explicit primary key, but the index on NOT NULL column j places the same constraint on rows as a primary key and cannot be made invisible:

mysql> ALTER TABLE t2 ALTER INDEX j_idx INVISIBLE;
ERROR 3522 (HY000): A primary key index cannot be invisible.

Now suppose that an explicit primary key is added to the table:

ALTER TABLE t2 ADD PRIMARY KEY (i);

The explicit primary key cannot be made invisible. In addition, the unique index on j no longer acts as an implicit primary key and as a result can be made invisible:

mysql> ALTER TABLE t2 ALTER INDEX j_idx INVISIBLE;
Query OK, 0 rows affected (0.03 sec)

13 Descending Indexes

MySQL supports descending【dɪˈsendɪŋ (次序)下降的，遞減的;】 indexes: DESC in an index definition is no longer ignored but causes storage of key values in descending order. Previously, indexes could be scanned in reverse order but at a performance penalty【ˈpenəlti 處罰;懲罰;刑罰;點球;(對犯規者的)判罰;不利;害處;】. A descending index can be scanned in forward order, which is more efficient. Descending indexes also make it possible for the optimizer to use multiple-column indexes when the most efficient scan order mixes ascending order for some columns and descending order for others.

Consider the following table definition, which contains two columns and four two-column index definitions for the various combinations of ascending and descending indexes on the columns:

CREATE TABLE t (
 c1 INT, c2 INT,
 INDEX idx1 (c1 ASC, c2 ASC),
 INDEX idx2 (c1 ASC, c2 DESC),
 INDEX idx3 (c1 DESC, c2 ASC),
 INDEX idx4 (c1 DESC, c2 DESC)
);

The table definition results in four distinct indexes. The optimizer can perform a forward index scan for each of the ORDER BY clauses and need not use a filesort operation:

ORDER BY c1 ASC, c2 ASC -- optimizer can use idx1
ORDER BY c1 DESC, c2 DESC -- optimizer can use idx4
ORDER BY c1 ASC, c2 DESC -- optimizer can use idx2
ORDER BY c1 DESC, c2 ASC -- optimizer can use idx3

Use of descending indexes is subject to【be subject to 受…影響;受…支配;降伏;給; 】 these conditions:

• Descending indexes are supported only for the InnoDB storage engine, with these limitations:

Change buffering is not supported for a secondary index if the index contains a descending index key column or if the primary key includes a descending index column.
The InnoDB SQL parser does not use descending indexes. For InnoDB full-text search, this means that the index required on the FTS_DOC_ID column of the indexed table cannot be defined as a descending index.

• Descending indexes are supported for all data types for which ascending indexes are available.

• Descending indexes are supported for ordinary (nongenerated) and generated columns (both VIRTUAL and STORED).

• DISTINCT can use any index containing matching columns, including descending key parts.

• Indexes that have descending key parts are not used for MIN()/MAX() optimization of queries that invoke aggregate functions but do not have a GROUP BY clause.

• Descending indexes are supported for BTREE but not HASH indexes. Descending indexes are not supported for FULLTEXT or SPATIAL indexes.

Explicitly【ɪkˈsplɪsətli 明確地;明白地;】 specified ASC and DESC designators for HASH, FULLTEXT, and SPATIAL indexes results in an error.

You can see in the Extra column of the output of EXPLAIN that the optimizer is able to use a descending index, as shown here:

mysql> CREATE TABLE t1 (
 -> a INT,
 -> b INT,
 -> INDEX a_desc_b_asc (a DESC, b ASC)
 -> );
mysql> EXPLAIN SELECT * FROM t1 ORDER BY a ASC\G
*************************** 1. row ***************************
 id: 1
 select_type: SIMPLE
 table: t1
 partitions: NULL
 type: index
possible_keys: NULL
 key: a_desc_b_asc
 key_len: 10
 ref: NULL
 rows: 1
 filtered: 100.00
 Extra: Backward index scan; Using index

In EXPLAIN FORMAT=TREE output, use of a descending index is indicated by the addition of (reverse) following the name of the index, like this:

mysql> EXPLAIN FORMAT=TREE SELECT * FROM t1 ORDER