第03期：列非空與自增技术頭條網

第03期：列非空與自增

2020-04-29 17:40:00 愛可生

MySQL 裡字段的屬性很多，對性能來說，影響也是可大可小，所以針對其屬性這一塊有必要進行一次探究。

一、NULL / NOT NULL

NULL 對外部程序來說，具體為不知道、不確切的、無法表述的值。所以在很多家公司的開發規範裡都明確規定了，必須為 NOT NULL。

其實用到 NULL 的場景都可以轉換為有意義的字符或者數值，一是有利用數據的易讀性以及後期的易維護性；二是降低 SQL 語句的編寫難度。

關於 NULL 的特性如下：

1. 參與 NULL 字段拼接的結果都為 NULL，預期的可能會有差異

預想把字段 r1 做個拼接，再插入到新的表 t3 裡，結果 t3 表的記錄全為 NULL，跟預期不符。

<code>mysql> show create table t1\\G*************************** 1. row ***************************      Table: t1Create Table: CREATE TABLE `t1` ( `r1` varchar(10) DEFAULT NULL) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci1 row in set (0.00 sec)mysql> show create table t2\\G*************************** 1. row ***************************      Table: t2Create Table: CREATE TABLE `t2` ( `r1` varchar(10) NOT NULL) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci1 row in set (0.00 sec)mysql> create table t3 like t1;Query OK, 0 rows affected (0.04 sec)mysql> insert into t3 select concat(r1,'database') from t1 limit 2;Query OK, 2 rows affected (0.02 sec)Records: 2  Duplicates: 0  Warnings: 0mysql> select * from t3;+------+| r1   |+------+| NULL || NULL |+------+2 rows in set (0.00 sec)/<code>

那正確的方法如下，對 NULL 用函數 IFNULL 特殊處理。

<code>mysql> insert into t3 select concat(ifnull(r1,''),'database') from t1 limit 2;Query OK, 2 rows affected (0.01 sec)Records: 2  Duplicates: 0  Warnings: 0mysql> select * from t3;+----------+| r1       |+----------+| database || database |+----------+2 rows in set (0.00 sec)/<code>

2. 對於包含 NULL 列的求 COUNT 值也不準確

t1 和 t2 的記錄數是一樣的，但是字段 r1 包含了 NULL，這導致結果忽略了這些值。

<code>mysql> select count(r1) as rc from t1;+-------+| rc    |+-------+| 16384 |+-------+1 row in set (0.01 sec)mysql> select count(r1) as rc from t2;+-------+| rc    |+-------+| 32768 |+-------+1 row in set (0.03 sec)/<code>

這時候我們可能想到了，正確的方法是用 NULL 相關函數處理，

<code>mysql> select count(ifnull(r1,'')) as rc from t1;+-------+| rc    |+-------+| 32768 |+-------+1 row in set (0.03 sec)/<code>

或者是直接用 COUNT(*) 包含了所有可能的值

<code>mysql> select count(*) as rc from t1;+-------+| rc    |+-------+| 32768 |+-------+1 row in set (0.02 sec)/<code>

當然了不僅僅是 COUNT，除了 NULL 相關的函數，大部分對 NULL 處理都不友好。

所以其實從上面兩點來看，NULL 的處理都得特殊處理，額外增加了編寫 SQL 的難度。

3. 包含 NULL 的索引列

對包含 NULL 列建立索引，比不包含的 NULL 的字段，要多佔用一個 BIT 位來存儲。

示例

key_len 分別為 43 和 42，t1 比 t2 多了一個字節，那這裡可能有人要問了，不是說佔了一個 BIT 位嗎？那為什麼多了一個字節？可以關注我關於 BIT 的詳細描述。

<code>mysql> pager grep -i 'key_len'PAGER set to 'grep -i 'key_len''mysql> explain select * from t1 where r1 = ''\\G     key_len: 431 row in set, 1 warning (0.00 sec)mysql> explain select * from t2 where r1 = ''\\G     key_len: 421 row in set, 1 warning (0.00 sec)/<code>

4. 各存儲引擎相關的對 NULL 的處理

在 MySQL 8.0 發佈後，僅有 InnoDB、MyISAM 和 Memory 支持對包含 NULL 列的索引，其他引擎不支持。比如 NDB。

二、AUTO_INCREMENT

列的自增屬性，一般用來設置整數列根據一定步長逐步增長的值，類似於其他數據庫的序列。不過這裡的“序列”是基於特定一張表的。關於自增屬性的相關特性如下：

1. 控制自增屬性性能的變量：innodb_autoinc_lock_mode

innodb_autoinc_lock_mode=0

代表傳統模式，也就是說，在對有自增屬性的字段插入記錄時，會持續持有一個表級別的自增鎖，直到語句執行結束為止。比如說下面兩條語句，SQL 1 在執行期間，一直持有對錶 f1 的表級自增鎖，接下來 SQL 2 執行時鎖超時。

innodb_autoinc_lock_mode=1

代表連續模式，和傳統模式差不多，不同的點在於對於簡單的插入語句，比如 SQL 2，只在分配新的 ID 過程中持有一個輕量級的互斥鎖（線程級別，而不是事務級別），而不是直到語句結束才釋放的表鎖。

<code>-- SQL 1mysql> insert into f1(c2) select rpad(uuid(),100,uuid()) from t1;Query OK, 16777216 rows affected (3 min 35.92 sec)Records: 16777216  Duplicates: 0  Warnings: 0-- SQL 2mysql> insert into f1(c2) select 'database';ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction/<code>

innodb_autoinc_lock_mode=2

代表交錯模式。這個模式下放棄了自增表鎖，產生的值會不連續。不過這是性能最高的模式，多條插入語句可以併發執行。MySQL 8.0 默認就是交錯模式。

<code>-- SQL 1mysql> insert into f1(c2) select rpad(uuid(),100,uuid()) from t1;Query OK, 16777216 rows affected (3 min 35.92 sec)Records: 16777216  Duplicates: 0  Warnings: 0-- SQL 2mysql> insert into f1(c2) select 'sql2';Query OK, 1 row affected (0.02 sec)Records: 1  Duplicates: 0  Warnings: 0/<code>

那針對複製安全來說，以上三種模式，0 和 1 對語句級別安全，也就是產生的二進制日誌複製到任何其他機器都數據都一致；2 對於語句級別不安全；三種模式對二進制日誌格式為行的都安全。

2. 控制自增屬性的步長以及偏移量

一般用在主主複製架構或者多源複製架構裡，主動規避主鍵衝突。

auto_increment_increment 控制步長

auto_increment_offset 控制偏移量

3. 對於要立刻獲取插入值的需求

就是說要獲取一張表任何時候的最大值，應該時刻執行以下 SQL 3 ，而不是 SQL 2。SQL 2 裡的函數 last_insert_id() 只獲取上一條語句最開始的 ID，只適合簡單的 INSERT。

<code>-- SQL 1  mysql> insert into f1(c2) values ('xx1'),('xx2'),('xx3');  Query OK, 3 rows affected (0.01 sec)  Records: 3  Duplicates: 0  Warnings: 0-- SQL 2  mysql> select last_insert_id() as last_insert_id;  +----------------+  | last_insert_id |  +----------------+  |              1 |  +----------------+  1 row in set (0.00 sec)-- SQL 3  mysql> select max(c1) as last_insert_id from f1;  +----------------+  | last_insert_id |  +----------------+  |              3 |  +----------------+  1 row in set (0.00 sec)-- SQL 4mysql> select * from f1;+----+------+| c1 | c2   |+----+------+|  1 | xx1  ||  2 | xx2  ||  3 | xx3  |+----+------+3 rows in set (0.00 sec)/<code>

4. 自增列溢出現象

自增屬性的列如果到了此列數據類型的最大值，會發生值溢出。比如變更表 f1 的自增屬性列為 tinyint。

SQL 2 顯式插入最大值 127， SQL 3 就報錯了。所以這點上建議提前規劃好自增列的字段類型，提前瞭解上限值。

<code>mysql> drop table f1;Query OK, 0 rows affected (0.04 sec)mysql> create table f1(c1 tinyint auto_increment primary key);Query OK, 0 rows affected (0.05 sec)-- SQL 1mysql> insert into f1 values (127);Query OK, 1 row affected (0.01 sec)-- SQL 2mysql> select * from f1;+-----+| c1  |+-----+| 127 |+-----+1 row in set (0.00 sec)-- SQL 3mysql> insert into f1 select null;ERROR 1062 (23000): Duplicate entry '127' for key 'PRIMARY'/<code>

5. 自增列也可以顯式插入有符號的值

<code>mysql> insert into f1 values (-10),(-20),(-30);Query OK, 3 rows affected (0.01 sec)Records: 3  Duplicates: 0  Warnings: 0mysql> select * from f1;+-----+| c1  |+-----+| -30 || -20 || -10 || 127 |+-----+4 rows in set (0.00 sec)/<code>

那針對這樣的，建議在請求到達數據庫前規避掉。