05.12 記一次Redis內存詭異增長

摘要: 一次由於redis rehash造成的內存暴增。

一、現象

  • 實例名:r-bp1cxxxxxxxxxd04(主從)

  • 時間:2017-11-16 12:26~12:27

  • 問題:一分鐘內存上漲了2G,如下圖所示:

  • 鍵值規模:6000萬左右

記一次Redis內存詭異增長

二、Redis內存分析

1.內存組成

上圖中的內存統計的是Redis的info memory命令中的used_memory屬性,例如:

記一次Redis內存詭異增長

每個屬性的詳細說明

<table><thead>屬性名屬性說明/<thead><tbody>used_memoryRedis 分配器分配的內存量,也就是實際存儲數據的內存總量used_memory_human以可讀格式返回 Redis 使用的內存總量used_memory_rss從操作系統的角度,Redis進程佔用的總物理內存used_memory_peak內存分配器分配的最大內存,代表used_memory的歷史峰值used_memory_peak_human以可讀的格式顯示內存消耗峰值
used_memory_luaLua引擎所消耗的內存mem_fragmentation_ratioused_memory_rss /used_memory比值,表示內存碎片率mem_allocatorRedis 所使用的內存分配器。默認: jemalloc/<tbody>/<table>

計算公式如下:

used_memory = 自身內存+對象內存+緩衝內存+lua內存used_rss = used_memory + 內存碎片

如下圖所示:

記一次Redis內存詭異增長

2.內存分析:

(1) 自身內存:一個空的Redis佔用很小,可以忽略不計

(2) kv內存:key對象 + value對象

(3) 緩衝區:客戶端緩衝區(普通 + slave偽裝 + pubsub)以及aof緩衝區(比較固定,一般沒問題)

(4) Lua:Lua引擎所消耗的內存

3.內存突增常見問題

(1) kv內存:bigkey、大量寫入

(2) 客戶端緩衝區:一般常見的有普通客戶端緩衝區(例如monitor命令)或者pubsub客戶端緩衝區

三、問題排查

(1) bigkey?

經掃描未發現bigkey

記一次Redis內存詭異增長

(2) 鍵值個數增加?

未發現鍵值有明顯變化

記一次Redis內存詭異增長

(3) 客戶端緩衝區

由於內存增上去後,長時間沒下落,如果是因為緩衝區問題,會從info clients找到明顯問題:

執行後發現:

記一次Redis內存詭異增長

執行client中也沒有明顯的omem大於0的情況

id=80207 addr=10.xx.0.4:63920 fd=46 name= age=624 idle=1 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping read=0 write=0 type=user
id=80215 addr=10.xx.0.23:43489 fd=36 name= age=591 idle=1 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping read=0 write=0 type=user
id=80366 addr=10.xx.0.8:59785 fd=18 name= age=84 idle=1 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=del read=0 write=0 type=user
id=80356 addr=10.xx.0.33:32117 fd=13 name= age=114 idle=0 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=32768 obl=0 oll=0 omem=0 events=r cmd=ping read=0 write=0 type=user
id=80064 addr=10.xx.59.4:53446 fd=38 name= age=1070 idle=1070 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=NULL read=0 write=0 type=admin
id=80276 addr=10.xx.0.23:48511 fd=8 name= age=387 idle=1 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping read=0 write=0 type=user
id=80188 addr=10.xx.0.33:16265 fd=42 name= age=681 idle=3 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping read=0 write=0 type=user
id=80326 addr=10.xx.0.32:59779 fd=16 name= age=209 idle=0 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=32768 obl=0 oll=0 omem=0 events=r cmd=ping read=0 write=0 type=user
id=80065 addr=10.xx.59.4:53447 fd=45 name= age=1070 idle=1070 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=NULL read=0 write=0 type=admin
id=79936 addr=10.xx.0.22:10607 fd=30 name= age=1480 idle=1 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=32768 obl=0 oll=0 omem=0 events=r cmd=ping read=0 write=0 type=user
id=80174 addr=10.xx.0.5:60914 fd=6 name= age=722 idle=2 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping read=0 write=0 type=user
id=80300 addr=10.xx.0.22:22757 fd=48 name= age=298 idle=1 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=32768 obl=0 oll=0 omem=0 events=r cmd=ping read=0 write=0 type=user
id=80037 addr=10.xx.0.5:55189 fd=15 name= age=1143 idle=2 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping read=0 write=0 type=user
id=80330 addr=10.xx.0.8:48533 fd=17 name= age=199 idle=10 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping read=0 write=0 type=user
id=79896 addr=10.xx.0.30:26814 fd=11 name= age=1616 idle=1 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping read=0 write=0 type=user
id=80299 addr=10.xx.0.24:11227 fd=44 name= age=303 idle=3 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping read=0 write=0 type=user
id=80086 addr=10.xx.0.32:52526 fd=40 name= age=1002 idle=1 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping read=0 write=0 type=user
id=80202 addr=10.xx.0.33:16658 fd=26 name= age=636 idle=3 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping read=0 write=0 type=user
id=80256 addr=10.xx.0.24:60496 fd=19 name= age=448 idle=2 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping read=0 write=0 type=user
id=79908 addr=10.xx.0.29:18975 fd=12 name= age=1583 idle=1 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping read=0 write=0 type=user
id=80365 addr=10.xx.0.29:46429 fd=14 name= age=85 idle=1 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=32768 obl=0 oll=0 omem=0 events=r cmd=ping read=0 write=0 type=user
id=79869 addr=10.xx.27.4:48455 fd=35 name= age=1700 idle=1700 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=NULL read=0 write=0 type=admin
id=80334 addr=10.xx.0.23:50012 fd=39 name= age=189 idle=1 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=32768 obl=0 oll=0 omem=0 events=r cmd=ping read=0 write=0 type=user
id=80041 addr=10.xx.0.32:51107 fd=33 name= age=1132 idle=3 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping read=0 write=0 type=user
id=79992 addr=10.xx.0.22:12068 fd=28 name= age=1289 idle=1 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=32768 obl=0 oll=0 omem=0 events=r cmd=ping read=0 write=0 type=user
id=80251 addr=10.xx.0.30:44213 fd=23 name= age=468 idle=1 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=32768 obl=0 oll=0 omem=0 events=r cmd=ping read=0 write=0 type=user
id=80006 addr=10.xx.0.2:45895 fd=31 name= age=1242 idle=1 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping read=0 write=0 type=user
id=80321 addr=10.xx.0.30:48048 fd=5 name= age=224 idle=3 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping read=0 write=0 type=user
id=80381 addr=10.xx.0.8:13360 fd=22 name= age=24 idle=1 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=32768 obl=0 oll=0 omem=0 events=r cmd=del read=0 write=0 type=user
id=80200 addr=10.xx.0.24:59183 fd=24 name= age=640 idle=0 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=32768 obl=0 oll=0 omem=0 events=r cmd=ping read=0 write=0 type=user
id=80113 addr=10.xx.0.2:52492 fd=21 name= age=915 idle=1 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=32768 obl=0 oll=0 omem=0 events=r cmd=ping read=0 write=0 type=user
id=174 addr=11.216.117.242:53027 fd=9 name= age=281390 idle=0 flags=S db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=32768 obl=0 oll=0 omem=0 events=r cmd=replconf read=0 write=0 type=admin
id=79991 addr=10.xx.0.4:48412 fd=25 name= age=1296 idle=0 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=32768 obl=0 oll=0 omem=0 events=r cmd=ping read=0 write=0 type=user
id=80301 addr=127.0.0.1:47869 fd=49 name= age=291 idle=261 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=strlen read=0 write=0 type=admin
id=80047 addr=10.xx.59.4:53184 fd=41 name= age=1114 idle=1114 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=NULL read=0 write=0 type=admin
id=80236 addr=10.xx.0.5:62546 fd=47 name= age=516 idle=1 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=32768 obl=0 oll=0 omem=0 events=r cmd=ping read=0 write=0 type=user
id=80364 addr=10.xx.0.4:18794 fd=7 name= age=85 idle=1 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=32768 obl=0 oll=0 omem=0 events=r cmd=ping read=0 write=0 type=user
id=80175 addr=10.xx.0.4:62245 fd=29 name= age=718 idle=1 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=32768 obl=0 oll=0 omem=0 events=r cmd=ping read=0 write=0 type=user
id=80336 addr=10.xx.0.29:45701 fd=50 name= age=180 idle=1 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=32768 obl=0 oll=0 omem=0 events=r cmd=ping read=0 write=0 type=user
id=80050 addr=10.xx.59.4:53188 fd=43 name= age=1114 idle=1114 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=NULL read=0 write=0 type=admin
id=79765 addr=10.xx.0.2:33832 fd=37 name= age=2027 idle=177 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=info read=0 write=0 type=user
id=80170 addr=10.xx.0.2:57853 fd=20 name= age=728 idle=24 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=0 obl=0 oll=0 omem=0 events=r cmd=ping read=0 write=0 type=user
id=80390 addr=127.0.0.1:49449 fd=27 name= age=0 idle=0 flags=N db=0 sub=0 psub=0 multi=-1 qbuf=0 qbuf-free=32768 obl=0 oll=0 omem=0 events=r cmd=client read=0 write=0 type=admin

四、揪出元兇

常用的幾招都用了,還是不行,同事@徑遠幫忙一起分析,懷疑是不是因為Redis的kv哈希表做了rehash。

1.Redis的kv存儲結構

如下圖所示,Redis的所有kv保存在dict中,其中ht對應兩個哈希表ht[0]和ht[1],平時一個空閒,一個用於存儲數據,只有當需要rehash時,ht[1]才會用到。

記一次Redis內存詭異增長

2.Redis的字典rehash

為了保證哈希表的負載,當哈希表的元素個數等於哈希表槽數時候,會進行rehash擴容。

擴容後h[1]的容量等於第一個大於等於ht[0].size*2的2n,例如hash表的初始化容量是4,那麼下一次擴容就是8,以此類推。

3.測試

(1) 測試方法

先批量寫入到rehash閾值附近,然後在逐條去寫,觀察內存變化

記一次Redis內存詭異增長

(2) 開始測試

(a) 當閾值=215=32768,從下面可以看出到key的個數為32769時,內存漲了一些,但是還不明顯。

記一次Redis內存詭異增長

(b) 當閾值=220=1048576,從下面可以看出到key的個數為1048577時,內存漲了32M。

因為rehash會擴容,所以新的哈希表中的槽位變為了221 * 2(因為每個key都設置了過期時間,expires表),指針為8個字節,221 ️ 2 ️ 8 = 225 = 32MB

記一次Redis內存詭異增長

(c) 當閾值=226=67108864,從下面可以看出到key的個數為67108865時,內存漲了2GB。

因為rehash會擴容,所以新的哈希表中的槽位變為了227 * 2(因為每個key都設置了過期時間,expires表),指針為8個字節,227 ️ 2 ️ 8 = 231 = 2GB

記一次Redis內存詭異增長

回過來看r-bp1c15fd9b142d04的key和內存變化圖,可以發現上面的規則是正確的:

記一次Redis內存詭異增長

記一次Redis內存詭異增長

4 後續觀察

17點時,rehash結束,內存降了增加的2G的一半。

記一次Redis內存詭異增長

五、總結

  • 由於哈希表的特性,Redis中鍵值數量大,不會對存取造成性能影響,但是會出現本文提到的問題。控制鍵個數有幾個建議:

  • 無用的鍵值設置過期時間或者定期刪除。

  • 優化鍵值設計:例如可以使用ziplist hash合併優化部分字符串類型。

  • 未來改進:內核層面支持rehash的審計日誌以及增強rehash的速度。


    分享到:


    相關文章: