-
Notifications
You must be signed in to change notification settings - Fork 243
/
Copy pathziplist.c
3220 lines (2899 loc) · 141 KB
/
ziplist.c
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
/* The ziplist is a specially encoded dually linked list that is designed
* to be very memory efficient. It stores both strings and integer values,
* where integers are encoded as actual integers instead of a series of
* characters. It allows push and pop operations on either side of the list
* in O(1) time. However, because every operation requires a reallocation of
* the memory used by the ziplist, the actual complexity is related to the
* amount of memory used by the ziplist.
*
* ziplist 压缩列表是一个特殊编码的双端链表(内存上连续),为了尽可能节省内存而设计的。
* ziplist 可以存储字符串或者整数值,其中整数被编码保存为实际的整数,而不是字符数组。
* ziplist 支持 O(1) 的时间复杂度在列表的两端进行 push 和 pop 操作。
* 然而因为这些操作都需要对整个 ziplist 进行内存重分配(因为是一块连续的内存),
* 所以操作的实际复杂度和 ziplist 占用的内存大小有关。
*
* 注意:在 7.0 版本里,ziplist 已经全面被 listpack 替换了(主要是因为连锁更新较影响性能)。
*
* ----------------------------------------------------------------------------
*
* ZIPLIST OVERALL LAYOUT
* ======================
*
* The general layout of the ziplist is as follows:
* ziplist 压缩列表的布局如下:
*
* <zlbytes> <zltail> <zllen> <entry> <entry> ... <entry> <zlend>
*
* NOTE: all fields are stored in little endian, if not specified otherwise.
*
* <uint32_t zlbytes> is an unsigned integer to hold the number of bytes that
* the ziplist occupies, including the four bytes of the zlbytes field itself.
* This value needs to be stored to be able to resize the entire structure
* without the need to traverse it first.
*
* zlbytes: 是一个 32 位无符号整数(4 bytes),记录整个 ziplist 占用的内存字节数,包含 4 个字节的 zlbytes 本身。
* 记录这个值我们可以 O(1) 知道 ziplist 的字节长度,然后进行 resize 大小调整,不然的话需要完整遍历整个 ziplist。
*
* <uint32_t zltail> is the offset to the last entry in the list. This allows
* a pop operation on the far side of the list without the need for full
* traversal.
*
* zltail: 是一个 32 位无符号整数(4 bytes),记录 ziplist 到尾节点的位置偏移量。
* 通过这个偏移量我们可以直接定位到表尾节点,例如进行表尾的 pop 操作,不然得完整遍历 ziplist。
*
* <uint16_t zllen> is the number of entries. When there are more than
* 2^16-2 entries, this value is set to 2^16-1 and we need to traverse the
* entire list to know how many items it holds.
*
* zllen: 是一个 16 位无符号整数(2 bytes),记录 ziplist 里的节点数量。
* 由于它设计只用 2 个字节进行存储,2 字节实际最大可以表示为 2^16 - 1 即: 65535。
* 当数字小于它时,则 zllen 的值就是实际的节点数量(O(1) 时间复杂度), 也就是注释里的 2^16 - 2 的含义。
* 否则当 zllen 值为 65535 时即 2^16-1,用它作为一个标识,表示需要完整遍历整个压缩列表 O(N) 时间复杂度才能计算出真实的节点数量。
* 所以 ziplist 不适合存储过多元素(遍历计算节点数量开销很大,且我们假设它只用于元素数量较少的场景)。
*
* <uint8_t zlend> is a special entry representing the end of the ziplist.
* Is encoded as a single byte equal to 255. No other normal entry starts
* with a byte set to the value of 255.
*
* zlend: 是一个 8 位无符号整数(1 byte),是一个特殊的标志位来标记压缩列表的结尾,0xFF(十进制表示为: 255)。
* 其它正常节点不会有以这个字节开头的,在遍历 ziplist 的时候通过这个标记来判断是否遍历结束。
*
* ZIPLIST ENTRIES
* ===============
*
* Every entry in the ziplist is prefixed by metadata that contains two pieces
* of information. First, the length of the previous entry is stored to be
* able to traverse the list from back to front. Second, the entry encoding is
* provided. It represents the entry type, integer or string, and in the case
* of strings it also represents the length of the string payload.
* So a complete entry is stored like this:
*
* <prevlen> <encoding> <entry-data>
*
* 每一个 ziplist entry 压缩列表节点在实际的节点数据之前都会包含两部分元数据,也叫 entry header。
* 1. prevlen: 前置节点的字节长度,以支持我们从后往前遍历(通过指针偏移量定位前一个节点)
* 2. encoding: 当前节点 entry-data 节点数据部分的类型和编码,例如存储的是整数还是字符串,类型下还会细分多种编码。
*
*
* Sometimes the encoding represents the entry itself, like for small integers
* as we'll see later. In such a case the <entry-data> part is missing, and we
* could have just:
*
* <prevlen> <encoding>
*
* 有时候节点可以不用有 entry-data,可以在 encoding 部分直接存储节点数据。
* 例如一些小整数,可以直接在 encoding 部分用几位来存储表示,对每一位都物尽其用。
*
* The length of the previous entry, <prevlen>, is encoded in the following way:
* If this length is smaller than 254 bytes, it will only consume a single
* byte representing the length as an unsigned 8 bit integer. When the length
* is greater than or equal to 254, it will consume 5 bytes. The first byte is
* set to 254 (FE) to indicate a larger value is following. The remaining 4
* bytes take the length of the previous entry as value.
*
* 当前节点的前节点字节长度,prevlen 的编码方式如下(同时我们将存储 prevlen 所需的字节数为 prevlensize,即下面的 1 或者 5 字节):
* 1. 如果前节点的字节长度 < 254 字节,那么 prevlen 使用 1 个字节来保存它,一个 8 位无符号的整数
* 2. 如果前节点的字节长度 >= 254 字节,那么 prevlen 使用 5 个字节来保存它:
* a. 第 1 个字节会被置为 0xFE 十进制的 254 (后面的 ZIP_BIG_PREVLEN),用来标识我们是用 5 个字节编码存储 prevlen
* b. prevlen 实际的值被保存在后 4 个字节里
*
* So practically an entry is encoded in the following way:
*
* <prevlen from 0 to 253> <encoding> <entry>
*
* Or alternatively if the previous entry length is greater than 253 bytes
* the following encoding is used:
*
* 0xFE <4 bytes unsigned little endian prevlen> <encoding> <entry>
*
* 编码结构示意图:
* 1. 如果前节点字节长度 < 254 字节,那么当前节点的编码布局如下所示:
* <prevlen from 0 to 253> <encoding> <entry>
* 前节点的长度,值介于 [0, 253] | 当前节点的实际数据类型以及长度 | 当前节点的实际数据
* 2. 如果前节点字节长度 >= 254 字节,那么当前节点的编码布局如下所示:
* 0xFE <4 bytes unsigned little endian prevlen> <encoding> <entry>
* zlend 标识,值为 254(1 字节)| 当前节点的实际长度(4 字节) | 当前节点的实际数据类型以及长度 | 当前节点的实际数据
*
* The encoding field of the entry depends on the content of the
* entry. When the entry is a string, the first 2 bits of the encoding first
* byte will hold the type of encoding used to store the length of the string,
* followed by the actual length of the string. When the entry is an integer
* the first 2 bits are both set to 1. The following 2 bits are used to specify
* what kind of integer will be stored after this header. An overview of the
* different types and encodings is as follows. The first byte is always enough
* to determine the kind of entry.
*
* |00pppppp| - 1 byte
* String value with length less than or equal to 63 bytes (6 bits).
* "pppppp" represents the unsigned 6 bit length.
* |01pppppp|qqqqqqqq| - 2 bytes
* String value with length less than or equal to 16383 bytes (14 bits).
* IMPORTANT: The 14 bit number is stored in big endian.
* |10000000|qqqqqqqq|rrrrrrrr|ssssssss|tttttttt| - 5 bytes
* String value with length greater than or equal to 16384 bytes.
* Only the 4 bytes following the first byte represents the length
* up to 2^32-1. The 6 lower bits of the first byte are not used and
* are set to zero.
* IMPORTANT: The 32 bit number is stored in big endian.
* |11000000| - 3 bytes
* Integer encoded as int16_t (2 bytes).
* |11010000| - 5 bytes
* Integer encoded as int32_t (4 bytes).
* |11100000| - 9 bytes
* Integer encoded as int64_t (8 bytes).
* |11110000| - 4 bytes
* Integer encoded as 24 bit signed (3 bytes).
* |11111110| - 2 bytes
* Integer encoded as 8 bit signed (1 byte).
* |1111xxxx| - (with xxxx between 0001 and 1101) immediate 4 bit integer.
* Unsigned integer from 0 to 12. The encoded value is actually from
* 1 to 13 because 0000 and 1111 can not be used, so 1 should be
* subtracted from the encoded 4 bit value to obtain the right value.
* |11111111| - End of ziplist special entry.
*
* Like for the ziplist header, all the integers are represented in little
* endian byte order, even when this code is compiled in big endian systems.
*
* EXAMPLES OF ACTUAL ZIPLISTS
* ===========================
*
* The following is a ziplist containing the two elements representing
* the strings "2" and "5". It is composed of 15 bytes, that we visually
* split into sections:
*
* [0f 00 00 00] [0c 00 00 00] [02 00] [00 f3] [02 f6] [ff]
* | | | | | |
* zlbytes zltail entries "2" "5" end
*
* The first 4 bytes represent the number 15, that is the number of bytes
* the whole ziplist is composed of. The second 4 bytes are the offset
* at which the last ziplist entry is found, that is 12, in fact the
* last entry, that is "5", is at offset 12 inside the ziplist.
* The next 16 bit integer represents the number of elements inside the
* ziplist, its value is 2 since there are just two elements inside.
* Finally "00 f3" is the first entry representing the number 2. It is
* composed of the previous entry length, which is zero because this is
* our first entry, and the byte F3 which corresponds to the encoding
* |1111xxxx| with xxxx between 0001 and 1101. We need to remove the "F"
* higher order bits 1111, and subtract 1 from the "3", so the entry value
* is "2". The next entry has a prevlen of 02, since the first entry is
* composed of exactly two bytes. The entry itself, F6, is encoded exactly
* like the first entry, and 6-1 = 5, so the value of the entry is 5.
* Finally the special entry FF signals the end of the ziplist.
*
* Adding another element to the above string with the value "Hello World"
* allows us to show how the ziplist encodes small strings. We'll just show
* the hex dump of the entry itself. Imagine the bytes as following the
* entry that stores "5" in the ziplist above:
*
* [02] [0b] [48 65 6c 6c 6f 20 57 6f 72 6c 64]
*
* The first byte, 02, is the length of the previous entry. The next
* byte represents the encoding in the pattern |00pppppp| that means
* that the entry is a string of length <pppppp>, so 0B means that
* an 11 bytes string follows. From the third byte (48) to the last (64)
* there are just the ASCII characters for "Hello World".
*
* ----------------------------------------------------------------------------
*
* Copyright (c) 2009-2012, Pieter Noordhuis <pcnoordhuis at gmail dot com>
* Copyright (c) 2009-2017, Salvatore Sanfilippo <antirez at gmail dot com>
* Copyright (c) 2020, Redis Labs, Inc
* All rights reserved.
*
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions are met:
*
* * Redistributions of source code must retain the above copyright notice,
* this list of conditions and the following disclaimer.
* * Redistributions in binary form must reproduce the above copyright
* notice, this list of conditions and the following disclaimer in the
* documentation and/or other materials provided with the distribution.
* * Neither the name of Redis nor the names of its contributors may be used
* to endorse or promote products derived from this software without
* specific prior written permission.
*
* THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
* AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
* IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
* ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE
* LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
* CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
* SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS
* INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN
* CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE)
* ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
* POSSIBILITY OF SUCH DAMAGE.
*/
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <stdint.h>
#include <limits.h>
#include "zmalloc.h"
#include "util.h"
#include "ziplist.h"
#include "config.h"
#include "endianconv.h"
#include "redisassert.h"
/* ZIP_END 标识 ziplist 的末尾,放在最后一个字节里 */
#define ZIP_END 255 /* Special "end of ziplist" entry. */
/* PREVLEN 一个字节最大能表示到 254 - 1 即 253,用 254 标识 prevlen 是用 5 个字节编码存储的 */
#define ZIP_BIG_PREVLEN 254 /* ZIP_BIG_PREVLEN - 1 is the max number of bytes of
the previous entry, for the "prevlen" field prefixing
each entry, to be represented with just a single byte.
Otherwise it is represented as FE AA BB CC DD, where
AA BB CC DD are a 4 bytes unsigned integer
representing the previous entry len. */
/* Different encoding/length possibilities */
#define ZIP_STR_MASK 0xc0
#define ZIP_INT_MASK 0x30
#define ZIP_STR_06B (0 << 6)
#define ZIP_STR_14B (1 << 6)
#define ZIP_STR_32B (2 << 6)
#define ZIP_INT_16B (0xc0 | 0<<4)
#define ZIP_INT_32B (0xc0 | 1<<4)
#define ZIP_INT_64B (0xc0 | 2<<4)
#define ZIP_INT_24B (0xc0 | 3<<4)
#define ZIP_INT_8B 0xfe
/* 4 bit integer immediate encoding |1111xxxx| with xxxx between
* 0001 and 1101. */
#define ZIP_INT_IMM_MASK 0x0f /* Mask to extract the 4 bits value. To add
one is needed to reconstruct the value. */
#define ZIP_INT_IMM_MIN 0xf1 /* 11110001 */
#define ZIP_INT_IMM_MAX 0xfd /* 11111101 */
#define INT24_MAX 0x7fffff
#define INT24_MIN (-INT24_MAX - 1)
/* Macro to determine if the entry is a string. String entries never start
* with "11" as most significant bits of the first byte. */
#define ZIP_IS_STR(enc) (((enc) & ZIP_STR_MASK) < ZIP_STR_MASK)
/* Utility macros.*/
/* Return total bytes a ziplist is composed of. */
/* 获取 ziplist 的占用字节数,即 zlbytes 的值 */
#define ZIPLIST_BYTES(zl) (*((uint32_t*)(zl)))
/* Return the offset of the last item inside the ziplist. */
/* 根据 zl 往后偏移 zlbytes 4 个字节,然后获取到尾节点的偏移量,即 zltail 的值 */
#define ZIPLIST_TAIL_OFFSET(zl) (*((uint32_t*)((zl)+sizeof(uint32_t))))
/* Return the length of a ziplist, or UINT16_MAX if the length cannot be
* determined without scanning the whole ziplist. */
/* 根据 zl 往后偏移 zlybtes 4 个字节和 zltail 4 个字节,然后获取节点数量,即 zllen 的值 */
#define ZIPLIST_LENGTH(zl) (*((uint16_t*)((zl)+sizeof(uint32_t)*2)))
/* The size of a ziplist header: two 32 bit integers for the total
* bytes count and last item offset. One 16 bit integer for the number
* of items field. */
/* 获取 ziplist header 占用的字节数
* 4 个字节的 zlbytes + 4 个字节的 zltail + 2 个字节的 zllen */
#define ZIPLIST_HEADER_SIZE (sizeof(uint32_t)*2+sizeof(uint16_t))
/* Size of the "end of ziplist" entry. Just one byte. */
/* 获取 ziplist 末端占用的字节数,其实就是一个字节的 ZIP_END */
#define ZIPLIST_END_SIZE (sizeof(uint8_t))
/* Return the pointer to the first entry of a ziplist. */
/* 获取 ziplist 的第一个节点指针,即 zl 偏移 header 占用的字节数,此时就指向第一个节点的起始位置 */
#define ZIPLIST_ENTRY_HEAD(zl) ((zl)+ZIPLIST_HEADER_SIZE)
/* Return the pointer to the last entry of a ziplist, using the
* last entry offset inside the ziplist header. */
/* 获取 ziplist 的最后一个节点指针,即 zl 偏移 zltail 偏移量,此时就指向最后一个节点的起始位置 */
#define ZIPLIST_ENTRY_TAIL(zl) ((zl)+intrev32ifbe(ZIPLIST_TAIL_OFFSET(zl)))
/* Return the pointer to the last byte of a ziplist, which is, the
* end of ziplist FF entry. */
/* 获取指向 ziplist 末端 ZIP_END 的指针,即 zl 偏移 zlbytes - 1,此时就指向 ZIP_END */
#define ZIPLIST_ENTRY_END(zl) ((zl)+intrev32ifbe(ZIPLIST_BYTES(zl))-ZIPLIST_END_SIZE)
/* copy from 黄健宏老师的 redis3.0 代码注释
空 ziplist 示例图:
area |<---- ziplist header ---->|<-- end -->|
size 4 bytes 4 bytes 2 bytes 1 byte
+---------+--------+-------+-----------+
component | zlbytes | zltail | zllen | zlend |
| | | | |
value | 1011 | 1010 | 0 | 1111 1111 |
+---------+--------+-------+-----------+
^
|
ZIPLIST_ENTRY_HEAD
&
address ZIPLIST_ENTRY_TAIL
&
ZIPLIST_ENTRY_END
非空 ziplist 示例图:
area |<---- ziplist header ---->|<----------- entries ------------->|<-end->|
size 4 bytes 4 bytes 2 bytes ? ? ? ? 1 byte
+---------+--------+-------+--------+--------+--------+--------+-------+
component | zlbytes | zltail | zllen | entry1 | entry2 | ... | entryN | zlend |
+---------+--------+-------+--------+--------+--------+--------+-------+
^ ^ ^
address | | |
ZIPLIST_ENTRY_HEAD | ZIPLIST_ENTRY_END
|
ZIPLIST_ENTRY_TAIL
*/
/* Increment the number of items field in the ziplist header. Note that this
* macro should never overflow the unsigned 16 bit integer, since entries are
* always pushed one at a time. When UINT16_MAX is reached we want the count
* to stay there to signal that a full scan is needed to get the number of
* items inside the ziplist. */
/* 取决于 incr 增加或者减少 ziplist 的节点数
* 增加节点场景:incr 一直为 1 因为我们每次只会增加一个元素,此时会检查 UINT16_MAX,
* 如果节点数大于它,则不继续自增,因为使用 UINT16_MAX 来标识我们需要完整遍历 ziplist 才能获取到节点数
*
* 减少节点场景:incr 会是一个负数,可以看到如果原本 zllen 大于 UINT16_MAX 的话,
* 就算减少节点到小于 UINT16_MAX 我们也不会在这里维护 zllen 的值,而是在 ziplistLen 计算 zllen 的时候维护
* */
#define ZIPLIST_INCR_LENGTH(zl,incr) { \
if (intrev16ifbe(ZIPLIST_LENGTH(zl)) < UINT16_MAX) \
ZIPLIST_LENGTH(zl) = intrev16ifbe(intrev16ifbe(ZIPLIST_LENGTH(zl))+incr); \
}
/* Don't let ziplists grow over 1GB in any case, don't wanna risk overflow in
* zlbytes */
/* 检查是否可以安全增加 ziplist 大小,确保不让 ziplist 大小超过 1GB,防止 zlbytes 溢出 */
#define ZIPLIST_MAX_SAFETY_SIZE (1<<30)
int ziplistSafeToAdd(unsigned char* zl, size_t add) {
size_t len = zl? ziplistBlobLen(zl): 0;
if (len + add > ZIPLIST_MAX_SAFETY_SIZE)
return 0;
return 1;
}
/* We use this function to receive information about a ziplist entry.
* Note that this is not how the data is actually encoded, is just what we
* get filled by a function in order to operate more easily. */
/* 这是一个很关键的结构体,将 ziplist 节点信息填充成一个 zlentry 结构体,方便后面进行函数操作
* 需要注意这并不是一个 ziplist 节点在内存中实际的编码布局,只是为了方便我们使用
* */
typedef struct zlentry {
/* 存储下面 prevrawlen 所需要的字节数 */
unsigned int prevrawlensize; /* Bytes used to encode the previous entry len*/
/* 存储前一个节点的字节长度 */
unsigned int prevrawlen; /* Previous entry len. */
/* 存储下面 len 所需要的字节数 */
unsigned int lensize; /* Bytes used to encode this entry type/len.
For example strings have a 1, 2 or 5 bytes
header. Integers always use a single byte.*/
/* 存储当前节点的字节长度 */
unsigned int len; /* Bytes used to represent the actual entry.
For strings this is just the string length
while for integers it is 1, 2, 3, 4, 8 or
0 (for 4 bit immediate) depending on the
number range. */
/* prevrawlensize + lensize 当前节点的头部字节,
* 其实是 prevlen + encoding 两项占用的字节数 */
unsigned int headersize; /* prevrawlensize + lensize. */
/* 存储当前节点的数据编码格式 */
unsigned char encoding; /* Set to ZIP_STR_* or ZIP_INT_* depending on
the entry encoding. However for 4 bits
immediate integers this can assume a range
of values and must be range-checked. */
/* 指向当前节点开头第一个字节的指针 */
unsigned char *p; /* Pointer to the very start of the entry, that
is, this points to prev-entry-len field. */
} zlentry;
#define ZIPLIST_ENTRY_ZERO(zle) { \
(zle)->prevrawlensize = (zle)->prevrawlen = 0; \
(zle)->lensize = (zle)->len = (zle)->headersize = 0; \
(zle)->encoding = 0; \
(zle)->p = NULL; \
}
/* Extract the encoding from the byte pointed by 'ptr' and set it into
* 'encoding' field of the zlentry structure. */
/* 从 ptr 中取出节点值的编码类型,保存在 encoding 变量中
* 时间复杂度 O(1) */
#define ZIP_ENTRY_ENCODING(ptr, encoding) do { \
(encoding) = ((ptr)[0]); \
if ((encoding) < ZIP_STR_MASK) (encoding) &= ZIP_STR_MASK; \
} while(0)
#define ZIP_ENCODING_SIZE_INVALID 0xff
/* Return the number of bytes required to encode the entry type + length.
* On error, return ZIP_ENCODING_SIZE_INVALID */
static inline unsigned int zipEncodingLenSize(unsigned char encoding) {
if (encoding == ZIP_INT_16B || encoding == ZIP_INT_32B ||
encoding == ZIP_INT_24B || encoding == ZIP_INT_64B ||
encoding == ZIP_INT_8B)
return 1;
if (encoding >= ZIP_INT_IMM_MIN && encoding <= ZIP_INT_IMM_MAX)
return 1;
if (encoding == ZIP_STR_06B)
return 1;
if (encoding == ZIP_STR_14B)
return 2;
if (encoding == ZIP_STR_32B)
return 5;
return ZIP_ENCODING_SIZE_INVALID;
}
#define ZIP_ASSERT_ENCODING(encoding) do { \
assert(zipEncodingLenSize(encoding) != ZIP_ENCODING_SIZE_INVALID); \
} while (0)
/* Return bytes needed to store integer encoded by 'encoding' */
static inline unsigned int zipIntSize(unsigned char encoding) {
switch(encoding) {
case ZIP_INT_8B: return 1;
case ZIP_INT_16B: return 2;
case ZIP_INT_24B: return 3;
case ZIP_INT_32B: return 4;
case ZIP_INT_64B: return 8;
}
if (encoding >= ZIP_INT_IMM_MIN && encoding <= ZIP_INT_IMM_MAX)
return 0; /* 4 bit immediate */
/* bad encoding, covered by a previous call to ZIP_ASSERT_ENCODING */
redis_unreachable();
return 0;
}
/* Write the encoding header of the entry in 'p'. If p is NULL it just returns
* the amount of bytes required to encode such a length. Arguments:
*
* 'encoding' is the encoding we are using for the entry. It could be
* ZIP_INT_* or ZIP_STR_* or between ZIP_INT_IMM_MIN and ZIP_INT_IMM_MAX
* for single-byte small immediate integers.
*
* 'rawlen' is only used for ZIP_STR_* encodings and is the length of the
* string that this entry represents.
*
* The function returns the number of bytes used by the encoding/length
* header stored in 'p'. */
unsigned int zipStoreEntryEncoding(unsigned char *p, unsigned char encoding, unsigned int rawlen) {
unsigned char len = 1, buf[5];
if (ZIP_IS_STR(encoding)) {
/* Although encoding is given it may not be set for strings,
* so we determine it here using the raw length. */
if (rawlen <= 0x3f) {
if (!p) return len;
buf[0] = ZIP_STR_06B | rawlen;
} else if (rawlen <= 0x3fff) {
len += 1;
if (!p) return len;
buf[0] = ZIP_STR_14B | ((rawlen >> 8) & 0x3f);
buf[1] = rawlen & 0xff;
} else {
len += 4;
if (!p) return len;
buf[0] = ZIP_STR_32B;
buf[1] = (rawlen >> 24) & 0xff;
buf[2] = (rawlen >> 16) & 0xff;
buf[3] = (rawlen >> 8) & 0xff;
buf[4] = rawlen & 0xff;
}
} else {
/* Implies integer encoding, so length is always 1. */
if (!p) return len;
buf[0] = encoding;
}
/* Store this length at p. */
memcpy(p,buf,len);
return len;
}
/* Decode the entry encoding type and data length (string length for strings,
* number of bytes used for the integer for integer entries) encoded in 'ptr'.
* The 'encoding' variable is input, extracted by the caller, the 'lensize'
* variable will hold the number of bytes required to encode the entry
* length, and the 'len' variable will hold the entry length.
* On invalid encoding error, lensize is set to 0. */
#define ZIP_DECODE_LENGTH(ptr, encoding, lensize, len) do { \
if ((encoding) < ZIP_STR_MASK) { \
if ((encoding) == ZIP_STR_06B) { \
(lensize) = 1; \
(len) = (ptr)[0] & 0x3f; \
} else if ((encoding) == ZIP_STR_14B) { \
(lensize) = 2; \
(len) = (((ptr)[0] & 0x3f) << 8) | (ptr)[1]; \
} else if ((encoding) == ZIP_STR_32B) { \
(lensize) = 5; \
(len) = ((uint32_t)(ptr)[1] << 24) | \
((uint32_t)(ptr)[2] << 16) | \
((uint32_t)(ptr)[3] << 8) | \
((uint32_t)(ptr)[4]); \
} else { \
(lensize) = 0; /* bad encoding, should be covered by a previous */ \
(len) = 0; /* ZIP_ASSERT_ENCODING / zipEncodingLenSize, or */ \
/* match the lensize after this macro with 0. */ \
} \
} else { \
(lensize) = 1; \
if ((encoding) == ZIP_INT_8B) (len) = 1; \
else if ((encoding) == ZIP_INT_16B) (len) = 2; \
else if ((encoding) == ZIP_INT_24B) (len) = 3; \
else if ((encoding) == ZIP_INT_32B) (len) = 4; \
else if ((encoding) == ZIP_INT_64B) (len) = 8; \
else if (encoding >= ZIP_INT_IMM_MIN && encoding <= ZIP_INT_IMM_MAX) \
(len) = 0; /* 4 bit immediate */ \
else \
(lensize) = (len) = 0; /* bad encoding */ \
} \
} while(0)
/* Encode the length of the previous entry and write it to "p". This only
* uses the larger encoding (required in __ziplistCascadeUpdate). */
/* 这个是明确知道我们想用 5 个字节来编码存储 prevlen 调用它
*
* 它会在两个场景下调用:
* 1. 本身就需要用 5 个字节来编码,例如 len > ZIP_BIG_PREVLEN - 1
*
* 2. 可以用 1 个字节来编码,但是还是用 5 个字节来编码存储(见后面连锁更新部分)
* 这种情况发生在某节点的 prevlen 是用 5 个字节存储,但是因为更新 / 删除
* 它前一个节点的 size 变小了,即某节点的 prevlen 可以用 1 个字节存储了
* 理论上我们可以进行缩容回收那 4 个字节,但是我们为了避免更多的连锁更新,不进行缩容
*/
int zipStorePrevEntryLengthLarge(unsigned char *p, unsigned int len) {
uint32_t u32;
if (p != NULL) {
/* 设置第一个字节为 ZIP_BIG_PREVLEN 254 标识它 */
p[0] = ZIP_BIG_PREVLEN;
/* 把 len 值写进后面四个字节里 */
u32 = len;
memcpy(p+1,&u32,sizeof(u32));
memrev32ifbe(p+1);
}
/* 返回编码存储 len 需要的字节数,其实就是 5 个字节 */
return 1 + sizeof(uint32_t);
}
/* Encode the length of the previous entry and write it to "p". Return the
* number of bytes needed to encode this length if "p" is NULL. */
/* 根据 len 将节点的 prevlen 属性写进节点(写到 p 的位置)
* 如果 p 为 NULL 的话,返回编码存储 len 需要的字节数,1 或者 5 个字节 */
unsigned int zipStorePrevEntryLength(unsigned char *p, unsigned int len) {
if (p == NULL) {
/* p 为 NULL 根据 len 的值返回编码所需的字节数
* len < ZIP_BIG_PREVLEN (254) 则用 1 个字节,否则用 5 个字节 */
return (len < ZIP_BIG_PREVLEN) ? 1 : sizeof(uint32_t) + 1;
} else {
if (len < ZIP_BIG_PREVLEN) {
/* len < ZIP_BIG_PREVLEN (254) 用 1 个字节编码存储 len */
p[0] = len;
return 1;
} else {
/* len >= ZIP_BIG_PREVLEN (254) 用 5 个字节编码存储 len */
return zipStorePrevEntryLengthLarge(p,len);
}
}
}
/* Return the number of bytes used to encode the length of the previous
* entry. The length is returned by setting the var 'prevlensize'. */
/* 取出编码前一个节点长度所需的字节数,并将它保存到 prevlensize 变量中
* 可以看到是直接根据 (ptr)[0] < ZIP_BIG_PREVLEN 来判断的,时间复杂度为 O(1) */
#define ZIP_DECODE_PREVLENSIZE(ptr, prevlensize) do { \
if ((ptr)[0] < ZIP_BIG_PREVLEN) { \
(prevlensize) = 1; \
} else { \
(prevlensize) = 5; \
} \
} while(0)
/* Return the length of the previous element, and the number of bytes that
* are used in order to encode the previous element length.
* 'ptr' must point to the prevlen prefix of an entry (that encodes the
* length of the previous entry in order to navigate the elements backward).
* The length of the previous entry is stored in 'prevlen', the number of
* bytes needed to encode the previous entry length are stored in
* 'prevlensize'. */
/* 根据 prevlensize 知道 prevlen 的编码方式,然后获取 prevlen 的值
* 一个字节编码,对应字节 (ptr)[0] 的值就是 prevlen
* 五个字节编码,具体的 prevlen 是存储在后四个字节,后四个字节进行位运算获得实际的 prevlen
*/
#define ZIP_DECODE_PREVLEN(ptr, prevlensize, prevlen) do { \
ZIP_DECODE_PREVLENSIZE(ptr, prevlensize); \
if ((prevlensize) == 1) { \
(prevlen) = (ptr)[0]; \
} else { /* prevlensize == 5 */ \
(prevlen) = ((ptr)[4] << 24) | \
((ptr)[3] << 16) | \
((ptr)[2] << 8) | \
((ptr)[1]); \
} \
} while(0)
/* Given a pointer 'p' to the prevlen info that prefixes an entry, this
* function returns the difference in number of bytes needed to encode
* the prevlen if the previous entry changes of size.
*
* So if A is the number of bytes used right now to encode the 'prevlen'
* field.
*
* And B is the number of bytes that are needed in order to encode the
* 'prevlen' if the previous element will be updated to one of size 'len'.
*
* Then the function returns B - A
*
* So the function returns a positive number if more space is needed,
* a negative number if less space is needed, or zero if the same space
* is needed. */
/* 根据 p 获取它的 prevlensize,记为 A,即当前节点 prevlensize 值
* 根据 len 获取编码存储它需要的字节数,记为 B,即前一个节点的 prevlen 改变了,编码存储它需要多少字节
* 计算 B - A 的差值,即返回重新编码 prevlen 所需要的字节数的差值
* 根据现在的设计它只有三种结果:
* 1. 差值为 0,说明空间刚刚好,例如两者都为 1 或者两者都为 5
* 2. 差值为 +4,说明之前是用 5 个字节,len 改变后可以用 1 个字节,如果要缩容的话可以回收 4 个字节
* 3. 差值为 -4,说明之前是用 1 个字节,len 改变后要用 5 个字节,这种情况代表一定需要扩容,不然无法存储 prevlen
*/
int zipPrevLenByteDiff(unsigned char *p, unsigned int len) {
unsigned int prevlensize;
ZIP_DECODE_PREVLENSIZE(p, prevlensize);
return zipStorePrevEntryLength(NULL, len) - prevlensize;
}
/* Check if string pointed to by 'entry' can be encoded as an integer.
* Stores the integer value in 'v' and its encoding in 'encoding'. */
int zipTryEncoding(unsigned char *entry, unsigned int entrylen, long long *v, unsigned char *encoding) {
long long value;
if (entrylen >= 32 || entrylen == 0) return 0;
if (string2ll((char*)entry,entrylen,&value)) {
/* Great, the string can be encoded. Check what's the smallest
* of our encoding types that can hold this value. */
if (value >= 0 && value <= 12) {
*encoding = ZIP_INT_IMM_MIN+value;
} else if (value >= INT8_MIN && value <= INT8_MAX) {
*encoding = ZIP_INT_8B;
} else if (value >= INT16_MIN && value <= INT16_MAX) {
*encoding = ZIP_INT_16B;
} else if (value >= INT24_MIN && value <= INT24_MAX) {
*encoding = ZIP_INT_24B;
} else if (value >= INT32_MIN && value <= INT32_MAX) {
*encoding = ZIP_INT_32B;
} else {
*encoding = ZIP_INT_64B;
}
*v = value;
return 1;
}
return 0;
}
/* Store integer 'value' at 'p', encoded as 'encoding' */
void zipSaveInteger(unsigned char *p, int64_t value, unsigned char encoding) {
int16_t i16;
int32_t i32;
int64_t i64;
if (encoding == ZIP_INT_8B) {
((int8_t*)p)[0] = (int8_t)value;
} else if (encoding == ZIP_INT_16B) {
i16 = value;
memcpy(p,&i16,sizeof(i16));
memrev16ifbe(p);
} else if (encoding == ZIP_INT_24B) {
i32 = ((uint64_t)value)<<8;
memrev32ifbe(&i32);
memcpy(p,((uint8_t*)&i32)+1,sizeof(i32)-sizeof(uint8_t));
} else if (encoding == ZIP_INT_32B) {
i32 = value;
memcpy(p,&i32,sizeof(i32));
memrev32ifbe(p);
} else if (encoding == ZIP_INT_64B) {
i64 = value;
memcpy(p,&i64,sizeof(i64));
memrev64ifbe(p);
} else if (encoding >= ZIP_INT_IMM_MIN && encoding <= ZIP_INT_IMM_MAX) {
/* Nothing to do, the value is stored in the encoding itself. */
} else {
assert(NULL);
}
}
/* Read integer encoded as 'encoding' from 'p' */
int64_t zipLoadInteger(unsigned char *p, unsigned char encoding) {
int16_t i16;
int32_t i32;
int64_t i64, ret = 0;
if (encoding == ZIP_INT_8B) {
ret = ((int8_t*)p)[0];
} else if (encoding == ZIP_INT_16B) {
memcpy(&i16,p,sizeof(i16));
memrev16ifbe(&i16);
ret = i16;
} else if (encoding == ZIP_INT_32B) {
memcpy(&i32,p,sizeof(i32));
memrev32ifbe(&i32);
ret = i32;
} else if (encoding == ZIP_INT_24B) {
i32 = 0;
memcpy(((uint8_t*)&i32)+1,p,sizeof(i32)-sizeof(uint8_t));
memrev32ifbe(&i32);
ret = i32>>8;
} else if (encoding == ZIP_INT_64B) {
memcpy(&i64,p,sizeof(i64));
memrev64ifbe(&i64);
ret = i64;
} else if (encoding >= ZIP_INT_IMM_MIN && encoding <= ZIP_INT_IMM_MAX) {
ret = (encoding & ZIP_INT_IMM_MASK)-1;
} else {
assert(NULL);
}
return ret;
}
/* Fills a struct with all information about an entry.
* This function is the "unsafe" alternative to the one below.
* Generally, all function that return a pointer to an element in the ziplist
* will assert that this element is valid, so it can be freely used.
* Generally functions such ziplistGet assume the input pointer is already
* validated (since it's the return value of another function). */
/* 将 ziplist 节点内存填充为 zlentry 结构体,这并不是一个节点的存储布局,是方便我们进行节点表示 / 操作的 */
static inline void zipEntry(unsigned char *p, zlentry *e) {
/* 根据 p 目前的指针,获取 entry 的 prevlen 相关属性 */
ZIP_DECODE_PREVLEN(p, e->prevrawlensize, e->prevrawlen);
/* p+prevrawlensize 位置的第一个字节,获取 entry 当前的 encoding 属性 */
ZIP_ENTRY_ENCODING(p + e->prevrawlensize, e->encoding);
/* p+prevrawlensize 根据 encoding 获取 entry 的 len 相关属性 */
ZIP_DECODE_LENGTH(p + e->prevrawlensize, e->encoding, e->lensize, e->len);
assert(e->lensize != 0); /* check that encoding was valid. */
/* entry 的 headersize 部分由 prevrawlensize+lensize 组成,即节点的 prevlen + encoding 部分占用的字节数 */
e->headersize = e->prevrawlensize + e->lensize;
/* 将节点的开头指针存到 zlentry->p */
e->p = p;
}
/* Fills a struct with all information about an entry.
* This function is safe to use on untrusted pointers, it'll make sure not to
* try to access memory outside the ziplist payload.
* Returns 1 if the entry is valid, and 0 otherwise. */
/* 上面 zipEntry 的安全版本,会校验确保不会访问到 ziplist 以外的内存空间 */
static inline int zipEntrySafe(unsigned char* zl, size_t zlbytes, unsigned char *p, zlentry *e, int validate_prevlen) {
unsigned char *zlfirst = zl + ZIPLIST_HEADER_SIZE;
unsigned char *zllast = zl + zlbytes - ZIPLIST_END_SIZE;
#define OUT_OF_RANGE(p) (unlikely((p) < zlfirst || (p) > zllast))
/* If there's no possibility for the header to reach outside the ziplist,
* take the fast path. (max lensize and prevrawlensize are both 5 bytes) */
if (p >= zlfirst && p + 10 < zllast) {
ZIP_DECODE_PREVLEN(p, e->prevrawlensize, e->prevrawlen);
ZIP_ENTRY_ENCODING(p + e->prevrawlensize, e->encoding);
ZIP_DECODE_LENGTH(p + e->prevrawlensize, e->encoding, e->lensize, e->len);
e->headersize = e->prevrawlensize + e->lensize;
e->p = p;
/* We didn't call ZIP_ASSERT_ENCODING, so we check lensize was set to 0. */
if (unlikely(e->lensize == 0))
return 0;
/* Make sure the entry doesn't reach outside the edge of the ziplist */
if (OUT_OF_RANGE(p + e->headersize + e->len))
return 0;
/* Make sure prevlen doesn't reach outside the edge of the ziplist */
if (validate_prevlen && OUT_OF_RANGE(p - e->prevrawlen))
return 0;
return 1;
}
/* Make sure the pointer doesn't reach outside the edge of the ziplist */
if (OUT_OF_RANGE(p))
return 0;
/* Make sure the encoded prevlen header doesn't reach outside the allocation */
ZIP_DECODE_PREVLENSIZE(p, e->prevrawlensize);
if (OUT_OF_RANGE(p + e->prevrawlensize))
return 0;
/* Make sure encoded entry header is valid. */
ZIP_ENTRY_ENCODING(p + e->prevrawlensize, e->encoding);
e->lensize = zipEncodingLenSize(e->encoding);
if (unlikely(e->lensize == ZIP_ENCODING_SIZE_INVALID))
return 0;
/* Make sure the encoded entry header doesn't reach outside the allocation */
if (OUT_OF_RANGE(p + e->prevrawlensize + e->lensize))
return 0;
/* Decode the prevlen and entry len headers. */
ZIP_DECODE_PREVLEN(p, e->prevrawlensize, e->prevrawlen);
ZIP_DECODE_LENGTH(p + e->prevrawlensize, e->encoding, e->lensize, e->len);
e->headersize = e->prevrawlensize + e->lensize;
/* Make sure the entry doesn't reach outside the edge of the ziplist */
if (OUT_OF_RANGE(p + e->headersize + e->len))
return 0;
/* Make sure prevlen doesn't reach outside the edge of the ziplist */
if (validate_prevlen && OUT_OF_RANGE(p - e->prevrawlen))
return 0;
e->p = p;
return 1;
#undef OUT_OF_RANGE
}
/* Return the total number of bytes used by the entry pointed to by 'p'. */
static inline unsigned int zipRawEntryLengthSafe(unsigned char* zl, size_t zlbytes, unsigned char *p) {
zlentry e;
assert(zipEntrySafe(zl, zlbytes, p, &e, 0));
return e.headersize + e.len;
}
/* Return the total number of bytes used by the entry pointed to by 'p'. */
/* 返回 p 指向的节点占用的字节总数和 */
static inline unsigned int zipRawEntryLength(unsigned char *p) {
zlentry e;
zipEntry(p, &e);
return e.headersize + e.len;
}
/* Validate that the entry doesn't reach outside the ziplist allocation. */
static inline void zipAssertValidEntry(unsigned char* zl, size_t zlbytes, unsigned char *p) {
zlentry e;
assert(zipEntrySafe(zl, zlbytes, p, &e, 1));
}
/* Create a new empty ziplist. */
/* 创建一个空 ziplist 只包含 <zlbytes><zltail><zllen><zlend> */
unsigned char *ziplistNew(void) {
/* ziplist_header,两个 uint32_t + 一个 uint16_t,即 zlbytes(4) + zltail(4) + zllen(2) = 10 bytes
* ziplist_end,一个 uint8_t 即 zlend 为 1 byte
* 初始化好 header 跟 end 共 11 字节 */
unsigned int bytes = ZIPLIST_HEADER_SIZE+ZIPLIST_END_SIZE;
/* 给 ziplist 分配内存空间 */
unsigned char *zl = zmalloc(bytes);
/* zlbytes: 将 ziplist 总字节数写进内存
* zl 既为 ziplist 的起始地址,其中值又负责记录 ziplist 的总字节长度,zlbytes 编码存储固定 4 字节,也就代表了一个 ziplist 总字节最大为为 (2^32)-1 字节*/
ZIPLIST_BYTES(zl) = intrev32ifbe(bytes);
/* zltail: 将到尾节点的偏移量写进内存,因为是刚初始化的 ziplist,
* 偏移量其实就是 HEADER_SIZE 值,此时它刚好指向 zlend,因此能够以 O(1) 时间复杂度快速在尾部进行 push 或 pop 操作 */
ZIPLIST_TAIL_OFFSET(zl) = intrev32ifbe(ZIPLIST_HEADER_SIZE);
/* zllen: 将 ziplist 节点数量写进内存,初始化是 0 */
ZIPLIST_LENGTH(zl) = 0;
/* zlend: 最后一个字节设置为 ZIP_END,标识 ziplist 结尾 */
zl[bytes-1] = ZIP_END;
return zl;
}
/* Resize the ziplist. */
/* 调整压缩列表为 len 大小 */
unsigned char *ziplistResize(unsigned char *zl, size_t len) {
assert(len < UINT32_MAX);
/* 给 zl 重新分配空间,如果 len 大于原来的大小,会保留原有的元素 */
zl = zrealloc(zl,len);
/* 更新 zlbytes */
ZIPLIST_BYTES(zl) = intrev32ifbe(len);
/* 重新设置 ZIP_END */
zl[len-1] = ZIP_END;
return zl;
}
/* 这个是 6.2 版本之前的连锁更新函数代码,后面紧跟着的是 6.2 版本(7.0最新)的代码
* 通过两者的差异来看是怎么对函数进行的优化 */
/* When an entry is inserted, we need to set the prevlen field of the next
* entry to equal the length of the inserted entry. It can occur that this
* length cannot be encoded in 1 byte and the next entry needs to be grow
* a bit larger to hold the 5-byte encoded prevlen. This can be done for free,
* because this only happens when an entry is already being inserted (which
* causes a realloc and memmove). However, encoding the prevlen may require
* that this entry is grown as well. This effect may cascade throughout
* the ziplist when there are consecutive entries with a size close to
* ZIP_BIG_PREVLEN, so we need to check that the prevlen can be encoded in
* every consecutive entry.
*
* Note that this effect can also happen in reverse, where the bytes required
* to encode the prevlen field can shrink. This effect is deliberately ignored,
* because it can cause a "flapping" effect where a chain prevlen fields is
* first grown and then shrunk again after consecutive inserts. Rather, the
* field is allowed to stay larger than necessary, because a large prevlen
* field implies the ziplist is holding large entries anyway.
*
* The pointer "p" points to the first entry that does NOT need to be
* updated, i.e. consecutive fields MAY need an update. */
/* 当一个新节点插入到某个节点之前的时候,如果原节点 header 不足以保存新节点的长度
* 即新节点的后继节点的 prevlen 字段不足以保存新节点的长度,此时需要对后继节点扩展
* 但是当对后继节点扩展的时候,扩展后的它也有可能会导致它的后继节点扩展
* 这种情况在多个连续节点,长度再接近 ZIP_BIG_PREVLEN == 254 的时候会发生 see #9218
*
* 反过来说,因为节点长度变小然后引起的连续缩小也是有可能出现的
* 例如前一个节点长度变小,后一个节点的 prevlen 是可以从 5 缩小到 1
* 不过为了避免 扩展 -> 缩小 -> 扩展 -> 缩小这样的情况老是出现,避免过多的连锁更新
* 这种缩容的情况不会进行处理,而是继续让 prevlen 保持 5 个字节,存储可以用 1 个字节存储的值
*
* 函数检查并修复后续连续节点的连锁更新问题,针对指针 p 的后面节点进行检查
* 注意不包含 p 对应的节点,因为 p 在传入之前就已经完成了扩展操作 */
unsigned char *__ziplistCascadeUpdate_before_62(unsigned char *zl, unsigned char *p) {
/* curlen 保存当前 ziplist 的总字节数 */
size_t curlen = intrev32ifbe(ZIPLIST_BYTES(zl)), rawlen, rawlensize;
size_t offset, noffset, extra;
unsigned char *np;
zlentry cur, next;
/* 只要没有达到末尾就一直循环 */
while (p[0] != ZIP_END) {
/* 对 p 指向的节点信息保存到 cur 中 */
zipEntry(p, &cur);
/* 当前节点占用的内存字节数 rawlen = prevrawlensize + lensize + len(value) */
rawlen = cur.headersize + cur.len;
/* 计算编码当前字节长度所需的字节数即 prevlenSize */
rawlensize = zipStorePrevEntryLength(NULL,rawlen);
/* Abort if there is no next entry. */
/* 如果没有下一个节点则跳出
连锁更新的第一个结束条件 */
if (p[rawlen] == ZIP_END) break;
/* 将 p+rawlen 后继节点的信息保存在 next 中 */
zipEntry(p+rawlen, &next);
/* Abort when "prevlen" has not changed. */
/* 如果 next 的 prevrawlen == rawlen
即 next 节点 prevrawlen 锁保存的长度就等于 p 节点的长度
这种就是长度刚好,不需要进行变动,后面的节点也不用更新,跳出
连锁更新的第二个结束条件 */
if (next.prevrawlen == rawlen) break;
if (next.prevrawlensize < rawlensize) {
/* The "prevlen" field of "next" needs more bytes to hold
* the raw length of "cur". */
/* next 节点的 prevlenSize 小于编码 p 节点需要的字节长度,说明 next 节点的 header 需要扩展
offset 记录当前 p 的偏移量,在后面内存重分配后可以再精准定位 p */
offset = p-zl;
/* 需要扩展的字节数,其实就是 insert 部分里的 nextdiff */
extra = rawlensize-next.prevrawlensize;
/* 调整 ziplist 的空间大小,需要扩展 extra */
zl = ziplistResize(zl,curlen+extra);
/* 根据偏移量定位回 p */
p = zl+offset;
/* Current pointer and offset for next element. */
/* next p 指向 next 节点的新地址 */
np = p+rawlen;
/* next offset 记录 next 节点的偏移量 */
noffset = np-zl;
/* Update tail offset when next element is not the tail element. */
/* 如果 next 节点不是尾节点,此时需要更新 tail offset */
if ((zl+intrev32ifbe(ZIPLIST_TAIL_OFFSET(zl))) != np) {
ZIPLIST_TAIL_OFFSET(zl) =
intrev32ifbe(intrev32ifbe(ZIPLIST_TAIL_OFFSET(zl))+extra);
}
/* Move the tail to the back. */
/* 从 np+next.prevrawlensize 复制 curlen-noffset-next.prevrawlensize-1 个字符到 np+rawlensize
将 next 节点之后的所有内存往后移动一下,空出空间扩展 next.prevlenSize */
memmove(np+rawlensize,
np+next.prevrawlensize,
curlen-noffset-next.prevrawlensize-1);
/* 将 next 节点的 prevlenSize 以 rawlen 重新编码,更新 next 的 prevrawlen / prevrawlensize */
zipStorePrevEntryLength(np,rawlen);
/* Advance the cursor */
/* 将 p 指针后移,移动到 next 节点,下一次循环处理 next.next */