@@ -396,42 +396,23 @@ for a file size:
396
396
397
397
Unfortunately, we're not quite done. The popcount function is non-injective,
398
398
so we can only find the file size from the block index, not the other way
399
- around. However, we can guess and correct. Consider an n' block index that
400
- is greater than n, we can find one pretty easily:
399
+ around. However, we can solve for an n' block index that is greater than n
400
+ with an error bounded by the range of the popcount function. We can then
401
+ repeatedly substitute this n' into the original equation until the error
402
+ is smaller than the integer division. As it turns out, we only need to
403
+ perform this substitution once. Now we directly calculate our block index:
401
404
402
- ![ summation3step1 ] ( https://latex.codecogs.com/svg.latex?n%27% 20%3D%20%5Cleft%5Clfloor%5Cfrac%7BN%7D%7BB-2%5Cfrac%7Bw%7D%7B8%7D%7D%5Cright%5Crfloor )
405
+ ![ formulaforn ] ( https://latex.codecogs.com/svg.latex?n%20%3D%20%5Cleft%5Clfloor%5Cfrac%7BN-%5Cfrac%7Bw%7D%7B8%7D%5Cleft%28%5Ctext%7Bpopcount%7D%5Cleft%28%5Cfrac%7BN%7D%7BB-2%5Cfrac%7Bw%7D%7B8%7D%7D-1%5Cright%29 & plus ; 2%5Cright%29 %7D%7BB-2%5Cfrac%7Bw%7D%7B8%7D%7D%5Cright%5Crfloor )
403
406
404
- where:
405
- n' >= n
406
-
407
- We can plug n' back into our popcount equation to find an N' file size that
408
- is greater than N. However, we need to rearrange our terms a bit to avoid
409
- integer overflow:
410
-
411
- ![ summation3step2] ( https://latex.codecogs.com/svg.latex?N%27%20%3D%20%28B-2%5Cfrac%7Bw%7D%7B8%7D%29n%27&plus ; %5Cfrac%7Bw%7D%7B8%7D%5Ctext%7Bpopcount%7D%28n%27%29 )
412
-
413
- where:
414
- N' >= N
415
-
416
- Now that we have N', we can find our block offset:
417
-
418
- ![ summation3step3] ( https://latex.codecogs.com/svg.latex?%5Cmathit%7Boff%7D%27%20%3D%20N%20-%20N%27 )
419
-
420
- where:
421
- off' >= off, our byte offset in the block
422
-
423
- Now we're getting somewhere. N' is greater than or equal to N, and as long as
424
- the number of pointers per block is bounded by the block size, it can only be
425
- different by at most one block. So we have two cases that can be determined by
426
- the sign of off'. If off' is negative, we correct n' and add a block to off'.
427
- Note that we also need to incorporate the overhead of the last block to get
428
- the right offset.
407
+ Now that we have our block index n, we can just plug it back into the above
408
+ equation to find the offset. However, we do need to rearrange the equation
409
+ a bit to avoid integer overflow:
429
410
430
- ![ summation3step4 ] ( https://latex.codecogs.com/svg.latex?n%2C%20% 5Cmathit%7Boff%7D%20%3D%20%5Cbegin%7Bcases%7D%20n%27-1%2C%20%5Cmathit%7Boff% 7D%27 & plus ; B%20%26%20%5Cmathit%7Boff% 7D%27%20%3C%200%20%5C%5C%20n%27%2C% 20%5Cmathit%7Boff%7D%27 & plus ; % 5Cfrac%7Bw%7D%7B8%7D%5Cleft%5B% 5Ctext%7Bctz %7D%28n%27%29 & plus ; 1%5Cright%5D%20%26%20%5Cmathit%7Boff%7D%27%20%5Cgeq%200%20%5Cend%7Bcases%7D )
411
+ ![ formulaforoff ] ( https://latex.codecogs.com/svg.latex?% 5Cmathit%7Boff%7D%20%3D%20N%20-%20%5Cleft%28B-2%5Cfrac%7Bw% 7D%7B8% 7D%5Cright%29n%20-% 20%5Cfrac%7Bw%7D%7B8%7D%5Ctext%7Bpopcount %7D%28n%29 )
431
412
432
- It's a lot of math, but computers are very good at math. With these equations
433
- we can solve for the block index + offset while only needed to store the file
434
- size in O(1).
413
+ The solution involves quite a bit of math, but computers are very good at math.
414
+ We can now solve for the block index + offset while only needed to store the
415
+ file size in O(1).
435
416
436
417
Here is what it might look like to update a file stored with a CTZ skip-list:
437
418
```
0 commit comments