Skip to content

Commit b9affae

Browse files
committed
Fix Some Typos in Teddy Documentation
1 parent a1cda64 commit b9affae

File tree

1 file changed

+42
-38
lines changed

1 file changed

+42
-38
lines changed

src/simd_accel/teddy128.rs

+42-38
Original file line numberDiff line numberDiff line change
@@ -48,7 +48,7 @@ number of short patterns is far more likely.
4848
Faro and Kulekci published another paper [4b] that is conceptually very similar
4949
to [4a]. The key difference is that it uses the CRC32 instruction (introduced
5050
as part of SSE 4.2) to compute fingerprint values. This also enables the
51-
algorithm to work effectively on substrings as short at 7 bytes with 4 byte
51+
algorithm to work effectively on substrings as short as 7 bytes with 4 byte
5252
windows. 7 bytes is unfortunately still too long. The window could be
5353
technically shrunk to 2 bytes, thereby reducing minimum length to 3, but the
5454
small window size ends up negating most performance benefits—and it's likely
@@ -66,6 +66,7 @@ because it is behind a paywall.
6666
6767
Teddy
6868
-----
69+
6970
Finally, we get to Teddy. If the above literature review is complete, then it
7071
appears that Teddy is a novel algorithm. More than that, in my experience, it
7172
completely blows away the competition for short substrings, which is exactly
@@ -242,20 +243,21 @@ haystack.
242243
243244
Implementation notes
244245
--------------------
246+
245247
The problem with the algorithm as described above is that it uses a single byte
246-
for a fingerprint. This will work will if the fingerprints are rare in the
247-
haystack (e.g., capital letters or special characters in normal English text),
248-
but if the fingerprints are common, you'll wind up spending too much time in
249-
the verification step, which effectively gives the performance benefits of
250-
scanning 16 bytes at a time. Remember, the key to the performance of this
251-
algorithm is to do as little work as possible per 16 bytes.
248+
for a fingerprint. This will if the fingerprints are rare in the haystack (e.g.,
249+
capital letters or special characters in normal English text), but if the
250+
fingerprints are common, you'll wind up spending too much time in the
251+
verification step, which effectively gives the performance benefits of scanning
252+
16 bytes at a time. Remember, the key to the performance of this algorithm is to
253+
do as little work as possible per 16 bytes.
252254
253255
This algorithm can be extrapolated in a relatively straight-forward way to use
254256
larger fingerprints. That is, instead of a single byte prefix, we might use a
255-
three byte prefix. The implementation below implements N = {1, 2, 3} and always
256-
picks the largest N possible. The rationale is that the bigger the fingerprint,
257-
the fewer verification steps we'll do. Of course, if N is too large, then we'll
258-
end up doing too much on each step.
257+
three byte prefix. The implementation below implements `N = {1, 2, 3}` and
258+
always picks the largest N possible. The rationale is that the bigger the
259+
fingerprint, the fewer verification steps we'll do. Of course, if N is too
260+
large, then we'll end up doing too much on each step.
259261
260262
The way to extend it is:
261263
@@ -344,7 +346,7 @@ pub struct Match {
344346
pub pat: usize,
345347
/// The start byte offset of the match.
346348
pub start: usize,
347-
/// The end byte offset of the match. This is always start + pat.len().
349+
/// The end byte offset of the match. This is always `start + pat.len()`.
348350
pub end: usize,
349351
}
350352

@@ -363,7 +365,7 @@ pub struct Teddy {
363365
}
364366

365367
/// A list of masks. This has length equal to the length of the fingerprint.
366-
/// The length of the fingerprint is always `max(3, len(smallest substring))`.
368+
/// The length of the fingerprint is always `max(3, len(smallest_substring))`.
367369
#[derive(Debug, Clone)]
368370
struct Masks(Vec<Mask>);
369371

@@ -377,9 +379,9 @@ struct Mask {
377379
}
378380

379381
impl Teddy {
380-
/// Create a new Teddy multi substring matcher.
382+
/// Create a new `Teddy` multi substring matcher.
381383
///
382-
/// If a Teddy matcher could not be created (e.g., `pats` is empty or has
384+
/// If a `Teddy` matcher could not be created (e.g., `pats` is empty or has
383385
/// an empty substring), then `None` is returned.
384386
pub fn new(pats: &syntax::Literals) -> Option<Teddy> {
385387
let pats: Vec<_> = pats.literals().iter().map(|p|p.to_vec()).collect();
@@ -407,7 +409,7 @@ impl Teddy {
407409
})
408410
}
409411

410-
/// Returns all of the substrings matched by this Teddy.
412+
/// Returns all of the substrings matched by this `Teddy`.
411413
pub fn patterns(&self) -> &[Vec<u8>] {
412414
&self.pats
413415
}
@@ -422,7 +424,7 @@ impl Teddy {
422424
self.pats.iter().fold(0, |a, b| a + b.len())
423425
}
424426

425-
/// Searches `haystack` for the substrings in this Teddy. If a match was
427+
/// Searches `haystack` for the substrings in this `Teddy`. If a match was
426428
/// found, then it is returned. Otherwise, `None` is returned.
427429
pub fn find(&self, haystack: &[u8]) -> Option<Match> {
428430
// If our haystack is smaller than the block size, then fall back to
@@ -441,7 +443,7 @@ impl Teddy {
441443
}
442444
}
443445

444-
/// find1 is used when there is only 1 mask. This is the easy case and is
446+
/// `find1` is used when there is only 1 mask. This is the easy case and is
445447
/// pretty much as described in the module documentation.
446448
#[inline(always)]
447449
fn find1(&self, haystack: &[u8]) -> Option<Match> {
@@ -451,7 +453,7 @@ impl Teddy {
451453
debug_assert!(len >= BLOCK_SIZE);
452454
while pos <= len - BLOCK_SIZE {
453455
let h = unsafe { u8x16::load_unchecked(haystack, pos) };
454-
// N.B. res0 is our `C` in the module documentation.
456+
// N.B. `res0` is our `C` in the module documentation.
455457
let res0 = self.masks.members1(h);
456458
// Only do expensive verification if there are any non-zero bits.
457459
if res0.ne(zero).any() {
@@ -464,7 +466,7 @@ impl Teddy {
464466
self.slow(haystack, pos)
465467
}
466468

467-
/// find2 is used when there are 2 masks, e.g., the fingerprint is 2 bytes
469+
/// `find2` is used when there are 2 masks, e.g., the fingerprint is 2 bytes
468470
/// long.
469471
#[inline(always)]
470472
fn find2(&self, haystack: &[u8]) -> Option<Match> {
@@ -478,12 +480,12 @@ impl Teddy {
478480
);
479481
let zero = u8x16::splat(0);
480482
let len = haystack.len();
481-
// The previous value of C (from the module documentation) for the
483+
// The previous value of `C` (from the module documentation) for the
482484
// *first* byte in the fingerprint. On subsequent iterations, we take
483-
// the last bitset from the previous C and insert it into the first
485+
// the last bitset from the previous `C` and insert it into the first
484486
// position of the current C, shifting all other bitsets to the right
485-
// one lane. This causes C for the first byte to line up with C for the
486-
// second byte, so that they can be AND'd together.
487+
// one lane. This causes `C` for the first byte to line up with `C` for
488+
// the second byte, so that they can be `AND`'d together.
487489
let mut prev0 = u8x16::splat(0);
488490
let mut pos = 0;
489491
debug_assert!(len >= BLOCK_SIZE);
@@ -493,17 +495,19 @@ impl Teddy {
493495

494496
// The next three lines are essentially equivalent to
495497
//
496-
// (prev0 << 15) | (res0 >> 1)
498+
// ```rust,ignore
499+
// (prev0 << 15) | (res0 >> 1)
500+
// ```
497501
//
498502
// ... if SIMD vectors could shift across lanes. There is the
499-
// PALIGNR instruction, but apparently LLVM doesn't expose it as
503+
// `PALIGNR` instruction, but apparently LLVM doesn't expose it as
500504
// a proper intrinsic. Thankfully, it appears the following
501-
// sequence does indeed compile down to a PALIGNR.
505+
// sequence does indeed compile down to a `PALIGNR`.
502506
let prev0byte0 = prev0.extract(15);
503507
let res0shiftr8 = res0.shuffle_bytes(res0shuffle);
504508
let res0prev0 = res0shiftr8.replace(0, prev0byte0);
505509

506-
// AND's our C values together.
510+
// `AND`'s our `C` values together.
507511
let res = res0prev0 & res1;
508512
prev0 = res0;
509513
if res.ne(zero).any() {
@@ -519,12 +523,12 @@ impl Teddy {
519523
self.slow(haystack, pos - 1)
520524
}
521525

522-
/// find3 is used when there are 3 masks, e.g., the fingerprint is 3 bytes
526+
/// `find3` is used when there are 3 masks, e.g., the fingerprint is 3 bytes
523527
/// long.
524528
///
525-
/// N.B. This is a straight-forward extrapolation of find2. The only
526-
/// difference is that we need to keep track of two previous values of
527-
/// C, since we now need to align for three bytes.
529+
/// N.B. This is a straight-forward extrapolation of `find2`. The only
530+
/// difference is that we need to keep track of two previous values of `C`,
531+
/// since we now need to align for three bytes.
528532
#[inline(always)]
529533
fn find3(&self, haystack: &[u8]) -> Option<Match> {
530534
let zero = u8x16::splat(0);
@@ -606,7 +610,7 @@ impl Teddy {
606610
///
607611
/// If a match exists, it returns the first one.
608612
///
609-
/// offset is an additional byte offset to add to the position before
613+
/// `offset` is an additional byte offset to add to the position before
610614
/// substring match verification.
611615
#[inline(always)]
612616
fn verify_64(
@@ -708,17 +712,17 @@ impl Masks {
708712
}
709713

710714
/// Adds the given pattern to the given bucket. The bucket should be a
711-
/// power of 2 <= 2^7.
715+
/// power of `2 <= 2^7`.
712716
fn add(&mut self, bucket: u8, pat: &[u8]) {
713717
for (i, mask) in self.0.iter_mut().enumerate() {
714718
mask.add(bucket, pat[i]);
715719
}
716720
}
717721

718722
/// Finds the fingerprints that are in the given haystack block. i.e., this
719-
/// returns C as described in the module documentation.
723+
/// returns `C` as described in the module documentation.
720724
///
721-
/// More specifically, for i in 0..16 and j in 0..8, C[i][j] == 1 if and
725+
/// More specifically, `for i in 0..16` and `j in 0..8, C[i][j] == 1` if and
722726
/// only if `haystack_block[i]` corresponds to a fingerprint that is part
723727
/// of a pattern in bucket `j`.
724728
#[inline(always)]
@@ -745,8 +749,8 @@ impl Masks {
745749
(res0, res1)
746750
}
747751

748-
/// Like members1, but computes C for the first, second and third bytes in
749-
/// the fingerprint.
752+
/// Like `members1`, but computes `C` for the first, second and third bytes
753+
/// in the fingerprint.
750754
#[inline(always)]
751755
fn members3(&self, haystack_block: u8x16) -> (u8x16, u8x16, u8x16) {
752756
let masklo = u8x16::splat(0xF);

0 commit comments

Comments
 (0)