Fast search

This chapters details the Bigloo's API for fast string search algorithms.

Knuth, Morris, and Pratt

Bigloo supports an implementation of the Knuth, Morris, and Pratt algorithm on strings and memory mapped area, Memory mapped area.

kmp-table patternbigloo procedure

This function creates a kmp-table used for fast search.
.keep

kmp-mmap kmp-table mmap offsetbigloo procedure

kmp-string kmp-table string offsetbigloo procedure

This function searches the pattern described by kmp-table in the memory mapped area mmap (respec. in the string). The search starts at offset. If an occurrence is found, its position in the mmap is returned. Otherwise -1 is returned.

For the sake of the example, here is a prototypal implementation of the Usenix command grep:

(define (main args)
   (cond
      ((null? (cdr args))
       (fprintf (current-error-port) "Usage: grep STRING [FILE]...")
       (exit 0))
      (else
       (let ((t (kmp-table (cadr args))))
          (for-each (lambda (f) (grep-file t f)) (cddr args))))))

(define (grep-file t file)
   (let* ((mm (open-mmap file read: #t write: #f))
          (ls (mmap-length mm)))
      (let loop ((o 0))
         (unless (>=fx o ls)
            (let ((n (kmp-mmap t mm o)))
               (when (>fx n 0)
                  (print file ":" (mmap-line mm ls n))
                  (loop (+fx n 1))))))
      (close-mmap mm)))

(define (mmap-line mm ls n)
   (let ((b 0)
         (e (elong->fixnum ls)))
      ;; beginning
      (let loop ((i n))
         (when (>fx i 0)
            (if (char=? (mmap-ref mm i) #\Newline)
                (set! b (+fx i 1))
                (loop (-fx i 1)))))
      ;; end
      (let loop ((i n))
         (when (<fx i ls)
            (if (char=? (mmap-ref mm i) #\Newline)
                (set! e i)
                (loop (+fx i 1)))))
      (mmap-substring mm b (- e b))))
.keep

Boyer - Moore

Bigloo supports an implementation of the Boyer, Moore algorithm on strings and memory mapped area, Memory mapped area.

bm-table patternbigloo procedure

This function creates a bm-table used for fast search.
.keep

bm-mmap bm-table mmap offsetbigloo procedure

bm-string bm-table string offsetbigloo procedure

This function searches the pattern described by bm-table in the memory mapped area mmap (respec. in the string). The search starts at offset. If an occurrence is found, its position in the mmap is returned. Otherwise -1 is returned.

For the sake of the example, here is a prototypal implementation of the Usenix command grep:

(define (main args)
   (cond
      ((null? (cdr args))
       (fprintf (current-error-port) "Usage: grep STRING [FILE]...")
       (exit 0))
      (else
       (let ((t (bm-table (cadr args))))
          (for-each (lambda (f) (grep-file t f)) (cddr args))))))

(define (grep-file t file)
   (let* ((mm (open-mmap file read: #t write: #f))
          (ls (mmap-length mm)))
      (let loop ((o 0))
         (unless (>=fx o ls)
            (let ((n (bm-mmap t mm o)))
               (when (>fx n 0)
                  (print file ":" (mmap-line mm ls n))
                  (loop (+fx n 1))))))
      (close-mmap mm)))

(define (mmap-line mm ls n)
   (let ((b 0)
         (e (elong->fixnum ls)))
      ;; beginning
      (let loop ((i n))
         (when (>fx i 0)
            (if (char=? (mmap-ref mm i) #\Newline)
                (set! b (+fx i 1))
                (loop (-fx i 1)))))
      ;; end
      (let loop ((i n))
         (when (<fx i ls)
            (if (char=? (mmap-ref mm i) #\Newline)
                (set! e i)
                (loop (+fx i 1)))))
      (mmap-substring mm b (- e b))))
.keep

Boyer - Moore - Horspool

Bigloo supports an implementation of the Boyer, Moore, Horspool algorithm on strings and memory mapped area, Memory mapped area.

bmh-table patternbigloo procedure

This function creates a bmh-table used for fast search.
.keep

bmh-mmap bm-table mmap offsetbigloo procedure

bmh-string bm-table string offsetbigloo procedure

This function searches the pattern described by bmh-table in the memory mapped area mmap (respec. in the string). The search starts at offset. If an occurrence is found, its position in the mmap is returned. Otherwise -1 is returned.
.keep