github

kbrsh / wade

  • понедельник, 13 ноября 2017 г. в 03:14:10
https://github.com/kbrsh/wade


🌊 Blazing fast, 1kb search for Javascript



Wade

Blazing fast 1kb search

Build Status

Installation

NPM

npm install wade

CDN

<script src="https://unpkg.com/wade"></script>

Usage

Initialize Wade with an array of strings.

const search = Wade(["Apple", "Orange", "Lemon", "Tomato"]);

Now you can search for a query within the array, and Wade will return the index of it, along with a score.

search("App");
/*
[{
  index: 0,
  score: 1
}]
*/

Combined with libraries like Moon, you can create a real-time search.

Loading/Saving Data

To save data as an object, use Wade.save on your search function, and then use these later when initializing Wade.

For example:

// Create the initial search function
const search = Wade(["Apple", "Orange", "Lemon", "Tomato"]);
const instance = Wade.save(search);

// Save `instance`

Later, you can get the same search function without having Wade recreate an index every time by doing:

// Retrieve `instance`, then
const search = Wade(instance);

Processors

Wade uses a set of processors to preprocess data and search queries. By default, these will:

  • Make everything lowercase
  • Remove punctuation
  • Remove stop words

A process consists of different functions that process a string and modify it in some way, and return the transformed string.

You can easily modify the processors as they are available in Wade.config.processors, for example:

// Don't preprocess at all
Wade.config.processors = [];

// Add custom processor to remove periods
Wade.config.processors.push(function(str) {
  return str.replace(/\./g, "");
});

All functions will be executed in the order of the array (0-n) and they will be used on each document in the data.

The stop words can be configured to include any words you like, and you can access the array of stop words by using:

Wade.config.stopWords = [/* array of stop words */];

The punctuation regular expression used to remove punctuation can be configured with:

Wade.config.punctuationRE = /[.!]/g; // should contain punctuation to remove

Algorithm

The algorithm behind the search is fairly simple. First, a trie data structure is generated off of the data. When performing a search, the following happens:

  • The search query is processed through the pipeline
  • The search query is then tokenized into keywords
  • Each keyword except the last is searched for and scores for each item in the data are updated according to the amount of keywords that appear in the document.
  • The last keyword is treated as a prefix, and Wade performs a depth-first search and updates the score for all data prefixed with this keyword. The score is added depending on how much of the word was included in the prefix, and how relevant the word is to the data. This allows for searching as a user types.

A more in-depth explanation of the algorithm is available here.

License

Licensed under the MIT License by Kabir Shah