PTS - Partial Text Search
Description: Partial Text Search stands for searching for part of the word and match a text. This is different of what most search engines do because they use Full Text search, meaning they will only store whole words on their databases.

It means that, if you search for 'fan' you'll never be able to match words like 'fantastic', 'fanatic', 'fana' etc.

There are several aproaches to improve this field of indexed search and we know some, that's why we're trying to code it, to learn each one's draw backs. The ideas we had so far are separated into three classes: Search, Quality and Load.

The Search related ideas are about how to make the search works within words, ie. are the core ideas:

  • N-GRAMs: Split the word 'bubble' like (bub, ubb, bbl, ble).
  • Soundex: Use Soundex algorithm to improve relevance.
  • Stemming: Use Stemming algorithm to get root terms and search for them too.

    The Quality related ideas choose, among all results you had, how to know which one is exactly what the user really want, ie. raising relevancy:

  • Block Count: Quality is related to the amount of blocks (letters, silabs, words, etc) match from your search to the real terms.
  • Match Class: If you match a block before changing the keyword (with soundex or stemming) it should have more quality than after changing it.

    The Load ideas are important because the work is always too much for a single machine:

  • DNS Index: Use different names for each group of results, so you don't need to know priorhand on which machine you should look for the block 'foo', you just point it to f.yourdomain and you're at the right place.


  • Download: Download TGZ file: Here


    Rengolin This page was created using Vim