1. Home
  2. Blog
  3. Post
  4. Which Full Text Search Index Do I Use

Total Posts: 7

Tags:

  • cloud function (6)
  • counter (1)
  • faq (1)
  • helper function (1)
  • index (4)
  • search (4)
  • timestamp (1)
  • trigger function (1)

Which Full-Text Search Index Do I Use?

7 min read

Which Full-Text Search Index Do I Use?

A summary of my Advanced Firestore Function Search Indexes with Fuzzy Search!

If you don't want to pay for Algolia Search, one of the forks of Elastic Search, and you don't trust yourself to deploy a whole new database like MeiliSearch, you can always use just plan old Firestore for you searching needs.

Say what!?

Yes, anything can be indexed, if you just create new indexes! Almost anything is possible. While some of my search options are faster than others, Firestore has built in caching that makes things easier and better.

So which one, get on with it already!?

To be brief, it depends on your needs for that particular collection or project.

Full Text Search

This was my first idea, and uses a simple utf hack with the ~ symbol to find matches based on what something starts with. I took this idea a step further and indexed every sentence by groups of 6 words (customizable) so that you can search for the beginning of any word. It creates documents based on the number of words to be indexed.

While this seems like overkill, it actually saves you the most space based on all my options. As long as you use a debouce timer in your observables, it won't read too much either. It is also pretty quick.

Most people will get bogged down by the number of documents it creates. There is not really many other options to it. I do not recommend this search except for specific cases.

Relevant Search

This is idea came to me when I realized I could put all the data in an array, and still count the number of hits for each letter and word in a document. I was turned off by this idea in case someone had a lot of data, until I saw what 1MB of data really looks like. Also, it is repeated data, and I get rid of all the fluff like html tags.

I expanded upon this idea by trying to add fuzzy search capabilities. I learned about how PostSQL and some version of mySQL do it on the backend... indexes and algorithms that use phonetics and string comparisons.

So, I added soundex to the mix. You can use whatever function you like, but you cannot search a modified version of what is stored like you can in SQL. I would be very interested if someone has any ideas about this. I allow the ability to add your own function to any language.

It is called Relevant Search because it sorts on the frontend and backend by, well, relevance. I added a paging option (startAt), and you can index separately by fields. This is my most unique and well-rounded search.

Trigram Search

This is the best search when it comes to Fuzzy Search. It is literally the same indexes used in other search engines. I basically use groups of 3 letters to get a close fuzzy search match. See my page for more details on this process. This is the slowest, but the most powerful. I believe you could come up with many techniques and ideas with Trigams. I posted several questions on StackOverFlow, and no one seems to know what Trigrams are. Good thing I wrote something using it.

Post any ideas, improvements, or bugs to these functions on the github issue page.

If you appreciate these functions, shoot me some up votes here, as I use points to help me debug future projects!

Spread the word.

J


search
faq