05 November 2023 —
Recently, I migrated this very blog to Astro, and I took the opportunity to modernize the search system.
I was then using lunr, which was a bit unmaintained, and not playing well with modern build tools.
Introducing fuse.js
Looking for alternatives I found fuse.js, a fuzzy-search library, with zero dependencies, very popular, modern, TypeScript-friendly and very easy to use.
Integrating Fuse turned out to be very simple. You need to provide an array of “searchable things” (usually the objects representing your blog posts), and then determine which are the properties in which you want to search.
Fuse will then create an index for the provided list of objects, and allow you to search on them.
When calling search
, fuse will return the list of objects that match the search term, ordered by relevance.
Then, it’s up to you to decide how to use the result. In my case I iterate over them and build a page listing the post titles and an excerpt. Feel free to take a look.
Fuse accepts many other options. Take a look to their documentation in order to fine tune it.
Creating your search island
As the title of this post says, this article assumes your blog does not have a backend, and Fuse will have to run the search in the browser.
For that, you will need an island, a component with logic that runs in the browser.
We’ll be using react to write that component. It will receive a list of posts and pass them to Fuse.
Content collections vs markdown pages
Let’s now build the search page. Assuming you write your posts in markdown/MDX, there are a couple of ways in which you can get the list of posts to pass to fuse.
In this very blog I use Astro’s content collections to handle the posts. That allows me to take advantage of the helpers exposed by astro:content
.
If you write your posts as regular pages, it’s slightly harder to get the bodies of your posts, but you can use Astro.glob()
to list your posts and metadata, and then manually read each post to get its body.
As you can see, the second option is a bit less convenient, and requires to manually manipulate the filesystem, so I would recommend using content collections if possible.
Embedded search vs endpoint
The next thing you need to decide is how the Search
component will get the list of posts.
You need to take into consideration that you are exposing the whole body of every post so that fuse can search on them. That means the list can be relatively big depending on how many posts you have.
In the examples above we have embedded the list of posts in the search page itself. When you build it you’ll see all posts are part of the generated search/index.html
file, which can make it relatively big.
This approach is very convenient and simple, but will block the page rendering until everything is downloaded.
Another approach would be to expose the list of posts via a separated JSON file (AKA Astro endpoint), and let the Search
component load it asynchronously.
With that, you can define an individual caching strategy for this JSON file itself, load the search page faster with a loading indicator, etc.
First, we would need to create the endpoint that will serve the list of posts.
Then we need to edit the Search
component to fetch this itself:
Take into consideration this is a simplified example. You should probably handle potential errors when loading the posts, abort the request when the component unmounts, etc.
You might even want to use something other than
useEffect
+fetch
, like React Query/TanStack Query and similar.
And finally, we can update the search page so that it does not load the posts itself.
Pre-generated index
The last thing you might want to consider is pre-generating fuse’s index.
In all examples above we have always been exposing the list of posts, and letting fuse generate the index on the fly.
This is fine as long as you don’t have hundreds of blog posts. This blog has more than 60 posts, and this approach is working fine.
However, if you decide to pre-generate the index, you should not embed it in the search page, and instead expose it via an endpoint to avoid a very big search/index.html
file.
Let’s evolve the posts.json.ts
endpoint to expose a pre-generated index.
And then, the Search
component needs to take the index into consideration.
More information on how to pre-generate fuse’s index can be found in their docs.
Conclusion
This article highlights different ways to use fuse.js to index the posts of a static Astro blog.
The tool is very flexible. It lets you start simple, and then evolve from there as your needs change.
Search results are not perfect, but as long as you don’t need high accuracy, fuse.js is a very useful tool.