Build a search page for your astro static blog with fuse.js

05 November 2023 Comments

Recently, I migrated this very blog to Astro, and I took the opportunity to modernize the search system.

I was then using lunr, which was a bit unmaintained, and not playing well with modern build tools.

Introducing fuse.js

Looking for alternatives I found fuse.js, a fuzzy-search library, with zero dependencies, very popular, modern, TypeScript-friendly and very easy to use.

Integrating Fuse turned out to be very simple. You need to provide an array of “searchable things” (usually the objects representing your blog posts), and then determine which are the properties in which you want to search.

import Fuse from 'fuse.js';

const posts = [
  // ...
];
const fuse = new Fuse(posts, {
  keys: ['body', 'title']
});

Fuse will then create an index for the provided list of objects, and allow you to search on them.

const results = fuse.search('astro');

When calling search, fuse will return the list of objects that match the search term, ordered by relevance.

Then, it’s up to you to decide how to use the result. In my case I iterate over them and build a page listing the post titles and an excerpt. Feel free to take a look.

Fuse accepts many other options. Take a look to their documentation in order to fine tune it.

Creating your search island

As the title of this post says, this article assumes your blog does not have a backend, and Fuse will have to run the search in the browser.

For that, you will need an island, a component with logic that runs in the browser.

We’ll be using react to write that component. It will receive a list of posts and pass them to Fuse.

import Fuse from 'fuse.js';
import { useState, useMemo } from 'react';

export function Search({ posts }) {
  const [searchValue, setSearchValue] = useState('');
  const fuse = useMemo(() => new Fuse(posts, { keys: ['body', 'title'] }), [posts])
  const results = useMemo(() => fuse.search(searchValue), [fuse, searchValue]);

  return (
    <>
      <input
        type="search"
        placeholder="Search…"
        aria-label="Search"
        value={searchValue}
        onChange={(e) => setSearchValue(e.target.value)}
      />

      {results.length === 0 && (
        <>
          {searchValue === '' && <p>Enter a search term</p>}
          {searchValue !== '' && <p>No results found</p>}
        </>
      )}
      {results.map(({ item: post }, index) => (
        <p key={index}>
          <a href={post.url}>{post.title}</a>
        </p>
      ))}
    </>
  );
}

Content collections vs markdown pages

Let’s now build the search page. Assuming you write your posts in markdown/MDX, there are a couple of ways in which you can get the list of posts to pass to fuse.

In this very blog I use Astro’s content collections to handle the posts. That allows me to take advantage of the helpers exposed by astro:content.

---
// pages/search.astro
import { getCollection } from 'astro:content';
import { Search } from '../components/Search';

const rawPosts = await getCollection('posts');
const posts = rawPosts.map((post) => ({
  body: post.body,
  title: post.data.title,
  // Anything else you need...
}));
---

<div>
  <Search posts={posts} client:only="react" />
</div>

If you write your posts as regular pages, it’s slightly harder to get the bodies of your posts, but you can use Astro.glob() to list your posts and metadata, and then manually read each post to get its body.

---
// pages/search.astro
import fs from 'node:fs';
import { Search } from '../components/Search';

const rawPosts = await Astro.glob('./posts/*.mdx');
const posts =  rawPosts.map((post) => ({
  body: fs.readFileSync(post.file).toString(),
  title: post.frontmatter.title,
  // Anything else you need...
}));
---

<div>
  <Search posts={posts} client:only="react" />
</div>

As you can see, the second option is a bit less convenient, and requires to manually manipulate the filesystem, so I would recommend using content collections if possible.

Embedded search vs endpoint

The next thing you need to decide is how the Search component will get the list of posts.

You need to take into consideration that you are exposing the whole body of every post so that fuse can search on them. That means the list can be relatively big depending on how many posts you have.

In the examples above we have embedded the list of posts in the search page itself. When you build it you’ll see all posts are part of the generated search/index.html file, which can make it relatively big.

This approach is very convenient and simple, but will block the page rendering until everything is downloaded.

Another approach would be to expose the list of posts via a separated JSON file (AKA Astro endpoint), and let the Search component load it asynchronously.

With that, you can define an individual caching strategy for this JSON file itself, load the search page faster with a loading indicator, etc.

First, we would need to create the endpoint that will serve the list of posts.

// pages/posts.json.ts
export async function GET() {
  const posts = [
    // Get posts...
  ];

  return new Response(
    JSON.stringify({ posts }),
    {
      status: 200,
      headers: {
        "Content-Type": "application/json"
      }
    }
  );
}

Then we need to edit the Search component to fetch this itself:

import Fuse from 'fuse.js';
import { useState, useMemo } from 'react';

-export function Search({ posts }) {
+export function Search() {
+ const [posts, setPosts] = useState<Post[] | null>(null);
  const [searchValue, setSearchValue] = useState('');
- const fuse = useMemo(() => new Fuse(posts { keys: ['body', 'title'] }), [posts]);
+ const fuse = useMemo(() => new Fuse(posts ?? [], { keys: ['body', 'title'] }), [posts]);
  const results = useMemo(() => fuse.search(searchValue), [fuse, searchValue]);

+ useEffect(() => {
+   fetch('/posts.json')
+     .then((resp) => resp.json())
+     .then(({ posts }) => setPosts(posts));
+ }, []);

+ if (!posts) {
+   return <p>Loading posts...</p>;
+ }

  return <>...</>;
}

Take into consideration this is a simplified example. You should probably handle potential errors when loading the posts, abort the request when the component unmounts, etc.

You might even want to use something other than useEffect + fetch, like React Query/TanStack Query and similar.

And finally, we can update the search page so that it does not load the posts itself.

---
// pages/search.astro
import { Search } from '../components/Search';
---

<div>
  <Search client:only="react" />
</div>

Pre-generated index

The last thing you might want to consider is pre-generating fuse’s index.

In all examples above we have always been exposing the list of posts, and letting fuse generate the index on the fly.

This is fine as long as you don’t have hundreds of blog posts. This blog has more than 60 posts, and this approach is working fine.

However, if you decide to pre-generate the index, you should not embed it in the search page, and instead expose it via an endpoint to avoid a very big search/index.html file.

Let’s evolve the posts.json.ts endpoint to expose a pre-generated index.

// pages/posts.json.ts
import Fuse from 'fuse.js';

export async function GET() {
  const posts = [
    // Get posts...
  ];
  const postsIndex = Fuse.createIndex(['body', 'title'], posts);

  return new Response(
    JSON.stringify({
      posts,
      index: postsIndex.toJSON(),
    }),
    {
      status: 200,
      headers: {
        "Content-Type": "application/json"
      }
    }
  );
}

And then, the Search component needs to take the index into consideration.

import Fuse from 'fuse.js';
import { useState, useMemo } from 'react';

export function Search() {
  const [posts, setPosts] = useState<Post[] | null>(null);
+ const [index, setIndex] = useState(null);
  const [searchValue, setSearchValue] = useState('');
- const fuse = useMemo(() => new Fuse(posts ?? [], { keys: ['body', 'title'] }), [posts]);
+ const fuse = useMemo(() => {
+   const parsedIndex = index ? Fuse.parseIndex(index) : undefined;
+   return new Fuse(posts ?? [], { keys: ['body', 'title'] }, parsedIndex);
+ }, [posts, index]);
  const results = useMemo(() => fuse.search(searchValue), [fuse, searchValue]);

  useEffect(() => {
    fetch('/posts.json')
      .then((resp) => resp.json())
-     .then(({ posts }) => setPosts(posts));
+     .then(({ posts, index }) => {
+       setPosts(posts);
+       setIndex(index);
+     });
  }, []);

  if (!posts) {
    return <p>Loading posts...</p>;
  }

  return <>...</>;
}

More information on how to pre-generate fuse’s index can be found in their docs.

Conclusion

This article highlights different ways to use fuse.js to index the posts of a static Astro blog.

The tool is very flexible. It lets you start simple, and then evolve from there as your needs change.

Search results are not perfect, but as long as you don’t need high accuracy, fuse.js is a very useful tool.