pagefind

Static low-bandwidth search at scale (by CloudCannon)

Pagefind Alternatives

Similar projects and alternatives to pagefind

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a better pagefind alternative or higher similarity.

pagefind reviews and mentions

Posts with mentions or reviews of pagefind. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-12-15.
  • Lightweight, portable and secure Wasm runtimes and their use cases.
    2 projects | dev.to | 15 Dec 2023
    In theory, if we ran lower level code, we would be using less resources. That's more than a theory. Go to this video where I demonstrate Pagefind, written in Rust and compiled to Wasm as target, as a static app that ingests and indexes HTML documents and runs super efficient search queries, all client-side.
  • Free Open-Source Blog Template for Developers ✏️📃
    2 projects | dev.to | 26 Aug 2023
    ✅ Pagefind static search library integration
  • How to Start Your Blog in 2023
    8 projects | news.ycombinator.com | 20 Feb 2023
    I use Astro SSG and Cloudflare Pages. I use https://github.com/cloudcannon/pagefind for search on my Astro setup. You can test the search functionality here https://tinyrocket.pages.dev/.

    From its repo: "Pagefind runs after any static site generator and automatically indexes the built static files. Pagefind then outputs a static search bundle to your website, and exposes a JavaScript search API that can be used anywhere on your site."

    Pagefind is cool!

  • We’re the Meilisearch team! To celebrate v1.0 of our open-source search engine, Ask us Anything!
    14 projects | /r/rust | 8 Feb 2023
    An option there is https://pagefind.app/ — not as fast as a persistent server but solves some of the deployment and bandwidth issues.
  • The technology behind GitHub’s new code search
    17 projects | news.ycombinator.com | 6 Feb 2023
    Search is a fascinating topic because it's such a fundamental problem and every search engine is based around the same extremely simple data structure (Posting list/inverted index). Despite that, search isn't easy and every search engine seems to be quite unique. It also seems to get exponentially harder with scale.

    You can write your own search engine that will perform very well on a surprisingly large amount of data, even doing naive full-text search. A search tool I came across a while back is a great example of something at that scale: https://pagefind.app/.

    For anyone who doesn't know anything about search I highly recommend reading this (It's mentioned in the blog post as well): https://swtch.com/~rsc/regexp/regexp4.html.

    Algolia also has a series of blog posts describing how their search engine works: https://www.algolia.com/blog/engineering/inside-the-algolia-....

    ---

    It's interesting that GitHub seems to have quite a few shards. Algolia basically has a monolithic architecture with 3 different hosts which replicate data and they embed their search engine in Nginx:

    "Our search engine is a C++ module which is directly embedded inside Nginx. So when the query enters Nginx, we directly run it through the search engine and send it back to the client."

    I'm guessing GitHub probably doesn't store repos in a custom binary format like Algolia does though:

    "Each index is a binary file in our own format. We put the information in a specific order so that it is very fast to perform queries on it."

    "Our Nginx C++ module will directly open the index file in memory-mapped mode in order to share memory between the different Nginx processes and will apply the query on the memory-mapped data structure."

    https://stackshare.io/posts/how-algolia-built-their-realtime...

    100ms p99 seems pretty good, but I'm curious what the p50 is and how much time is spent searching vs ranking. I've seen Dan Luu say that majority of time should be spent ranking rather than searching and when I've snooped on https://hn.algolia.com I've seen single digit millisecond search times in the responses which seems to corroborate this.

    I'm curious why they chose to optimize ingestion when it only took 36hrs to re-index the entire corpus without optimizations. A 50% speedup is nice, but 36hrs and 18hrs are the same order of magnitude and it sounds like there was a fair amount of engineering effort put into this. An index 1/5 of the size is pretty sweet though, I have to assume that's a bigger win that 50% faster ingestion.

    Since they're indexing by language I wonder if they have custom indexing/searching for each language, or if their ngram strategy is generic over all languages. Perhaps their "sparse grams" naturally token different for every language. Hard to tell when they leave out the juiciest part of the strategy though: "Assume you have some function that given a bigram gives a weight".

    Search is so cool. I could talk about it all day.

  • The Top Five Static Site Generators (SSGs) for 2023 — and when to use them!
    7 projects | dev.to | 16 Jan 2023
    Pagefind is a fully static search library that runs after Hugo, Eleventy, Next.js, Astro, SvelteKit, or any other SSG. It aims to perform well on large sites, while using as little of your users’ bandwidth as possible, and without hosting any infrastructure. It’s a fantastic alternative to a paid search solution, and its bandwidth requirements are nothing short of incredible.
  • Sweeter searches with Pagefind
    7 projects | dev.to | 8 Dec 2022
    Fortunately, while there are limits to how much you’ll be able to improve your experience with online search in general, you can optimize your own website’s search capabilities. That’s assuming, of course, that your website is built with a static site generator (SSG), as I’ve recommended on my own website over the years, and has search capabilities in the first place. If it lacks search, you can fix that readily enough with the free Pagefind tool about which I wrote earlier this year.
  • Does My Website Look Big in This? Six Tips to Lower your Page Weight
    3 projects | dev.to | 20 Oct 2022
    Pagefind for site search — Static search that scales incredibly well, for sites up to 100k+ pages. Pagefind is built with low network requests as a primary constraint, and still manages to deliver an extremely polished performance.
  • Artisanal Web Development
    2 projects | dev.to | 11 Oct 2022
    The open-source search tool Pagefind is a prime example of development under constraint, while also solving a distinct problem. In its case the constraint was network traffic — why ship a large search index file to every user, when indexes can be ‘chunked’ and served atomically, on demand?
  • The Best Marketing Tools and Integrations for your Static Site
    3 projects | dev.to | 2 Oct 2022
    We recommend: Pagefind for static sites up to 100K pages. It’s hard to beat CloudCannon’s own Pagefind — a free, open-source tool that sips as little of your users’ bandwidth as possible, while serving lightning-fast and finely configurable results. Even better, Pagefind’s search index is rebuilt automatically when your site builds, meaning it’s always up to date. For larger sites or more complex needs, we’d recommend using a paid, hosted search service like Algolia, or a customized instance of Google Programmable Search.
  • A note from our sponsor - InfluxDB
    www.influxdata.com | 20 Feb 2024
    Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality. Learn more →

Stats

Basic pagefind repo stats
23
2,774
9.3
15 days ago

CloudCannon/pagefind is an open source project licensed under MIT License which is an OSI approved license.

The primary programming language of pagefind is Rust.

The modern API for authentication & user identity.
The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
workos.com