Our great sponsors
-
I can pretty easily add any StackExchange sites I left out, or anything that comes as a Zim file (e.g. from https://farm.openzim.org/recipes or the like). If it’d be appropriate for https://devdocs.io (official docs with suitable licensing), you can contribute a crawler to them and it’ll flow downstream to me.
I also have plans to do proper web crawling, though it’ll take me a while to get there: https://search.feep.dev/blog/post/2022-08-10-crawling-roadma...
-
Congratulations, I always enjoy new search engines
W.r.t. "and updated intermittently," I wanted to draw your attention to the HN realtime API: https://github.com/HackerNews/API#live-data and also that S.O. offers Atom Feeds: https://stackoverflow.com/feeds/ (I'd guess the rest do, too, but I didn't verify)
I am a huge proponent of taking advantage of any update features that a site offers, because otherwise the "how about now?" of re-crawling is wasteful to both parties.
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
- Brave (recently started its own index but often falls back on Google's)
Love to see projects like Marginalia and now this. These projects also make meta search engines like Searx[0] that much more powerful.
Anyways since I'm in the business of listing out relevant projects, other code-centered search engines you might wanna check out are searchcode.com[1], codesearch.ai[2], symbolhound[3], and publicwww.com[4] (some of these are often down, but might still be good to learn from)
-
- https://pldb.com/ (might be a good way to automatically get all the docs of each programming language as well as books/videos/publications that mention a certain language)