Lmgrep: Lucene-based grep-like utility

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

Scout Monitoring - Free Django app performance insights with Scout Monitoring
Get Scout setup in minutes, and let us sweat the small stuff. A couple lines in settings.py is all you need to start monitoring your apps. Sign up for our free tier today.
www.scoutapm.com
featured
InfluxDB - Purpose built for real-time analytics at any scale.
InfluxDB Platform is powered by columnar analytics, optimized for cost-efficient storage, and built with open data standards.
www.influxdata.com
featured
  • lucene-grep

    Grep-like utility based on Lucene Monitor compiled with GraalVM native-image

    Here goes: https://github.com/dainiusjocas/lucene-grep/issues/84

    I realize some relatively obscure Finnish stemmer and Lucene with GraalVM aren't exactly a common use case. I did some testing and provided my use case. I certainly have much English language content to search with using lucene-grep. So, thank you for making it!

  • Scout Monitoring

    Free Django app performance insights with Scout Monitoring. Get Scout setup in minutes, and let us sweat the small stuff. A couple lines in settings.py is all you need to start monitoring your apps. Sign up for our free tier today.

    Scout Monitoring logo
  • cs

    command line codespelunker or code search

    Neat. This is similar to a tool I have been working on (but need to finish off) as I saw the same issue.

    Except rather than build an index I brute forced the search each time. For most repositories it’s fast enough even with ranking.

    https://github.com/boyter/cs For those interested it’s still very WIP with noticeable issues in TUI mode.

  • dxr

    Discontinued DEPRECATED - Powerful search for large codebases

    There is DXR from Mozilla but I'm not sure how generalised it is.

    https://github.com/mozilla/dxr

    There is also Sourcegraph.

  • ArchiveBox

    🗃 Open source self-hosted web archiving. Takes URLs/browser history/bookmarks/Pocket/Pinboard/etc., saves HTML, JS, PDFs, media, and more...

    Not OP so I can't speak for them. There's a bunch of ways to do this, ranging from more turnkey solutions to collections of scripts and extensions you can use. On the turnkey side, there's programs like ArchiveBox[1] which take links and store them as WARC files. You can import your browsing history into ArchiveBox and set up a script to do it automatically. If you'd like to set something up yourself, you can extract your browsing history (eg, firefox stores its history in a sqlite database) and manually wget those urls. For a reference to the more "bootstrapped" version, I'll link to Gwern's post on their archiving setup [2]. It's fairly long, so I advise skipping to the parts you're interested in first.

    1: https://github.com/ArchiveBox/ArchiveBox

    2: https://www.gwern.net/Archiving-URLs

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts

  • To preserve their work journalists take archiving into their own hands

    3 projects | news.ycombinator.com | 4 Aug 2024
  • Search Multi-language Documents in ast-grep

    7 projects | dev.to | 23 Jul 2024
  • Why are so many books listed as "Borrow Unavailable" at the Internet Archive?

    1 project | news.ycombinator.com | 14 Jun 2024
  • Ask HN: What do you use for reading papers?

    1 project | news.ycombinator.com | 14 Jun 2024
  • amber, a code search & replace tool

    11 projects | news.ycombinator.com | 23 May 2024

Did you konow that Python is
the 1st most popular programming language
based on number of metions?