mwmbl
apache-iceberg-data-exploration | mwmbl | |
---|---|---|
1 | 27 | |
8 | 1,388 | |
- | 1.9% | |
7.0 | 9.3 | |
4 months ago | 8 days ago | |
Jupyter Notebook | Python | |
- | GNU Affero General Public License v3.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
apache-iceberg-data-exploration
mwmbl
- FLaNK Stack Weekly 19 Feb 2024
-
Text Processing Practice Expt: 27 SERP Types to SQLite (Yy084)
echo "https://mwmbl.org/?q=$x"|client 185.34.32.175
-
How bad are search results? Compare Google, Bing, Marginalia, Kagi, and ChatGPT
Ironically I had to use a search engine to discover what "Mwmbl" was. It's apparently a search engine. But, visiting the front page, I see something akin to a git commit log?! I'm not sure I'd have guessed that this was a SE if Brave Search did not tell me it was (even then I'm not convinced yet).
https://mwmbl.org/
-
Indexing a Billion Pages
I believe this is closer to the thing you were asking about, and the simple answer appears to be "a home grown one in python" https://github.com/mwmbl/mwmbl/blob/e544d45c374c13cdc1a5048d...
- Welcome to mwmbl, the free, open-source and non-profit search engine
- Marginalia.nu API
- Show HN: Ichido, search engine that tags sites using Google and Cloudflare
- Introduction!
- Mwmbl, the free, open-source and non-profit search engine
What are some alternatives?
nifi - Apache NiFi
Lobsters - Computing-focused community centered around link aggregation and discussion
xonsh - :shell: Python-powered, cross-platform, Unix-gazing shell.
PiTheremin
FLiPStackWeekly - FLaNK AI Weekly covering Apache NiFi, Apache Flink, Apache Kafka, Apache Spark, Apache Iceberg, Apache Ozone, Apache Pulsar, and more...
whoogle-search - A self-hosted, ad-free, privacy-respecting metasearch engine
code-search-blocklist - A list of domains hosting scrapped code snippets and polluting search results to block.
ublock-origin-shitty-copies-filter - Filter for DuckDuckGo and Google to remove those spam-websites that just blatantly copy and paste content from well known websites.
ublacklist - Blocks specific sites from appearing in Google search results
searx - Privacy-respecting metasearch engine [Moved to: https://github.com/searx/searx]
bertsearch - Elasticsearch with BERT for advanced document search.
MarginaliaSearch - Internet search engine for text-oriented websites. Indexing the small, old and weird web.