mwmbl
nifi
mwmbl | nifi | |
---|---|---|
27 | 35 | |
1,362 | 4,410 | |
1.0% | 1.8% | |
9.4 | 9.9 | |
10 days ago | 7 days ago | |
Python | Java | |
GNU Affero General Public License v3.0 | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
mwmbl
- FLaNK Stack Weekly 19 Feb 2024
-
Text Processing Practice Expt: 27 SERP Types to SQLite (Yy084)
echo "https://mwmbl.org/?q=$x"|client 185.34.32.175
-
How bad are search results? Compare Google, Bing, Marginalia, Kagi, and ChatGPT
Ironically I had to use a search engine to discover what "Mwmbl" was. It's apparently a search engine. But, visiting the front page, I see something akin to a git commit log?! I'm not sure I'd have guessed that this was a SE if Brave Search did not tell me it was (even then I'm not convinced yet).
https://mwmbl.org/
-
Indexing a Billion Pages
I believe this is closer to the thing you were asking about, and the simple answer appears to be "a home grown one in python" https://github.com/mwmbl/mwmbl/blob/e544d45c374c13cdc1a5048d...
- Welcome to mwmbl, the free, open-source and non-profit search engine
- Marginalia.nu API
- Show HN: Ichido, search engine that tags sites using Google and Cloudflare
- Introduction!
- Mwmbl, the free, open-source and non-profit search engine
nifi
- FLaNK Stack Weekly 19 Feb 2024
- Ask HN: What are some unpopular technologies you wish people knew more about?
- FLaNK Stack Weekly for 13 November 2023
-
Ask HN: What low code platforms are worth using?
Apache NIFI (https://nifi.apache.org/).
It uses the concept of Flow-based programming. Also its so underacknolged but this tool is very flexible. I have used as an Event Bus all the 3rd-Party Integrations.
- Apache Nifi: easy to use, powerful, reliable system to process, distribute data
- Tool decision - What architecture would you choose and why?
-
Help with choosing techstack for a new DE team
Presently setting up Apache Nifi + Apache MiNiFi for the ETL portion of my work. NiFi was easy enough to figure out; but the docs for MiNiFi have been a pain due to differences between the Java and C++ versions. I then entirely configured it with the Java version so that it was easier to search for answers for the MiNiFi yaml syntax.
-
MS SQL Change Data Capture
Found it
-
Is there something like airflow but written in Scala/Java?
Apache Camel Apache Nifi Spring Cloud
-
Json splitting and Rerouting (new to nifi)
NIFI, like most Apache projects does most of its discussion on its mailing lists, but also has a slack.
What are some alternatives?
Lobsters - Computing-focused community centered around link aggregation and discussion
Logstash - Logstash - transport and process your logs, events, or other data
whoogle-search - A self-hosted, ad-free, privacy-respecting metasearch engine
superset - Apache Superset is a Data Visualization and Data Exploration Platform
PiTheremin
meltano
code-search-blocklist - A list of domains hosting scrapped code snippets and polluting search results to block.
meltano - Meltano: the declarative code-first data integration engine that powers your wildest data and ML-powered product ideas. Say goodbye to writing, maintaining, and scaling your own API integrations.
ublock-origin-shitty-copies-filter - Filter for DuckDuckGo and Google to remove those spam-websites that just blatantly copy and paste content from well known websites.
Pandas - Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more
ublacklist - Blocks specific sites from appearing in Google search results
Metabase - The simplest, fastest way to get business intelligence and analytics to everyone in your company :yum: