Our great sponsors
-
https://hn.algolia.com/?dateRange=all&page=0&prefix=false&qu...
Outside of legacy systems, Hadoop isn't widely used anymore.
-
-
InfluxDB
Access the most powerful time series database as a service. Ingest, store, & analyze all types of time series data in a fully-managed, purpose-built database. Keep data forever with low-cost storage and superior data compression.
-
-
Another tip is using wget2 instead of wget if you're mirroring a site (but this is more I/O tip than computationally heavy) https://gitlab.com/gnuwget/wget2/-/wikis/home
Sadly, wget2 doesn't support WARC last time I checked, but wget2 comes with a `--max-threads` parameter that together with `--mirror` and `--tries` makes it trivial to mirror even the slowest websites out there.
NOTE:
The number of mentions on this list indicates mentions on common posts plus user suggested alternatives.
Hence, a higher number means a more popular project.