GlusterFS
Apache Solr
GlusterFS | Apache Solr | |
---|---|---|
19 | 31 | |
4,498 | 4,366 | |
1.0% | 0.0% | |
6.4 | 0.0 | |
5 days ago | 2 months ago | |
C | Java | |
GNU General Public License v3.0 only | Apache License 2.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
GlusterFS
-
Tell HN: ZFS silent data corruption bugfix – my research results
https://github.com/gluster/glusterfs/issues/894
And apparently apart from modern coreutils using that, it is mostly gentoo users hitting the bugs in lseek.
-
Linux deserves a better class of friends
This Product Appendix does not apply to online service offerings managed by Red Hat or generally available open source projects such as www.wildfly.org, www.fedoraproject.org, www.openstack.redhat.com, www.gluster.org, www.centos.org, okd.io, Ansible Project Software or other community projects.
-
Which distributed filesystem to use on a 4 node cluster?
Just because Red Hat will stop selling commercial support for their product, does not mean GlusterFS itself is dying. It's an open source project like any other - https://github.com/gluster/glusterfs
-
Setting up a 2 node distributed network share
https://www.gluster.org/ Is the way to do this across nodes
-
System Design: Netflix
This allows us to fetch the desired quality of the video as per the user's request, and once the media file finishes processing, it will be uploaded to a distributed file storage such as HDFS, GlusterFS, or an object storage such as Amazon S3 for later retrieval during streaming.
-
What's the best way to periodically sync two remote servers?
GlusterFS
-
System Design: The complete course
But where can we store files at scale? Well, object storage is what we're looking for. Object stores break data files up into pieces called objects. It then stores those objects in a single repository, which can be spread out across multiple networked systems. We can also use distributed file storage such as HDFS or GlusterFS.
-
First Apartment and First Homelab
GlusterFS - same as above (https://www.gluster.org/)
-
Multiple DS units acting as one?
What you look for is a clustered file system. Like https://www.gluster.org/. As long as all units are closeby with low latency there are a couple solutions that allow you to create distributed storage solutions of various kinds. Key value stores applenty, clustered file systems that pretent to be one file system etc. If you have geographically distributed solutions with high latencies it becomes harder. Most open source systems don't work really well in this scenario. There were a couple attempts like Hydrabase but they didn't go so far. It normally is solved by doing two clusters and then replicate between them.
-
Upload pdf file to mongodb atlas
I'd imagine most managed service providers are going to require a credit card, though most of them have a free tier. If you want to take an unmanaged approach, maybe look into Gluster. I've used it before and never had issue with it, but I also had an infrastructure team that set it up, so I'm not familiar with the challenges that way: https://www.gluster.org/
Apache Solr
- Iniciando no Elasticsearch: Conceitos básicos
-
YaCy, a distributed Web Search Engine, based on a peer-to-peer network
There are already many project about search:
- https://www.marginalia.nu/
- https://searchmysite.net/
- https://lucene.apache.org/
- elastic search
- https://presearch.com/
- https://stract.com/
- https://wiby.me/
I think that all project are fun. I would like to see one succeeding at reaching mainstream level of attention.
I have also been gathering links meta data for some time. Maybe I will use them to feed any eventual self hosted search engine, or language model, if I decide to experiment with that.
- domains for seed https://github.com/rumca-js/Internet-Places-Database
- bookmarks seed https://github.com/rumca-js/RSS-Link-Database
- links for year https://github.com/rumca-js/RSS-Link-Database-2024
-
Getting started with Elasticsearch + Python
Elasticsearch is based on Lucene and is used by various companies and developers across the world to build custom search solutions.
-
Tools to use to query and index data?
elastic search is kinda heavyweight infra for a small project. Its built on top of apache lucene (https://lucene.apache.org), which you can use directly.
-
Top metrics for Elasticsearch monitoring with Prometheus
Elasticsearch is based on Lucene, which is built in Java. This means that monitoring the Java Virtual Machine (JVM) memory is crucial to understand the current usage of the whole system.
-
Cross data type search that wasn’t supported well using Elasticsearch
Apache Lucene which seems to have a lot more features than Elasticsearch
-
How to find closest keyphrase match in text?
Generally with term vectors and a tf-idf index. Lucene is a good starting place to help.
-
Java Library to perform string search
try elasticsearch or solr, behind the scenes they both use https://lucene.apache.org/ if you don't want basically a full nosql database service, but I'd just slap solr up and call it a day.
-
Top 8 Open-Source Observability & Testing Tools
OpenSearch is an open-source database to ingest, search, visualize, and analyze data. It’s built on top of Apache Lucerce, a FOSS library for indexing and search, which OpenSearch leverages for more advanced analytics capabilities, like anomaly detection, machine learning, full-text search, and more.
-
grep like search with preprocessing
Lucene is the thing you think you need. Elastic Search is a nice wrapper for it. But these are Java, so maybe you want Sphinx Search (C++) or MeiliSearch (Rust).
What are some alternatives?
minio - The Object Store for AI Data Infrastructure
OpenSearch - 🔎 Open source distributed and RESTful search engine.
lizardfs - LizardFS is an Open Source Distributed File System licensed under GPLv3.
Typesense - Open Source alternative to Algolia + Pinecone and an Easier-to-Use alternative to ElasticSearch ⚡ 🔍 ✨ Fast, typo tolerant, in-memory fuzzy Search Engine for building delightful search experiences
Tahoe-LAFS - The Tahoe-LAFS decentralized secure filesystem.
MeiliSearch - A lightning-fast search API that fits effortlessly into your apps, websites, and workflow
Go IPFS - IPFS implementation in Go [Moved to: https://github.com/ipfs/kubo]
Elasticsearch - Free and Open, Distributed, RESTful Search Engine
btrfs - Haskell bindings to the btrfs API
loki - Like Prometheus, but for logs.
MooseFS - MooseFS – Open Source, Petabyte, Fault-Tolerant, Highly Performing, Scalable Network Distributed File System (Software-Defined Storage)
Apache Lucene - Apache Lucene.NET