spark-solr
mmlspark
spark-solr | mmlspark | |
---|---|---|
2 | 2 | |
445 | 2,489 | |
0.2% | - | |
2.8 | 9.3 | |
5 months ago | over 2 years ago | |
Scala | Scala | |
Apache License 2.0 | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
spark-solr
- How to store 175 million rows and query them
-
Alternatives to update by query
You could use Spark
mmlspark
What are some alternatives?
spark-nlp - State of the Art Natural Language Processing
SynapseML - Simple and Distributed Machine Learning
isolation-forest - A Spark/Scala implementation of the isolation forest unsupervised outlier detection algorithm.
polyaxon - MLOps Tools For Managing & Orchestrating The Machine Learning LifeCycle
Activeloop Hub - Data Lake for Deep Learning. Build, manage, query, version, & visualize datasets. Stream data real-time to PyTorch/TensorFlow. https://activeloop.ai [Moved to: https://github.com/activeloopai/deeplake]
Spark Utils - Basic framework utilities to quickly start writing production ready Apache Spark applications
metarank - A low code Machine Learning personalized ranking service for articles, listings, search results, recommendations that boosts user engagement. A friendly Learn-to-Rank engine
dblink - Distributed Bayesian Entity Resolution in Apache Spark
Spark Tools - Executable Apache Spark Tools: Format Converter & SQL Processor
qbeast-spark - Qbeast-spark: DataSource enabling multi-dimensional indexing and efficient data sampling. Big Data, free from the unnecessary!
netron - Visualizer for neural network, deep learning and machine learning models
Clustering4Ever - C4E, a JVM friendly library written in Scala for both local and distributed (Spark) Clustering.