data-retrieval
data-retrieval | Spooq | |
---|---|---|
1 | 1 | |
1 | 8 | |
- | - | |
10.0 | 7.4 | |
9 months ago | about 1 month ago | |
Clojure | Python | |
GNU General Public License v3.0 or later | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
data-retrieval
-
[OC] What the major news agencies reported on from November 6 to 12, 2022
The source code for the analysis, visualization, and the data are open: * These scripts extract and transform the data. * You can find the data here. * This repository contains the resources for the visualization.
Spooq
-
Using Spooq to load a large scale of data
the link to the project: https://github.com/Breaka84/Spooq/blob/master/spooq/loader/hive_loader.py
What are some alternatives?
graph-vis
Proxmox-load-balancer - Designed to constantly maintain the Proxmox cluster in balance
scrollreveal - Animate elements as they scroll into view. [Moved to: https://github.com/jlmakes/scrollreveal]
pyspark-example-project - Implementing best practices for PySpark ETL jobs and applications.
historical-data
workshop-realtime-data-pipelines - You will inspect and run a sample architecture making use of Apache Pulsar™ and Pulsar Functions for real-time, event-streaming-based data ingestion, cleaning and processing.
scrollreveal - Animate elements as they scroll into view.
data-science-ipython-notebooks - Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.
omniparser - omniparser: a native Golang ETL streaming parser and transform library for CSV, JSON, XML, EDI, text, etc.
hamilton - Hamilton helps data scientists and engineers define testable, modular, self-documenting dataflows, that encode lineage and metadata. Runs and scales everywhere python does.
dlt - data load tool (dlt) is an open source Python library that makes data loading easy 🛠️