pybaseball
mara-pipelines
Our great sponsors
pybaseball | mara-pipelines | |
---|---|---|
33 | 3 | |
1,113 | 2,054 | |
- | 0.4% | |
5.0 | 6.0 | |
23 days ago | 4 months ago | |
Python | Python | |
MIT License | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
pybaseball
- pybaseball help
- Best baseballdata source for datascraping
-
Baseball Savant API
pybaseball and MLB-StatsAPI are the go-to python wrappers for the official MLB API.
- Data sources for MLB ABs?
-
Looking for Spray Angle Data from this season (preferably)
I think pybaseball includes spray angle in their statcast() data frame! And if not, there definitely is the (x,y) location of where the ball landed, so you could probably do some trig to calculate that too.
-
Anyone willing to help create a webscrape for first pitch data?
Check out the docs here: https://github.com/jldbc/pybaseball/blob/master/docs/playerid_reverse_lookup.md
-
Python Code help
I think a lot of what you're looking to do though can be accomplished a bit easier by utilizing pre-existing Python libraries like Pybaseball.
- Help running pybaseball commands in Python
-
MLB Stats API Application time?
most folks without direct access to mlb's api scrape baseball savant's data api. packages like baseballr or pybaseball can help with this. remember, this is in the open on a trust model: no commercial use, and don't hammer the api.
-
Where to get started analyzing basic baseball metrics
And if you’re using Python, it’s pybaseball I believe. https://github.com/jldbc/pybaseball
mara-pipelines
-
How to keep track of the different Transformations done in an ETL pipeline?
The closest I've found is Mara but not what I'm after.
-
Using PostgreSQL as a Data Warehouse
The tooling behind the approach has been built as a set of python package named Mara. It is available at GitHub:
https://github.com/mara/mara-pipelines
And additional packages can be found at the Mara org:
https://github.com/mara
-
Build your own “data lake” for reporting purposes
Minio and nifi, require machines by themselves. Better off pure python and if obe wants sonething lighweight and visually pleasing Mara [0] or Dagster with Dagit [1] will do the job
[0] https://github.com/mara/mara-pipelines
[1] https://docs.dagster.io/tutorial/execute
What are some alternatives?
MLB-StatsAPI - Python wrapper for MLB Stats API
abcd-hcp-pipeline - bids application for processing functional MRI data, robust to scanner, acquisition and age variability.
baseballr - A package written for R focused on baseball analysis. Currently in development.
kuwala - Kuwala is the no-code data platform for BI analysts and engineers enabling you to build powerful analytics workflows. We are set out to bring state-of-the-art data engineering tools you love, such as Airbyte, dbt, or Great Expectations together in one intuitive interface built with React Flow. In addition we provide third-party data into data science models and products with a focus on geospatial data. Currently, the following data connectors are available worldwide: a) High-resolution demographics data b) Point of Interests from Open Street Map c) Google Popular Times
boxball - Prebuilt Docker images with Retrosheet's complete baseball history data for many analytical frameworks. Includes Postgres, cstore_fdw, MySQL, SQLite, Clickhouse, Drill, Parquet, and CSV.
dbt-core - dbt enables data analysts and engineers to transform their data using the same practices that software engineers use to build applications.
sports.py - A simple Python package to gather live sports scores
etl-markup-toolkit - ETL Markup Toolkit is a spark-native tool for expressing ETL transformations as configuration
strat-o-rama - Generating plausible Strat-O-Matic cards from MLB data
dremio-oss - Dremio - the missing link in modern data
baseball-pi - Get the live box score, plays, and batter stats of your favorite MLB team right on your desktop.
airbyte - The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.