worldfootballR
dplyr
Our great sponsors
worldfootballR | dplyr | |
---|---|---|
7 | 40 | |
392 | 4,634 | |
- | 0.5% | |
9.0 | 7.4 | |
2 months ago | 16 days ago | |
R | R | |
- | GNU General Public License v3.0 or later |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
worldfootballR
-
[OC] Liverpool Substitutions Using worldfootballR and GT
Data extracted using worldfootballR
dplyr
-
Show HN: Open-source, browser-local data exploration using DuckDB-WASM and PRQL
That's great feedback, thanks!
This tool definitely comes from a place of personal need - beyond just handling large files, I've also never really gelled well with the Excel/Google Sheet model of changing data in place as if you were editing text. I'm a Data Scientist and always preferred the chained data transforms you see in things like dplyr (https://dplyr.tidyverse.org/) or Polars (https://pola.rs/) and I feel this tool maps very closely to the chained model.
Also, thank you for the feature requests! Those would all be very useful - we'll put them on the roadmap.
-
PSA: You don't need fancy stuff to do good work.
Before diving into advanced machine learning algorithms or statistical models, we need to start with the basics: collecting and organizing data. Fortunately, both Python and R offer a wealth of libraries that make it easy to collect data from a variety of sources, including web scraping, APIs, and reading from files. Key libraries in Python include requests, BeautifulSoup, and pandas, while R has httr, rvest, and dplyr.
- osdc-2023-assignment1
-
Modern Polars: an extensive side-by-side comparison of Polars and Pandas
It really can't be said enough how pandas is a mess. It has way too much surface area and no common thread pulling it all together. This gets obvious when you work with better dataframe libs like dplyr [1] or DataFramesMeta [2]. I've worked on production systems with all of these libs, this is not gratuitous bashing.
-
How do I find R code for R functions?
There are two ways you can generally see the source code for packages. The simplest is to look for the github repository for the package (assuming it exists). For dplyr, it's here. Easiest way to find these is to google search "r github" plus the name of the package. Usually it'll be one of the first results. The github repo would also usually be linked on the package's CRAN page. However, be aware that this may be a development version of the package and not the same version that is currently released on CRAN (e.g. dplyr on CRAN is version 1.0.10, but on github it is listed as version 1.0.99.9000, which will probably become version 1.1.0 when it is released onto CRAN).
-
People who live near other people vote for Democrats
Tools used: various packages in R (tidycensus, dplyr, ggplot2, sf)
-
Used Cars Data Scraping - R & Github Actions & AWS
It came up with the idea of how to combine Data Engineering with Cloud and automation. I needed to find a data source as it would be an automated pipeline, so I needed a dynamic source. At the same time, I wanted to find a site where I thought retrieving data would not be a problem and do practice with both rvest and dplyr. After I had no problems with my experiments with Carvago, I added the necessary data cleaning steps. Another thing I aimed for in the project was to keep the data in different ways in different environments. While raw (daily CSV) and processed data were written to the Github repo, I wrote the processed data to PostgreSQL on AWS RDS. In addition, I sync the raw and processed data to S3 to be able to use it with Athena. However, I have separated some stages for GitHub Actions to be a good practice. For example, in the first stage, I added synchronization with AWS S3 as a separate action while scraping data, cleaning, and printing fundamental analysis to a simple log file. If there is no error after all this, I added a report with RMarkdown and the action that will be published on github.io. Thus, I created an end-to-end data pipeline where the data from the source is made to offer basic reporting with simple processing.
-
Quick candlestick summaries with Elixir's Explorer
The API is heavily influenced by Tidy Data and borrows much of its design from dplyr. The philosophy is heavily influenced by this passage from dplyr's documentation:
-
tidytable v0.8.1 is on CRAN - it also comes with a new logo! Need data.table speed with tidyverse syntax? Check out tidytable.
Also - I might have been the one that put in the request for .by in dplyr 😅
-
ibis-datasette: Query datasette servers without writing a line of SQL
For my day job I work on ibis. ibis lets users write queries using a familiar dataframe-like API, and then execute those queries on a number of SQL (and non-SQL) backends. Think of it like dplyr but for Python.
What are some alternatives?
Rustler - Safe Rust bridge for creating Erlang NIF functions
ggplot2 - An implementation of the Grammar of Graphics in R
nx - Multi-dimensional arrays (tensors) and numerical definitions for Elixir
explorer - Series (one-dimensional) and dataframes (two-dimensional) for fast and elegant data exploration in Elixir
blogdown - Create Blogs and Websites with R Markdown
wesanderson - A Wes Anderson color palette for R
rmarkdown - Dynamic Documents for R
Pandas - Flexible and powerful data analysis / manipulation library for Python, providing labeled data structures similar to R data.frame objects, statistical functions, and much more
polars - Dataframes powered by a multithreaded, vectorized query engine, written in Rust
regression-js - Curve Fitting in JavaScript.
axon - Nx-powered Neural Networks