tidyquery
janitor

tidyquery | janitor | |
---|---|---|
2 | 2 | |
167 | 1,400 | |
- | 0.1% | |
0.0 | 5.5 | |
about 2 years ago | about 2 months ago | |
R | R | |
Apache License 2.0 | GNU General Public License v3.0 or later |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
tidyquery
-
Can "dplyr" code automatically be converted to SQL code?
tidyquery
-
ClickHouse as an alternative to Elasticsearch for log storage and analysis
> SQL is a perfect language for analytics.
Slightly off topic, but I strongly agree with this statement and wonder why the languages used for a lot of data science work (R, Python) don't have such a strong focus on SQL.
It might just be my brain, but SQL makes so much logical sense as a query language and, with small variances, is used to directly query so many databases.
In R, why learn the data.tables (OK, speed) or dplyr paradigms, when SQL can be easily applied directly to dataframes? There are libraries to support this like sqldf[1], tidyquery[2] and duckdf[3] (author). And I'm sure the situation is similar in Python.
This is not a post against great libraries like data.table and dplyr, which I do use from time to time. It's more of a question about why SQL is not more popular as the query language de jour for data science.
[1] https://cran.r-project.org/web/packages/sqldf/index.html
[2] https://github.com/ianmcook/tidyquery
[3] https://github.com/phillc73/duckdf
janitor
-
Working with columns names that are numbers (in this case, years)
I would just clean the names and work with those. Then there is no need to use backticks. Read about the function clean_names in the janitor vignette: https://github.com/sfirke/janitor
-
R Libraries Every Data Scientist Should Know - Pyoflife
I just stumbled across Janitor which can help you clean colum names easily.
What are some alternatives?
duckdf - 🦆 SQL for R dataframes, with ducks
parquetize - R package that allows to convert databases of different formats to parquet format
tidylog - Tidylog provides feedback about dplyr and tidyr operations. It provides wrapper functions for the most common functions, such as filter, mutate, select, and group_by, and provides detailed output for joins.
IntRo - Introduction to R for health data
tidyverse - Easily install and load packages from the tidyverse
Practical-Applications-in-R-for-Psychologists - Lesson files for Practical Applications in R for Psychologists.
cloki-go-legacy - Clickhouse Loki API in GO (WIP)
desctable - An R package to produce descriptive and comparative tables
tidyquant - Bringing financial analysis to the tidyverse
tidytext - Text mining using tidy tools :sparkles::page_facing_up::sparkles:
clickhousedb_fdw - PostgreSQL's Foreign Data Wrapper For ClickHouse
datapasta - On top of spaghetti, all covered in cheese....
