dplyr
Pandas
Our great sponsors
dplyr | Pandas | |
---|---|---|
40 | 393 | |
4,652 | 41,923 | |
0.7% | 1.4% | |
7.4 | 10.0 | |
21 days ago | 3 days ago | |
R | Python | |
GNU General Public License v3.0 or later | BSD 3-clause "New" or "Revised" License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
dplyr
-
Show HN: Open-source, browser-local data exploration using DuckDB-WASM and PRQL
That's great feedback, thanks!
This tool definitely comes from a place of personal need - beyond just handling large files, I've also never really gelled well with the Excel/Google Sheet model of changing data in place as if you were editing text. I'm a Data Scientist and always preferred the chained data transforms you see in things like dplyr (https://dplyr.tidyverse.org/) or Polars (https://pola.rs/) and I feel this tool maps very closely to the chained model.
Also, thank you for the feature requests! Those would all be very useful - we'll put them on the roadmap.
-
IS it possible for a R package to set an R option that only affects that package?
There's an example of how to use zzz.R with a .onload() function to set options in the dplyr code base: https://github.com/tidyverse/dplyr/blob/bbcfe99e29fe737d456b0d7adc33d3c445a32d9d/R/zzz.r
-
Calculation within a data table by calling on specific values in two columns
Look at the tidyverse, especially the case_when or mutate functions.
-
PSA: You don't need fancy stuff to do good work.
Before diving into advanced machine learning algorithms or statistical models, we need to start with the basics: collecting and organizing data. Fortunately, both Python and R offer a wealth of libraries that make it easy to collect data from a variety of sources, including web scraping, APIs, and reading from files. Key libraries in Python include requests, BeautifulSoup, and pandas, while R has httr, rvest, and dplyr.
-
Creating data frame
It looks like your syntax is wrong. I think you’re trying to calculate a new variables in your data frame, or alter an existing column in a data frame. Have a look at the select() function in this reference for the proper syntax to use. https://dplyr.tidyverse.org/ Does that help?
-
I'm designing a shirt for a friend, it has 4 embroidered images of things they like/do. One thing is coding, they use R... I'm wondering two things. 1) What's a good image or piece of code or something that I should use? and 2) should I even add it to the design the shirt?
A lot of populat libraries have their own logos. Maybe one of them would be good. Check out dplyr for example: https://dplyr.tidyverse.org/
-
Anyone use Python for statistics, particularly DOE or QA/QC? What are your thoughts?
I hope you give it a try when you get a chance: https://dplyr.tidyverse.org/
-
Rstudio tidyverse help!
You can read up on the dplyr-verbs here, which I strongly suggest for your exam! In the code examples, you can simply click on any function you don't understand and it will take you directly to the documentation. Good Luck!
- Beginner question
- osdc-2023-assignment1
Pandas
-
Deploying a Serverless Dash App with AWS SAM and Lambda
Dash is a Python framework that enables you to build interactive frontend applications without writing a single line of Javascript. Internally and in projects we like to use it in order to build a quick proof of concept for data driven applications because of the nice integration with Plotly and pandas. For this post, I'm going to assume that you're already familiar with Dash and won't explain that part in detail. Instead, we'll focus on what's necessary to make it run serverless.
-
Help Us Build Our Roadmap – Pydantic
there is pull request to integrate in both pydantic extra types and into pandas cose [1]
[1]: https://github.com/pandas-dev/pandas/issues/53999
-
Stuff I Learned during Hanukkah of Data 2023
Last year I worked through the challenges using VisiData, Datasette, and Pandas. I walked through my thought process and solutions in a series of posts.
-
Introducing Flama for Robust Machine Learning APIs
pandas: A library for data analysis in Python
-
Exploring Open-Source Alternatives to Landing AI for Robust MLOps
Data analysis involves scrutinizing datasets for class imbalances or protected features and understanding their correlations and representations. A classical tool like pandas would be my obvious choice for most of the analysis, and I would use OpenCV or Scikit-Image for image-related tasks.
-
Mastering Pandas read_csv() with Examples - A Tutorial by Codes With Pankaj
Pandas, a powerful data manipulation library in Python, has become an essential tool for data scientists and analysts. One of its key functions is read_csv(), which allows users to read data from CSV (Comma-Separated Values) files into a Pandas DataFrame. In this tutorial, brought to you by CodesWithPankaj.com, we will explore the intricacies of read_csv() with clear examples to help you harness its full potential.
-
What Would Go in Your Dream Documentation Solution?
So, what I'd like to do is write a documentation package in Python to recreate what I've lost. I plan to build upon the fantastic python-docx and docxtpl packages, and I'll probably rely on pandas from much of the tabular stuff. Here are the features I intend to include:
-
How do people know when to use what programming language?
Weirdly most of my time spent with data analysis was in the C layers in pandas.
- Read files from s3 using Pandas/s3fs or AWS Data Wrangler?
-
10 Github repositories to achieve Python mastery
Explore here.
What are some alternatives?
worldfootballR - A wrapper for extracting world football (soccer) data from FBref, Transfermark, Understat and fotmob
Cubes - [NOT MAINTAINED] Light-weight Python OLAP framework for multi-dimensional data analysis
Rustler - Safe Rust bridge for creating Erlang NIF functions
tensorflow - An Open Source Machine Learning Framework for Everyone
ggplot2 - An implementation of the Grammar of Graphics in R
orange - 🍊 :bar_chart: :bulb: Orange: Interactive data analysis
nx - Multi-dimensional arrays (tensors) and numerical definitions for Elixir
Airflow - Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
explorer - Series (one-dimensional) and dataframes (two-dimensional) for fast and elegant data exploration in Elixir
Keras - Deep Learning for humans
rmarkdown - Dynamic Documents for R
Pytorch - Tensors and Dynamic neural networks in Python with strong GPU acceleration