|5 days ago||7 days ago|
|BSD 3-clause "New" or "Revised" License|
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
How to query pandas DataFrames with SQL
5 projects | dev.to | 1 Feb 2023
Pandas is a go-to tool for tabular data management, processing, and analysis in Python, but sometimes you may want to go from pandas to SQL.
What are the best Python libraries to learn for beginners?
4 projects | reddit.com/r/Python | 31 Jan 2023
Replacing Pandas with Polars. A Practical Guide
4 projects | news.ycombinator.com | 22 Jan 2023
> The big thing pandas has going for it is that it's already been through this field testing. All the bugs have been ironed out by the hundreds of thousands of users.
At this very moment pandas github repo has 1563 open issues labeled with a bug tag . So much for "all the bugs have been ironed out".
Joining the Open Source Development Course
4 projects | dev.to | 20 Jan 2023
Python is the main programming language I use nowadays. In particular numpy and pandas are of course extremely useful. I also use biopython package - a collection of software tools for biological computation written in Python by an international group of researchers and developers.
Pandas VS Rath - a user suggested alternative
2 projects | 12 Jan 2023
Twitter Data Pipeline with Apache Airflow + MinIO (S3 compatible Object Storage)
5 projects | dev.to | 6 Jan 2023
Below is the python Task that transforms the tweets list into a Pandas dataframe, then dumps it in our MinIO Object Storage as a CSV file:
Hanukkah of Data 2022 - Puzzle 2
2 projects | dev.to | 30 Dec 2022
It was rewarding to dig into SQLite a bit while solving this puzzle, so I figured this would be a good opportunity to learn a bit more about pandas too! So how would I adapt this working SQL solution to pandas?
ETL using pandas
2 projects | reddit.com/r/dataengineering | 20 Dec 2022
Tor Network Statistics & Performance [OC]
2 projects | reddit.com/r/dataisbeautiful | 19 Dec 2022
All the data has been extracted from the official Tor Metrics website, and using Python with the Pandas library I've cleaned the data. Finally, the visualizations have been made with Tableau.
How to take inputs from an ascii file in Python
2 projects | reddit.com/r/learnpython | 16 Dec 2022
If you did that you could use a built-in library like csv to read and parse the file or you could use a 3rd party library like Pandas. Alternatively, you could store your file as json:
Show HN: Open-Source No-Code Platform for Machine Learning and Data Science
2 projects | news.ycombinator.com | 1 Jan 2023
Honestly, I think ML should always involve at least a little bit of coding, which would be more practical. That said, this looks reasonable, good playground for experiment.
A good similar product is Orange: https://orangedatamining.com/
Resources for data visualization (free & paid) for scientific publications
2 projects | reddit.com/r/datascience | 17 Nov 2022
Actually....I thought of an interesting free option. Check out orange3. https://orangedatamining.com/
2 projects | reddit.com/r/opensource | 27 Jun 2022
I love to play with my dataset using OrangeDatamining. Very easy to use. Docs and example available. It’s like Data Modeler from IBM but better bcos it is open project :-)
[D] Why Hasn't FOSS Drag-and-Drop ML tools taken off yet?
2 projects | reddit.com/r/MachineLearning | 8 Sep 2021
Currently, I am looking around for modules for Knime and Orange and looked at some of the modules, and realized that it does not have enough tools within their tool kit (e.g. text data analysis, network analysis, image classification).
What are some alternatives?
Cubes - Light-weight Python OLAP framework for multi-dimensional data analysis
glue - Linked Data Visualizations Across Multiple Files
tensorflow - An Open Source Machine Learning Framework for Everyone
Airflow - Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
pyexcel - Single API for reading, manipulating and writing data in csv, ods, xls, xlsx and xlsm files
Keras - Deep Learning for humans
SymPy - A computer algebra system written in pure Python
Dask - Parallel computing with task scheduling
NumPy - The fundamental package for scientific computing with Python.
Pytorch - Tensors and Dynamic neural networks in Python with strong GPU acceleration
blaze - NumPy and Pandas interface to Big Data
Scrapy - Scrapy, a fast high-level web crawling & scraping framework for Python.