|3 days ago||5 days ago|
|BSD 3-clause "New" or "Revised" License||BSD 3-clause "New" or "Revised" License|
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
Level up your Python today with open-source contributions
11 projects | dev.to | 25 May 2022
You could also visit the project's GitHub repository and add "/contribute" to the end of the URL. For example, visiting https://github.com/scrapy/scrapy/contribute will show you relatively approachable tasks for first-time contributors to the Scrapy project.
Python Projects to Improve?
3 projects | reddit.com/r/learnpython | 17 May 2022
I used Scrapy for those, but you can also learn how to scrape and parse HTML with requests and something like Beautiful Soup.
Django custom management command running Scrapy: How to include Scrapy's options?
1 project | reddit.com/r/codehunter | 1 May 2022
I want to be able to run the Scrapy web crawling framework from within Django. Scrapy itself only provides a command line tool scrapy to execute its commands, i.e. the tool was not intentionally written to be called from an external program.
Is it possible on Python?
4 projects | reddit.com/r/Python | 28 Apr 2022
Yeah my hunch is that a combination of nltk, python-Levenshtein, numpy for language processing, pandas for gathering results and scrapy for web scraping should make it possible. Sadly such a project probably requires at least a month or two worth of training in Python to prototype. Good luck OP.
Web-scraping without much Python knowledge
1 project | reddit.com/r/learnpython | 21 Apr 2022
The scrapy framework would be a good fit for what you want to do (especially if you plan on hosting the crawler), however the learning curve is quite steep, especially if you don't have much experience using python.
Seeking a modern streamlined modern framework for my one page web app.
2 projects | reddit.com/r/webdev | 15 Apr 2022
I had a cool idea recently, coded up a scraper with scrapy. Decided it was a cool dataset, scraped it all into postgres with SQLalchemy. Built out a nice streamlit project so others users could interact with it as well. A core component of the project related to using spotify's API. Long story short, streamlit isn't totally mature and something like capturing the auth token after a redirect is a PITA. Plus, you are sort of locked into the visual design of streamlit to begin with.
Looking for a Selenium alternative
1 project | reddit.com/r/selenium | 2 Apr 2022
Look into Scrapy
Bulk Image Downloading (and Renaming)
1 project | reddit.com/r/sysadmin | 31 Mar 2022
Don't know about any non scripting way. But i would use https://scrapy.org/ for this.
Intermediate & Advanced Python Projects
2 projects | dev.to | 31 Mar 2022
Script to pull data from websites structured data?
1 project | reddit.com/r/Python | 28 Mar 2022
Check out Scrapy - https://scrapy.org/
3 Things To Know Before Building with PyScript
2 projects | dev.to | 26 May 2022
Going from Excel -> Python for analysis?
2 projects | reddit.com/r/biology | 23 May 2022
It is worthwhile to learn how to use Pandas for handling data - https://pandas.pydata.org/
Send Multiple Email in Excel using Python
1 project | dev.to | 23 May 2022
I use pandas to read the file
A smart way to print :)
6 projects | reddit.com/r/Python | 22 May 2022
“...” in python?
1 project | reddit.com/r/learnpython | 18 May 2022
For example, line 424 to 436 in mask.py in pandas, there’s an @overload decorator, then a method astype with some hinting, then ... inside; followed by a few other methods just like it with the same name, followed by the actual implementation. pandas mask
Weekly Entering & Transitioning Thread | 15 May 2022 - 22 May 2022
2 projects | reddit.com/r/datascience | 18 May 2022
If you are using python, try pandas, for the analysis (the describe method seems to be what you are looking for), and Seaborn for visualisation, and quick overview of your data. Good luck!
Opinion - Literally the only thing holding back Linux from going "mainstream" is MS Office
4 projects | reddit.com/r/linux | 17 May 2022
Pandas seems popular to try to turn Python into something more like R, which is certainly useful for many types of use-cases when you have tables of data you want to do some operations on to spit out some results and plot nice graphs. But I am sure there are many use-cases that would require other libraries, or a combination of libraries.
Filter csv data in phyton
1 project | reddit.com/r/Python | 12 May 2022
Pandas can do all that.
Exiting PR for Pandas: Will we get rid of SettingWithCopyWarning?
1 project | reddit.com/r/Python | 11 May 2022
How to use Spark and Pandas to prepare big data
3 projects | dev.to | 10 May 2022
We’ve learned a lot while setting up Spark on AWS EMR. While this post will focus on how to use PySpark with Pandas, let us know in the comments if you’re interested in a future article on how we set up Spark on AWS EMR.
What are some alternatives?
requests-html - Pythonic HTML Parsing for Humans™
Cubes - Light-weight Python OLAP framework for multi-dimensional data analysis
pyspider - A Powerful Spider(Web Crawler) System in Python.
orange - 🍊 :bar_chart: :bulb: Orange: Interactive data analysis
MechanicalSoup - A Python library for automating interaction with websites.
Airflow - Apache Airflow - A platform to programmatically author, schedule, and monitor workflows
colly - Elegant Scraper and Crawler Framework for Golang
Dask - Parallel computing with task scheduling
pyexcel - Single API for reading, manipulating and writing data in csv, ods, xls, xlsx and xlsm files
SymPy - A computer algebra system written in pure Python
Grab - Web Scraping Framework