ebayScraper
tidytuesday
ebayScraper | tidytuesday | |
---|---|---|
7 | 79 | |
178 | 6,432 | |
- | 1.5% | |
0.0 | 8.4 | |
over 1 year ago | 8 days ago | |
Python | HTML | |
GNU General Public License v3.0 only | Creative Commons Zero v1.0 Universal |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
ebayScraper
-
I wrote a python program for scraping Ebay to find a cheap used espresso machines under $200.
If you ever want to expand on this project more, you might enjoy looking at my implementation of an eBay Scraper I made last year: https://github.com/driscoll42/ebayMarketAnalyzer You can see the code I used to specify a specific search to scrape eBay for those instead of needing to put the specific search URL, also filters based on price. The main issue you'll run into sooner or later are CAPTCHAs eBay added earlier this year.
-
I am trying to create a ML model to auto detect these captchas and solve them. I have 500 of these captchas. Can somebody guide me with this?
Except they're not, I speak from personal experience. I built a scraper for eBay to analyze sales data and a few months ago eBay added CATPCHAs to the site which prevented my tool from working. They were more complex than the one OP is working on, but still CAPTCHAs. Further I got several emails from other eBay scrapers asking me if I was working on a solution around it. CAPTCHAs aren't perfect but they do work to prevent a large segment of people from scraping a site. If eBay had had CAPTCHAs from the beginning my project never would have started at all.
-
[Tom’s Hardware] The GPU Sadness Index: Tracking eBay Pricing
Here's the repo! https://github.com/driscoll42/ebayScraper I'd love any suggestions to improve. It makes sense on the background/thicker lines, though the image size is a Tom's Hardware thing, by default they're much larger. Example
- An analysis of the UK £54 million PS5/Xbox and computer hardware Scalping Market
-
NVIDIA Ampere/RTX 30 Series Scalping Market Analysis
Source Code for Data Scraping: https://github.com/driscoll42/ebayScraper
-
What are the best datasets for building a data visualisation portfolio?
No but I'll check that out. I just wrote a pythong script, primarily calling a url with requests and then using beautifulsoup to parse the data. Here's a link if you want to look at it: https://github.com/driscoll42/ebayScraper
-
An analysis of the $82 million eBay Scalping Market for Xbox, PS5, AMD, and NVIDIA
Source Code: https://github.com/driscoll42/ebayScraper
tidytuesday
-
Recommendation for interesting datasets to work with?
TidyTuesday is a weekly data cleaning project where a new, interesting data source is linked to each week: https://github.com/rfordatascience/tidytuesday
- Rfordatascience/tidytuesday: Official repo for the tidytuesday project
- [OC] Tornados in the U.S. are becoming more frequent in off-peak months
-
Too old to continue my education? I'm lost.
For R, I don't have specific resources, but I remember I started out with doing tidytuesdays challenge (https://github.com/rfordatascience/tidytuesday).
-
First Project
Tidy Tuesday has data and links to more data. The nice thing about those data sets is that you can search for what other people did with the data on social media (e.g. Twitter).
-
[OC] Popularity of Horror Movie Poster Color Schemes from 1970
Dataset: https://github.com/rfordatascience/tidytuesday/tree/master/data/2022/2022-11-01
-
Tips on getting experience in R on GitHub
What you're describing is contributing to open source. Some things I'd suggest doing: - learn some git first - create GitHub account and create at least a practice repo - look at learning community-related repos, like Tidy Tuesday - follow R "power" users, people associated with RStudio, and similar folks on social media. Those folks will sometimes mention projects aimed at beginners.
-
[OC] 2021-22 EPL Home/Away Goal Differential
Data: TidyTuesday April 4
-
Publicly available datasets?
The Tidy Tuesday git repo has a lot of example datasets to work with.
-
[OC] Kyle Feldt and his Chevalier Sheriffs: An Infographic of Feldt's NRL Tries
I mostly use ggplot2 in R for visualisations which means that The R Graph Gallery is my starting point for inspiration. The best thing to do is start with a simple idea that tells a story, and one of the best guys out there that does this is Cedric Scherer. He is involved a bit with the TidyTuesday project which I wish I had more time to play around with, and is a great starting point for developing a library of vis techniques.
What are some alternatives?
cloudflare-scrape - A Python module to bypass Cloudflare's anti-bot page.
data - Data and code behind the articles and graphics at FiveThirtyEight
zippyshare-scraper - A module to get direct downloadable links from zippyshare download page.
gganimate - A Grammar of Animated Graphics
MarktplaatsScraper - Scrapes Marktplaats based on a search query and notifies the user.
cheatsheets - Posit Cheat Sheets - Can also be found at https://posit.co/resources/cheatsheets/.
Zillow-Telegram-Notifications - Receive notifications through Telegram about new homes posted on Zillow.
r4ds - R for data science: a book
Slowly_Letter_Downloader - Automates the process of downloading letters from slowly in PDF form.
awesome-public-datasets - A topic-centric list of HQ open datasets.
shopscraper - Scrape Shopify webshops for product information
big-mac-data - Data and methodology for the Big Mac index