free-stock-tickers
scaling-to-distributed-crawling
free-stock-tickers | scaling-to-distributed-crawling | |
---|---|---|
4 | 5 | |
7 | 36 | |
- | - | |
6.5 | 0.0 | |
11 months ago | over 2 years ago | |
HTML | HTML | |
MIT License | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
free-stock-tickers
- GitHub - le-quentin/free-stock-tickers: Freely fetch live stock data by scraping web pages
-
Free backend app to integrate live stock prices in your spreadsheets
Find it here: https://github.com/le-quentin/free-stock-tickers
- For developers: free open source backend app to fetch stock prices directly in your spreadsheet
- Pour les dév, outil gratuit pour lire les valeurs de bourse depuis Excel
scaling-to-distributed-crawling
-
DOs and DON'Ts of Web Scraping
We published a repository and blog post about distributed crawling in Python. It is a bit more complicated than what we've seen so far. It uses external software (Celery for asynchronous task queue and Redis as the database).
- Mastering Web Scraping in Python: Scaling to Distributed Crawling - ZenRows
- Mastering Web Scraping in Python: Scaling to Distributed Crawling – ZenRows
-
Mastering Web Scraping in Python: Scaling to Distributed Crawling
We will start to separate concepts before the project grows. We already have two files: tasks.py and main.py. We will create another two to host crawler-related functions (crawler.py) and database access (repo.py). Please look at the snippet below for the repo file, it is not complete, but you get the idea. There is a GitHub repository with the final content in case you want to check it.
What are some alternatives?
Stock-listing - This app shows a list of all the available stocks from Sensibull API and their detailed quotes.
celery - Distributed Task Queue (development branch)
awesome-systematic-trading - A curated list of insanely awesome libraries, packages and resources for systematic trading. Crypto, Stock, Futures, Options, CFDs, FX, and more | 量化交易 | 量化投资
colly - Elegant Scraper and Crawler Framework for Golang
node-yahoo-finance2 - Unofficial API for Yahoo Finance
Scrapy - Scrapy, a fast high-level web crawling & scraping framework for Python.
blog-article-protection-scraping-headless-browser - Repository related to post in my blog. Visit it to more details.
Redis - Redis is an in-memory database that persists on disk. The data model is key-value, but many different kind of values are supported: Strings, Lists, Sets, Sorted Sets, Hashes, Streams, HyperLogLogs, Bitmaps.
newspaper - newspaper3k is a news, full-text, and article metadata extraction in Python 3. Advanced docs:
PeARS-orchard - This is the development version of PeARS, the people's search engine. More compact but less robust than PeARS-federated. If you just want to use PeARS in real life, use PeARS-federated instead.
storm-crawler - A scalable, mature and versatile web crawler based on Apache Storm
Crawly - Crawly, a high-level web crawling & scraping framework for Elixir.