InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now. Learn more →
Top 23 Python Beautifulsoup Projects
-
-
InfluxDB
InfluxDB – Built for High-Performance Time Series Workloads. InfluxDB 3 OSS is now GA. Transform, enrich, and act on time series data directly in the database. Automate critical tasks and eliminate the need to move data externally. Download now.
-
crawlee-python
Crawlee—A web scraping and browser automation library for Python to build reliable crawlers. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with BeautifulSoup, Playwright, and raw HTTP. Both headful and headless mode. With proxy rotation.
Which hashtags are trending now? What is an influencer's engagement rate? What topics are important for a content creator? You can find answers to these and many other questions by analyzing TikTok data. However, for analysis, you need to extract the data in a convenient format. In this blog, we'll explore how to scrape TikTok using Crawlee for Python.
-
Language: Python | GitHub: 4.7K+ stars | link
-
Project mention: Show HN: Scraper for job listings directly from company websites | news.ycombinator.com | 2024-12-07
jobfunnel is FOSS and accepting contributions: https://github.com/PaulMcInnis/JobFunnel
Currently supports indeed, in the past supported glassdoor and others.
-
-
-
Project mention: Release 0.44.0 of Spellcheck (GitHub) Action - baby-steps maintenance | dev.to | 2024-10-25
soupsieve bumped from version 2.5 to 2.6, see release notes
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
-
languagepod101-scraper
Python scraper for Language Pods such as Japanesepod101.com :japanese_ogre: :japan: :sushi: Compatible with Japanese, Chinese, French, German, Italian, Korean, Portuguese, Russian, Spanish and many more! ✨
-
-
web_to_obsidian
A Python 3 script that scrapes an html/xml page to extract text, then creates markdown files for Obsidian & the dataview plugin
-
cf-ai-lora-news-summarizer
Python webapp that summarizes news with Cloudflare Workers AI LoRA, Mistral, Beautifulsoup, and Streamlit
-
-
tweet-transcriber
A Reddit bot that transcribes tweets from comments and submissions links, mirrors their images and replies back with a formatted Markdown message.
-
Amazon-Product-Information-Scraper
This Python web-scraping project retrieves product names, prices, review stars, and review counts for a specific product category.
-
DDD
🎧 CLI Python tool for bulk downloading Darknet Diaries podcast. Hate being online? This is the way. (by Psyhackological)
-
-
-
-
python-web-scraping-primjeri
web scraping stranica posta.hr, konzum.hr, index.hr, njuskalo.hr, neostar.com, DasWeltAuto.hr, ...
-
web-scraping-with-python
Demonstration of Web Scraping using Selenium Python (Pytest & Pyunit) and Beautiful Soup
-
-
python_portfolio_web_scraper-spotrac
Python solution to webscrape contract data from https://www.spotrac.com
-
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
Python Beautifulsoup discussion
Python Beautifulsoup related posts
-
How to scrape Google Maps data using Python and Crawlee
-
How to scrape Google search results with Python
-
How to scrape infinite scrolling webpages with Python
-
How to scrape a website with Python (Beginner tutorial)
-
Nastavak analize tržišta rabljenih auta - novi auti su preskupi, a porasle su cijene i rabljenima ?
-
Kratka analiza strujića na Njuškalu - detalji u komentaru
-
flannelfy.net update - LastFM, All Scores
-
A note from our sponsor - InfluxDB
www.influxdata.com | 15 May 2025
Index
What are some of the best open-source Beautifulsoup projects in Python? This list will help you:
# | Project | Stars |
---|---|---|
1 | requests-html | 13,806 |
2 | crawlee-python | 5,638 |
3 | MechanicalSoup | 4,752 |
4 | JobFunnel | 2,010 |
5 | tiktok-downloader | 321 |
6 | Senpwai | 246 |
7 | soupsieve | 236 |
8 | languagepod101-scraper | 160 |
9 | WhatSoup | 141 |
10 | web_to_obsidian | 54 |
11 | cf-ai-lora-news-summarizer | 27 |
12 | reddit-bots | 25 |
13 | tweet-transcriber | 18 |
14 | Amazon-Product-Information-Scraper | 15 |
15 | DDD | 13 |
16 | audioflow | 13 |
17 | ScoreCast | 11 |
18 | tabroom-API | 11 |
19 | python-web-scraping-primjeri | 6 |
20 | web-scraping-with-python | 5 |
21 | statum | 5 |
22 | python_portfolio_web_scraper-spotrac | 5 |
23 | flannelfynet | 4 |