SaaSHub helps you find the best software and product alternatives Learn more →
Top 23 Python selenium-webdriver Projects
-
Scweet
A simple and unlimited twitter scraper : scrape tweets, likes, retweets, following, followers, user info, images...
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
Python-Selenium-Action
Run Selenium with Python via Github Actions using Headless or Non-Headless browsers!
-
WorkOS
The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
-
linkedin-comments-scraper
Script to scrape comments (including name, profile link, pfp, designation, email(if present), and comment) from a LinkedIn post from the URL of the post.
-
Instagram-Like-Comment-Bot
📷 An Instagram bot written in Python using Selenium on Google Chrome. It will go through posts in hashtag(s) and like and comment on them.
-
Reddit-Bot-Account-Maker
Python code that creates Reddit accounts, complete with email verification.
-
proxy_web_crawler
Automates the process of repeatedly searching for a website via scraped proxy IP and search keywords
-
selenium-python-pytest-bdd
Example code for a simple Selenium Python Pytest-BDD project using the Page Object Model. Dependency management is handled by pip. Supports Chrome & Firefox
-
selenium_driver_updater
Download or update your Selenium driver binaries and their browsers automatically with this package
-
phantomime
An embeddable headless browser package for Python that provides a simplified interface for interacting with web pages using Selenium and Selenium Hub.
-
web-scraping-with-python
Demonstration of Web Scraping using Selenium Python (Pytest & Pyunit) and Beautiful Soup
-
jobsearch-python-webcrawler
Automatically scrap all the job titles and links, then organize into an excel file. Using Selenium and Openpyxl to scrape the information.
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
Project mention: Twitter api reaching rate limit. 5calls per 15 mins just to get user likes. | /r/learnprogramming | 2023-05-22hmm,, do you know any good one? I found this one but it doesn't scrape a single tweet's likes and followers https://github.com/Altimis/Scweet
Project mention: Saturday Daily Thread: Resource Request and Sharing! Daily Thread | /r/Python | 2023-06-24Shared this multiple times - but always seems to be helpful to someone. A GitHub action / template to run your Selenium based scripts on GitHub with ease. https://github.com/MarketingPipeline/Python-Selenium-Action
Project mention: website-to-gif: A GH Action to turn a webpage into a GIF | /r/opensource | 2023-10-23
An old project I did: https://github.com/WilliamHYZhang/Reddit-Bot-Account-Maker
Project mention: Pyppeteer Tutorial: The Ultimate Guide to Using Puppeteer with Python | dev.to | 2024-02-05import asyncio import pytest from pyppeteer.errors import PageError from urllib.parse import quote import json import os import sys from os import environ from pyppeteer import connect, launch exec_platform = os.getenv('EXEC_PLATFORM') # Get username and access key of the LambdaTest Platform username = environ.get('LT_USERNAME', None) access_key = environ.get('LT_ACCESS_KEY', None) test1_url = 'https://ecommerce-playground.lambdatest.io/' test2_url = 'https://scrapingclub.com/exercise/list_infinite_scroll/' # Usecase - 1 # loc_ecomm_1 = ".order-1.col-lg-6 div:nth-of-type(1) > div:nth-of-type(1) > div:nth-of-type(1) > div:nth-of-type(1) > div:nth-of-type(1) div:nth-of-type(1) > img:nth-of-type(1)" loc_ecomm_1 = "[aria-label='1 / 2'] div:nth-of-type(1) > [alt='Nikon D300']" target_url_1 = "https://ecommerce-playground.lambdatest.io/index.php?route=product/product&product_id=63" # Usecase - 2 (Click on e-commerce sliding banner) loc_ecomm_2 = "[alt='Canon DSLR camera']" target_url_2 = "https://ecommerce-playground.lambdatest.io/index.php?route=product/product&product_id=30" # Usecase - 3 Automating interactions on https://scrapingclub.com/exercise/list_infinite_scroll/ loc_infinite_src_prod1 = ".grid .p-4 [href='/exercise/list_basic_detail/93926-C/']" target_url_3 = "https://scrapingclub.com/exercise/list_basic_detail/93926-C/" # Usecase - 4 Automating interactions on https://scrapingclub.com/exercise/list_infinite_scroll/ # when the images are lazy loaded loc_infinite_src_prod2 = "div:nth-of-type(31) > .p-4 [href='/exercise/list_basic_detail/94967-A/']" target_url_4 = "https://scrapingclub.com/exercise/list_basic_detail/94967-A/" # Set timeout in ms timeOut = 60000 async def scroll_to_element(page, selector): # Scroll until the element is detected await page.evaluateHandle( '''async (selector) => { const element = document.querySelector(selector); if (element) { element.scrollIntoView(); } }''', selector ) return selector async def scroll_carousel(page, scr_count): for scr in range(1, scr_count): elem_next_button = "#mz-carousel-213240 > ul li:nth-child(" + str(scr) + ")" await asyncio.sleep(1) elem_next_button = await page.querySelector(elem_next_button) await elem_next_button.click() # Replica of https://github.com/hjsblogger/web-scraping-with-python/blob/ # main/tests/beautiful-soup/test_infinite_scraping.py#L67C5-L80C18 async def scroll_end_of_page(page): start_height = await page.evaluate('document.documentElement.scrollHeight') while True: # Scroll to the bottom of the page await page.evaluate(f'window.scrollTo(0, {start_height})') # Wait for the content to load await asyncio.sleep(1) # Get the new scroll height scroll_height = await page.evaluate('document.documentElement.scrollHeight') if scroll_height == start_height: # If heights are the same, we reached the end of the page break # Add an additional wait await asyncio.sleep(2) start_height = scroll_height # Additional wait after scrolling await asyncio.sleep(2) @pytest.mark.asyncio @pytest.mark.order(1) async def test_lazy_load_ecomm_1(page): # The time out can be set using the setDefaultNavigationTimeout # It is primarily used for overriding the default page timeout of 30 seconds page.setDefaultNavigationTimeout(timeOut) await page.goto(test1_url, {'waitUntil': 'load', 'timeout': timeOut}) # Set the viewport - Apple MacBook Air 13-inch # Reference - https://codekbyte.com/devices-viewport-sizes/ # await page.setViewport({'width': 1440, 'height': 770}) await asyncio.sleep(2) if exec_platform == 'local': # Scroll until the element is detected elem_button = await scroll_to_element(page, loc_ecomm_1) # await page.click(elem_button) # Wait until the page is loaded # https://miyakogi.github.io/pyppeteer/reference.html#pyppeteer.page.Page.waitForNavigation navigationPromise = asyncio.ensure_future(page.waitForNavigation()) await page.click(elem_button) await navigationPromise elif exec_platform == 'cloud': elem_button = await page.waitForSelector(loc_ecomm_1, {'visible': True}) await asyncio.gather( elem_button.click(), page.waitForNavigation({'waitUntil': 'networkidle2', 'timeout': 30000}), ) # Assert if required, since the test is a simple one; we leave as is :D current_url = page.url print('Current URL is: ' + current_url) try: assert current_url == target_url_1 print("Test Success: Product checkout successful") except PageError as e: print("Test Failure: Could not checkout Product") print("Error Code" + str(e)) @pytest.mark.asyncio @pytest.mark.order(2) async def test_lazy_load_ecomm_2(page): carousel_len = 4 # The time out can be set using the setDefaultNavigationTimeout # It is primarily used for overriding the default page timeout of 30 seconds page.setDefaultNavigationTimeout(timeOut) await page.goto(test1_url, {'waitUntil': 'load', 'timeout': timeOut}) # Set the viewport - Apple MacBook Air 13-inch # Reference - https://codekbyte.com/devices-viewport-sizes/ # await page.setViewport({'width': 1440, 'height': 770}) await asyncio.sleep(2) # Approach 1: Directly click on the third button on the carousel # elem_carousel_banner = await page.querySelector("#mz-carousel-213240 > ul li:nth-child(3)") # await asyncio.sleep(1) # await elem_carousel_banner.click() # Approach 2 (Only for demo): Serially click on every button on carousel await scroll_carousel(page, carousel_len) await asyncio.sleep(1) # elem_prod_1 = await page.querySelector(loc_ecomm_2) elem_prod_1 = await page.waitForSelector(loc_ecomm_2, {'visible': True}) await asyncio.gather( elem_prod_1.click(), page.waitForNavigation({'waitUntil': 'networkidle2', 'timeout': 60000}), ) # Assert if required, since the test is a simple one; we leave as is :D current_url = page.url print('Current URL is: ' + current_url) try: assert current_url == target_url_2 print("Test Success: Product checkout successful") except PageError as e: print("Test Failure: Could not checkout Product") print("Error Code" + str(e)) @pytest.mark.asyncio @pytest.mark.order(3) async def test_lazy_load_infinite_scroll_1(page): # The time out can be set using the setDefaultNavigationTimeout # It is primarily used for overriding the default page timeout of 30 seconds page.setDefaultNavigationTimeout(timeOut) await page.goto(test2_url, {'waitUntil': 'load', 'timeout': timeOut}) # Set the viewport - Apple MacBook Air 13-inch # Reference - https://codekbyte.com/devices-viewport-sizes/ # await page.setViewport({'width': 1440, 'height': 770}) await asyncio.sleep(1) elem_prod1 = await page.querySelector(loc_infinite_src_prod1) await asyncio.gather( elem_prod1.click(), page.waitForNavigation({'waitUntil': 'networkidle2', 'timeout': 60000}), ) # await asyncio.sleep(1) # await elem_carousel_banner.click() # elem_button = scroll_to_element(page, loc_infinite_src_prod1) # print(elem_button) # await asyncio.sleep(2) # await elem_button.click() # Assert if required, since the test is a simple one; we leave as is :D current_url = page.url print('Current URL is: ' + current_url) try: assert current_url == target_url_3 print("Test Success: Product checkout successful") except PageError as e: print("Test Failure: Could not checkout Product") print("Error Code" + str(e)) @pytest.mark.asyncio @pytest.mark.order(4) async def test_lazy_load_infinite_scroll_2(page): # The time out can be set using the setDefaultNavigationTimeout # It is primarily used for overriding the default page timeout of 30 seconds page.setDefaultNavigationTimeout(timeOut) # Tested navigation using LambdaTest YouTube channel # await page.goto("https://www.youtube.com/@LambdaTest/videos", await page.goto(test2_url, {'waitUntil': 'load', 'timeout': timeOut}) # Set the viewport - Apple MacBook Air 13-inch # Reference - https://codekbyte.com/devices-viewport-sizes/ # await page.setViewport({'width': 1440, 'height': 770}) await asyncio.sleep(1) await scroll_end_of_page(page) await page.evaluate('window.scrollTo(0, 0)') await asyncio.sleep(1) # elem_prod = await page.querySelector(loc_infinite_src_prod2) # asyncio.sleep(1) # await asyncio.gather( # elem_prod.click(), # page.waitForNavigation({'waitUntil': 'load', 'timeout': 60000}), # ) elem_button = await scroll_to_element(page, loc_infinite_src_prod2) await asyncio.sleep(1) # await page.click(elem_button) await asyncio.gather( page.click(elem_button), page.waitForNavigation({'waitUntil': 'networkidle2', 'timeout': 60000}), ) # Assert if required, since the test is a simple one; we leave as is :D current_url = page.url print('Current URL is: ' + current_url) try: assert current_url == target_url_4 print("Test Success: Product checkout successful") except PageError as e: print("Test Failure: Could not checkout Product") print("Error Code" + str(e))
Python selenium-webdriver related posts
- Pyppeteer Tutorial: The Ultimate Guide to Using Puppeteer with Python
- A reddit account creator for r/place
- Twitter api reaching rate limit. 5calls per 15 mins just to get user likes.
- It’s just dawned on me how insanely effective the gamification strategy of Duolingo is. I thought it was very kid-like at first until I realised how much I’ve learned using their approach.
- How do I scrape Twitter media along with text of all the accounts that I'm following without using API?
- Linkedin Comments Scraper - Script to scrape comments (including name, profile picture, designation, email(if present), and comment) from a LinkedIn post from the URL of the post.
- GufoskiGram - Multi social tool
-
A note from our sponsor - SaaSHub
www.saashub.com | 20 Apr 2024
Index
What are some of the best open-source selenium-webdriver projects in Python? This list will help you:
Project | Stars | |
---|---|---|
1 | Scweet | 966 |
2 | common-intern | 637 |
3 | facebook-post-scraper | 295 |
4 | pyleniumio | 253 |
5 | Dalle3 | 172 |
6 | Python-Selenium-Action | 150 |
7 | website-to-gif | 104 |
8 | linkedin-comments-scraper | 70 |
9 | Instagram-Like-Comment-Bot | 62 |
10 | Reddit-Bot-Account-Maker | 50 |
11 | proxy_web_crawler | 41 |
12 | Sparx-bwk | 30 |
13 | SnapStreak-Recovery | 26 |
14 | selenium-python-pytest-bdd | 23 |
15 | s-tool | 10 |
16 | selenium_driver_updater | 10 |
17 | phantomime | 8 |
18 | Voov-Automation | 5 |
19 | web-scraping-with-python | 4 |
20 | jobsearch-python-webcrawler | 3 |
21 | duolingo-bot | 3 |
22 | citronella | 3 |
23 | GSOC_org_analysis | 0 |
Sponsored