Scrape images from a search engine with JavaScript and Puppeteer

This page summarizes the projects mentioned and recommended in the original post on dev.to

Our great sponsors
  • SurveyJS - Open-Source JSON Form Builder to Create Dynamic Forms Right in Your App
  • WorkOS - The modern identity platform for B2B SaaS
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • sharp

    High performance Node.js image processing, the fastest module to resize JPEG, PNG, WebP, AVIF and TIFF images. Uses the libvips library.

  • We now have the links of every images but some of them are quite heavy (>1mb). Fortunately we can use another Node.js library to compress their size with minimal loss of quality: sharp

  • duckduckgo-locales

    Translation files for <a href="https://duckduckgo.com"> </a>

  • const puppeteer = require("puppeteer") const data = require("./dog-breeds.json") const script = async () => { //this will open visibly a chromium window, this is useful to see what is going on and test stuff before the finalized script const browser = await puppeteer.launch({ headless: false, slowMo: 100 }) const page = await browser.newPage() //loop on every breed for (let dogBreed of data) { console.log("Start for breed:", dogBreed) const url = `https://duckduckgo.com/?q=${dogBreed.replaceAll( " ", "+" )}&va=b&t=hc&iar=images&iax=images&ia=images` //in case we encounter a page without images or an error try { await page.goto(url) //make sure the page is loaded and contain our targeted element await page.waitForNavigation() await page.waitForSelector(".tile--img__media") await page.evaluate( () => { const firstImage = document.querySelector(".tile--img__media") //we open the panel that contains the image info firstImage.click() }, { delay: 400 } ) //get the link of the image from the panel await page.waitForSelector(".detail__pane a") const link = await page.evaluate( () => { const links = document.querySelectorAll(".detail__pane a") const linkImage = Array.from(links).find((item) => item.innerText.includes("fichier") ) return linkImage?.getAttribute("href") }, { delay: 250 } ) console.log("link succesfully retrieved:", link) console.log("=====") } catch (e) { console.log(e) } } } script()

  • SurveyJS

    Open-Source JSON Form Builder to Create Dynamic Forms Right in Your App. With SurveyJS form UI libraries, you can build and style forms in a fully-integrated drag & drop form builder, render them in your JS app, and store form submission data in any backend, inc. PHP, ASP.NET Core, and Node.js.

    SurveyJS logo
NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts