scrapeghost
exiftool
scrapeghost | exiftool | |
---|---|---|
10 | 249 | |
1,396 | 2,860 | |
- | 2.8% | |
8.2 | 7.0 | |
5 months ago | 9 days ago | |
Python | Perl | |
GNU General Public License v3.0 or later | GNU General Public License v3.0 only |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
scrapeghost
-
Those of you who have developed product features using GPT4 API (or failed to do so), how did it go?
Not my project but an ex-colleague has been having some success in this direction: https://jamesturk.github.io/scrapeghost/
-
What are the best tools for web scraping and analysis of natural language to populate a dataset?
Yes, there is something like that available - ScrapeGhost.
- FLaNK Stack Weekly 3 April 2023
- Scraping Websites Using GPT
-
@TwitterDev Announces New Twitter API Tiers
With AI scraping, tools can be far more resilient than soon enough to minor dom changes. See - https://jamesturk.github.io/scrapeghost/.
-
Experimental library for scraping websites using OpenAI's GPT API
Their ToS mentions scraping but it pertains to scraping their frontend instead of using their API, which they don't want you to do.
Also - this library requests the HTML by itself [0] and ships it as a prompt but with preset system messages as the instruction [1].
[0] - https://github.com/jamesturk/scrapeghost/blob/main/src/scrap...
[1] - https://github.com/jamesturk/scrapeghost/blob/main/src/scrap...
- scrapeghost. Web scrape using gpt-4 (experimental)
exiftool
-
Ask HN: Best to store, index and categorize audio recordings
If you're doing a pipelined bulk processing pass to add metadata tags after extracting them via Speech to text, or have delimited notes in a text file, or ... etc.
You might find ExifTool useful.
It's pure commandline (with a few third party GUI's IIRC) multiplatform and purpose built to display, edit, add media tags to all sorts of AV files.
https://exiftool.org/
-
Cleaning up my 200GB iCloud with some JavaScript
> Any method that I've found to clean them up (exporting the originals, deleting them from the library, and then re-importing the JPEGs only seems easiest) will lose all of the years of metadata that I've built up in the library.
The open source tool osxphotos (https://github.com/RhetTbull/osxphotos) can help with this. You can export the JPEG images while preserving metadata using the thrid-party exiftool utility:
`osxphotos export /path/to/export --has-raw --skip-raw --exiftool`
This exports all images that have a raw pair but skips the raw component then uses exiftool (https://exiftool.org/) to write the metadata (keywords, etc.) to the exported JPEG files. You can then re-import these into photos either by dragging them or by running `osxphotos import /path/to/export/*`
Both the export and import commands have many other options for controlling export directory, etc. `osxphotos help export` or `osxphotos docs` to open docs in browser. (Disclaimer: I'm the author of osxphotos)
-
Is there a way to remove metadata from an image file?
Check out exiftool.org
- EXIF Data from Cloud Stock Photo Used for Production of Satellite Video
-
Locationator: Access Apple's Reverse Geocoding service from the command line, Services menu
Locationator also comes with an optional CLI that can be used to perform reverse geocoding on images from the command line or perform the reverse geocoding and then write the location data to the file's XMP metadata using exiftool. It also comes with two services for doing the same from the Finder or other apps using the Services menu.
-
Modifying "Media Creation Date" metadata in .m4v files?
Edit: Nevermind, I got it. I used PyExifTool and installed exiftool from exiftool.org.
- Exploring EXIF
-
Canon PowerShot S95
May not work as not all camera store the serial number in the EXIF, but if you've got exiftool installed you can try running:
-
JPEG XL: How It Started, How It’s Going
I think TIFF has some unique features that makes it more prone to certain security issues[1] compared to other formats, such as storing absolute file offsets instead of relative offsets. So I am not sure TIFF is a good container format, but many camera raws are TIFF-based for some reason.[2]
[1] https://cve.mitre.org/cgi-bin/cvekey.cgi?keyword=libtiff
[2] https://exiftool.org/#supported (search for "TIFF-based")
-
How to keep file creation dates intact when importing to DSM?
I have struggled with this in the past, and I found the utility called exiftool quite useful.
What are some alternatives?
autoscraper - A Smart, Automatic, Fast and Lightweight Web Scraper for Python
exiv2 - Image metadata library and tools
tmx-solver - ThreatMetrix (anti-bot/fraud-detection) solver, deobfuscator & data harvester
jExifToolGUI - jExifToolGUI is a multi-platform java/Swing graphical frontend for the excellent command-line ExifTool application by Phil Harvey
wikipedia_ql - Query language for efficient data extraction from Wikipedia
exifcleaner - Cross-platform desktop GUI app to clean image metadata
Bandwhich - Terminal bandwidth utilization tool
HomeBrew - 🍺 The missing package manager for macOS (or Linux)
bpytop - Linux/OSX/FreeBSD resource monitor
FFmpeg - Mirror of https://git.ffmpeg.org/ffmpeg.git
duckling - Language, engine, and tooling for expressing, testing, and evaluating composable language rules on input strings.
DiffusionToolkit - Metadata-indexer and Viewer for AI-generated images