Pushshift API
Sketchpad
Pushshift API | Sketchpad | |
---|---|---|
122 | 42 | |
1,255 | 112 | |
- | - | |
0.0 | 3.0 | |
about 1 year ago | 7 months ago | |
Python | Python | |
- | - |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
Pushshift API
-
[POLL] Should r/seventeen join the blackout on June 12-14th in protest of Reddit’s changes to their API pricing?
As an average user, you may not notice much difference in your browsing experience. For those using external moderation or accessibility tools, though Reddit has announced that tools/apps focusing on accessibility and moderation will not be negatively affected by the pricing changes, many of these tools are already provided by third-party apps that will be affected. Sites such as Unddit (for tracking post/comment edit history), which run on APIs like Pushshift, that are not quite covered under the vague criteria of being “legal, non-commercial, and helpful to mods,” are likely to no longer work as intended going forward.
- Reddit API Kommerzialisierung
- Advancing Community-Led Moderation: An Update on How NCRI/Pushshift and Reddit, Inc. are Working Together
-
Delusion
Pushshift also supports other query parameters, such as specifying a time range using 'before' and 'after' or sorting the results by 'score', 'num_comments', etc. You can find the full list of query parameters in the Pushshift documentation: https://github.com/pushshift/api
-
I am scared of Unddit, Reveddit, and other clones. They violate all rights to privacy and save people's content without any consent process at all. There is no recourse to delete your data.
The thing to look at is Pusshift: https://github.com/pushshift/api
-
Happy cake day
Rule 3 says I can't link directly, but I can link to this. Pushshift is absolutely amazing for things like this. Just enter a url like http://api.pushshift.io/reddit/search/comment?q="\[START OF COMMENT TEXT]" and you should be able to find it quite easily.
- Random Daily Discussion Thread - March 08, 2023 at 09:00PM
-
Discussion Thread
In the meantime, the third party tool Pushshift can accomplish pretty much the same thing, just with some effort. It's a database of (almost) every comment and submission on Reddit, and you can filter the results based on things like the author and keywords in the body. You can write the queries yourself, but it's much easier to just use a webpage designed to make it easy. For example, if you wanted to find old KITTY pings, you could just enter "groupbot" in the author field, and "KITTY" in the "search term" of https://camas.unddit.com/. The result is this:
-
Occasional Updates?
It has, https://github.com/pushshift/api code behind the api server here.
- A app or website to look at banned or private subreddits comments
Sketchpad
-
I'm scared of loosing this safe space (and other trans subreddits) in the face of API changes and possible hate flood that will come after
Also I'm gonna try and download all of the trans subreddits using this script https://github.com/Watchful1/Sketchpad/blob/master/postDownloader.py. hopefully I can get it working tomorrow.
-
Reddit’s plan to kill third-party apps sparks widespread protests
Looks like there are also some unofficial, faster ways. But I don't know if they work: https://github.com/Watchful1/Sketchpad/blob/master/postDownloader.py
-
Script to find overlapping users between subreddits from dump files
A while back I wrote a fairly popular script that used the pushshift api to find overlapping users between subreddits. This doesn't work anymore since the api is down, so I threw together an updated script that does the same thing using the subreddit dump files.
-
PRAW - getting ONLY top comments of a single specific thread efficiently
If you actually just want to level comments I have an example here https://github.com/Watchful1/Sketchpad/blob/master/load_top_level.py
- Late Night Random Discussion Thread - 04 April, 2023
-
Help with search and count results script of reddit API
I have a script here that lets you download a specific subreddit or users entire history using pushshift. It's a good example of how the url works and how to iterate through results based on timestamp. You can add a q=keyword parameter to filter to only submissions/comments matching a specific keyword. And you could remove the subreddit parameter if you want data from all of reddit.
-
Are the more comments objects directive, or random?
I have an old script I wrote a long time ago to fetch only the top level comments in a thread here, which isn't quite what you're trying to do but should be a good example.
- Getting more than 1000 threads.
-
Separate dump files for the top 20k subreddits
In addition to the dump files, pushshift offers an API with powerful filtering options. The main limitation is that it takes quite some time to download a substantial amount of data. If you have a use case that doesn't cleanly align to specific subreddits, take a look at my api download script here. Again I'm happy to work with you to build something for a specific use case.
-
Looking for advice on how to identify users based on unique combinations of subreddit activity
There was a script posted at https://github.com/Watchful1/Sketchpad/blob/master/overlapCounter.py This does exactly what I need by using the pushift api, but seems too slow to work as a web app, and also I have no idea where to begin in converting the script to a web app.
What are some alternatives?
Removeddit - View deleted stuff from reddit
PushshiftDumps - Example scripts for the pushshift dump files
PRAW - PRAW, an acronym for "Python Reddit API Wrapper", is a python package that allows for simple access to Reddit's API.
qBittorrent - qBittorrent BitTorrent client
psaw - Python Pushshift.io API Wrapper (for comment/submission search)
7-Zip-zstd - 7-Zip with support for Brotli, Fast-LZMA2, Lizard, LZ4, LZ5 and Zstandard
pmaw - A multithread Pushshift.io API Wrapper for reddit.com comment and submission searches.
snoowrap - A JavaScript wrapper for the reddit API
RedditExtractoR - :exclamation: This is a read-only mirror of the CRAN R package repository. RedditExtractoR — Reddit Data Extraction Toolkit
snowflake - Snowflake is a network service for generating unique ID numbers at high scale with some simple guarantees.
RedditExtractor - A minimalistic R wrapper for the Reddit API
timesearch - The subreddit archiver