Pushshift API
psaw
Our great sponsors
Pushshift API | psaw | |
---|---|---|
122 | 20 | |
1,255 | 311 | |
- | - | |
0.0 | 0.0 | |
about 1 year ago | over 2 years ago | |
Python | Python | |
- | BSD 2-clause "Simplified" License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
Pushshift API
-
[POLL] Should r/seventeen join the blackout on June 12-14th in protest of Reddit’s changes to their API pricing?
As an average user, you may not notice much difference in your browsing experience. For those using external moderation or accessibility tools, though Reddit has announced that tools/apps focusing on accessibility and moderation will not be negatively affected by the pricing changes, many of these tools are already provided by third-party apps that will be affected. Sites such as Unddit (for tracking post/comment edit history), which run on APIs like Pushshift, that are not quite covered under the vague criteria of being “legal, non-commercial, and helpful to mods,” are likely to no longer work as intended going forward.
- Reddit API Kommerzialisierung
- Advancing Community-Led Moderation: An Update on How NCRI/Pushshift and Reddit, Inc. are Working Together
-
Delusion
Pushshift also supports other query parameters, such as specifying a time range using 'before' and 'after' or sorting the results by 'score', 'num_comments', etc. You can find the full list of query parameters in the Pushshift documentation: https://github.com/pushshift/api
-
I am scared of Unddit, Reveddit, and other clones. They violate all rights to privacy and save people's content without any consent process at all. There is no recourse to delete your data.
The thing to look at is Pusshift: https://github.com/pushshift/api
-
Happy cake day
Rule 3 says I can't link directly, but I can link to this. Pushshift is absolutely amazing for things like this. Just enter a url like http://api.pushshift.io/reddit/search/comment?q="\[START OF COMMENT TEXT]" and you should be able to find it quite easily.
- Random Daily Discussion Thread - March 08, 2023 at 09:00PM
-
Discussion Thread
In the meantime, the third party tool Pushshift can accomplish pretty much the same thing, just with some effort. It's a database of (almost) every comment and submission on Reddit, and you can filter the results based on things like the author and keywords in the body. You can write the queries yourself, but it's much easier to just use a webpage designed to make it easy. For example, if you wanted to find old KITTY pings, you could just enter "groupbot" in the author field, and "KITTY" in the "search term" of https://camas.unddit.com/. The result is this:
-
Occasional Updates?
It has, https://github.com/pushshift/api code behind the api server here.
- A app or website to look at banned or private subreddits comments
psaw
-
"Unable to connect to pushshift.io."
PSAW is deprecated: https://github.com/dmarx/psaw Try PMAW. Although, be aware of the well documented, ongoing issues with Pushshift itself; some of the wrappers aren't working as expected. I suspect that once the API itself is functioning normally there may be further updates to the wrappers.
-
I've been getting Response status code 404 since Monday morning. Is this due to the system update? Should I be changing my script someway to access the updated API?
This information is contained in the Readme on Github but is not in the readthedocs page for some reason.
-
How to make the bot respond based on invocation and not subreddit
You could use https://github.com/dmarx/psaw to monitor keywords. I haven't personally used it but it is a popular method.
-
How to collect top submissions per day of a specific subreddit?
You can do the first part easily with PSAW if you use python, it lets you get submissions from pushshift and then updates them with the current data from the reddit api. Then you would have to sort them, which is also fairly easy with python.
-
PRAW - How do I get more responses
Most of PRAWs methods have a limit argument. Usually with a default of 100. Set it to None, which actually sets the limit to 1000. You'd have to resort to other APIs if you want more than 1000 items like this: https://github.com/dmarx/psaw.
-
Most posts in search results are only showing a score of 1.
To get live scores or other metadata, you should incorporate accessing the reddit API into your workflow. One easy way to do this is using the 3rd party Pushshift wrapper called PSAW. See the note about setting r = praw.Reddit(...) and api = PushshiftAPI(r).
-
[OC] Modelling /r/CryptoCurrency's Time Variant Subconcious Using Deep Learning!
PSAW: https://github.com/dmarx/psaw
-
Removing deleted/archived posts
Try using psaw. That will query Pushshift first and copy over updated data from reddit.
-
PSAW user question
Try without the `asc` sort parameter. From the source code we can see that it can cause issues https://github.com/dmarx/psaw/blob/master/psaw/PushshiftAPI.py#L162-L164
-
Question/Help - Getting data about user flairs on r/Hololive
I again changed the way I collect data. Adapted from (https://deepnote.com/@deepnote/Mining-and-Exploring-Reddit-Data-using-Python-rfZ7TRRAT2unpCqU6egaKA) and using PSAW (Python Pushshift.io API Wrapper) (https://github.com/dmarx/psaw).
What are some alternatives?
Removeddit - View deleted stuff from reddit
PRAW - PRAW, an acronym for "Python Reddit API Wrapper", is a python package that allows for simple access to Reddit's API.
huggingface_hub - The official Python client for the Huggingface Hub.
pmaw - A multithread Pushshift.io API Wrapper for reddit.com comment and submission searches.
SubredditDownloader - This python tools allows downloading of all submissions from a subreddit using Pushshift (API/Files) and the reddit API
snoowrap - A JavaScript wrapper for the reddit API
reddit-flair-popularity
RedditExtractoR - :exclamation: This is a read-only mirror of the CRAN R package repository. RedditExtractoR — Reddit Data Extraction Toolkit
RemindMeBot - u/RemindMeBot on reddit
snowflake - Snowflake is a network service for generating unique ID numbers at high scale with some simple guarantees.
PrawWrapper - A wrapper around PRAW for easier unit testing