RedditExtractor vs disk.frame

Our great sponsors

WorkOS - The modern identity platform for B2B SaaS

InfluxDB - Power Real-Time Data Analytics at Scale

SaaSHub - Software Alternatives and Reviews

Our great sponsors

RedditExtractor		disk.frame
	Project
5	Mentions	5
82	Stars	592
-	Growth	0.5%
3.3	Activity	0.0
8 months ago	Latest Commit	3 months ago
R	Language	R
GNU General Public License v3.0 only	License	GNU General Public License v3.0 or later

The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.

RedditExtractor

Posts with mentions or reviews of RedditExtractor. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2022-09-08.

Will RedditExtractoR be impacted by API changes?
1 project | /r/redditdev | 1 Jun 2023

IIRC RedditExtractor doesn't use OAuth2 so I think the 10reqs/min ratelimit will be applied to the library/client.
bulk subreddit datasets?
1 project | /r/Rlanguage | 4 Mar 2023

Sounds like this package might help you reach your objective. The timeframe you can capture will depend on the amount of activity within the subreddit of interest.
Has anyone here used the Reddit API before (in R)?
1 project | /r/rstats | 26 Feb 2023
Using RedditExtractoR to scrape flairs?
1 project | /r/redditdev | 31 Jan 2023

My apologies if the Reddit API flair is inappropriate here - RedditExtractoR does use the Reddit API, but it's technically distinct as a simplified package for R (see: https://github.com/ivan-rivera/RedditExtractor)
H3 Podcast YouTube Views Analysis
2 projects | /r/h3h3productions | 8 Sep 2022

Great idea, yeah Reddit has an API too, and it looks like there are R & Python packages to access it - https://github.com/ivan-rivera/RedditExtractor

disk.frame

Posts with mentions or reviews of disk.frame. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2021-10-12.

Do you code from memory? Or do you reference things?
1 project | /r/rstats | 31 Mar 2022

Say hello to disk.frame.
How can I read in only two columns from a massive 10+ GB tab file?
1 project | /r/rstats | 20 Jan 2022
Data cleaning/ analysis 100-200 million rows of data. Is this doable in R, or is there another program I should try instead?
2 projects | /r/rstats | 12 Oct 2021

It depends on your hardware, but it should not be a problem. You might look into disk frame (https://diskframe.com) or similar packages.
is it possible to have my enviroment objects and work with them on my local drive instead of RAM?
1 project | /r/Rlanguage | 3 Jul 2021

If that doesn't work, the disk.frame package might help. It is new-ish and not common, but does seem to work with data on disk rather than in memory
We Test PCIe 4.0 Storage: The AnandTech 2021 SSD Benchmark Suite
1 project | news.ycombinator.com | 2 Feb 2021

> The speeds were just stunning to say the least at 15GB/s.
That is amazing. That is around DDR4-1866 speeds, and not far from DDR4-2666 (~21 GB/s). At those speeds I would happily work with dataframes sitting on the disk rather than in memory [1, 2]. Did you benchmark RAID 0 with less than four disks?
[1] R: https://github.com/xiaodaigh/disk.frame

What are some alternatives?

When comparing RedditExtractor and disk.frame you can also consider the following projects:

Pushshift API - Pushshift API

db-benchmark - reproducible benchmark of database-like ops

police-settlements - A FiveThirtyEight/The Marshall Project effort to collect comprehensive data on police misconduct settlements from 2010-19.

drake - An R-focused pipeline toolkit for reproducibility and high-performance computing

reddit-awards-data - Dataset and visualizations of the most popular Reddit Awards, using the PRAW API.

Rcrawler - An R web crawler and scraper

r4ds - R for data science: a book

tuber - :sweet_potato: Access YouTube from R

awesome-R - A curated list of awesome R packages, frameworks and software.

polite - Be nice on the web

opentripplanner - An R package to set up and use OpenTripPlanner (OTP) as a local or remote multimodal trip planner.

RedditExtractor vs Pushshift API disk.frame vs db-benchmark RedditExtractor vs police-settlements disk.frame vs drake RedditExtractor vs reddit-awards-data disk.frame vs police-settlements RedditExtractor vs Rcrawler disk.frame vs r4ds RedditExtractor vs tuber disk.frame vs awesome-R RedditExtractor vs polite disk.frame vs opentripplanner

Compare RedditExtractor vs disk.frame and see what are their differences.

RedditExtractor

disk.frame

RedditExtractor

disk.frame

What are some alternatives?