cstore_fdw VS site

Compare cstore_fdw vs site and see what are their differences.

cstore_fdw

Columnar storage extension for Postgres built as a foreign data wrapper. Check out https://github.com/citusdata/citus for a modernized columnar storage implementation built as a table access method. (by citusdata)

site

The new frontend/backend code for https://xeiaso.net (by Xe)
Our great sponsors
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • WorkOS - The modern identity platform for B2B SaaS
  • SaaSHub - Software Alternatives and Reviews
cstore_fdw site
6 12
1,738 601
0.4% -
2.6 9.5
about 3 years ago 8 days ago
C MDX
Apache License 2.0 zlib License
The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.

cstore_fdw

Posts with mentions or reviews of cstore_fdw. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2024-02-21.
  • Moving a Billion Postgres Rows on a $100 Budget
    2 projects | news.ycombinator.com | 21 Feb 2024
    Columnar store PostgreSQL extension exists, here are two but I think I’m missing at least another one:

    https://github.com/citusdata/cstore_fdw

    https://github.com/hydradatabase/hydra

    You can also connect other stores using the foreign data wrappers, like parquet files stored on an object store, duckdb, clickhouse… though the joins aren’t optimised as PostgreSQL would do full scan on the external table when joining.

  • Anything can be a message queue if you use it wrongly enough
    6 projects | news.ycombinator.com | 4 Jun 2023
    I'm definitely not from Citus data -- just a pg zealot fighting the culture war.

    If you want to reach people who can actually help, you probably want to check this link:

    https://github.com/citusdata/cstore_fdw/issues

  • Pg_squeeze: An extension to fix table bloat
    3 projects | news.ycombinator.com | 4 Oct 2022
    That appears to be the case:

    https://github.com/citusdata/cstore_fdw

    >Important notice: Columnar storage is now part of Citus

  • Ingesting an S3 file into an RDS PostgreSQL table
    3 projects | dev.to | 10 Jun 2022
    either we go for RDS, but we stick to the AWS handpicked extensions (exit timescale, citus or their columnar storage, ... ),
  • Postgres and Parquet in the Data Lke
    7 projects | news.ycombinator.com | 3 May 2022
    Re: performance overhead, with FDWs we have to re-munge the data into PostgreSQL's internal row-oriented TupleSlot format again. Postgres also doesn't run aggregations that can take advantage of the columnar format (e.g. CPU vectorization). Citus had some experimental code to get that working [2], but that was before FDWs supported aggregation pushdown. Nowadays it might be possible to basically have an FDW that hooks into the GROUP BY execution and runs a faster version of the aggregation that's optimized for columnar storage. We have a blog post series [3] about how we added agg pushdown support to Multicorn -- similar idea.

    There's also DuckDB which obliterates both of these options when it comes to performance. In my (again limited, not very scientific) benchmarking of on a customer's 3M row table [4] (278MB in cstore_fdw, 140MB in Parquet), I see a 10-20x (1/2s -> 0.1/0.2s) speedup on some basic aggregation queries when querying a Parquet file with DuckDB as opposed to using cstore_fdw/parquet_fdw.

    I think the dream is being able to use DuckDB from within a FDW as an OLAP query engine for PostgreSQL. duckdb_fdw [5] exists, but it basically took sqlite_fdw and connected it to DuckDB's SQLite interface, which means that a lot of operations get lost in translation and aren't pushed down to DuckDB, so it's not much better than plain parquet_fdw.

    This comment is already getting too long, but FDWs can indeed participate in partitions! There's this blog post that I keep meaning to implement where the setup is, a "coordinator" PG instance has a partitioned table, where each partition is a postgres_fdw foreign table that proxies to a "data" PG instance. The "coordinator" node doesn't store any data and only gathers execution results from the "data" nodes. In the article, the "data" nodes store plain old PG tables, but I don't think there's anything preventing them from being parquet_fdw/cstore_fdw tables instead.

    [0] https://github.com/citusdata/cstore_fdw

  • Creating a simple data pipeline
    1 project | /r/dataengineering | 20 May 2021
    The citus extension for postgresql. https://github.com/citusdata/cstore_fdw

site

Posts with mentions or reviews of site. We have used some of these posts to build our list of alternatives and similar projects. The last one was on 2023-12-03.
  • Ask HN: What side projects landed you a job?
    62 projects | news.ycombinator.com | 3 Dec 2023
    My blog https://xeiaso.net (source code: https://github.com/Xe/site) and the stuff I've written for it ended up doing several things to help me get employed over the years:

    1. Letting me have a place to write to get better at writing, which makes it easier to do my in DevRel.

    2. Lets me talk about all of the interesting projects I work on (eg: an AI novel writing experiment https://xeiaso.net/videos/2023/ai-hackathon/) that people regularly find interesting. This gets people interested in wanting to employ me, which ends up working up well for me in the long run.

    Do side projects, but write about what you did and what you learned.

  • My First Impressions of Nix
    33 projects | news.ycombinator.com | 19 Jun 2023
  • Hacker News evading criticism by selectively adding noreferrer to certain links
    6 projects | news.ycombinator.com | 7 Jun 2023
    As someone who is regularly falling victim to the rightward lurch (for having committed the dastardly crime of the wrong hormone activating in-utero), the only reason I don't actively block Hacker News readers is that I make ad money off of them. That is the only reason it's worth the abuse vector to me.

    dang, if you are reading this, please take a moment to seriously consider the actions you have taken today. I understand your desire for the community that Hacker News could be, but that is so far away from what it is today that it's almost laughable. Yes, this is a no-win situation but that's bascially how it is globally when trying to be centerist about any issue. I use Hacker News referers to change the page slightly (mostly to add a deserved "hey, can you please not be an asshole, thanks" via this code: https://github.com/Xe/site/blob/686cc58fb6fc8f2e3bf0197e9b38...) and I would be very frustrated if that went away. Maybe even to the point of having a worker process figure out if my articles are posted to hacker news and making them go dark if they are on the front page. I know you value the articles I post (as our email threads have contained), but really it's an abuse vector that I need to keep metrics of.

    Website administrators should be allowed to block Hacker News referers. Yes this is a thing that is not desirable for you as an administrator, but at some level something's got to give. The enshittening of Hacker News is something that is very undesirable for me too. I've gone over this in our emails. This was going to be another one of those emails, but I really would prefer this one to be out in the open.

  • Anything can be a message queue if you use it wrongly enough
    6 projects | news.ycombinator.com | 4 Jun 2023
    My read time estimate code is here: https://github.com/Xe/site/blob/aa3608afa6c62695ca0ab139f823...

    I've been trying to play with the constants over the years to make the read time estimate more "accurate", but it's a tough nut to crack in general. So I can go over my numbers more accurately, how long did it take you to read it?

  • Ask HN: Those with money-making side projects,how did you come up with the idea?
    6 projects | news.ycombinator.com | 11 Dec 2022
    I originally started putting ads on my blog after people started being an asshole about my articles on Hacker News, originally scoped to only readers from Hacker News. That combined with Patreon pays for all my hosting costs (even the CDN on fly.io and my random AWS infrastructure) and all the video games I play (about $280 US per month of income). It's gotten to the point where it's a tax burden, but I think it's worth it. I've never had a side project make an actual profit before and I'm excited to keep writing as a way to hone my skills and get experience with even more fun technology.

    My recent post on embedding Rust into Go programs with WebAssembly (https://news.ycombinator.com/item?id=33713717) made me about $20 of ad impressions on the day of its release, pretty impressive given how many of you people must run ad blockers!

    It'd be cool to make my blog generate more income and eventually take over as my full time job, but I'm pretty happy with the fact that it's a side project that I can peck at when I want to. A lot of energy that would be spent doing various random Discord/IRC bots that go nowhere ends up being thrust into the blog instead. I also love being able to integrate various cursed things (like a Dhall script that takes my salary history data to spit out LaTeX for my resume: https://github.com/Xe/site/blob/main/dhall/latex/resume.dhal...) and then write up how I did it and why. It makes coming up with ideas for the blog a lot easier!

    I have plans to make a "Why I think WASI is cool" style post with interactive terminals that run WebAssembly programs in the browser, but I'm still trying to figure out how to graft xterm.js into my custom build setup with Deno. I have an untested but should theoretically work implementation here though in case anyone has any tips: https://github.com/Xe/site/blob/main/src/frontend/wasiterm.t...

    Filing my taxes is a huge pain now lol.

  • The carcinization of Go programs (via WASM)
    1 project | /r/rust | 24 Nov 2022
    Hi! I was going to ask about your site template but I see you already answered my questions :D
  • Salary Transparency
    1 project | news.ycombinator.com | 24 Oct 2022
    Patches are welcome: https://github.com/Xe/site/blob/main/templates/salary_transp...
  • Ask HN: Is having a Personal blog/brand worth it for you?
    7 projects | news.ycombinator.com | 18 Jul 2022
    I've found it worth doing. My blog (xeiaso.net, formerly christine.website) is the main way that I get employed at this point. It also helps that people link it here a lot. After 100 articles or so writing got a lot easier and now people rely on my blog for a lot of things. I think it's worth it, but I've also been exclusively self-hosting it. I currently have the code (and writing) open source on GitHub (https://github.com/Xe/site) but I'm considering moving the writing to either a private repo or a SQLite database because people keep copying it, slathering it in ads and rehosting it.
  • I Miss Heroku's DevEx
    3 projects | news.ycombinator.com | 11 May 2022
  • Crimes with Go Generics
    2 projects | news.ycombinator.com | 24 Apr 2022
    Oh dear. I pushed an addendum to the article: https://github.com/Xe/site/commit/05135edcbe5e474131c15c2476...

    Thanks for pointing that out!

What are some alternatives?

When comparing cstore_fdw and site you can also consider the following projects:

ZLib - A massively spiffy yet delicately unobtrusive compression library.

tumblelog - A static tumblelog generator available as both a Perl and Python version

odbc2parquet - A command line tool to query an ODBC data source and write the result into a parquet file.

markwhen - Make a cascading timeline from markdown-like text. Supports simple American/European date styles, ISO8601, images, links, locations, and more.

zstd - Zstandard - Fast real-time compression algorithm

recco - Gain information about applications to inform deployments

cute_headers - Collection of cross-platform one-file C/C++ libraries with no dependencies, primarily used for games

type-safe-builder-experiment - Experimenting with the type safe builder pattern in different languages.

delta - An open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Trino, and Hive and APIs

pgBackRest - Reliable PostgreSQL Backup & Restore

parquet_fdw - Parquet foreign data wrapper for PostgreSQL

Bailo - Managing the lifecycle of machine learning to support scalability, impact, collaboration, compliance and sharing.