Ask HN: Codebases with great, easy to read code?

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

Our great sponsors
  • WorkOS - The modern identity platform for B2B SaaS
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • SaaSHub - Software Alternatives and Reviews
  • tigerbeetle

    Discontinued A distributed financial accounting database designed for mission critical safety and performance. [Moved to: https://github.com/tigerbeetledb/tigerbeetle] (by coilhq)

  • Are we allowed to share repos that we've written? :)

    If so, then here's distributed consensus written in Zig:

    https://github.com/coilhq/tigerbeetle/blob/main/src/vsr/repl...

    Something that differentiates this from many other consensus implementations is that there's no networking/multithreading code leaking through, it's all message passing, so that it can be deterministically fuzz tested.

    I learned so much, and had so much fun writing this, that I hope it's an enjoyable read—or please let me know what can be improved!

  • requests

    A simple, yet elegant, HTTP library.

  • This is a very interesting question.

    Are you interested in any particular languages?

    For Python, take a look at: https://github.com/psf/requests

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

    WorkOS logo
  • OkHttp

    Square’s meticulous HTTP client for the JVM, Android, and GraalVM.

  • I’ve learned a TON from the [okhttp3](https://square.github.io/okhttp/) codebase, highly recommend studying it.

  • Prefect

    The easiest way to build, run, and monitor data pipelines at scale.

  • grbl

    An open source, embedded, high performance g-code-parser and CNC milling controller written in optimized C that will run on a straight Arduino

  • GRBL the CNC firware for Arduninos:

    https://github.com/grbl/grbl/

    It feels like it has more comments than code. The comments are written in a very nice, understandable language that even activley teaches about concepts that are only adjacent to the code at hand.

  • CPython

    The Python programming language

  • One recent example: I wanted to know if the SQLite package in Python took any steps to avoid calling "interrupt" on a closed connection, which the SQLite C documentation warns against.

    A couple of searches against https://github.com/python/cpython lead me to this code here: https://github.com/python/cpython/blob/4674fd4e938eb4a29ccd5...

  • Stockfish

    A free and strong UCI chess engine

  • Stockfish is well written, commented, and documented C++ code:

    https://github.com/official-stockfish/Stockfish

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • NodaTime

    A better date and time API for .NET

  • Noda time is very clean/well written IMO -> https://github.com/nodatime/nodatime

  • smart_open

    Utils for streaming large files (S3, HDFS, gzip, bz2...)

  • I see that you're primarily looking into Python work, so I'd recommend `smart_open` as a nice, compact way to get started.

    https://github.com/RaRe-Technologies/smart_open

  • DOOM-3-BFG

    Doom 3 BFG Edition

  • Doom 3 is a perennial favorite for "most beautiful C++ codebase" lists [0]

    [0] https://github.com/id-Software/DOOM-3-BFG

  • Sidekiq

    Simple, efficient background processing for Ruby

  • minitest

    minitest provides a complete suite of testing facilities supporting TDD, BDD, mocking, and benchmarking.

  • https://github.com/seattlerb/minitest really removed the FUD for me when i started learning Ruby and Rails. Its full of metaprogramming and fancy tricks but is also quite small, practical and informal in its style.

    e.g. "assert_equal" is really just "expected == actual" at it's core but it uses both both a block param (a kind of closure) for composing a default message and calls "diff" which is a dumb wrapper around the system "diff" utility (horrors!). There is even some evolved nastiness in there for an API change that uses the existing assert/refute logic to raise an informative message. this is handled with a simple if and not some sort of complex hard-to-follow factory pattern or dependency injection misuse.

    https://github.com/seattlerb/minitest/blob/master/lib/minite...

  • serenity

    The Serenity Operating System 🐞

  • SerenityOS, especially the userland, has always seemed very elegant to me:

    https://github.com/SerenityOS/serenity

  • Pi-hole

    A black hole for Internet advertisements

  • Pihole [1] is mostly written in bash, which reads rather well, as far as I am concerned.

    [1] https://github.com/pi-hole/pi-hole

  • Box2D

    Box2D is a 2D physics engine for games

  • LevelDB

    LevelDB is a fast key-value storage library written at Google that provides an ordered mapping from string keys to string values.

  • Ghost

    Independent technology for modern publishing, memberships, subscriptions and newsletters.

  • WordPress

    WordPress, Git-ified. This repository is just a mirror of the WordPress subversion repository. Please do not send pull requests. Submit pull requests to https://github.com/WordPress/wordpress-develop and patches to https://core.trac.wordpress.org/ instead.

  • scalachess

    Chess API written in scala. Immutable and free of side effects.

  • https://github.com/lichess-org/scalachess/tree/master/src/ma...

    It's funny because I remember comparing it to mine that I had tried to write during college, and appreciating how much better it is.

    Pay attention to how there's a bunch of different types of chess in there too, and how that's factored.

  • ILSpy

    .NET Decompiler with support for PDB generation, ReadyToRun, Metadata (&more) - cross-platform!

  • For anyone looking for a (nontrivial) C# project, I can only recommend going through ILSpy decompiler. https://github.com/icsharpcode/ilspy

  • gitlab

  • tarsnap

    Command-line client code for Tarsnap.

  • In past threads, people have mentioned enjoying my Tarsnap (https://github.com/Tarsnap/tarsnap) code. I personally think that the spiped (https://github.com/Tarsnap/spiped) code is even better.

  • spiped

    Spiped is a utility for creating symmetrically encrypted and authenticated pipes between socket addresses.

  • In past threads, people have mentioned enjoying my Tarsnap (https://github.com/Tarsnap/tarsnap) code. I personally think that the spiped (https://github.com/Tarsnap/spiped) code is even better.

  • Django

    The Web framework for perfectionists with deadlines.

  • Every time that I can't figure out how to do something with Django, I just read the code [1] and then everything is easy and clear.

    [1]: https://github.com/django/django

  • Nginx

    An official read-only mirror of http://hg.nginx.org/nginx/ which is updated hourly. Pull requests on GitHub cannot be accepted and will be automatically closed. The proper way to submit changes to nginx is via the nginx development mailing list, see http://nginx.org/en/docs/contributing_changes.html

  • It's been years since I've looked but I remember being impressed by the NGINX codebase. https://github.com/nginx/nginx

  • reshade

    A generic post-processing injector for games and video software.

  • I think my favourite open source project to poke around in recently is [Reshade](https://github.com/crosire/reshade). The code is pretty readable and is doing a lot of interesting stuff. Every time I've taken a look at it I've learned something new. Definitely super light on boilerplate, given that it's solving a bit of a unique problem.

    In terms of tips and tricks, I often start looking at new code by trying to write out in plain english prose, a bit of a story of how the code works. Almost like I'm writing a blog post explaining how things work to someone else. Often this process uncovers rabbit holes that I need to go down to understand isolated bits of logic before I can return to building this big picture view, which is sort of the point.

  • Chef

    Chef Infra, a powerful automation platform that transforms infrastructure into code automating how infrastructure is configured, deployed and managed across any environment, at any scale

  • I've found the Chef project (https://github.com/chef/chef) to be high quality and easily readable but I've been working with Chef for like 8 years at this point which might be influencing how I view it.

    Hashicorp projects also seem very well done too especially given how extensible they are.

  • AsyncAwaitBestPractices

    Extensions for System.Threading.Tasks.Task and System.Threading.Tasks.ValueTask

  • This guy made a HN mobile reader and put all the code on Github for his NDC Oslo presentation, it was good and shows off very readable asynchronous code in C#:

    https://github.com/brminnick/AsyncAwaitBestPractices

  • FFmpeg

    Mirror of https://git.ffmpeg.org/ffmpeg.git

  • I had to modify FFmpeg for a job and I found it surprisingly accessible and easy to read/modify: https://github.com/FFmpeg/FFmpeg

  • ramda

    :ram: Practical functional Javascript

  • I find Ramda very easy to read! It's a functional Javascript library based on currying and composition. https://github.com/ramda/ramda/

    I find a lot of code fairly alienating to read. Lots of codebases require you to get into the "mindset" of the person who wrote the code: their idioms, assumptions, patterns they lean on, etc. So unless you've got the time to get deep into it, the insights you can draw from reading it are minimal.

    Ramda, by comparison, is just a library of utility functions, and all of those utilities perform very simple operations: merging, plucking, appending, equality checking, etc.

    There's a lot of intention in the Ramda API as well. All functions are "data last," meaning that the actual piece of data you're operating on is the final argument to every function. This enables you to write Ramda code that is very structurally consistent: function parameters first, data last, every time.

    It gives me a sense of empowerment, reading the code. It's like "This doesn't have to be rocket science. If you just start from these basic operations, and write those basic operations with a simple but strict ideology of 'data last' every time, and stick them together like lego blocks using compose, then you can achieve some very cool stuff with very little code."

  • JDK

    JDK main-line development https://openjdk.org/projects/jdk

  • A lot of the Java concurrency primitives written by Doug Lea and co. are great reads, and very well commented. See the source of `ConcurrentHashMap` for example: https://github.com/openjdk/jdk/blob/master/src/java.base/sha...

  • reactos

    A free Windows-compatible Operating System

  • yui3

    A library for building richly interactive web applications.

  • Reading and using YUI3 (https://github.com/yui/yui3) took my JavaScript to the next level. It's no longer relevant because of improvements to the language, but it's the best model of readable JavaScript I've ever seen.

  • GitTrends

    A iOS and Android app to monitor the Views, Clones and Star history of your GitHub repos

  • Thanks for the kind words!

    I’ve also published an open-source iOS + Android app to the App Stores, called GitTrends that leverages my AsyncAwaitBestPractices library if anyone wants to see how to use it in a real/live production app!

    The source code for GitTrends is available here: https://gittrends.com

  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts