Speeding up LXC container pull by up to 3x

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

Our great sponsors
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • WorkOS - The modern identity platform for B2B SaaS
  • SaaSHub - Software Alternatives and Reviews
  • htcat

    Parallel and Pipelined HTTP GET Utility

  • https://github.com/htcat/htcat) to assist with Heroku's efforts to speed up moving the tar formatted application releases around. It's pretty old, it doesn't integrate tar archival itself, it probably can stand improvement (or given its small size, even a rewrite).

    My favorite hack in there (that also made it work with pre-signed S3 urls) was not using the HEAD method as is customary to determine object sizes, but instead doing a regular "GET" that, for small files, would execute on its own...but for larger would simply be abruptly closed by htcat once it reached the bytes that had since been fetched in parallel by a range-based request sent immediately afterwards. The goal was to have htcat not offer a penalty on small files so it could be used on blended workloads without thinking.

    It also found a bug in S3's range implementation. We had a problem with some object or other, I wrote in about it, and was told that upon investigation a bug had been fixed. No more problem.

  • stargz-snapshotter

    Fast container image distribution plugin with lazy pulling

  • This is interesting and seems general purpose. Not merely for container images.

    There’s this option for OCI containers which I don’t pretend to understand: https://github.com/containerd/stargz-snapshotter

    It is used by containerd and nerdctl. You do have to build the image with it. Images work in OCI compatible registry. By fetching most used files first container can be started before loading is finished. Or so I gather.

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts