Apple's custom NVMes are amazingly fast – if you don't care about data integrity

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

InfluxDB - Power Real-Time Data Analytics at Scale
Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
www.influxdata.com
featured
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com
featured
  • zfs

    OpenZFS on Linux and FreeBSD

  • I just looked into this, since what you say and what Apple’s documentation says are two different things.

    Here is Apple’s documentation:

    https://devstreaming-cdn.apple.com/videos/wwdc/2019/419ef9ip...

    F_BARRIERFSYNC: fsync() with a barrier

    F_FULLFSYNC: Drive flush its cache to disk

    This sounds like the Linux fsync() and Linux syncfs() respectively. What you say is that F_FULLFSYNC is the same as Linux fsync() and your performance numbers back that up. Unfortunately, you would only see a difference between Linux fsync() and Linux syncfs() if you have files being asynchronously written at the same time as the files that are subject to fsync()/syncfs(). fsync() would only touch the chosen files while syncfs() would touch both. If you did not have heavy background file writes and F_FULLSYNC really is equivalent to syncfs(), you would not be able to tell the difference in your tests.

    That said, let’s look at how this actually works on Mac OS. Unfortunately, the apfs driver does not appear to be open source, but the HFS+ driver is. Here are the relevant pieces of code in HFS+:

    https://github.com/apple-oss-distributions/hfs/blob/hfs-556....

    https://github.com/apple-oss-distributions/hfs/blob/5e3008b6...

    First, let me start with saying this merits a faceplam. The fsync() operation is operating at the level of the mount point, not the individual file. F_FULLSYNC and F_BARRIERFSYNC are different, but they both might as well be variants of the Linux syncfs().

    For good measure, let us look at how this is done on the MacOS ZFS driver:

    https://github.com/openzfsonosx/zfs/blob/master/module/zfs/z...

    The file is properly synced independently of the mountpoint, such that other files being modified in the file are not immediately required to be written out to disk. That said, both F_FULLSYNC and F_BARRIERFSYNC on MacOS are mapped by the ZFS driver to the same function that implements fsync() on Linux:

    https://github.com/openzfs/zfs/blob/master/module/os/linux/z...

    For good measure, let us look at how syncfs() is implemented by ZFS on Linux:

    https://github.com/openzfs/zfs/blob/master/module/os/linux/z...

    It operates on the superblock, which is what MacOS’ HFS+ driver does.

    From this, I can conclude:

    Linux syncfs() == macOS F_FULLFSYNC on HFS+

    Linux fsync() == macOS fsync()/F_FULLFSYNC/F_BARRIERFSYNC on ZFS

    Also, MacOS F_BARRIERSYNC is a weakened Linux syncfs() and Apple’s documentation is very misleading (although maybe not technically wrong). POSIX does allow fsync to be implemented via syncfs (sync in POSIX, but I am saying syncfs from Linux to be less confusing). However, not issuing and waiting for the completion of an IO barrier on fsync is broken behavior like you claim.

    I am not sure how MacOS APFS behaves. I imagine that additional testing that takes into account the nuances in semantics would be able to clarify that. If it behaves like HFS+, it is broken.

  • MySQL

    MySQL Server, the world's most popular open source database, and MySQL Cluster, a real-time, open source transactional database.

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • hfs

  • I just looked into this, since what you say and what Apple’s documentation says are two different things.

    Here is Apple’s documentation:

    https://devstreaming-cdn.apple.com/videos/wwdc/2019/419ef9ip...

    F_BARRIERFSYNC: fsync() with a barrier

    F_FULLFSYNC: Drive flush its cache to disk

    This sounds like the Linux fsync() and Linux syncfs() respectively. What you say is that F_FULLFSYNC is the same as Linux fsync() and your performance numbers back that up. Unfortunately, you would only see a difference between Linux fsync() and Linux syncfs() if you have files being asynchronously written at the same time as the files that are subject to fsync()/syncfs(). fsync() would only touch the chosen files while syncfs() would touch both. If you did not have heavy background file writes and F_FULLSYNC really is equivalent to syncfs(), you would not be able to tell the difference in your tests.

    That said, let’s look at how this actually works on Mac OS. Unfortunately, the apfs driver does not appear to be open source, but the HFS+ driver is. Here are the relevant pieces of code in HFS+:

    https://github.com/apple-oss-distributions/hfs/blob/hfs-556....

    https://github.com/apple-oss-distributions/hfs/blob/5e3008b6...

    First, let me start with saying this merits a faceplam. The fsync() operation is operating at the level of the mount point, not the individual file. F_FULLSYNC and F_BARRIERFSYNC are different, but they both might as well be variants of the Linux syncfs().

    For good measure, let us look at how this is done on the MacOS ZFS driver:

    https://github.com/openzfsonosx/zfs/blob/master/module/zfs/z...

    The file is properly synced independently of the mountpoint, such that other files being modified in the file are not immediately required to be written out to disk. That said, both F_FULLSYNC and F_BARRIERFSYNC on MacOS are mapped by the ZFS driver to the same function that implements fsync() on Linux:

    https://github.com/openzfs/zfs/blob/master/module/os/linux/z...

    For good measure, let us look at how syncfs() is implemented by ZFS on Linux:

    https://github.com/openzfs/zfs/blob/master/module/os/linux/z...

    It operates on the superblock, which is what MacOS’ HFS+ driver does.

    From this, I can conclude:

    Linux syncfs() == macOS F_FULLFSYNC on HFS+

    Linux fsync() == macOS fsync()/F_FULLFSYNC/F_BARRIERFSYNC on ZFS

    Also, MacOS F_BARRIERSYNC is a weakened Linux syncfs() and Apple’s documentation is very misleading (although maybe not technically wrong). POSIX does allow fsync to be implemented via syncfs (sync in POSIX, but I am saying syncfs from Linux to be less confusing). However, not issuing and waiting for the completion of an IO barrier on fsync is broken behavior like you claim.

    I am not sure how MacOS APFS behaves. I imagine that additional testing that takes into account the nuances in semantics would be able to clarify that. If it behaves like HFS+, it is broken.

  • zfs

    OpenZFS on OS X (by openzfsonosx)

  • I just looked into this, since what you say and what Apple’s documentation says are two different things.

    Here is Apple’s documentation:

    https://devstreaming-cdn.apple.com/videos/wwdc/2019/419ef9ip...

    F_BARRIERFSYNC: fsync() with a barrier

    F_FULLFSYNC: Drive flush its cache to disk

    This sounds like the Linux fsync() and Linux syncfs() respectively. What you say is that F_FULLFSYNC is the same as Linux fsync() and your performance numbers back that up. Unfortunately, you would only see a difference between Linux fsync() and Linux syncfs() if you have files being asynchronously written at the same time as the files that are subject to fsync()/syncfs(). fsync() would only touch the chosen files while syncfs() would touch both. If you did not have heavy background file writes and F_FULLSYNC really is equivalent to syncfs(), you would not be able to tell the difference in your tests.

    That said, let’s look at how this actually works on Mac OS. Unfortunately, the apfs driver does not appear to be open source, but the HFS+ driver is. Here are the relevant pieces of code in HFS+:

    https://github.com/apple-oss-distributions/hfs/blob/hfs-556....

    https://github.com/apple-oss-distributions/hfs/blob/5e3008b6...

    First, let me start with saying this merits a faceplam. The fsync() operation is operating at the level of the mount point, not the individual file. F_FULLSYNC and F_BARRIERFSYNC are different, but they both might as well be variants of the Linux syncfs().

    For good measure, let us look at how this is done on the MacOS ZFS driver:

    https://github.com/openzfsonosx/zfs/blob/master/module/zfs/z...

    The file is properly synced independently of the mountpoint, such that other files being modified in the file are not immediately required to be written out to disk. That said, both F_FULLSYNC and F_BARRIERFSYNC on MacOS are mapped by the ZFS driver to the same function that implements fsync() on Linux:

    https://github.com/openzfs/zfs/blob/master/module/os/linux/z...

    For good measure, let us look at how syncfs() is implemented by ZFS on Linux:

    https://github.com/openzfs/zfs/blob/master/module/os/linux/z...

    It operates on the superblock, which is what MacOS’ HFS+ driver does.

    From this, I can conclude:

    Linux syncfs() == macOS F_FULLFSYNC on HFS+

    Linux fsync() == macOS fsync()/F_FULLFSYNC/F_BARRIERFSYNC on ZFS

    Also, MacOS F_BARRIERSYNC is a weakened Linux syncfs() and Apple’s documentation is very misleading (although maybe not technically wrong). POSIX does allow fsync to be implemented via syncfs (sync in POSIX, but I am saying syncfs from Linux to be less confusing). However, not issuing and waiting for the completion of an IO barrier on fsync is broken behavior like you claim.

    I am not sure how MacOS APFS behaves. I imagine that additional testing that takes into account the nuances in semantics would be able to clarify that. If it behaves like HFS+, it is broken.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts

  • Ubuntu 24.04 LTS is so buggy you can't install the OS [video]

    1 project | news.ycombinator.com | 1 May 2024
  • Radxa's SATA HAT makes compact Pi 5 NAS

    1 project | news.ycombinator.com | 4 Apr 2024
  • OpenZFS: Fix corruption caused by MMAP flushing problems

    1 project | news.ycombinator.com | 26 Mar 2024
  • ZFS: Some copied files are still corrupted (chunks replaced by zeros)

    1 project | news.ycombinator.com | 27 Feb 2024
  • DiskClick: Ever wanted to hear Old Hard drive sounds

    1 project | news.ycombinator.com | 19 Feb 2024