Schema on write is better to live by

InfluxDB - Power Real-Time Data Analytics at Scale

Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

www.influxdata.com

featured

WorkOS - The modern identity platform for B2B SaaS

The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

workos.com

featured

datasette.io

6 81 8.0 HTML

The official project website for Datasette

I've come around to almost the opposite approach.
I pull all of the data I can get my hands on (from Twitter, GitHub, Swarm, Apple Health, Pocket, Apple Photos and more) into SQLite database tables that match the schema of the system that they are imported from.
For my own personal Dogsheep (https://simonwillison.net/2020/Nov/14/personal-data-warehous...) that's 119 tables right now.
Then I use SQL queries against those tables to extract and combine data in ways that are useful to me.
If the schema of the systems I am importing from changes, I can update my queries to compensate for the change.
This protects me from having to solve for a standard schema up front - I take whatever those systems give me. But it lets me combine and search across all of the data from disparate systems essentially at runtime.
I even have a search engine for this, which is populated by SQL queries against the different source tables. You can see an example of how that works at https://github.com/simonw/datasette.io/blob/main/templates/d... - which powers the search interface at https://datasette.io/-/beta

datasette

187 8,934 9.3 Python

An open source multi-tool for exploring and publishing data

To be honest it's mostly for fun, and to help me dogfood https://datasette.io/ and come up with new features for it.
But just a moment ago I was trying to remember the name of the Diataxis documentation framework - I was sure I'd either tweeted about it or blogged it, so I ran a personal search in Dogsheep Beta and turned up this tweet: https://twitter.com/simonw/status/1386370167395942401
Someone asked me the other day who they should follow on Twitter for Python news, so I searched all 40,000 tweets I have favourited for "Python", faceted by user and told them the top four users.

InfluxDB

www.influxdata.com featured

Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
logseq

544 29,797 9.9 Clojure

A local-first, non-linear, outliner notebook for organizing and sharing your personal knowledge base. Use it to organize your todo list, to write your journals, or to record your unique life.

This way, the schema grows organically over time, from the bottom up. Instead of having to think up a system of classification before I start writing, I just write, and then classify later as information accrues.
Logseq has tags and block embeds and many other features too, but the core nested list model is what has really attracted me to it. I'm sure it's not the only note app that works along these lines (I'm always open to suggestions), but it's open-source and it works quite nicely.
[1]: https://logseq.com/

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Ask HN: High quality Python scripts or small libraries to learn from

12 projects | news.ycombinator.com | 19 Apr 2024
GitHub – GSA/code-gov: An informative repo for all Code.gov repos

12 projects | news.ycombinator.com | 9 Sep 2023
Welcome to Datasette Cloud

6 projects | news.ycombinator.com | 20 Aug 2023
SQLite Functions for Working with JSON

10 projects | news.ycombinator.com | 10 Aug 2023
I'm sure I'm being stupid.. Copying data from an API and making a database

2 projects | /r/Database | 19 Jan 2023

Schema on write is better to live by

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com
datasette Sqlite knowledge-management Python pkm
Post date: 21 Aug 2021

datasette.io

datasette

InfluxDB

logseq

Related posts

Ask HN: High quality Python scripts or small libraries to learn from

GitHub – GSA/code-gov: An informative repo for all Code.gov repos

Welcome to Datasette Cloud

SQLite Functions for Working with JSON

I'm sure I'm being stupid.. Copying data from an API and making a database

Schema on write is better to live by

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com datasette Sqlite knowledge-management Python pkm Post date: 21 Aug 2021

datasette.io

datasette

InfluxDB

logseq

Related posts

Ask HN: High quality Python scripts or small libraries to learn from

GitHub – GSA/code-gov: An informative repo for all Code.gov repos

Welcome to Datasette Cloud

SQLite Functions for Working with JSON

I'm sure I'm being stupid.. Copying data from an API and making a database

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com
datasette Sqlite knowledge-management Python pkm
Post date: 21 Aug 2021