Show HN: I made a SQL game to help people learn / challenge their skills

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

Our great sponsors
  • SurveyJS - Open-Source JSON Form Builder to Create Dynamic Forms Right in Your App
  • WorkOS - The modern identity platform for B2B SaaS
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • postgres-wasm

    A PostgresQL server running in your browser

  • > forcing SQLite

    It might be the case that it's running SQLite via wasm. If so, then other database engines would need to be runnable in a browser too.

    PostgreSQL has been shown to work in the browser (eg https://www.crunchydata.com/blog/learn-postgres-at-the-playg..., and also https://github.com/snaplet/postgres-wasm), so that might be an option.

    Not sure about others.

  • pev2

    Postgres Explain Visualizer 2

  • I'd say it depends a bit on your access pattern, the query frequency and the time you have for each query, and beyond that on your insert performance requirements. I'm kinda referring to PostgreSQL here since that's what we use at work. In there, I have like 2-3 rules of thumb by now:

    A remarkable query pattern treats tables like hashmaps. You have a query, you expect 0 or 1 results, and you want those results as fast as possible, even more so than usual. This is usually accompanied with a low to very low write frequency in relationship to reads against this. Examples are - like you say - a login: Give me the account for this mail - generally, you'll have many more logins than registrations. Give me the current chat-room for a user - they enter a chatroom once every few hours and then look into it a lot. This usually warrants an index without much analysis. Often, foreign-key relationships also enter this pattern, but imo, that would require evidence through query plans to look further into it and normally, the primary key should handle these.

    After that, it becomes a bit weird until you think about it, because you get into something like index sizes and selectiveness. The interesting part there is: Postgres on a decently quick core can iterate linearly through a few ten thousand to a low number of hundred thousand rows in a few milliseconds, per parallel worker thread. This is weighed against the cost of traversing a potentially large index - and an index cutting the dataset into too many very tiny pieces will be large. This might cause postgres to stop using an index even it if it matches conditions in the query, because just chugging through the dataset brute force in parallel is faster. And yes, I was apprehensive at first, but the optimizer tends to be right there.

    So yeah, your approach sounds right there. Unless you really know this index is really good for the DB and you can make specific points about rows inspected per query, just don't bother. Once a query starts running slow - or classes of queries start running slow, throw the whole bus of EXPLAIN ANALYZE and something like a query plan visualizer[1] at it to identify the slow part. This will generally point to a slow sequential scan or a slow parallel sequential scan, or a crazy sized join. And then you can start thinking about creating some index for the conditions used in the query to chop the dataset into some c * 10k sized blocks for c being like 5-25ish usually.

    The nature of that c is more about your performance requirements and the nature of these queries, as well as the write load. If this is a web app, with loads of reads and few inserts, c should probably be on the smaller side. Tenant-ID tends to be a good one, CreatedDate for some sequential data stream like an audit log or some tracking/reporting stuff, IDs transactions tend to reference this thing by. If you're looking at a write-heavy reporting setup, it might be better to be a bit coarser for quicker inserts, because who cares if a reporting query is chugging for 5 seconds? Though 5 seconds would be in that ugly uncanny valley - I can't just go and get coffee while the query runs in good faith, but it's too slow to be a good experience, lol.

    And something to consider for the latter is also the general query patterns for the total application depending on the database schema. This, again, indicates that delaying index creating well into maturity of a codebase is a good idea. I've had quite a few applications and reporting-like solutions which ended up with like 6-8 different filter patterns throughout their queries. And sure, we could have created 6 different index structures to support all of those individually perfectly, but then we'd have ended up with a crazy amount of indexes which in turn wouldn't be great for insert-performance. Instead, since we waited for issues, we could identify 2 indexes that would support each individual query only to like 60% - 80% of the effect of the tailored index structure for this query, but these 2 indexes would support all queries across the board and - as we later saw - would support 90%+ of all queries created further down the line to a similar degree.

    So, waiting until we had an understanding of the dataset and actual problems with actual data meant we could solve a lot of issue with a morning of analysis and a small touch of the system. And then performance issues were gone for a very long time.

    1: https://explain.dalibo.com/

  • SurveyJS

    Open-Source JSON Form Builder to Create Dynamic Forms Right in Your App. With SurveyJS form UI libraries, you can build and style forms in a fully-integrated drag & drop form builder, render them in your JS app, and store form submission data in any backend, inc. PHP, ASP.NET Core, and Node.js.

    SurveyJS logo
NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts