hypopg
pev2
Our great sponsors
hypopg | pev2 | |
---|---|---|
5 | 39 | |
1,091 | 2,315 | |
2.6% | 4.3% | |
5.6 | 7.7 | |
4 months ago | 7 days ago | |
C | TypeScript | |
GNU General Public License v3.0 or later | PostgreSQL License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
hypopg
- The SQLite Index Suggester
-
Why Can't Database Tables Index Themselves?
There's a lot of good ecosystem stuff around this:
https://github.com/HypoPG/hypopg
HypoPG is a PostgreSQL extension adding support for hypothetical indexes.
An hypothetical -- or virtual -- index is an index that doesn't really exists, and thus doesn't cost CPU, disk or any resource to create. They're useful to know if specific indexes can increase performance for problematic queries, since you can know if PostgreSQL will use these indexes or not without having to spend resources to create them.
With one approach to using it here: https://www.percona.com/blog/2019/07/22/automatic-index-reco...
- PostgreSQL Explain Output Explained
pev2
-
Pg_hint_plan: Force PostgreSQL to execute query plans how you want
The PEV2 is open source and give you a good visualization. I never used this pgmustard to compare.
-
Efficient Database Queries in Rails: A Practical Approach
Visualize Your Plan: Visit explain.dalibo.com and paste the generated plan text and query. Then, hit Submit. The tool will generate a visualization of your query plan. Here's an example of the visualization for the fifth attempt version of the query from this post. It shows the different types of scans that were used and how the data gets combined. The duration of each operation is also shown:
-
What's new in the Postgres 16 query planner (a.k.a. optimizer)
I use this tool to visualize my queries: https://explain.dalibo.com/ (there's also https://www.pgexplain.dev/, last time the output was less nice, but now both look the same)
You can download the whole analyzer as a simple html file and use it this way. No need to obfuscate or sanitize anything at all.
-
200 Web-Based, Must-Try Web Design and Development Tools
PostgreSQL Query Plan Analyzer and Visualizer
-
Do you use pgAdmin? Why?
I didn’t know about pev2, interesting, checking it now. Did you integrate the component yourself or are you using this hosted page by them: https://explain.dalibo.com/?
-
Locally deployable dashboard to aggregate and visaulize auto_explain log entries?
I'm quite happy with pev2. Single HTML file to download, runs locally in browser, and has great visualizations.
It powers explain.dalibo.com, and it's possible to self-host it as a service for your team too to share execution plans.
-
Show HN: I made a SQL game to help people learn / challenge their skills
I'd say it depends a bit on your access pattern, the query frequency and the time you have for each query, and beyond that on your insert performance requirements. I'm kinda referring to PostgreSQL here since that's what we use at work. In there, I have like 2-3 rules of thumb by now:
A remarkable query pattern treats tables like hashmaps. You have a query, you expect 0 or 1 results, and you want those results as fast as possible, even more so than usual. This is usually accompanied with a low to very low write frequency in relationship to reads against this. Examples are - like you say - a login: Give me the account for this mail - generally, you'll have many more logins than registrations. Give me the current chat-room for a user - they enter a chatroom once every few hours and then look into it a lot. This usually warrants an index without much analysis. Often, foreign-key relationships also enter this pattern, but imo, that would require evidence through query plans to look further into it and normally, the primary key should handle these.
After that, it becomes a bit weird until you think about it, because you get into something like index sizes and selectiveness. The interesting part there is: Postgres on a decently quick core can iterate linearly through a few ten thousand to a low number of hundred thousand rows in a few milliseconds, per parallel worker thread. This is weighed against the cost of traversing a potentially large index - and an index cutting the dataset into too many very tiny pieces will be large. This might cause postgres to stop using an index even it if it matches conditions in the query, because just chugging through the dataset brute force in parallel is faster. And yes, I was apprehensive at first, but the optimizer tends to be right there.
So yeah, your approach sounds right there. Unless you really know this index is really good for the DB and you can make specific points about rows inspected per query, just don't bother. Once a query starts running slow - or classes of queries start running slow, throw the whole bus of EXPLAIN ANALYZE and something like a query plan visualizer[1] at it to identify the slow part. This will generally point to a slow sequential scan or a slow parallel sequential scan, or a crazy sized join. And then you can start thinking about creating some index for the conditions used in the query to chop the dataset into some c * 10k sized blocks for c being like 5-25ish usually.
The nature of that c is more about your performance requirements and the nature of these queries, as well as the write load. If this is a web app, with loads of reads and few inserts, c should probably be on the smaller side. Tenant-ID tends to be a good one, CreatedDate for some sequential data stream like an audit log or some tracking/reporting stuff, IDs transactions tend to reference this thing by. If you're looking at a write-heavy reporting setup, it might be better to be a bit coarser for quicker inserts, because who cares if a reporting query is chugging for 5 seconds? Though 5 seconds would be in that ugly uncanny valley - I can't just go and get coffee while the query runs in good faith, but it's too slow to be a good experience, lol.
And something to consider for the latter is also the general query patterns for the total application depending on the database schema. This, again, indicates that delaying index creating well into maturity of a codebase is a good idea. I've had quite a few applications and reporting-like solutions which ended up with like 6-8 different filter patterns throughout their queries. And sure, we could have created 6 different index structures to support all of those individually perfectly, but then we'd have ended up with a crazy amount of indexes which in turn wouldn't be great for insert-performance. Instead, since we waited for issues, we could identify 2 indexes that would support each individual query only to like 60% - 80% of the effect of the tailored index structure for this query, but these 2 indexes would support all queries across the board and - as we later saw - would support 90%+ of all queries created further down the line to a similar degree.
So, waiting until we had an understanding of the dataset and actual problems with actual data meant we could solve a lot of issue with a morning of analysis and a small touch of the system. And then performance issues were gone for a very long time.
-
Tracking down high CPU Utilization on Amazon Aurora PostgreSQL
Analyzing the plan could be overwhelming sometimes and could use tools such as dalibo and depesz that help visualize your explain plans (Make sure to read the data retention policies on these tools and ideally anonymize your queries for security reasons before you upload your plans)!
What are some alternatives?
awesome-db-tools - Everything that makes working with databases easier
TypeORM - ORM for TypeScript and JavaScript. Supports MySQL, PostgreSQL, MariaDB, SQLite, MS SQL Server, Oracle, SAP Hana, WebSQL databases. Works in NodeJS, Browser, Ionic, Cordova and Electron platforms.
pev - Postgres Explain Visualizer
ruby-pg - A PostgreSQL client library for Ruby
sqlite-wf - Simple visual ETL tool
sysbench - Scriptable database and system performance benchmark
yugabyte-db - YugabyteDB - the cloud native distributed SQL database for mission-critical applications.
octosql - OctoSQL is a query tool that allows you to join, analyse and transform data from multiple databases and file formats using SQL.
cloudbeaver - Cloud Database Manager
orafce - The "orafce" project implements in Postgres some of the functions from the Oracle database that are missing (or behaving differently).Those functions were verified on Oracle 10g, and the module is useful for production work.
testcontainers-node - Testcontainers is a NodeJS library that supports tests, providing lightweight, throwaway instances of common databases, Selenium web browsers, or anything else that can run in a Docker container.
supabase - The open source Firebase alternative.