pg_hint_plan and single-table cardinality correction

This page summarizes the projects mentioned and recommended in the original post on dev.to

Our great sponsors
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • WorkOS - The modern identity platform for B2B SaaS
  • SaaSHub - Software Alternatives and Reviews
  • pev2

    Postgres Explain Visualizer 2

  • Here the execution is much faster, avoiding the nested loop from "order_details". You may be surprised to see the whole "order_details" scanned and hashed, but remember this is a distributed database where this table is sharded to many tablets. With an even lower selectivity like Rows(dummy o #5) which optimizes for 5 rows coming from "orders" you can see a nested loop from there. Here it is as displayed by the Dalibo plan visualizer to add some colors to this post: This is the correct access and all I had to do, in addition to adding this dummy CTE, is mentioning the unaccounted selectivity of 20%. I also mentioned that the selectivity when starting with "order_details" is 100% with Rows(dummy d *1). This doesn't change anything here, but it is important to think about all possible join order the query planner may consider, and be sure your hint match all of them. I could also have added Rows(dummy o d *3) to mention that there are on average 3 "order_details" for each "order". I could also have decided to go with exact numbers like Rows(dummy o #18) Rows(dummy d #2155) Rows(dummy o d *2.5) which, even if the table change, keeps the right balance for the join method decision. Finally, I didn't hint the join paths that does not start with my dummy CTE as I expect the query planner to always start there (it knows there is only one row thanks to the VALUE or LIMIT construct).

  • yugabyte-db

    YugabyteDB - the cloud native distributed SQL database for mission-critical applications.

  • In PostgreSQL you have to install the pg_hint_plan extension, and enable it. The PostgreSQL community fears that it is incorrectly used, and that people do not report, to the community that maintains the query planner, the issues encountered with bad optimizer estimations. And the risk is that people just workaround the issue with hints without addressing the root cause. Even if also Open Source, Yugabyte is in a different position. The users know that this database is new, in very active development, and interact with us though Slack or GitHub. Then, it is acceptable to test a different plan with hints, report the issue, leave the hints for a short term workaround until the issue is fixed. In addition to that, the fixed version will be easy to deploy without downtime as YugabyteDB is a distributed database which supports easy rolling upgrade.

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts