Replacing a SQL analyst with 26 recursive GPT prompts

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

Our great sponsors
  • WorkOS - The modern identity platform for B2B SaaS
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • SaaSHub - Software Alternatives and Reviews
  • lambdaprompt

    λprompt - A functional programming interface for building AI systems

  • This is great~ There's been some really rapid progress on Text2SQL in the last 6 months, and I really thinking this will have a real impact on the modern data stack ecosystem!

    I had similar success with lambdaprompt for solving Text2SQL (https://github.com/approximatelabs/lambdaprompt/)

  • sketch

    AI code-writing assistant that understands data content

  • (3) Asking for re-writes of failed queries (happens occasionally) also helps

    The main challenge I think with a lot of these "look it works" tools for data applications, is how do you get an interface that actually will be easy to adopt. The chat-bot style shown here (discord and slack integration) I can see being really valuable, as I believe there has been some traction with these style integrations with data catalog systems recently. People like to ask data questions to other people in slack, adding a bot that tries to answer might short-circuit a lot of this!

    We built a prototype where we applied similar techniques to the pandas-code-writing part of the stack, trying to help keep data scientists / data analysts "in flow", integrating the code answers in notebooks (similar to how co-pilot puts suggestions in-line) -- and released https://github.com/approximatelabs/sketch a little while ago.

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

    WorkOS logo
  • genetic-programming

    Genetic programming in Common Lisp

  • A couple of thoughts jumped out after reading this: transforms and meta-learning.

    An old trick in AI is to transform the medium to Lisp because it can be represented as a syntax-free tree that always runs. In this case, working with SQL directly led to syntax errors which returned no results. It would probably be more fruitful to work with relational algebra and tuple relational calculus (I had to look that up hah) represented as Lisp and convert the final answer back to SQL. But I'm honestly impressed that ChatGPT's SQL answers mostly worked anyway!

    https://en.wikipedia.org/wiki/Genetic_programming

    http://www.cis.umassd.edu/~ivalova/Spring08/cis412/Ectures/G...

    https://www.gene-expression-programming.com/GepBook/Chapter1...

    https://github.com/gdobbins/genetic-programming

    I actually don't know how far things have come with meta-learning as far as AIs tuning their own hyperparameters. Well, a quick google search turned up this:

    https://cloud.google.com/ai-platform/training/docs/hyperpara...

    So I would guess that this is the secret sauce that's boosted AI to such better performance in the last year or two. It's always been obvious to do that, but it requires a certain level of computing power to be able to run trainings thousands of times to pick the best learners.

    Anyway, my point is that the author is doing the above steps semi-manually, but AIs are beginning to self-manage. Recursion sounds like a handy term to convey that. ChatGPT is so complex compared to what he is doing that I don't see any reason why it couldn't take his place too! And with so many eyeballs on this stuff, we probably only have a year or two before AI can do it all.

    I'm regurgitating 20 year old knowledge here as an armchair warrior. Insiders are so far beyond this. But see, everything I mentioned is so much easier to understand than neural networks, that there's no reason why NNs can't use these techniques themselves. The hard work has already been done, now it's just plug n chug.

  • zillion

    Make sense of it all. Semantic data modeling and analytics with a sprinkle of AI. https://totalhack.github.io/zillion/

  • This seems fun, but certainly unnecessary. All of those questions could be answered in seconds using a warehouse tool like Looker or Metabase or https://github.com/totalhack/zillion (disclaimer: I'm the author and this is alpha-level stuff, though I use it regularly).

  • olympe

    Query your database in plain english

  • It only support Postgres for now (and it's far from perfect..)

    https://github.com/BenderV/olympe

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts