-
wasm-client
Examples for the MotherDuck WASM Client library, enabling MotherDuck integration for WebAssembly-powered DuckDB
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
vanna
π€ Chat with your SQL database π. Accurate Text-to-SQL Generation via LLMs using RAG π.
-
spider
scripts and baselines for Spider: Yale complex and cross-domain semantic parsing and text-to-SQL challenge
Love these! We do want to deliver more features like FixIt! [0]
What's really exciting is what you can do with DuckDB, MotherDuck, and WASM. A powerful in-browser storage and execution engine tethered to a central serverless data warehouse using hybrid mode [1] opens the doors for unprecedented experiences. Imagine the possibilities if you have metadata, data, query logic, or even LLMs in the client 0ms away from the user and on user's own hardware.
So we're doing this in our UI of course, but we also released a WASM SDK so that developers can take advantage of this new architecture in their own apps! [2]
[0]https://motherduck.com/blog/introducing-fixit-ai-sql-error-f...
[1]https://motherduck.com/docs/architecture-and-capabilities
[2]https://github.com/motherduckdb/wasm-client
1. First of all, thanks for outlining how you trained the model here in the repo: https://github.com/NumbersStationAI/DuckDB-NSQL?tab=readme-o...! I did not know about `sqlglot`, that's a pretty cool lib. Which part of the project was the most challenging or time-consuming: generating the training data, the actual training, or testing? How did you iterate, improve, and test the model?
2. How would you suggest using this model effectively if we have custom data in our DBs? For example, we might have a column called `purpose` that's a custom defined enum (i.e. not a very well-known concept outside of our business). Currently, we've fed it in as context by defining all the possible values it can have. Do you have any other recs on how to tune our prompts so that this model is just as effective with our own custom data?
3. Similar to above, do you know you can use the same model to work effectively on tens or even hundreds of tables? I've used multiple question-SQL example pairs as context, but I've found that I need 15-20 for it to be effective for even one table, let alone tens of tables.
Iβm trying to solve for this with my project and (at least based on what people say in Discord), itβs working really well for them:
https://github.com/vanna-ai/vanna