Our great sponsors
-
laion-aesthetic-datasette
Use Datasette to explore LAION improved_aesthetics_6plus training data used by Stable DIffusion
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
gradio
Build and share delightful machine learning apps, all in Python. 🌟 Star to support our work!
-
WorkOS
The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
If anyone is interested in the technical details, the database itself is a 4GB SQLite file which we are hosting with Datasette running on Fly.
More details in our repo: https://github.com/simonw/laion-aesthetic-datasette
Search is provided by SQLite FTS5.
I recommend looking into "transfer learning".
That's where you start with an existing large model, and train a new model on top of it by feeding in new images.
What's fascinating about transfer learning is that you don't need to give it a lot of new images, at all. Just a few hundred extras can create a model that's frighteningly accurate for tasks like image labeling.
This is pretty much how all AI models work today. Take a look at the Stable Diffusion model card: https://github.com/CompVis/stable-diffusion/blob/main/Stable...
They ran multiple training sessions with progressively smaller (and higher quality) images to get the final result.
Done https://github.com/rom1504/clip-retrieval/commit/53e3383f58b...
Using clip for searching is better than direct text indexing for a variety of reasons but here for example because it matches better what stable diffusion sees