Onboard AI learns any GitHub repo in minutes and lets you chat with it to locate functionality, understand different parts, and generate new code. Use it for free at www.getonboard.dev. Learn more →
Refinery Alternatives
Similar projects and alternatives to refinery
-
azuredatastudio
Azure Data Studio is a data management and development tool with connectivity to popular cloud and on-premises databases. Azure Data Studio supports Windows, macOS, and Linux, with immediate capability to connect to Azure SQL and SQL Server. Browse the extension library for more database support options including MySQL, PostgreSQL, and MongoDB.
-
dbs-tools
Perl tools to transform account / transaction data from DBS Bank into proper CSV
-
InfluxDB
Collect and Analyze Billions of Data Points in Real Time. Manage all types of time series data in a single, purpose-built database. Run at any scale in any environment in the cloud, on-premises, or at the edge.
-
sqlx
🧰 The Rust SQL Toolkit. An async, pure Rust SQL crate featuring compile-time checked queries without a DSL. Supports PostgreSQL, MySQL, SQLite, and MSSQL. (by launchbadge)
-
fiftyone
The open-source tool for building high-quality datasets and computer vision models
-
-
-
cleanlab
The standard data-centric AI package for data quality and machine learning with messy, real-world data and labels.
-
Onboard AI
Learn any GitHub repo in 59 seconds. Onboard AI learns any GitHub repo in minutes and lets you chat with it to locate functionality, understand different parts, and generate new code. Use it for free at www.getonboard.dev.
-
-
refinery-sample-projects
Containing examples of projects you can use to test refinery. Please select the use case from the branches.
-
naphtha
Universal database connection layer for your application in Rust. Implements the most common functions insert, update and remove for database connections. Change the database without having to adjust your code. Specific models can be stored in different databases. Query models by property. Migrations in pure Rust and available during runtime.
-
nanoGPT
The simplest, fastest repository for training/finetuning medium-sized GPTs.
-
awesome-production-machine-learning
A curated list of awesome open source libraries to deploy, monitor, version and scale your machine learning
-
dgl
Python package built to ease deep learning on graph, on top of existing DL frameworks.
-
-
-
BotLibre
An open platform for artificial intelligence, chat bots, virtual agents, social media automation, and live chat automation.
-
-
grape
🍇 GRAPE is a Rust/Python Graph Representation Learning library for Predictions and Evaluations (by AnacletoLAB)
-
-
dcai-lab
Lab assignments for Introduction to Data-Centric AI, MIT IAP 2023 👩🏽💻
-
SaaSHub
SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives
refinery reviews and mentions
-
[P] We are building a curated list of open source tooling for data-centric AI workflows, looking for contributions.
You definitely forgot https://www.kern.ai/ :)
-
GPT and BERT: A Comparison of Transformer Architectures
Get it for free here: https://github.com/code-kern-ai/refinery
-
Drastically decrease the size of your Docker application
Containers are amazing for building applications. Because they allow you to pack up a programm together with all it's dependencies and execute it wherever you like. That is why our application consists of 20+ individual containers, forming our data-centric IDE for NLP, which you can check out here: https://github.com/code-kern-ai/refinery.
-
Introducing bricks, an open-source content-library for NLP
Today we launched bricks, an open-source library which provides enrichments for your natural language processing projects. Our main goal with bricks is to shorten the amount of time that you need from idea to implementation. Bricks also seamlessly integrates into our main tool, the Kern AI refinery.
-
How to fine-tune your embeddings for better similarity search
This blog post will share our experience with fine-tuning sentence embeddings on a commonly available dataset using similarity learning. We additionally explore how this could benefit the labeling workflow in the Kern AI refinery. To understand this post, you should know what embeddings are and how they are generated. A rough idea of what fine-tuning is also helps. All the code and data referenced in this post is available on GitHub.
-
Vector Databases for Data-Centric AI (Part 2)
Shout out to both Kern.AI (an excellent open-source NLP labelling tool) https://github.com/code-kern-ai/refinery and Voxel51 (an excellent open-source Computer Vision analysis tool) https://github.com/voxel51/fiftyone for being early adopters of the technology in their platforms, but I don't believe either have yet made use of all of the value it can provide.
-
Hacker News top posts: Jul 18, 2022
Show HN: If VS Code had a data-centric IDE sibling, what would that look like?\ (23 comments)
-
Show HN: If VS Code had a data-centric IDE sibling, what would that look like?
Hi Ruben,
you can take a look at our architecture overview here: https://github.com/code-kern-ai/refinery#-architecture
A bit below it, you find a table with the links to all repositories. All of them are open-source. But thanks for the feedback, I'll try to make it a bit easier to understand! I appreciate that! :)
Hi Tom! Thanks, happy to hear that :)
We've focused on JSON as the user-specified data model. So you can upload anything fitting into a JSON. We're using pandas to process the uploaded data, so spreadsheets or CSV-ish also work.
We've got a public roadmap (https://github.com/code-kern-ai/refinery/projects/1), and we're looking forward to also integrate e.g. native PDF labeling sometime soon.
-
A note from our sponsor - Onboard AI
getonboard.dev | 8 Dec 2023
Stats
code-kern-ai/refinery is an open source project licensed under Apache License 2.0 which is an OSI approved license.
The primary programming language of refinery is Python.