datasetGPT
A command-line interface to generate textual and conversational datasets with LLMs. (by radi-cho)
typer
Typer, build great CLIs. Easy to code. Based on Python type hints. (by tiangolo)
Our great sponsors
datasetGPT | typer | |
---|---|---|
6 | 86 | |
272 | 14,293 | |
- | - | |
6.0 | 6.7 | |
8 months ago | 1 day ago | |
Python | Python | |
- | MIT License |
The number of mentions indicates the total number of mentions that we've tracked plus the number of user suggested alternatives.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
datasetGPT
Posts with mentions or reviews of datasetGPT.
We have used some of these posts to build our list of alternatives
and similar projects. The last one was on 2023-04-01.
- datasetGPT is a command-line interface and a Python library for inferencing Large Language Models to generate textual datasets. (Regenerative feedback loops)
-
[R] [P] I generated a 30K-utterance dataset by making GPT-4 prompt two ChatGPT instances to converse.
A dataset consisting of dialogues between two instances of ChatGPT (gpt-3.5-turbo). The CLI commands and dialogue prompts themselves have been written by GPT-4. The dataset covers a wide range of contexts (questions and answers, arguing and reasoning, task-oriented dialogues) and downstream tasks (e.g., hotel reservations, medical advice). Texts have been generated with datasetGPT and the OpenAI API as a backend. Approximate cost for generation: $35.
-
[P] two copies of gpt-3.5 (one playing as the oracle, and another as the guesser) performs poorly on the game of 20 Questions (68/1823).
Last week I released a CLI that can do this at scale: https://github.com/radi-cho/datasetGPT. Will use personal funds to generate somewhat big task oriented dataset later today with gpt-3.5 or gpt-4. Will open source it along a way for people to contribute their own datasets so we can collect bigger ones. Would be helpful both for analysis of how LLMs work and for fine tuning downstream models (Alpaca-like).
- DatasetGPT - A command-line interface to generate textual and conversational datasets with LLMs.
- DatasetGPT – an open-source command line tool for generating datasets with LLMs
-
[P] [D] datasetGPT - A command-line tool to generate datasets by inferencing LLMs. Supports OpenAI, Cohere, and Petals.
GitHub: https://github.com/radi-cho/datasetGPT
typer
Posts with mentions or reviews of typer.
We have used some of these posts to build our list of alternatives
and similar projects. The last one was on 2023-11-20.
- Copilot for your GitHub stars
-
Things I've learned about building CLI tools in Python
I have been using Typer on every one of my CLI projects which uses Click under the hood. The documentation is fantastic, the CLI app it produces looks great and lets you create things quickly. I high recommend it.
-
Things to do with standalone script
Adding CLI capabilities. My preferred library here is typer.
-
Where to start for managing a Python code base for public distribution
I just heard about this but it seems to be pretty much the type of thing you want and want fast.
-
Help on Docstrings
Docstrings are for documenting how a function/ class/ method/ module works. Often you don't need to add a docstring to your main function because no one will be importing it to use elsewhere. And if you want it to run as a CLI, then there are better ways to document the available options. For example, typer does most of it for you, or in click you add the help text to the decorator.
-
Which best practices do you follow to build robust & extensible ETL jobs?
Most computing tasks in airflow DAGs are KubernetesPodOperator containing a CLI (Python Typer). It allows us to pass arguments easily to run DAG manually if needed (the new UI to pass arguments to DAG in airflow 2.6 is really nice). Arguments allow us to replay DAG easily (change start / end dates for instance).
-
Devs on teams that deploy anytime you want, what does your SDLC workflow look like?
So it's basically the main .gitlab-ci.yml file plus a separate Python CI app using Typer for the AWS instrumentation.
-
The different uses of Python type hints
Similarly for Typer, which is literally "the FastAPI of CLIs"[1]. Handy to type your `main` parameters and have CLI argument parsing. For more complicated cases, it's a wrapper around Click.
-
Command line parser library, which one do you like the most, regardless of language?
interesting that you hate python, but love Click. Did you try Typer which uses Click underneath?
- Typer: Build great CLIs. Easy to code. Based on Python type hints