Prompt-Engineering-Guide
examples
Prompt-Engineering-Guide | examples | |
---|---|---|
83 | 12 | |
43,924 | 99 | |
2.3% | - | |
9.7 | 7.8 | |
8 days ago | about 2 months ago | |
MDX | Jupyter Notebook | |
MIT License | GNU Affero General Public License v3.0 |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
Prompt-Engineering-Guide
-
Top Open Source Prompt Engineering Guides & Tools🔧🏗️🚀
Prompt Engineering Guide is the holy grail of all guides, aiming to make it easier to stay up-to-date with prompt engineering guides, techniques, applications, and papers. If you are getting started, this is an excellent place to start.
- FLaNK AI - 15 April 2024
- Prompt Engineering Guide
-
24 GitHub repos with 372M views that you can't miss out as a software engineer
Guides, papers, lecture, notebooks and resources for prompt engineering: https://github.com/dair-ai/Prompt-Engineering-Guide
-
Resources to deepen LLMs understanding for software engineers
this has been a great resource. approachable and great for practitioners. it's frequently updated with new papers and techniques https://www.promptingguide.ai/
-
Step-by-Step Guide to building an Anomaly Detector using a LLM
The idea behind prompt engineering is to construct the queries given to the language models to optimise their performance. This helps to guide them to generate the desired output by fine-tuning their response. There is a plethora of research papers out there on different forms of prompt engineering. DAIR.AI published a guide on prompt engineering that you might find useful to get started.
-
The Essential Guide to Prompt Engineering for Creators and Innovators
Prompt Engineering Guide
-
Getting Started with Prompt Engineering
Let's try to understand what is Prompt Engineering is all about. Here's the quote from Prompt Engineering Guide. DAIR-AI
-
Microsoft/promptbase: All things prompt engineering
I found this resource [0] handy for getting a grasp on all the different terms people use (zero/one-shot, tree of thoughts, RAG, etc). It's not super detailed, but was enough for me (a professional developer) to get started on some side projects with Mistral.
[0] https://www.promptingguide.ai/
-
OpenAI: Prompt Engineering
There are better guides out there too
- https://www.promptingguide.ai/readings
- https://github.com/dair-ai/Prompt-Engineering-Guide/tree/mai...
- https://github.com/microsoft/promptbase (this one is less of a guide, but is likely the current SoTA)
examples
- FLaNK AI - 15 April 2024
-
[R] Detecting Dataset Drift and Non-IID Sampling: A k-Nearest Neighbors approach that works for Image/Text/Audio/Numeric Data
I just published a paper detailing this non-IID check and open-sourced its code in the cleanlab package — just one line of code will check for this and many other types of issues in your dataset.
-
Datalab: A Linter for ML Datasets
I recently published a blog introducing Datalab and an open-source Python implementation that is easy-to-use for all data types (image, text, tabular, audio, etc). For data scientists, I’ve made a quick Jupyter tutorial to run Datalab on your own data.
-
Finetuning Large Language Models -- An introduction to the core ideas and approaches
Cool read! I just finished up a notebook where I show how noisy labels can drastically impact the performance of Open AI LLMs. I first fine-tune the well-known Davinci model (the backbone of ChatGPT) on the original data and report an accuracy of 63%. I then use the open-source package cleanlab to find examples that are incorrectly labeled and drop them from the training data. This step increases the fine-tuning accuracy to 66% (better accuracy with less data). Finally, I correct the mislabeled examples and fine-tuning accuracy jumps to 77%!
-
What are some active research areas in Machine Learning Systems?
The entire field of data-centric AI is an active field that is pretty new --- it focuses on the data side of ML as opposed to just model optimization. Our company is building an open-source package cleanlab that is becoming the DCAI standard.
-
[Research] ActiveLab: Active Learning with Data Re-Labeling
I recently published a paper introducing this novel method and an open-source Python implementation that is easy-to-use for all data types (image, text, tabular, audio, etc). For data scientists, I’ve made a quick Jupyter tutorial to run ActiveLab on your own data. For ML researchers, I’ve made all of our benchmarking code available for reproducibility so you can see for yourself how effective ActiveLab is in practice.
-
cleanlab open-source --- expanded support for Active Learning and other data-centric AI tasks
suggest which data is most informative to (re)label next (active learning) (link)
- Strategies for selecting what data to annotate?
- [D] Can someone point to research on determining usefulness of samples/datasets for training ML models?
-
cleanlab: an open-source python framework for data-centric AI
In one-line of python, cleanlab can automatically: 1) find mislabeled data + train robust models 2) detect outliers 3) estimate consensus + annotator-quality for datasets labeled by multiple annotators 4) suggest which data is best to label or re-label next (active learning)
What are some alternatives?
langchain - ⚡ Building applications with LLMs through composability ⚡ [Moved to: https://github.com/langchain-ai/langchain]
token-label-error-benchmarks - Benchmarking methods for label error detection in token classification tasks
openai-cookbook - Examples and guides for using the OpenAI API
awesome-active-learning - A curated list of awesome Active Learning
BetterChatGPT - An amazing UI for OpenAI's ChatGPT (Website + Windows + MacOS + Linux)
deep-active-learning - Deep Active Learning
prompt-engineering - Tips and tricks for working with Large Language Models like OpenAI's GPT-4.
notebooks - Repo for various jupyter notebooks.
Learn_Prompting - Prompt Engineering, Generative AI, and LLM Guide by Learn Prompting | Join our discord for the largest Prompt Engineering learning community
cleanlab - The standard data-centric AI package for data quality and machine learning with messy, real-world data and labels.
awesome-chatgpt-prompts - This repo includes ChatGPT prompt curation to use ChatGPT better.
multiannotator-benchmarks - Benchmarking algorithms for assessing quality of data labeled by multiple annotators