ChainForge
gpt-prompt-engineer | ChainForge | |
---|---|---|
7 | 14 | |
8,182 | 2,089 | |
- | - | |
6.9 | 8.7 | |
2 months ago | 17 days ago | |
Jupyter Notebook | TypeScript | |
MIT License | MIT License |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
gpt-prompt-engineer
-
AI & Machine Learning on July 17th 2023 Recap: AI-Powered brain implants can spy on our thoughts; Amazon Created a New Generative AI Org; Objaverse-XL's 10M+ dataset set to revolutionize AI in 3D; Stable Doodle: Next chapter in AI art; gpt-prompt-engineer takes AI to heights; Ai Unraveled Podcast
The article from The Guardian discusses the rising issue of fake reviews generated by artificial intelligence tools, such as ChatGPT. These AI-generated reviews are becoming increasingly difficult to distinguish from genuine ones, posing new challenges for platforms like TripAdvisor, which identified 1.3 million fake reviews in 2022. AI tools are capable of producing highly plausible reviews for hotels, restaurants, and products in a variety of styles and languages. https://github.com/mshumer/gpt-prompt-engineer
-
GPT-Prompt-Engineer
It seems like a `ranking_system_prompt` prompt is used to rank the output of other prompts, which is pretty cool!
> Your job is to rank the quality of two outputs generated by different prompts. The prompts are used to generate a response for a given task. You will be provided with the task description, the test prompt, and two generations - one for each system prompt. Rank the generations in order of quality. If Generation A is better, respond with 'A'. If Generation B is better, respond with 'B'. Remember, to be considered 'better', a generation must not just be good, it must be noticeably superior to the other. Also, keep in mind that you are a very harsh critic. Only rank a generation as better if it truly impresses you more th>an the other. Respond with your ranking, and nothing else. Be fair and unbiased in your judgement.
Source: https://github.com/mshumer/gpt-prompt-engineer/blob/main/gpt...
- Using a GPT-4 AI agent to create optimal prompts[Open-source]
ChainForge
- ChainForge is an open-source visual prompt engineering programming environment
- AI for ChainForge Beta
-
Anthropic Claude for Google Sheets
This seems like a sheets implementation of something like ChainForge (https://github.com/ianarawjo/ChainForge). Curious that Anthropic is entering the LLMOps tooling space ---this definitely comes as a surprise to me, as both OpenAI and HuggingFace seem to avoid building prompt engineering tooling themselves. Is this a business strategy of Anthropic's? An experiment? Regardless, it's very cool to see a company like them throw their hat into the LLMOps space beyond being a model provider. Interested to see what comes next.
- ChainForge, a visual programming environment for prompt engineering
-
I asked 60 LLMs a set of 20 questions
ChainForge has similar functionality for comparing : https://github.com/ianarawjo/ChainForge
LocalAI creates a GPT-compatible HTTP API for local LLMs: https://github.com/go-skynet/LocalAI
Is it necessary to have an HTTP API for each model in a comparative study?
- Show HN: Knit – A Better LLM Playground
-
Show HN: ChainForge, a visual tool for prompt engineering and LLM evaluation
I think you should probably mention that its source is available! [0]
I don't personally have a need for this right now, but I can really see the use for the parameterised queries, as well as comparisons across models.
Thanks for your efforts!
0: https://github.com/ianarawjo/ChainForge
- Continue multiple conversations simultaneously across multiple LLMs
- ChainForge now supports chat evaluation
-
GPT-Prompt-Engineer
No problem! I guess I will make a plug myself --we've been working on a similar 'prompt engineering', ChainForge (https://github.com/ianarawjo/ChainForge). It's targeted towards slightly different users and use cases than promptfoo --probably more geared towards early-stage, 'quick-and-dirty' prompting explorations of differences between prompts and models for less experience programmers, versus the kind of continuious benchmarking and verification testing that promptfoo offers.
I particularly like promptfoo's support for CI, which I haven't seen anywhere else, and is very important for developers pushing prompts into production (esp since OpenAI keeps updating their models every few months...).
What are some alternatives?
gpt-engineer - Specify what you want it to build, the AI asks for clarification, and then builds it.
langflow - ⛓️ Langflow is a visual framework for building multi-agent and RAG applications. It's open-source, Python-powered, fully customizable, model and vector store agnostic.
promptfoo - Test your prompts. Evaluate and compare LLM outputs, catch regressions, and improve prompt quality. [Moved to: https://github.com/promptfoo/promptfoo]
agenta - The all-in-one LLM developer platform: prompt management, evaluation, human feedback, and deployment all in one place.
promptfoo - Test your prompts, agents, and RAGs. Use LLM evals to improve your app's quality and catch problems. Compare performance of GPT, Claude, Gemini, Llama, and more. Simple declarative configs with command line and CI/CD integration.
DocGPT - 💻📚💡 DoctorGPT provides advanced LLM prompting for PDFs and webpages. [Moved to: https://github.com/FeatureBaseDB/DoctorGPT]
litellm - Call all LLM APIs using the OpenAI format. Use Bedrock, Azure, OpenAI, Cohere, Anthropic, Ollama, Sagemaker, HuggingFace, Replicate (100+ LLMs)
fiddler-auditor - Fiddler Auditor is a tool to evaluate language models.
GodMode - AI Chat Browser: Fast, Full webapp access to ChatGPT / Claude / Bard / Bing / Llama2! I use this 20 times a day.
bench - A tool for evaluating LLMs
Flowise - Drag & drop UI to build your customized LLM flow