[D] - Are there any AI benchmarks that involve successful longterm problem solving when running as autonomous agents (like in autogpt)? How do we compare the effectiveness of models as agents?

This page summarizes the projects mentioned and recommended in the original post on /r/MachineLearning

SurveyJS - Open-Source JSON Form Builder to Create Dynamic Forms Right in Your App
With SurveyJS form UI libraries, you can build and style forms in a fully-integrated drag & drop form builder, render them in your JS app, and store form submission data in any backend, inc. PHP, ASP.NET Core, and Node.js.
surveyjs.io
featured
InfluxDB - Power Real-Time Data Analytics at Scale
Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
www.influxdata.com
featured
  • Voyager

    An Open-Ended Embodied Agent with Large Language Models (by MineDojo)

  • Does this beat the voyager? I read about it and wondered what if we add a skill library to langchain/llamaindex agents. It could be the same vector store for storing static data but after each task is performed, the agent will evaluate and archive the recipe of steps to perform a new task. Next time when the agent is asked to perform a task, it can just look at the library to retrieve a recipe. Unlike traditional fine tuning, you dont update the model parameters, these recipes are much more interpretable and can be manually edited/inserted by humans. There may also be an automatic way to convert wikihow articles or youtube tutorials into recipes.

  • SurveyJS

    Open-Source JSON Form Builder to Create Dynamic Forms Right in Your App. With SurveyJS form UI libraries, you can build and style forms in a fully-integrated drag & drop form builder, render them in your JS app, and store form submission data in any backend, inc. PHP, ASP.NET Core, and Node.js.

    SurveyJS logo
NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts

  • Is there any game that allow us to interact with it by python?

    2 projects | /r/reinforcementlearning | 1 Dec 2023
  • A Coder Considers the Waning Days of the Craft

    2 projects | news.ycombinator.com | 13 Nov 2023
  • Open/Local LLM support for MineDojo/Voyager

    4 projects | /r/LocalLLaMA | 11 Oct 2023
  • Voyager – Minecraft Embodied Agent with Large Language Models

    1 project | news.ycombinator.com | 17 Sep 2023
  • GPT-4 was set free in Minecraft, here's what happened next...

    1 project | /r/ArtificialInteligence | 6 Jul 2023