Pushing ChatGPT's Structured Data Support to Its Limits

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

InfluxDB - Power Real-Time Data Analytics at Scale
Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
www.influxdata.com
featured
SaaSHub - Software Alternatives and Reviews
SaaSHub helps you find the best software and product alternatives
www.saashub.com
featured
  • outlines

    Structured Text Generation

  • > ery few open-source LLMs explicitly claim they intentionally support structured data, but they’re smart enough and they have logically seen enough examples of JSON Schema that with enough system prompt tweaking they should behave.

    Open source models are actually _better_ at structured outputs because you can adapt them using tools like JSONFormer et al (https://www.reddit.com/r/LocalLLaMA/comments/17a4zlf/reliabl...). The structured outputs can be arbitrary grammars, for example, not just JSON (https://github.com/outlines-dev/outlines#using-context-free-...).

  • hackweek-2023-12

    Hack Week Projects

  • FWIW, I've seen stronger performance from gpt-4-1106-preview when I use "response_type: JSON" (and providing a typescript schema in context), rather than using the "tools" API.

    More flexible, and (evaluating non-scientifically!) qualitatively better answers & instruction following.

    Example from a hack week project earlier this month: https://github.com/microsoft-healthcare-madison/hackweek-202...

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • FWIW, I've seen stronger performance from gpt-4-1106-preview when I use "response_type: JSON" (and providing a typescript schema in context), rather than using the "tools" API.

    More flexible, and (evaluating non-scientifically!) qualitatively better answers & instruction following.

    Example from a hack week project earlier this month: https://github.com/microsoft-healthcare-madison/hackweek-202...

  • NexusRaven-V2

  • gorilla

    Gorilla: An API store for LLMs

  • * Gorilla [https://github.com/ShishirPatil/gorilla]

    Could be interesting to try some of these exercises with these models.

  • instructor

    structured outputs for llms

  • I've been using the instructor[1] library recently and have found the abstractions simple and extremely helpful for getting great structured outputs from LLMs with pydantic.

    1 https://github.com/jxnl/instructor/tree/main

  • langroid

    Harness LLMs with Multi-Agent Programming

  • we (like simpleaichat from OP) leverage Pydantic to specify the desired structured output, and under the hood Langroid translates it to either the OpenAI function-calling params or (for LLMs that don’t natively support fn-calling), auto-insert appropriate instructions into tje system-prompt. We call this mechanism a ToolMessage:

    https://github.com/langroid/langroid/blob/main/langroid/agen...

    We take this idea much further — you can define a method in a ChatAgent to “handle” the tool and attach the tool to the agent. For stateless tools you can define a “handle” method in the tool itself and it gets patched into the ChatAgent as the handler for the tool.

  • SaaSHub

    SaaSHub - Software Alternatives and Reviews. SaaSHub helps you find the best software and product alternatives

    SaaSHub logo
NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts