Questions about memory, tree-of-thought, planning

This page summarizes the projects mentioned and recommended in the original post on /r/LocalLLaMA

Our great sponsors
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • WorkOS - The modern identity platform for B2B SaaS
  • SaaSHub - Software Alternatives and Reviews
  • tree-of-thought-prompting

    Using Tree-of-Thought Prompting to boost ChatGPT's reasoning

  • 2 - Probably too early in testing and development for there to be a 'standard'. A quick google search will find you some stuff to read like https://github.com/dave1010/tree-of-thought-prompting, but your best bet is to read through the stuff other people are doing and try things for yourself. You might end up discovering something new that nobody has thought of yet. Kaio Ken literally just changed the game overnight and figured out how to expand context to 8k for llama-based models with 2 lines of code. Things are evolving fast and the community desperately needs people willing to spend time reading papers on Arxiv, digging through githubs, and testing stuff out.

  • Weaviate

    Weaviate is an open-source vector database that stores both objects and vectors, allowing for the combination of vector search with structured filtering with the fault tolerance and scalability of a cloud-native database​.

  • I tried cromadb but had terrible performance and could not pin down the cause (likely a problem on my end). Weaviate was easy to setup and had excellent performance, this is probably what I will use in the future. Next on my list is txtinstruct, to finetune a model with data that does not change and using a vector db for everything else seems promising.

  • InfluxDB

    Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.

    InfluxDB logo
  • tree-of-thought-llm

    [NeurIPS 2023] Tree of Thoughts: Deliberate Problem Solving with Large Language Models

  • I have not tried this myself but you can find the original paper here, and here is the implementation. But I have also read that you can simply give 8-10 examples to a instruction following LLM in the prompt. As a sidenote, most inference engines have an option to emulate the OpenAi API. There is also LocalAI explicitly for that purpose.

  • txtinstruct

    📚 Datasets and models for instruction-tuning

  • I tried cromadb but had terrible performance and could not pin down the cause (likely a problem on my end). Weaviate was easy to setup and had excellent performance, this is probably what I will use in the future. Next on my list is txtinstruct, to finetune a model with data that does not change and using a vector db for everything else seems promising.

  • LocalAI

    :robot: The free, Open Source OpenAI alternative. Self-hosted, community-driven and local-first. Drop-in replacement for OpenAI running on consumer-grade hardware. No GPU required. Runs gguf, transformers, diffusers and many more models architectures. It allows to generate Text, Audio, Video, Images. Also with voice cloning capabilities.

  • I have not tried this myself but you can find the original paper here, and here is the implementation. But I have also read that you can simply give 8-10 examples to a instruction following LLM in the prompt. As a sidenote, most inference engines have an option to emulate the OpenAi API. There is also LocalAI explicitly for that purpose.

  • ChatRWKV

    ChatRWKV is like ChatGPT but powered by RWKV (100% RNN) language model, and open source.

  • Most LLMs actually do a decent job out of the box if you ask them for step by step instructions. Tree of tough is one way to improve the results, reflexion is another that can be used separate or additionally. The downside is that most models will run quickly into their token limit (around 2k for most). However the new SuperHot models can handle up to 8k and then there are the RMVK-Raven models, they are RNNs and not transformers like all the other LLMs and can theoretically handle infinite context lengths (but they loose "focus" after a while).

  • guidance

    Discontinued A guidance language for controlling large language models. [Moved to: https://github.com/guidance-ai/guidance] (by microsoft)

  • Took a quick look, jsonformer uses the huggingface transformers library, it is focused on research and for performance out of the box. I would take a look at guidance, it is inference engine agnostic and it can generate any structured output you want, not only json. This has the added benefit that you don't need to use additional software (or plugins) for every API you might want to use in the future. Here is how that would look like for json.

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

    WorkOS logo
  • llama-retrieval-plugin

    LLaMa retrieval plugin script using OpenAI's retrieval plugin

  • Mhh, a viable alternative might be to setup a locally hosted search engine, like SearX and then use a LLM in conjunction with the llama-retrieval-plugin, that also gives you a database that is human readable and the LLM can give direct links to the source.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts