Is there a way to force output length smaller than x number of tokens w/o cut-off?

This page summarizes the projects mentioned and recommended in the original post on /r/LocalLLaMA

Our great sponsors
  • SurveyJS - Open-Source JSON Form Builder to Create Dynamic Forms Right in Your App
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • WorkOS - The modern identity platform for B2B SaaS
  • SillyTavern

    LLM Frontend for Power Users.

  • Oh, and for SillyTavern users, the latest version brought a useful feature called "Auto-Continue" where generation is auto-resumed after hitting max new tokens which prevents the cut-off sentences. I like to use that to ensure I get the response the model intended without having to set an extremely large max new token limit. Another, older feature of SillyTavern is "Trim Incomplete Sentences" which trims incomplete sentences, so at least you don't get partial responses that way.

  • SurveyJS

    Open-Source JSON Form Builder to Create Dynamic Forms Right in Your App. With SurveyJS form UI libraries, you can build and style forms in a fully-integrated drag & drop form builder, render them in your JS app, and store form submission data in any backend, inc. PHP, ASP.NET Core, and Node.js.

    SurveyJS logo
NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts