[D] Very long sequence data (books) understanding?

Our great sponsors

WorkOS - The modern identity platform for B2B SaaS

InfluxDB - Power Real-Time Data Analytics at Scale

SaaSHub - Software Alternatives and Reviews

Our great sponsors

storium-backend

4 8 0.0 Python

Source code for the web backend for hosting story generation models in the EMNLP 2020 paper "STORIUM: A Dataset and Evaluation Platform for Human-in-the-Loop Story Generation"

I released a dataset of stories that are 19K tokens on average, but the longest are over a million. Our human evaluations show that relevance is the biggest factor in whether authors decide to use model generated text in their story, making this a good platform for assessing long document understanding and generation.

WorkOS

workos.com sponsored

The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project