Learning how dictionaries work

This page summarizes the projects mentioned and recommended in the original post on news.ycombinator.com

Our great sponsors
  • WorkOS - The modern identity platform for B2B SaaS
  • InfluxDB - Power Real-Time Data Analytics at Scale
  • SaaSHub - Software Alternatives and Reviews
  • this-word-does-not-exist

    This Word Does Not Exist

  • There are a couple starting points you could take. I spent a weekend hacking out a program that generates fake word/definition pairs with a transformer model set against a dictionary: https://youtu.be/XnJ2TKAn-Vk?t=1547. If you substitute fake words for real words and have a sufficiently accurate model you could quickly generate reasonable and novel definitions.

    There are more complete versions of this publicly available: https://github.com/turtlesoupy/this-word-does-not-exist

    > This would be amazing, for example, to run on a large corpus, generate the dictionary, and then run it again to find words that are used but not defined - not just in the original corpus but in the definitions too.

    I think this would be how you would gauge success of the model. That is to say, you would evaluate model accuracy on a set of held-out words with definitions that never appeared in your dictionary training set but appeared in context in your corpus. You would have to manually annotate whether or not the generated definition of these held out words was acceptable.

  • dictpress

    A stand-alone web server application for building and publishing full fledged dictionary websites and APIs for any language.

  • WorkOS

    The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.

    WorkOS logo
NOTE: The number of mentions on this list indicates mentions on common posts plus user suggested alternatives. Hence, a higher number means a more popular project.

Suggest a related project

Related posts