Our great sponsors
-
WorkOS
The modern identity platform for B2B SaaS. The APIs are flexible and easy-to-use, supporting authentication, user identity, and complex enterprise features like SSO and SCIM provisioning.
2) BERT learns a lot in its embeddings: the BERTOLOGY paper (https://arxiv.org/abs/2002.12327) provides a great in-depth look at some of the broader linguistic traits that BERT learns. Different layers often learn different patterns, so the embeddings aren't really interpretable, but you can use something like bertviz (https://github.com/jessevig/bertviz) to explore attention weights across layers for predetermined examples
NOTE:
The number of mentions on this list indicates mentions on common posts plus user suggested alternatives.
Hence, a higher number means a more popular project.
Related posts
- StreamingLLM: tiny tweak to KV LRU improves long conversations
- [D] Is there a tool that indicates which parts of the input prompt impact the LLM's output the most?
- Show HN: Fully client-side GPT2 prediction visualizer
- How to visualise LLMs ?
- Ask HN: Can someone ELI5 Transformers and the “Attention is all we need” paper