Our great sponsors
-
heinsen_routing
Reference implementation of "An Algorithm for Routing Vectors in Sequences" (Heinsen, 2022) and "An Algorithm for Routing Capsules in All Domains" (Heinsen, 2019), for composing deep neural networks.
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
Simple and in hindsight, obvious:
1. Run the text through a document embeddding model and save the embedding.
2. Remove one token at a time, and compute the cosine similarity of the new document embedding to the original one.
3. Compute importance as a function of the change in cosine similarity.
Nice.
Also check out https://github.com/glassroom/heinsen_routing . It takes n embeddings and outputs m embeddings, and also gives you an n×m matrix with credit assignments, without having to remove tokens one by one, which can be prohibitively slow for long texts.