-
wit
WIT (Wikipedia-based Image Text) Dataset is a large multimodal multilingual dataset comprising 37M+ image-text sets with 11M+ unique images across 100+ languages. (by google-research-datasets)
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
There's the Wikipedia Image Text dataset, which has many languages (including English and simple English) aswell as a TF datasets wrapper. https://github.com/google-research-datasets/wit
NOTE:
The number of mentions on this list indicates mentions on common posts plus user suggested alternatives.
Hence, a higher number means a more popular project.
Related posts
-
AI enthusiasm #9 - A multilingual chatbot📣🈸
-
What contributing to Open-source is, and what it isn't
-
Show HN: Next-token prediction in JavaScript – build fast LLMs from scratch
-
PullRequestBenchmark Challenge: Can AI Replace Your Dev Team?
-
PRBenchmark – Expert PR Review Capabilities Equals Expert PR Creation Capability