tatoeba2
Frontpage
tatoeba2 | Frontpage | |
---|---|---|
47 | 455 | |
669 | 48 | |
1.5% | - | |
0.0 | 4.3 | |
1 day ago | 5 months ago | |
PHP | PHP | |
GNU Affero General Public License v3.0 | GNU General Public License v3.0 only |
Stars - the number of stars that a project has on GitHub. Growth - month over month growth in stars.
Activity is a relative number indicating how actively a project is being developed. Recent commits have higher weight than older ones.
For example, an activity of 9.0 indicates that a project is amongst the top 10% of the most actively developed projects that we are tracking.
tatoeba2
-
The AI Revolution Is Crushing Thousands of Languages
Alternate take, it can also help people learn niche languages if native speakers contribute to data sets. For example, I've been using Clozemaster for the past few months as a way to work on vocabulary on some languages, and they pull their dataset from Tatoeba [1]. I was very surprised to see that my father's native language, Kabylie, which is admittedly a somewhat niche language, is one of the top languages by sentence contribution in the dataset (over 700k entries, more than French or Spanish or German). I showed him the sentences once and he confirmed that yes, they all seem like what a native speaker would say. Not all of them have translations into other languages of course, and a lot of them are slight variations on each other, but some native speakers are there contributing. It's not currently an option to use in Clozemaster -- I'm guessing the TTS isn't really there -- but I totally could see these as gaps that are easily filled.
Same with my wife's native language (Bengali). There are surprisingly few language learning resources for Bangla, even though it's the 7th most spoken language in the world. But there it is in the data set with TTS and the ability for Clozemaster to have ChatGPT "explain" what's going on in the sentence (a very useful feature for new speakers).
Anyway, I don't view AI as good or bad, just another tool that we should be intentional about when we cultivate the data sets underlying the tool.
[1] https://tatoeba.org
-
Where can I find reliable example sentences?
Maybe on tatoeba.org with filters
-
Best vocab (not writing) app
I use both. I make a lot of my own cards so I get to focus on the vocab I want. Generally find a word I want to learn, use https://forvo.com/ to find native audio for it, then use https://tatoeba.org/ to find sentences use that word. Once you get a bit of practise it's pretty quick to make a word note, then make 2 or 3 sentence notes for it*. However I do use some pre-made decks like this set of sentence decks for each HSK level with native audio: https://ankiweb.net/shared/byauthor/933449107
-
Anyone else spend heaps of time searching for sentences for Anki?
You can try tatoeba https://tatoeba.org but I don't know if it's good with arabic ...
-
GPT-4's toki pona capabilities
tatoeba if anything because that has sentences so at least a modicum of context
-
Is there an app or website where I can paste a word/phrase and get examples of how it’s used in a sentence?
I use Tatoeba https://tatoeba.org : it's a collection of phrases with sometimes translations and audio recordings. You could use Forvo but it's only audio recordings.
- maneiras de falar "no pasa nada / it's okay/all right" em BR-PT?
- How do I get audio data from from native speakers for Anki?
-
anyone know a site like Reverso but for simpler sentences?
As someone else suggested, Tatoeba is also a good option. Nowadays, I use it less and less because I prefer the more didactic sentences found on online dictionaries. Nonetheless, it's still very good, especially due to the sheer quantity of sentences you can find there.
-
Nihongo Lessons has launched on the App Store
Appearances in the Tatoeba example sentence database.
Frontpage
-
Open source at Fastly is getting opener
Through the Fast Forward program, we give free services and support to open source projects and the nonprofits that support them. We support many of the world’s top programming languages (like Python, Rust, Ruby, and the wonderful Scratch), foundational technologies (cURL, the Linux kernel, Kubernetes, OpenStreetMap), and projects that make the internet better and more fun for everyone (Inkscape, Mastodon, Electronic Frontier Foundation, Terms of Service; Didn’t Read).
-
Dear writers: Delete your Findaway Voices account NOW
Terms of service are generally pretty shitty, yes. But this is egregiously shitty.
https://tosdr.org/ is a good site to compare. Any service over Grade E (Spotify, Facebook, the usual suspects) is (very likely to be) less bad. DeviantArt for example is a D, and doesn't include waiving your moral rights among some of the other overreach.
Some service terms are actually quite good (DuckDuckGo, Mullvad, off the top of my head). Though these aren't content sharing platforms so it's not really as fair of a comparison.
- Meta’s new AI image generator was trained on 1.1 billion Instagram and Facebook photos
-
what is something humans were never meant to see?
This is super useful https://tosdr.org/
- I created a free tool that explains privacy policies to users.
-
State of Online Privacy Reaches 'Creepy' Level
> Meaningful consent is becoming increasingly difficult for consumers; for instance ...
https://tosdr.org is good for that, why don't Mozilla just contribute to an existing project
-
[READ BODY TEXT BEFORE VOTING] Thoughts regarding online tracking?
I can't give you a complete guide here, but I recommend you go to privacy subreddits or watch relevant Youtube videos for more info. I also recommend sites like privacytools.io and privacyguides.org They contain lists of alternatives and tools. Also check out tosdr.org which contains summaries of the TOS of a ton of sites. Also try email aliases like simplelogin or anonaddy. Use burner emails for throwaways if possible emailnator.com or tempail.com . Try to use as many open-source applications as possible. You can even self-host certain things.
-
Unity Silently Deletes GitHub Repo That Tracks Terms of Service Changes
I think what you're looking for is TOSDR (Terms of Service, Didn't Read): https://tosdr.org
It's been going for several years and has very thorough analysis of various ToS, done by volunteers who are often legal professionals.
- Ask HN: Why did Microsoft, Meta, and PayPal update their ToS today?
- Ask HN: What is behind the recent wave of Terms of Service changes?
What are some alternatives?
gutensearch - Search engine for Project Gutenberg books
privacyguides.org - Protect your data against global mass surveillance programs.
river-runner - Uses USGS/MERIT Basin data to visualize the path of a rain droplet to its endpoint.
Windows11DragAndDropToTaskbarFix - "Windows 11 Drag & Drop to the Taskbar (Fix)" fixes the missing "Drag & Drop to the Taskbar" support in Windows 11. It works with the new Windows 11 taskbar and does not require nasty changes like UndockingDisabled or restoration of the classic taskbar.
FrequencyWords - Repository for Frequency Word List Generator and processed files
duckduckgo-locales - Translation files for <a href="https://duckduckgo.com"> </a>
rum - Simple, decomplected, isomorphic HTML UI library for Clojure and ClojureScript
Hacker-Typer - Hacker Typer is a fun joke for every person who wants to look like a cool hacker!
savepagenow - A simple Python wrapper and command-line interface for archive.org’s "Save Page Now" capturing service
TermuxBlack - Termux repository for hacking tools and packages
stylegan2-pytorch - Simplest working implementation of Stylegan2, state of the art generative adversarial network, in Pytorch. Enabling everyone to experience disentanglement
CyberChef - The Cyber Swiss Army Knife - a web app for encryption, encoding, compression and data analysis