Our great sponsors
-
hiertext
The HierText dataset contains ~12k images from the Open Images dataset v6 with large amount of text entities. We provide word, line and paragraph level annotations.
-
common-voice
Common Voice is part of Mozilla's initiative to help teach machines how real people speak.
-
SurveyJS
Open-Source JSON Form Builder to Create Dynamic Forms Right in Your App. With SurveyJS form UI libraries, you can build and style forms in a fully-integrated drag & drop form builder, render them in your JS app, and store form submission data in any backend, inc. PHP, ASP.NET Core, and Node.js.
I was expecting to be pessimistic, but Google actually releases the datasets under a permissive license (CC-By 4.0). Awesome!
https://research.google/tools/datasets/open-images-extended-...
https://github.com/google-research-datasets/hiertext
One of the tasks in this app is audio validation:
> Audio validation: Listen to a short audio clip and determine if the pronunciation sounds natural in your language.
If this is something you're interested in doing, I recommend contributing to Mozilla's Common Voice instead. Common Voice builds freely licensed (CC-0) voice datasets that can be used by everyone, not just Google.
https://commonvoice.mozilla.org