Our great sponsors
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
-
stylegan2-pytorch
Simplest working implementation of Stylegan2, state of the art generative adversarial network, in Pytorch. Enabling everyone to experience disentanglement
I'm a software engineer by profession and occasionally have reason to play with so-called Machine Learning (ML). I think the best show case of what's possible nowadays is the This X Does Not Exist fashion for generating permutations of arbitrary categories of say, human faces or even English words. Imagine a word that seemingly possesses all the natural characteristics of a word, but is not a word that actually exists, for example: trichurid. ML can produce infinite numbers of these.
It would seem ML is well placed to handle the world of phonemes (fundamental categories) and phones (permutations on those categories). Indeed, Facebook has a mature project and set of pre-trained ML models for something similar, if not identical: wav2vec (v2.0). If it's not identical then I think it'd be trivial to achieve. Wav2vec is trained to map the spoken word of a language to that language's particular writing system, see here for a specific example. However, we already have plenty of software that can convert writing systems to IPA. Whilst all that does connect a lot of dots, it's not exactly what I think the goal should be.
I'm a software engineer by profession and occasionally have reason to play with so-called Machine Learning (ML). I think the best show case of what's possible nowadays is the This X Does Not Exist fashion for generating permutations of arbitrary categories of say, human faces or even English words. Imagine a word that seemingly possesses all the natural characteristics of a word, but is not a word that actually exists, for example: trichurid. ML can produce infinite numbers of these.
It would seem ML is well placed to handle the world of phonemes (fundamental categories) and phones (permutations on those categories). Indeed, Facebook has a mature project and set of pre-trained ML models for something similar, if not identical: wav2vec (v2.0). If it's not identical then I think it'd be trivial to achieve. Wav2vec is trained to map the spoken word of a language to that language's particular writing system, see here for a specific example. However, we already have plenty of software that can convert writing systems to IPA. Whilst all that does connect a lot of dots, it's not exactly what I think the goal should be.