-
pixel2style2pixel
Official Implementation for "Encoding in Style: a StyleGAN Encoder for Image-to-Image Translation" (CVPR 2021) presenting the pixel2style2pixel (pSp) framework
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
I did something like this. Many image GAN papers have implementations on GitHub, just pick the model you want. State-of-the-art image translation is probably something like Pixel2Style2Pixel (https://github.com/eladrich/pixel2style2pixel). Note that there are also wave GANs and they have slightly(?) better audio on average. With image models, typically people input mel spectrograms, which discard the phase information (you could also input 2 channel images for the real and complex parts, but I haven't seen any projects that do that). `librosa` has functions for the Fourier transform and its inverse (Griffin Lim algorithm), but if you want high quality reconstructions try using a neural network solution like WaveGlow to do the inverse conversion (if you're training a GAN, you can fine-tune WaveGlow). The biggest bottleneck is data - get as much data as possible. Also check out /r/machinelearning.