Our great sponsors
-
InfluxDB
Power Real-Time Data Analytics at Scale. Get real-time insights from all types of time series data with InfluxDB. Ingest, query, and analyze billions of data points in real-time with unbounded cardinality.
Video is already possible as is shown by e.g. SALT_VERSE [1] (which may or may not use SD/DALL-E/... but that is not that interesting), what is not there yet is a --txt2vid script option. Implementing this would not be that hard but the processing time needed is quite substantial.
Something else which would be possible is the use of a model like SD in combination with a frame interpolation model like [2] as a video generator. Use SD to generate key frames, feed these to FILM and let it generate the intermediate frames and you should get video.
[1] https://twitter.com/SALT_VERSE
[2] https://film-net.github.io/
The pieces are coming into place https://github.com/microsoft/VideoX/tree/master/X-CLIP