Riffusion

Riffusion
Developer(s)	Seth Forsgren; Hayk Martiros;
Initial release	December 15, 2022
Repository	github.com/hmartiro/riffusion-inference
Written in	Python
Type	Text-to-image model
License	MIT License
Website	riffusion.com

Riffusion is a neural network, designed by Seth Forsgren and Hayk Martiros, that generates music using images of sound rather than audio.^[1] It was created as a fine-tuning of Stable Diffusion, an existing open-source model for generating images from text prompts, on spectrograms.^[1] This results in a model which uses text prompts to generate image files, which can be put through an inverse Fourier transform and converted into audio files.^[2] While these files are only several seconds long, the model can also use latent space between outputs to interpolate different files together.^[1]^[3] This is accomplished using a functionality of the Stable Diffusion model known as img2img.^[4]

Generated spectrogram from the prompt "bossa nova with electric guitar" (top), and the resulting audio after conversion (bottom)

The resulting music has been described as "de otro mundo" (otherworldly),^[5] although unlikely to replace man-made music.^[5] The model was made available on December 15, 2022, with the code also freely available on GitHub.^[2] It is one of many models derived from Stable Diffusion.^[4]

Riffusion is classified within a subset of AI text-to-music generators. In December 2022, Mubert^[6] similarly used Stable Diffusion to turn descriptive text into music loops. In January 2023, Google published a paper on their own text-to-music generator called MusicLM.^[7]^[8]

References edit

^ ^a ^b ^c Coldewey, Devin (December 15, 2022). "Try 'Riffusion,' an AI model that composes music by visualizing it".
^ ^a ^b Nasi, Michele (December 15, 2022). "Riffusion: creare tracce audio con l'intelligenza artificiale". IlSoftware.it.
^ "Essayez "Riffusion", un modèle d'IA qui compose de la musique en la visualisant". December 15, 2022.
^ ^a ^b "文章に沿った楽曲を自動生成してくれるAI「Riffusion」登場、画像生成AI「Stable Diffusion」ベースで誰でも自由に利用可能". GIGAZINE.
^ ^a ^b Llano, Eutropio (December 15, 2022). "El generador de imágenes AI también puede producir música (con resultados de otro mundo)".
^ "Mubert launches Text-to-Music interface – a completely new way to generate music from a single text prompt". December 21, 2022.
^ "MusicLM: Generating Music From Text". January 26, 2023.
^ "5 Reasons Google's MusicLM AI Text-to-Music App is Different". January 27, 2023.

This artificial intelligence-related article is a stub. You can help Wikipedia by expanding it.

[techcrunch-1] Coldewey, Devin (December 15, 2022). "Try 'Riffusion,' an AI model that composes music by visualizing it".

[ilsoftware-2] Nasi, Michele (December 15, 2022). "Riffusion: creare tracce audio con l'intelligenza artificiale". IlSoftware.it.

[nouvelles-3] "Essayez "Riffusion", un modèle d'IA qui compose de la musique en la visualisant". December 15, 2022.

[gigazine-4] "文章に沿った楽曲を自動生成してくれるAI「Riffusion」登場、画像生成AI「Stable Diffusion」ベースで誰でも自由に利用可能". GIGAZINE.

[deporticos-5] Llano, Eutropio (December 15, 2022). "El generador de imágenes AI también puede producir música (con resultados de otro mundo)".

[mubert-6] "Mubert launches Text-to-Music interface – a completely new way to generate music from a single text prompt". December 21, 2022.

[musiclm-7] "MusicLM: Generating Music From Text". January 26, 2023.

[text2music-8] "5 Reasons Google's MusicLM AI Text-to-Music App is Different". January 27, 2023.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

Developer(s)	Seth Forsgren Hayk Martiros
Initial release	December 15, 2022
Repository	github.com/hmartiro/riffusion-inference
Written in	Python
Type	Text-to-image model
License	MIT License
Website	riffusion.com