January 22, 2024

How to: Audio Visualisation with FFmpeg

Learn how to create audio visualizations for your music videos in a few simple steps, courtesy of one of Splash's software engineers, Mark Daunt.

Many of you have probably seen some cool examples of audio visualization on Youtube.

There are services that can produce really nice visualizations for you at a cost. For example https://wavve.co/. Another popular approach is to use Adobe After Effects https://blog.motionisland.com/create-an-after-effects-audio-spectrum-visualizer/

There are also plenty of code examples if you are just looking to show a visualization on a web page or in an app (https://github.com/willianjusten/awesome-audio-visualization).

The audio visualizations on the videos we generate for Splash Pro use FFmpeg and a few other techniques. In this article I’ll show you how to create a cool audio visualization video from any audio file with just a few FFmpeg commands.

Setting up FFmpeg

FFmpeg is an extremely versatile and powerful multimedia command line framework to handle audio and video conversion and manipulation tasks.

1. Installing FFmpeg:

The first step is to download and install FFmpeg on your computer. FFmpeg is available for various operating systems including Windows, macOS, and Linux. Visit the official FFmpeg website (https://ffmpeg.org) and follow the instructions for your specific platform to download the latest release. Once downloaded, run the installer and follow the on-screen prompts to complete the installation.

2. Configuring FFmpeg:

Once installed, FFmpeg requires some additional configuration. On Windows, you need to add the FFmpeg directory to your system's PATH variable. To do this, right-click on "This PC" or "My Computer," select "Properties," click on "Advanced system settings," and then press the "Environment Variables" button. Under the "System variables" section, locate the "Path" variable, click "Edit," and add the path to the FFmpeg bin directory.

The Simplest Music Video

1. Find a song and background image

The simplest way to get a video from an audio file is to use a static background image and use FFmpeg to combine the audio with the image.

If you want some audio to start with,  you can get Splash Pro’s AI to generate a song for you with a simple text prompt: https://pro.splashmusic.com/

Alternatively, you can simply use any audio file you might have, or use this one that I've prepared for you:

As for a background image, you can grab one of your choice or try the one below (save it as background.png):

2. Find out the duration of your audio file

Now you need to know the duration of the audio file. You can use ffprobe for this - this should have been installed when you installed FFmpeg:

ffprobe -v error -show_entries format=duration  song.mp3 

This should give you something like this:

duration=16.039184

Now you can use this duration to produce the video:

ffmpeg -loop 1 -i background.png -i song.mp3 -c:v libx264 -t 16.039184 -pix_fmt yuv420p basic_video.mp4

Here’s an example of what you should now have:

Simple Audio Visualizer with FFMpeg:

Given an audio file, there is a pretty simple command that you can use to get started on a basic visualiser video.

FFmpeg has a built-in command to create a waveform from a given audio track. So download the sample song and put it in a location on your computer. Or copy your own file to a known location and then try this command:

ffmpeg -i song.mp3 -filter_complex "aformat=channel_layouts=mono,showwaves=mode=cline:s=904X904:colors=White[v]" -map "[v]" -pix_fmt yuv420p song_viz.mp4 

Then open the resulting file, song_viz.mp4 and you should see something like this:

A few notes about the command we just used: 

-i  - specifies the path to the audio file.

-filter_complex "aformat=channel_layouts=mono,showwaves=mode=cline:s=904X904:colors=White[v]": This defines a complex filter that processes the audio and generates a visual representation of the waveform. Here, aformat=channel_layouts=mono sets the audio to mono and showwaves=mode=cline:s=904X904:colors=White creates a waveform visualization with specific size and color settings.

-map "[v]": Selects the output from the complex filter defined in the previous argument and maps it as the video stream for the output file

-pix_fmt yuv420p: Sets the pixel format of the output video to yuv420p. This format is commonly used for compatibility with various devices and players.

Finally:
song_viz.mp4 is the output path. You can read a full discussion of the various options for generating waveforms with FFmpeg here: https://trac.ffmpeg.org/wiki/Waveform

Now, you might have noticed that the audio from the original file has not been included in the mp4 video. In order to include it, you could modify the FFmpeg command to include it. But we already have our basic video with the background image and audio combined.

So let’s just take it one step further and overlay the visualization video on top of the basic video we created earlier the following command to combine the original audio with the video:

ffmpeg -i basic_video.mp4 -i song_viz.mp4 -filter_complex "[1:v]colorkey=Black:0.4:0.5[ckout];[0:v][ckout]overlay[outv]" -map "[outv]" -map "0:a" -pix_fmt yuv420p basic_viz_video.mp4

In this command:

-i basic_video.mp4: Specifies the video file generated in the previous step as the primary input.

-i song_viz.mp4: Specifies the overlay video.

-filter_complex - In this context, it's used to apply the colorkey filter to the second input and then overlay it onto the first input

- Black: This is the color to make transparent. You can also specify  colors in hex format like 0xFFFFFF.

- 0.4 : This is the similarity (0 to 1.0), lower value means only colors very close to the specified color will be made transparent.

- 0.5 : This is the blend (0 to 1.0), factor for smoothing the edges. Play around with similarity and blend values for fine-tuning. You might need to experiment a bit to get the best result.

[ckout] - This outputs the result to a link named `[ckout]` for use in the next filtergraph.

[0:v][ckout]overlay[outv] - This is the second filtergraph inside of the `filter_complex` option:

[0:v][ckout] - This specifies the first input's video stream and the output of the previous filtergraph as inputs.

overlay - This applies the overlay filter, which overlays the second input on the first input.

[outv] - This outputs the result to a link named [outv].

-map "[outv]" - This tells ffmpeg to include the video stream linked to [outv] in the output file.

The result should look something like this:

Circular Audio Visualizer with FFmpeg:

Now that’s a good start. But let's take it up a notch and try something a bit more clever. Let’s wrap that video in a circular pattern, like you see in so many examples. Using the output from the previous command, it only takes one more command to wrap the previous visualisation video in a circle:

ffmpeg -i song_viz.mp4 -filter_complex "format=rgba,geq='p(mod((2*W/(2*PI))*(PI+atan2(0.5*H-Y,X-W/2)),W), H-2*hypot(0.5*H-Y,X-W/2))'" -pix_fmt yuv420p song_viz_circle.mp4

The following “geq” filter does all the magic:

-filter_complex "format=rgba,geq='p(mod((2*W/(2*PI))*(PI+atan2(0.5*H-Y,X-W/2)),W), H-2*hypot(0.5*H-Y,X-W/2))'"

This effectively wraps the original video in a semicircle and repeats the pattern to form a complete circle. The geq filter allows you to map the pixels from the original video to new pixel locations in the new video. In this case we are using trigonometric calculations to map the points. The result is a circular video with a black background and an alpha (transparent) channel. So if you overlay this video onto another, the background will be transparent and the background video will show through.

Opening the output in a video app, you should see something like this:

Official documentation on the geq filter can be found here: https://ffmpeg.org/ffmpeg-filters.html#geq

Almost there...

Now the last thing we need to do is overlay that last circular visualisation onto our original basic video in a similar way as we did before, but using the song_circle_viz.mp4 as the overlay:

ffmpeg -i basic_video.mp4 -i song_viz_circle.mp4 -filter_complex "[1:v]colorkey=Black:0.4:0.5[ckout];[0:v][ckout]overlay[outv]" -map "[outv]" -map "0:a" -pix_fmt yuv420p final_viz_video.mp4

So that’s it for now. We use these simple techniques to get a starting point for our visualizations in Splash Pro. Stay tuned for more tips - coming soon!

============

About Mark Daunt

I'm a software engineer at Splash. My speciality is sound and music computing, and have a master's degree in that area from Queen Mary University of London. In my spare time I foster cats, record music, play guitar and pickleball.