January 22, 2024

Splash Pro "Fast" & "AI" Explained

You can make music and songs with Splash Pro using two different types of generative AI. At Splash, we call these Gen-1 and Gen-2. But what does this really mean? Splash CTO Lex Toumbourou explains the difference.

Splash Pro is a fully-generative AI music product by Splash. You might have noticed that we offer two generative options on Splash Pro, called Gen-1 and Gen-2. You can toggle between them on our home page, right above the text prompt box. 

In this article, I’ll explain the differences between Gen-2 and Gen-1.

Before we begin, I want to note that all the models that power Gen-1 and Gen-2 are trained on an extensive music dataset 100% created and owned by Splash. Since we own all the music the models have been trained on, we can offer users unlimited commercial licenses on all the outputs. You can read more about how our licensing works here.

Gen-1 or "Fast mode"

Gen-1 is a text2music service that uses a loop-based algorithmic song generator with generative outpainting vocal capability. 

When you make a text query in Gen-1, it utilizes a search model to match the closest sample from our loop catalog. Then, when a sample is selected, we create a full song based on that sample using our algorithm song generator.

The music output is structured like a complete song, with an intro, verses and choruses. Or, in the case of EDM - with an intro, builds, drops, breaks, etc.

All songs maintain consistent keys, scales, bpm and even chord progression, which makes them easily compatible with existing musical or creative projects.

Our out-painting vocal technology allows users to add multiple vocalists to a song. We support singing across various styles and genres, as well as rapping. 

Every song created by Gen-1 is unique in composition. However, given the limited pool of loops it draws from, it may contain some of the same sounds as other users.

If you require a unique track or song for your project, we recommend using Gen-2.

Here’s a track I created with Gen-1. The prompt was: heavy orchestral piece for an epic scene

Gen-2 or "AI mode"

Gen-2 is the next generation of Splash's AI music tool, Splash Pro. It generates high-quality (44.1khz sample rate) stereo audio using neural networks. Every song created by Gen-2 is unique.

Like Gen-1, all songs generated by Gen-2 maintain consistent bpm, key and scale to keep pitch and tempo throughout. Therefore, the music it creates is usable within a DAW. Our vocal models work as outpainting tools for Gen-2; you can layer multiple vocals on the generated song. Note that some songs come straight out of the model with existing vocals rendered as instruments. We are working on giving users more control over this. 

There are a few aspects of Gen-2 that my team and I are actively working on resolving:

Our training data only covers some genres. We're strong in modern, urban genres like EDM, hip-hop, hyperpop, lo-fi, etc, but we have some gaps in genres like acoustic and classical. Our in-house music team is working on addressing this.

Some overfitting cases mean that some of your text prompt results may sound similar, even though they are generative and technically unique.

Gen-2 cannot generate song structures in the same way that Gen-1 does. We are prioritizing this work!

Here's a track I created with Gen-2. The prompt was: Liquid drum and bass, jazzy roller

In conclusion, Gen-1 is an algorithmic music tool with outpainting vocal capability, and Gen-2 is a fully generative AI music tool that features strict BPM, key and scale conditioning with the same vocal out-painting ability.

Experiment with both and see which one you like best!

============

About Lex Toumbourou

I'm the CTO of Splash. I'm a self-taught product engineer with nearly 20 years experience across Ops, web dev, data eng, ML and gaming! I also play guitar, make beats and spent many years pursuing a music career, to no avail. I live in Brisbane with my wife and dog, and I love learning and creating stuff.