Derry’s Top Tips
Get the best results when removing vocals and extracting STEMS

Derry explains how to get the best vocal removal and stem separation from a music trac

Vocal removal and stem extraction software has come a long way. Recent advances using artificial intelligence have improved the quality of separations for DJs, audio engineers, and musicians trying to separate pre-existing mixes into vocal, drums, bass, and other instrumental stems albeit coming from a low-quality baseline.

Most separation software offer single-click options to perform vocal removal or separation of other instruments, but there are still some fundamentals that you can follow to improve the results that you can obtain.

Here, we’ll look at some of the important points you need to keep in mind to get a useable vocal or instrument separation, with minimal artifacts leftover that where possible requires minimal post-production so you can get straight to using it for your project.

Tip #1: File Quality 
The quality of your output files has a direct correlation to the quality and format of the file you are putting in. The golden rule generally in separation and extraction is “Bad input produces even worse output”. 

Expecting to upload a compressed 128kb MP3 with lossy compression is a recipe for disappointment. Lossy audio codecs focus on figuring out the parts you cannot really hear in the track and discard much of the information related to these parts. When performing a separation, these low-volume parts can often become audible, and if they have been removed then you end up hearing artifacts instead. So, running a lossy track through a vocal separation process will lead to a compounded effect in a reduction of quality.

For best quality output, it’s always recommended to use the highest quality input file, such as a WAV file, or a format encoded in a lossless format (such as FLAC). This will mean that newer more advanced algorithms powered by machine learning will be better able to lock onto the vocal track cleanly.

Tip #2: Shorter snippets are not always better
Say you just want to extract a single vocal phrase or extract a single drum loop from a full song. To get quick results, it is tempting to edit the file down to just the snippet you want and process just that section. However, this can actually lead to sub-optimal separations – the algorithms often require some context to get good results. It is actually better to leave in a few seconds of audio before and after the excerpt you want to extract and then edit this afterward.

When extracting drums, the loop you are looking for will often be repeated at several points throughout a song, and the result you get will depend on what instruments & vocals are happening at the same time. While a good rule of thumb is to choose sections where there are as few instruments playing as possible at the same time, your results will also depend on how well those instruments are captured by the deep learning model.

I would recommend running either the full song or at least a couple of minutes of audio through the algorithm. This generally increases your chances of getting a good drum loop.

Tip #3: A little (clean-up) goes a long way.
While many online vocal removers offer one-click separations, depending on the underlying technology there can often be interference or artifacts remaining in the separated vocal. Some such as silencing bleed/artifacts in regions where there are no vocals active can be removed easily. However, often this is not enough. To get really clean results you may need to use spectral editing or filtering. While spectral editing can take time to get used to and master, it can really transform your final result.

Tip #4: Let masking be your friend
Even after spectral editing there can still be audible low-volume artifacts. At this stage, you can be inclined to give up, and that the separation is unusable.  However, once you place the separated vocal or drum loop in a new mix you might be pleasantly surprised. Many users of such software don’t realise that those remaining artifacts will often be masked out by the presence of other instruments in a new mix. Be brave with the separated stems and throw them into your new mix – you will be amazed how useable they are in the right context!

At AudioSourceRE, we have created some of the most advanced AI vocal removal software on the market. Demix Pro is available on a fully-featured 7-day trial offering all our separation technologies as well as advanced spectral editing to create the most polished results. Demix Essentials offers our main separation algorithms in an easy to use 4 STEM package but doesn’t contain some of the advanced features such as spectral editing.

Give them a go and see how good a result you can obtain! We would love to hear about your STEM separation challenges at [email protected]