Anyone Can Sing Now: AI Voice Cloning

· By Will Harken

Anyone Can "Sing" Now: AI Voice Cloning

Before any pro singers get upset, let me clarify: It’s still crucial to be a skilled singer. Your unique tone and style are more important than ever. No one else has your voice. Its uniqueness could help you stand out in the music world if that’s your goal. Singing ability also helps you get better results with AI.


Techniques to Improve Your Voice

AI can only do so much with a bad vocal. Want my team to create personalized music and audio for you? Visit this page to learn more.

Practice singing scales. Record yourself and listen back on your phone. It’s easier to hear pitch issues when clear chords back the vocal.

Singing along with your favorite songs doesn’t count as practice. The original vocal usually covers your singing, and you’d probably be horrified to hear your isolated vocals.

A better approach is to sing to a karaoke version and compare your vocal to the original. Don’t get discouraged. Good pitch correction and mixing make song vocals sound good, not just the performance.

Record your singing and run it through Melodyne, WaveTune, or other tuning plugins. Visually seeing how off your notes are is helpful. Having music production experience can benefit your singing. Browse tuning plugins here.

A music producer at a mixing desk

Why Create Your Own AI Model?

As AI vocals become more common, you’ll start hearing artists who sound suspiciously like Taylor Swift. That’s an option if you don’t have moral qualms about it. But remember: your track might end up sounding like everyone else’s.

Most models on Jammable (formerly Voicify) for AI Vocals or are low quality, but there are some diamonds in the rough. You can also train your own vocals with these tools or locally if you have a GPU, which I’ll cover in my upcoming AI vocal engineering course. Get notified when that releases by signing up here.

If you use your own voice, record at least 10 minutes of your best pitch-corrected singing, mixed to your desired sound without reverb or delay. Then, train your model using tools like Jammable,, or local computing. You can then convert your scratch vocals into a final vocal with minimal effort.

Mixing Process Simplified

I cover a similar topic in my article "The AI Vocal Mixing Technique No One’s Talking About," I argue that most of the mixing process can be skipped when compression, EQ, and optionally saturation are baked into an AI vocal model.

For best results, I sometimes recommend using multiple vocal models throughout a song. If there’s a quiet, intimate section, create a model just for that. Then have a separate model for powerful vocals later. You could create one model to capture all dynamics, but separate models give more control.

Even the best singer won’t fix bad songwriting, though. If creating good music is your goal, there’s more to worry about than just having good vocals. Want my team to help you? Start a song here.

With tools like, you don’t even have to sing at all. It can generate full song ideas with lyrics for you, though the vocals often sound rough with artifacting and feathering. I see as a tool for arranging instrumentals and songwriting, not for final vocals.

Changing Keys with AI

One technique is to change the key of the vocal, sing it in the new key, then use AI to change it back to the original key. Tools like Jammable (formerly Voicify) for AI Vocals,, or GitHub libraries often allow you to change the output vocal’s pitch.

If you’re singing over a backing track, you’d need to pitch that track up or down to your desired key. Record, then pitch the track back to the original key. When using your AI conversion tool, adjust the pitch accordingly. So if you lowered the backing track by five semitones when you sang, you’d increase pitch by five semitones in the output AI vocal.

AI Vocals Work Best... genres where the final vocal is heavily compressed, saturated, and has reverb and delay. These effects hide imperfections in AI models. Lush instrumentation can also hide problems. I’m not saying you couldn’t do an AI acapella group—it would just be significantly harder to pull off convincingly.

If your goal is a hit song, one of the hardest things to nail is vocal layering. That falls a little out of the article’s scope, but there are vocal layering tutorials on YouTube.

Modern-day AI vocals don’t quite capture the quality and finesse of professionals (singer, mixing engineer, etc.). They often feel flat or miss the emotion of a true vocalist. You can try to overcome this by making your input vocal more emotional and powerful. Real singing still matters.

Pitch Correction Tips

I generally recommend doing pitch correction on your recorded vocals before converting them with AI. This is because AI outputs are lower quality, and you’ll hear artifacting from pitch correction more if you apply it to those AI files.

You can leverage an aggressive amount of pitch correction for your input vocals that will usually be smoothed out by the AI model (can be risky and may require multiple attempts/tests).

One technique I use is giving the AI one version of my vocals that is tuned and one that is not tuned. That way, you can comp out any places where the tuning sounds weird.

Too Long Didn’t Read

This all shows that musical skill is still required to create good finished music, but those who struggled with singing now have an enhanced ability to do so.


Leave a comment