Suno + Veo 3.1 Is an INSANE Combo for AI Music Videos

By Isa does AI

Summary

## Key takeaways - **Suno + Veo 3.1 Hollywood Transformation**: A basic song transforms into a Hollywood quality music video using nothing but AI, combining Sunno AI with Open Arts VO 3.1 that creates the most cinematic results I've ever seen. [00:00], [00:20] - **Custom Mode for Maximum Song Control**: We're going with custom mode for maximum control since your prompt might be too large for simple mode. The more specific you are, the better your results. [00:49], [01:16] - **Upscale Crucial for Character Quality**: Upscale using the 2x plus face option. Do not skip this step. It's crucial for quality. [01:49], [01:52] - **Lip-Sync Hack with Audio Mute**: VO3.1 has audio capabilities that let you make your character actually lip sync to specific lyrics. Generate videos with audio enabled so the character lip syncs, then mute that audio in editing and replace it with our Suno track. [02:40], [02:54] - **Consistent Character Across Scenes**: Add maintain the exact same facial features, hairstyle, and hair color as the reference image to your prompts. Now we have our scene images with the same character maintained throughout the entire sequence. [02:01], [03:56] - **Match Shots to Song Structure**: Match intimate verses with close-ups and explosive choruses with wide dynamic shots. The lip movements from VO won't match your Suno lyrics perfectly, but they'll look natural enough to make it seem like your character is actually performing. [05:05], [05:22]

Topics Covered

Specific Prompts Maximize AI Results
Upscale Crucial for Character Quality
VO3.1 Enables Lip-Sync Trick
Match Shots to Song Dynamics

Full Transcript

You're about to see how a basic song transforms into a Hollywood quality music video using nothing but AI.

[music] >> What you're looking at right now took me less than 10 minutes to create and I'm about to walk you through the exact process. We're combining Sunno AI with

process. We're combining Sunno AI with Open Arts VO 3.1, Google's newest video model that creates the most cinematic results I've ever seen. I'm showing you the advanced method that gives you

complete creative control over every single frame, plus a game-changing technique for making your character actually lip-s sync to your lyrics.

Links for both Sunno and Open Art are in the description below. First, let's

create our music with Sunno AI. Head to

Sunno's homepage and click create. We're

going with custom mode for maximum control since your prompt might be too large for simple mode. For the style description, I'm typing modern upbeat electro pop with glossy synths, punchy

beats, and vibrant modern production.

Bright, expressive female vocals full of charisma and playful swagger. Drive

catchy hooks and a powerful, empowering chorus built on self-confidence and energy. A bold, cinematic pop sound with

energy. A bold, cinematic pop sound with polished edge, glamour, and high energy star power. The more specific you are,

star power. The more specific you are, the better your results. Hit create.

Listen to your options. Pick the one that feels right and download it. Now we

have our soundtrack. Head over to OpenArt. Click on image in the left

OpenArt. Click on image in the left sidebar. Then select create image.

sidebar. Then select create image.

Switch to the Open Art photorealistic model. Type something detailed like

model. Type something detailed like confident young woman with sleek dark hair in a high ponytail, striking features, wearing a leather jacket, bold expression, cinematic lighting,

professional photography style. Turn on

autoenhance. Click create. Then upscale

using the 2x plus face option. Do not

skip this step. It's crucial for quality. Next, switch to seedream 4.0

quality. Next, switch to seedream 4.0 with the omni feature. Use your upscaled image as reference and generate multiple angles. Add maintain the exact same

angles. Add maintain the exact same facial features, hairstyle, and hair color as the reference image to your prompts. Create at least four different

prompts. Create at least four different angles. Maya looking over her shoulder.

angles. Maya looking over her shoulder.

3/4 view. Same features. Dramatic

lighting. Maya facing camera. Close-up

portrait. Same styling. Confident

expression. Maya in profile. Side angle.

Same character. Cinematic mood. Maya

laughing candid moment. Same features.

Natural lighting. Click character in the side panel. Select start with four plus

side panel. Select start with four plus images. Name your character. I'm calling

images. Name your character. I'm calling

mine Maya. Upload your images with the upscaled one first and click create character. Here's where things get

character. Here's where things get really interesting. VO3.1 has audio

really interesting. VO3.1 has audio capabilities that let you make your character actually lip sync to specific lyrics. Here's the workflow. First,

lyrics. Here's the workflow. First,

we'll generate videos with audio enabled so the character lip syncs. Then, we'll

mute that audio in editing and replace it with our pseudo track. This way, it looks like your character is singing your actual song. Now, in the character section, select Maya and click prompt

and reference. Change aspect ratio to

and reference. Change aspect ratio to cinema 16-9. Set prompt adherence to

cinema 16-9. Set prompt adherence to two. Let's create our scene images. I'm

two. Let's create our scene images. I'm

going for an energetic pop vibe. Maya in

a neon lit urban alley at night wearing a leather jacket and ripped jeans.

Confident pose. [music] Vibrant pink and blue neon signs. Cinematic lighting.

Edgy atmosphere. Maya on a rooftop at golden hour. City skyline behind her.

golden hour. City skyline behind her.

Wind in her hair wearing a crop top and high-waisted pants. Energetic vibe. Warm

high-waisted pants. Energetic vibe. Warm

lighting. Maya in a modern loft with floor to ceiling windows. Natural light

streaming in. Dancing pose, contemporary setting, bright and airy. Maya in an underground parking garage, dramatic lighting from overhead lights, leather jacket, confident stance, urban

aesthetic, moody atmosphere. Perfect.

Now we have our scene images with the same character maintained throughout the entire sequence. Click videos in the

entire sequence. Click videos in the left panel. Go to image to video and

left panel. Go to image to video and select Google Vo 3.1 as your model.

Upload your first scene image, the neon alley, as your start frame. For your

video prompt, you can describe the camera movement and include a line that your song has, like I do right here. Hit

create and VO 3.1 will generate your character with mouth movements that match the audio. Now, in your editing software, you'll mute this audio track and replace it with your Suno song. When

we sync it properly to the beat, it'll look like Maya is singing our actual track. For the rooftop scene, upload

track. For the rooftop scene, upload that image as your start frame and prompt. Camera circles around Maya as

prompt. Camera circles around Maya as she sings at golden hour. Hair moving in wind. Energetic performance. City

wind. Energetic performance. City

skyline bokeh in background. Dynamic

movement for the [music] loft scene.

Wide shot pulling back as Maya dances and sings in the bright loft. Natural

light streaming through windows.

Energetic choreography. Contemporary

vibe. For the parking garage. Tracking

shot moving with Maya as she walks and sings through the garage. Dramatic

overhead lighting. Confident

performance. Urban atmosphere. Create as

many clips as you need for your song.

Match intimate verses with close-ups and explosive choruses [music] with wide dynamic shots. Import all your clips

dynamic shots. Import all your clips into any editing software. Here's the

key. Mute the original VO audio from each clip. Then add your Sunno track as

each clip. Then add your Sunno track as the main audio. Align your video cuts to the beat of your music. The lip

movements from VO won't match your Suno lyrics perfectly, but they'll look natural enough to make it seem like your character is actually performing. Quick

cuts on the beat make this even more convincing.

>> Electric glowing. [music and singing] >> Look at that. The character stays perfectly consistent. The lip movements

perfectly consistent. The lip movements look realistic. The camera work is

look realistic. The camera work is cinematic. The energy matches the beat.

cinematic. The energy matches the beat.

This is professional quality content created in minutes. If you want to create professional music videos, hit the links in the description for Sunno and Open Art.

Loading...

Loading video analysis...