Creating a quality video with AI – Make the character talk

avatar

$1

In the previous post, we have started to work on the second scene where Tania Lemaire starts to give her report. We have now the start frame but there are still a lot of things lacking to make her talk.

First of all, we need the text. I took inspiration from the previous reports that she sent and kept the same structure. Once I had the text together, I tried to figure out which part of the text would fit each scene that we defined in the story.

From text to speech

For our second scene where we want to see the character talking into the camera, we need to transform the text into real speech. For that we are also going to use AI. There are many different solutions available. For this purpose I used a page called https://speechma.com/ that is totally free to use. I simply paste the text of each scene into the website, fill a captcha code and then let AI do it's job. I checked the speech and repeated if I wasn't happy with the result.

$1
source: Speechma

Speechma offers quite an impressive service to say that it's free to use. You can chose among a big quantity of different voices and different languages. You can add effects like pitch and speed to the voice and then download the generated speech as a mp3 file. The only issue that I run into was that the voice wasn't apple to say the word Liotes the way I wanted to. Since it's an unknown word, I could go around by writing Liotis to get the sound that I expected.

A speaking character

To continue our work on the second scene, I will now combine the image into a video with the person talking.

In the first step, I generate a short video of about 10 seconds where I set the image of Tania as a frist frame and simply tell the AI that she is talking. For that I use the model Kling 2.1. I have now a video with a talking head.

For the second step, we take the video that we have and use the feature 'Add Audio to Video' from open art. For each scene where Tania will talk to the camera, I can generate a new video using the video that we produced in step one and then adding the audio file that we had created before. I just switch to lip Sync and the AI takes care to do the job.

$1
source: OpenArt

I have done all the scenes with this same process where Tania talks to the camera. The only issue that I have is that the videos and the speeches are not always the same length. So I will need to be creative in editing.

To give you an idea of how this will look like, here a scene that I will probably not take because AI changed the personality of Tania. However, this is about how it will look like when Tania talks to the camera.

https://youtube.com/shorts/BjFv46CZ1N4?feature=share

Check the previous posts of this serie:


With @ph1102, I'm running the @liotes project.

Please consider supporting our Witness nodes:



0
0
0.000
11 comments
avatar

This technology has made the work of YouTubers a lot easier. They are making very interesting videos with its help, and this technology is proving to be very helpful in other fields as well.

0
0
0.000
avatar

I agree with you but youtubers still need to come up with a good story first :-)

0
0
0.000
avatar

The tools are quite amazing. I am a bit surprised to hear that you had to play around with the spelling of the word to get the pronunciation correct, but I guess that is how these tools are right now.

0
0
0.000
avatar

I believe that the reason for that is because Liotes isn't really a word from the english register and the AI didn't recognize it and didn't know how to say it.

0
0
0.000
avatar

So interesting to follow the process, I think the lip sync feature is cool, makes it much easier to match audio and video.

0
0
0.000
avatar

It's not perfect yet but it's a huge step forward I believe.

0
0
0.000
avatar

Sometimes, it's disconcerting what we can do now with AI.

0
0
0.000
avatar

And it's getting better every week :-)

0
0
0.000