Episode Transcript
[00:00:04] Speaker A: The following is our conversation with Fabio Minetzi, Director of Audiovisual Services at Translated. Before this conversation, we did not know that he was going to win the Product innovation challenge at Lockworld 52. Congratulations, Fabio. And I hope everyone listening to this conversation gets as inspired as we were during Locke World 52 in Monterrey, California. My name is Eddie Arrieta, CEO at Multilingual magazine. Enjoy.
[00:00:39] Speaker B: Thanks for having me here. So I'm Fabio Minazzi. I am the director for Audiovisual at Translated, and my background is actually from dubbing and voiceover and localization for software and video games.
Today I'm going to talk about this Voice for Purpose project, which is our last endeavor to make a language open up language to everyone.
So a few years ago, we were developing our AI dubbing platform at Translated Made Dub, and we were approached by voice actor Pino Insignio, pretty famous in Italy, was actually quite skeptical about AI voices, as you can imagine. And he actually challenged us, saying, hey, why don't you do something with a purpose with AI, like, for example, I wanted to donate my voice to people who lost it. Is it possible?
So that, you know, we pick up the challenge and we partner with him, we designed an experiment together. We brought people to recording studios, made them record their voice, you know, in different languages, and made them listen and judge to their recordings. You know what? Typically people said that, listen, my voice is not good, it's not charming, it's unsuitable, it sounds funny. But actually, when we made AI models of their voices and gave it to people who cannot speak any longer due to pathology, like als, which removes your capacity of articulating speech, well, they realize how important the voice is, how impactful their voice could be to others.
And that's when we invented Voice for Purpose, which is a platform that enables anyone to donate their voice to people who don't have it any longer and can deliver. It's an infrastructure that delivers voices to any connected device through the cloud, synthesizing voice in real time. We are talking about last generation voices, which are like expressive voices which we deliver to patients with support of a clinical research center which is specialized in neuromotor diseases.
And we measured the impact of those voices in their judgment.
So what came out is that these voices these donated or even self donated, because if you can still speak, you can donate yourself your voice. Like you, you donate the stamina cells to yourself to heal some of your diseases. On pathologies, you can use your voice as well. So we measured that impact and the score, the judgment shown that they are much better than the standard impersonal voices that are typically delivered to people with a pathologist, a voice pathologist.
So in the end, we have discovered that we can help people with different diseases, like cancer at the larynx, so that your full voice box has to be, you know, taken away to heal and, or people with ALS or even some forms of autism as well.
So that this is basically the project. It's a kind of a social project, but it's also a medical project at the same time.
Okay, so there are two sides. One is the donation side, which is pretty simple. It's enough to log into the voiceforpurpose.com website and then register. And with a kind of one minute of recording, your voice becomes part of the library, which is made available to people who actually might request a voice. The other side of it is actually people who have the pathology, they register and they get contacted by us to get an assessment. So that we only want to deliver voices to people who really have a pathology. So we vet the people who the voices are assigned to and then typically we go through clinical centers that they already a system and with their help we make sure that once we create the voice model base, or either on the donated voice or the self donated voice, these voices deliver to the devices that they use, which could be for example, like a mobile phone. So if you can type, you can definitely use your voice by typing and then having your voice generate in real time. Or it could be like an aac, which are these specific devices that people with voice disabilities use. There are augmented alternative communicators like tablets that where you can point with your finger, select and type in or even with your eyes if you can. You're no longer able to use your fingers and then you compose the sentence and then the voice is delivered. So we connect with the entities that typically support the people with pathologies, making sure that the voice is attached, is installed in the environment.
Right now we support five languages, typical E figs.
Well, ALS is the most glaring example because it's when people really cannot, cannot use their hands in order to communicate. Oftentimes even their lips are, you know, limited in their movements.
So that that is the case we started from. But we are seeing also other pathologies like the Rett syndrome, also some forms of autism that prevent people from speaking.
Those are the ones that we see most. And of course people with cancer at the larynx. So people or post trauma, people with paralysis to the vocal cords. They are all cases where we can apply this. We selected One reference center called the Nemo Lab, it stands for neuromotor laboratory, is specialized in neuromotor diseases.
And they, they have occupational therapists that interface with patients every day and they take them through all the steps, through their, you know, the evolution of their pathologies, trying to make the most out of their remaining, you know, functions. Yeah, absolutely. In fact, this is one of the things that we suggest to people, if anyone is diagnosed to pathology that might impact their voice, the capability to speak.
Please do save your voice. We provide, you know, an interface with a script that it can be spoken and we collect all the voice and bank it on a side. So one day, in case this becomes necessary, we can create the voice model and deliver to you. We actually have some patients that recorded the voice a year ago and never used it. And now they are asking us to turn on their voice because they now need it. Okay, let's start first for the patients, okay? Because this project is geared towards patients, okay?
It's important for the patients because they can use better voices on any device and the same voice will be carried over throughout the development of their pathology.
This means that if you at one point use a mobile phone to communicate and tomorrow you use an aac, the same voice can be carried over. Right now it's not the case because there are different operation operating systems and the voices that you use might actually change when you change your device. This is a technical say, hiccup that impacts your identity because voice is very connected to it. So this is important for them, for patients. It's also important for the overall society because we are talking about a few million people in the world that cannot speak. So we are talking about significant numbers.
Just to give you an idea, the rett syndrome affects one girl every 9,000 below 12.
So we're not talking about inconsequential numbers, not to talk about autism. Autism. And therefore we are talking about a significant portion of a minority, luckily, but significant portion of society overall.
And why is it important for our industry? Because opening up language to everyone is a general statement. Language is not only the words that you say, but also your possibility to communicate to other people. So by extension, if we care about you being able to use language, we should be caring also about you being able to deliver the language. Okay? And therefore this project, I think, is very close to the localization industry because I think every one of us is a locale, every one of us is an island, which in this case we can say features a specific voice. So voice is identity is the way we get recognized with the way we express our thoughts, our sentiments. So I think that if we care so much about, for example, terminology of how we express ourselves, we should be caring also about the way we deliver that terminology. The way of expressing, basically there are AI researchers as we develop voice models and therefore selection of the algorithms, the training of the algorithms, fine tuning, data cleaning, all that stuff connected specifically with the voice.
Then there is a software engineering component to build all the infrastructure, you know, the cloud infrastructure, managing the space, loading the models, having the portal up and running all the developments that are in the ui. So back end and front end as well. And then there is all the medical team that takes care of the relationship with the, with the patients and people in need. And of course, like a management, pure management, like keeping track of who got which voice when sending all the passwords and all that stuff, which is running the project itself. Okay, well, this is a, It's a great question. Allows me also to cover an important point, which is how we support the project, because translated this, did this effort and financed completely the campaign for getting the donors, the relationship with all the patients and running the product on the servers and so on. And which is, as you can imagine, is 247365 because you can talk any time the day or night. Okay, so we, we set ourselves the goal of supporting 100 people kind of for free to prove the concept. And now we are in the, in the process of looking for support, financial support that could be from private institutions, from companies, organizations, whatever that would like to join us in this, in this process. And of course for recognition, because it's a social project. And on the other side, see to what extent the health insurance system supports this in different countries. There are countries where the national health system pays for synthetic voices for patients, like Italy, for example. There are other countries where there are less of old. And therefore you can ask your insurance, for example, and maybe your insurance covers the cost for this. Overall, there's a lot of focus to making this affordable because that's a very important point. You cannot spend $100,000 per year to speak. Okay? We have to put it down to $200 to make it viable for people to accept it and scale and support many of them.
So I don't know if I answered your question, but this is where we are now. We are at the turning point. We are at a point where we can reach out to many people, but we also need to get support to scale.
Okay, well, my advice would be, if you hear about from this podcast about this project and you have any of the pathologies that may affect your voice, go to your medical doctor, your practitioner, or your medical support center. Ask to get in touch with us@infoon voiceforpurpose.com Send us an email. We'll be happy to discuss and see how to deliver to you a personalized voice.
Well, I think that I was very happy to see the reception of this project that within this industry, because we realized that it might be considered a medical project rather than a localization project. But instead I saw that people really felt that localization is very close to accessibility and accessibility is very close to medical care. And therefore we are giving to localization a whole new dimension which is far beyond language as text. Okay, so. And it implies media, if we want to use traditional terminology. But it also involves people a lot, is very people centric. And having such a warm reception by people made me think that this is really an amazing industry made of people who really want to reach out to people, help others to connect.
I will try to see if that's possible.
As I said, it's a kind of a medical project. So we were able to show some pictures in my presentation yesterday, some of people who were recording their voice or were using their voice. We are very careful about the fact that for them, it's really their life. So I will check this out and see if I can get any of that.
[00:17:39] Speaker A: And this was our conversation with Fabio Minezzi, Director of Audiovisual Services at Translated. Let us know what you thought and perhaps you have an innovation that you want to share with us. You can send it over. You'll be the next person interviewed in localization today. My name is Eddie Arrieta, CEO at Multilingual Magazine. Thanks for listening.