AI in the Booth: The Rise of Machine Interpretation

[00:00:03] Speaker A: Hello and welcome to Localization Today where we explore how language, technology and community converge to unlock ideas for everyone everywhere. I'm Eddie Arrieta, CEO here at Multilingual Media. Today's episode focuses on the world of interpreting technology, a field where automation, real time assistance and human expertise meet under intense pressure. Our guest is Claudio Fantinuoli, professor at the University of Mainz and founder of Interpret bank, one of the leading computer assisted interpreting tools used by professional interpreters around the world. Claudio has worked at the intersection of research and industry, including serving as head of Innovation and later CTO at kudo, where he helped introduce one of the first production ready simultaneous interpreting tools used by Fortune 5, 500 companies. Claudio, welcome again to the show. [00:01:05] Speaker B: Welcome Eddie, and thank you for having me here today. [00:01:09] Speaker A: Of course, Claudio, and I'm really happy because you've recently also launched your column on multilingual media. How has that experience going for you? [00:01:17] Speaker B: That's great. That's great. I have to thank you. Thank Multilingual for this opportunity. I like to write and this is a great opportunity to reach a broader audience and I'm happy to be on board. Yes. [00:01:32] Speaker A: You know, I should confess publicly that I was just looking, I was always looking at your LinkedIn articles and I was like, this is so good. This is so good. This should be multilingual. And then we had a few conversations and then finally we made it happen, Claudio. So thank you so much. Also being happily surprised to see the launch of Interpret bank and I was like, wow, this is so amazing. I wonder who is behind it. And then here I find Claudio Pantinoli. So Claudio, let's get to it. What originally motivated you to build Interpret bank and what problem are you looking to solve for interpreters? [00:02:11] Speaker B: Yeah, thank you for the question. First of all, Interpret bank has been around for a while, but it's interesting that you notice it lately because there is a change in what's going on in the interpreting industry and that maybe is bringing more visibility to these kind of tools. Interpret bank is a computer assisted interpreting tool also called called CHI Tools or CAI tool. And the CHI tool is a tool that is designed to support interpreters, professional interpreters to perform several tasks that they normally do they need to do like collecting glossaries for a specific event or manage them and many other things. And a K tool bring all of these activities or give the interpreter the opportunity to do all these activities with the support of a computer, with databases, lately with AI. And this is what we call a supportive tool for professional to streamline some of these tasks. [00:03:29] Speaker A: And there's been a lot of fear on the introduction of these tools. So for those that might not be familiar with the concept, how can an interpreter understand what you know this computer assisted interpreting does, does it take my job as an interpreter or as it presents, like you mentioned, it assists you? And what specifically is this assistance on would be? [00:03:53] Speaker B: Or is the question so interesting enough, AKAI tool uses in many of the technologies that are also used for automating interpretive, but they use for the interpreter. So it doesn't substitute the interpreter. The interpreter is at the center of the interpreting process, but it helps the interpreter. For example, conference interpreters, for example, or sort of interpreter, they need to work on very specialized subjects. So with a KAI tool, you can manage all the glossaries that you collect over time just in a very user friendly or interpreter oriented database. You can use AI to help you compile, for example, these glossaries. It is designed so that you can look up these glossaries very fast. But nowadays there is also, for example, the integration of the speech recognition, which is quite an interesting feature. So you can imagine this. The interpreter, let's say a simultaneous interpreter in a conference, in a booth, is interpreting. The chi tool is listening at the same time, what the speaker is saying is transcribing. And the interpreter can decide, for example, to look at the transcription. In moments when there are some particular complicated passages where he or she has missed some points, it can look at the transcription, but it can do much more. It can, for example, suggest the interpreter specific words, the translation of specific words. Imagine we are not doing a very general speech, but a very technical speech in a very specific subject. Where terminology matters, interpreters prepare the terminology, they learn it. But having it popping up in real time on a screen can help interpreters provide a better precise translation or think about numbers in simultaneous interpretation. If you look at the literature, the amount of errors that happen in simultaneous interpretation is very high. It's quite scary if you look at the numbers, how many numbers are translated wrongly. And if you think the numbers can have a very important meaning, depending on the subject, you have popping up these numbers that can help you be more precise. So all in one, to make this short, the whole set of features from managing terminology, from creating terminology, from looking up terminology with AI that are designed to give the interpreters a little bit if you want a boost. A good interpreter is not good because it has a chi tool, but it can be a little bit better by using these tools. And by the way, it can also, at least this is My theory, reduce the time a little bit, the time of preparation so that you can work with a little bit less preparation. But at this, maintaining quality in interpretation and probably even improve interpreters in the areas where we humans are not particularly good. We are very good interpreters at making sense of what's happening, situational understanding and so on. But for example, numbers, terminology, which are very mechanical stuff if you want to memorize and so on, we are not that good at this. And this kind of tool can help improve quality in these areas. [00:07:48] Speaker A: And you are completely right. I have been on the receiving end of immediate interpreting, and in some cases, you know, it's coming from Spanish into English. So I play along and then I put the headsets, and then I realize myself. You know, I recognize the difficulty of, like, simultaneous interpreting. I know even myself I wouldn't be able to do this. But I can realize the moments where a slip in terminology translates into a huge slip in concept, which I am not entirely sure if those that are receiving the interpretation well kind of know enough about the subject matter that they realize it's a mistake. And then, you know, you can move along. Now, in certain contexts, this might not be okay. So it's really interesting when you mention this. It, of course, feels that there is a lot of progress needed in the logistics and also the precision, and that a tool like this will allow interpreters to have some levers at their disposal to move in that direction. So when we're thinking about the role of the interpreters and how it's evolving, given these tools, what are some of the abilities and the things that. And you've mentioned a few of those already in this conversation, but a few of the things that you think the human interpreters are. Where human interpreters are critical, the moments where human interpreters are critical, the abilities that the human interpreters have, that machines will not have that speed in the moment to be able to resolve. And you have mentioned terminology and being able to store vast amounts of data like the place where machines would excel at. But if you could make these comparison a bit more plain for those that might not know enough about the subject, it'd be great. Claudio? [00:09:36] Speaker B: Yeah, maybe Think about this. I believe machines are making huge progress in interpreting tools. We see this in machine translation written, but the same is happening very fast also in spoken translation. But still at the moment, if you look at the models that are used or the application that are put in production to do interpretation, the machines are still not very flexible. They are a little bit mechanical in the way they approach the translation and so on. So they don't make a lot of sense about what has been said between the lines, what is happening in a room, what are the roles of the people speaking all things that are very important for meaning creation. And that human interpreter, on the contrary, is very able because we are human, we can make sense not only using what we are hearing, but also what we are seeing, what we know about the speakers, about many things. We are much, much, much better than machine in this area. But as I said before, in some areas machines are much better ready now than humans. So the idea is, okay, machines are making big progress in interpreting and they will find their place in the market for mach many scenarios. But there are still areas where high, high quality or not only high quality, just the situation is so complex that humans have an edge of a machine. But probably, and this is my my point, humans also supported by technology or by AI AI, they will have even more an edge of just machines. So at some point there will be a competition, at least in some markets, between machines and professionals. Just a fact, okay, there will be markets that will be served only by machines. There will be markets or segments served only by professional. But the intersection there will be a growing probably area where the two will compete. And the better way to position interpreters, human interpreters is to be even better than what they are now. And the technology is there with all its limitations. Because also technology to support interpreters has limitations. But we see in research, and this is pretty clear, for example, these suggestions that I introduced before. So the machine suggests the interpreter numbers, terminology or other stuff. There are quite many studies in the last three, four years around the world, in Europe and now in Asia, quite a lot that prove that the quality of interpretation for these aspects improve. So this kind of tools can help enhance interpreters professional. This is important now and it will be more important in the future when professional we have to compete also with machines. So let's give the humans also this strength that machine have in their work. This is the theory if you want behind kite tools. [00:13:14] Speaker A: And of course I think while I have unique conversation, I like to be indulged into the conversation. You are in the research side of things and usually what happens is the academics think about it. Then you start doing applied science and eventually it gets to the markets, the things that are really valuable to our society. What can you tell us about interpreting technology? Where do you see things going? And you are working on the research side and the industry and you are executing as you're investigating and researching what can you tell us about what you see and what you expect about the future of technology. [00:13:55] Speaker B: First of all, it's a great question and it's a very difficult question to answer because predicting the future is, as we know, very difficult. But one thing I think is sure, we have to expect machines in interpreting to be able to interpret to a somehow similar level than humans. Okay, this somehow is a word that means many things because it's very difficult to compare human produced translation than machine produced translation. They will be, in my opinion, very good. Also the machine produced interpretation, translation, they will be very good. And in some instances it would be very difficult even to tell apart the one from the other. So this is something that. It's coming quite fast, by the way, meaning my prediction, if you want a number, but numbers doesn't don't count so much. But let's say 20, 30, this is my horizon. Let's say we will have machines that will interpret quite well, both in simultaneous modality, but also, which is quite interesting because we very often think about conference interpreting, but also in the dialogic modality at a very high level. This is common. But at the same time as a person coming actually from the linguistic side, from the interpreting side and so on. I would not be scared if I were a professional, because even if machines will be able to, to interpret, let's say, very well, there will be many other reason, many reasons beside pure quality. While people will still want to have a human interpreter in certain situations, the fact is that the profession and the industry, I would say, need to be prepared for the fact that at some point we will have quite comparable output coming from human and interpretive. Then this is a very general statement. Then this will depend very much from the situation, from many, from the language combination. We know that interpreting is not interpreting. It may be interpreting between very large language families, very small in numbers of people speaking a language. And this makes the landscape very variable. So when we say, when I say that machine will reach a very good level, maybe this will be in a couple of language pairs or the bigger language pairs that we have and not in all the other languages. But for sure this threshold will come. So the thought that I think professionals, but the industry too has to do is to think, okay, where. What will be the positioning of interpreting as an ecosystem? Okay, so the interpreters, the people receiving interpretation, organizing, selling it, what will they positioning? When will we will have this kind of machines? And I'm pretty sure if I look what is happening in the labs around the world, that this is coming and Quite fast and by the way, not by company or models that are designed to do interpreting, but with high probability by very general models, which are the big large language models with multi modality that they have coming from the big labs that will be generalist models able to translate in real time. Also speeches. [00:18:05] Speaker A: Thank you very much Claudio. That helps me think a little bit about the future. And next time we talk I'll ask you the same question probably and see where we are at. Claudio, besides Interpret bank, what other things are you working on right now? [00:18:20] Speaker B: Well, I'm doing quite a lot of two things if you want. Doing quite a lot of consultancy for governments and for institutions about interpreting, especially about technology and interpreting. And the reason why this is happening, we just published by the way, a very interesting which is open source. So it's a publicly available piece of research done with cintef, which is a research center in Norway, in Oslo, for the Norwegian government about technology. They have a very special situation. They have a so called interpreting act. So they have quite a binding law in Norway, very rigorous law about when how the public institutions have to use interpreters, but they lack absolutely knowledge about technology, about the user technology, the limitations of technology and so on. And everybody is speaking about now about how to improve work of interpreter, how to help interpreter, how to automate many tasks. So this kind of consultancy is something that takes a lot of time, effort and I see a lot of governments or stakeholders in need in this couple of years, last couple of years for information about this because then policymakers need to make policies and it's not easy to make policies without knowing very well interpreting on the one side, but also technology on the other side and specifically technology in interpreting, which is such a niche if you want, there are so many, so it's a niche and the knowledge about it is not so widespread. So this is something that takes a lot of time and it's quite a joy actually to try to help a little bit these organizations. And on the other side I'm working quite a lot behind the scene also on machine interpretation because I think, as I said before, it has a future next to computer assisted interpreting. And a few months ago I did a study about multimodality in machine interpretation. So I created a prototype and now it's going to go also into production for a company where we added eyes to the interpreter. So one of the of an AI interpreter because fortunately this is not needed for most of the human interpreters. One of the shortcomings of AI interpreters is that they work just by Translating the words that they hear and they cannot make sense of what they see and what we see, as I said at the very beginning, it's so important. So what we did is to add multimodality, not with the modern models that try to make all these processes done in one big model, but by doing, by merging, so to say, different models. So we have the speech model that does the translation, and then the visual model that captures what's happening in real time and so to say, gives hints to the language model about what's happening, you know, like prompting the model, the speech translation model with what it's happening in the room. And this obviously improves the quality of the translation. These are all interesting elements to work on, which are taking a lot of my time and my joy in this area. [00:22:26] Speaker A: Thank you, Claudio. Because these are the places where eventually economic value is perceived. And that is something that has become really evident on the artificial intelligence conversation, similar to how the Internet and websites became these places where people could replace some of their brick and mortar infrastructural investments. Here you're starting to notice how artificial intelligence in some cases can be included to add that economic value. Claudio. From the conversations that I've had on the companies that I've worked with, it's always stood out that interpreting and simultaneous interpreting always comes in the places where language access is needed, and it gets to solve those issues. And then what we're discovering with artificial intelligence is that we've done it so much for certain, just like you said, that we are way ahead of our time. Well, it seems that we're way ahead. But what it just highlights is that for those pairs, language access, it's okay. Now we start seeing all the other language pairs, and then regulation is trying to figure out, okay, what do we do now with all these possibilities? Because you can't, as a policymaker, say, okay, we're going to give language access to 100 languages. You're going to be bankrupt. You're going to have a public budget deficit the size of the moon. So there is an economic reality there. There are some economic constraints. What do you see in your consulting in these institutions that help you understand the level of demand that could come because the regulations and the policy changes that artificial intelligence is going to facilitate. [00:24:04] Speaker B: Yeah, it's difficult to say, because as far as my experience go, there are very. There is not a unique problematic. If you want a problem across the globe in the different countries, it's every country where I had to do some consultation or advisory and so on the Situation is very different and especially it's very different between continents. So it's so different between USA and Europe. In Europe we are very much more conservative, for example, than in the US as far as I can say. But there is a. The idea is that of course there is a lot of economic value in adding accessibility, also moral value, which is also important. It's not only economic value in adding accessibility. And at the moment many of the administrations and so on, they struggle between. On the one side, yes, the need to provide access to people, they don't speak the language of the country. On the other side there is really the difficulty, really the logistic, which is very complicated for many countries with logistics, I mean, probably not the right word, but to have the right interpreters at the right time for the right task and so on, it's not absolutely easy. And they have also budget constraints. So they have to try to find the balance between the budget that they have or they have allocated the demands that are normally higher, that they become higher. As far as I know in other countries where I had to do, there is more and more demand for accessibility either because otherwise there is no communication and there are some risks if an entity doesn't offer the right support for communicating with the passion or whatever. And because we are living more and more, at least in Europe, in a multi language society, inside the single countries, through immigration and so on, the budget are constraints. We are in a little bit of recession probably in many countries. So everyone is concerned about, okay, how can we have the best multilinguality if you want at the best with the budget that we have, which is normally not increasing. And here I think the added value and the value also for many companies that started to move in this space is to combine human expertise when human expertise is available first and is needed. Because in many situations human expertise cannot be sold out just for the sake of it. There are situations where you need human expertise but at the same time help these stakeholders to implement policies that help them use tools that are available or that they can be developed in some cases, specifically for some use cases for some customers to apply artificial intelligence or whatever technology to solve these problems. So the problem is not one or zero binary solution, let's use this technology or not use this technology, it's white and black, zero and ones, is to find the right balance when it's appropriate to do something, to apply something, when it's appropriate not to use something, and so on. And this is not easy, especially because every person that has to spend the money they Want to automate everything. People that want to use it, they want to have the best service possible and find the right balance is very difficult. And it's very difficult for two reasons, by the way. One, because these technologies are new. So you can imagine there is not so much knowledge, especially in interpreting, because it's such a niche area and people need time to get accustomed to what's there and how they work or how bad they work. And because this is the second point, this technology is changing very, very fast. And when I see evaluation, for example, done by somebody about a product or specific technology done maybe six months ago, they use this as okay, look, this is bad for this or this is good for this. But six months ago, what we have in the labs or people, when you talk with the companies that are bringing new stuff out, are completely different again. So you have to restart again. So it makes this so dynamic and very difficult to navigate. But the important point is that there are, in my opinion and in opinion of every stakeholder I speak with lot of the opportunities here to make business on the one side, but also to offer a good service on the other side. So this is what we have to find the balance. [00:29:50] Speaker A: This is great because you're talking and this is the first time I hear about that, which is the moral value of it. And surely we shouldn't be looking now at technology as policymakers as a place where we're now going to save some money for us to what, build another park, but rather to give more access to more people and to really showcase where the value is at. From the conversations that you are having with companies, institutions, Claudio, what are the expectations that you're hearing about regarding the future of interpreting in those terms that we've been talking about? [00:30:27] Speaker B: Yeah, also this changes quite a lot with whom you speak in the countries, continents, and also probably a little bit the mindset and obviously the interests of the same people. What you hear, on the one side, there are two more or less expectations. On the one side there is quite a conservative attitude. If you think about the European institutions, for example, and you talk with the people, the managers there, they are very, very conservative. And they keep on saying there is almost no space for automation. This is one kind of manager attitude that I have to deal with every day. The other one is probably the over optimistic overview where people say, okay, now let's build this because it will solve all of our problems. Also in interpreting, we will automate everything and it will be everything automated. And then you have to. I find myself in this in the Position to speak to people in similar or different roles. But in any case, decision makers that have completely opposite views and in my opinion, very often wrong views because the truth sometimes is just in the middle of this, we will have AI and we will have humans doing the same job in different situation for different purposes. Yeah. I don't know if this answers a little bit your question. [00:32:11] Speaker A: It does. Thank you. Thank you so much, Claudia. And you got into something in there that relates to clear misconceptions about interpreting technology. And I think it probably relates to all the misconceptions on technology in the history of technologies for human beings where we, at least from what I've seen, we absolutely underestimate the moral and economic value that it can bring. And we also overestimate in some cases. Well, we also overestimate risks and we say in some cases, you know, it's going to destroy everything. But of course, these misconceptions over time get handled. What do you see in there that perhaps is hurtful to organizations? [00:32:55] Speaker B: Yeah. On the one side, probably thinking about the extremes. It's hurtful because thinking that nothing is going to change or thinking that machines will not be able to interpret well, which is a narrative that we hear from people outside AI, but very inside interpreting, so they know how difficult is interpreted and interpreting is very difficult. By this idea. They, they assume that machines will not do it. Well, thinking this way is wrong because you do not have the time to prepare to a scenario which is important and in my opinion is coming. But even if you, this is the play I do. I play the game I play all the time with my counterpart is even if you don't think this is a plausible scenario, do a what if scenario. Think about how would you position yourself, how would you do it differently if this scenario will come true? You don't believe it. Okay, I do believe in this. Do not worry too much about it, but worry about, be prepared if this scenario comes true. If you don't do it, you lose a lot of time and you will be always reactive in your strategies, no matter if you are a professional, if you are a professional association, or if you are a company. You need to look at the future and have different scenarios in front of you and be prepared for different scenarios. The same happens on the other side. If companies think they can do everything and very fast without guardrails with AI, this is on the other side. They risk a lot because they might have a brilliant technology that looks good on paper, but then when it's put on reality, reality is very different than the lab. The paper is very different than reality. Then you have humans that use the technology or buy the technology and these dynamics. To accept new technology, even if it's good, takes a lot of time and you can have a very bad awakening to see, wow, this technology is marvelous, but the market is not ready to use it or doesn't want to use it for whatever reason. So a reality check and having a little bit of more scenarios in front of you, even the scenarios that you don't think are the most plausible because of your personal attitude, is a very good exercise that helps you to avoiding failures in the future. [00:35:58] Speaker A: Claudio, thank you very much once again. You've always been very generous with your time sharing your knowledge with us and we hope to have you again in the magazine, of course, again in the podcast. Before we go, is there anything else you'd like to add, any messages to send out there? [00:36:13] Speaker B: No particular message. I think that these are very interesting timing for interpreting. It's a very interesting, interesting timing for any profession, for any activity out there, obviously also for interpreting, not only for translation people, maybe outside of interpreting, think about localization, written translation, maybe dubbing and so on. And not so much about the role of technology in interpretation, in real time translation of spoken content. I think it's an interesting time. So much has to be done to be discovered, to be put into practice, into business, into everyday life. So it's an interesting time to work in this space, [00:37:05] Speaker A: Claudio. Absolutely. It is an amazing time to be in this space and I'm really glad to get to see professionals like you and companies like yours. I want to thank everyone who has been listening or who listens also to localization today. Like I just mentioned, a big thank you to you, Claudio Fantinuoli, for sharing your perspective on the evolution of computer assistant interpreting and the realities of AI in real time language, language work and the future of the profession. Catch new episodes of Localization today on Spotify, Apple Podcasts and YouTube. Subscribe, rate and share so others can find these conversations. I'm Eddie Arrieta with Multilingual Media. Thanks for listening and we'll see you next time. Goodbye.

Show Notes

Episode Transcript

Other Episodes

Episode 180

Study assesses machine translation for marketing purposes

Episode 48

Recap: ELIA’s Together 2023 conference brings language professionals together with engaging panels, networking sessions, and good food

Episode 19

Why Chinese New Year means two weeks off