Episode Transcript
[00:00:03] Speaker A: Hello and welcome to Localization Today, where we explore how language, technology and community converge to unlock ideas for everyone everywhere. I'm Eddie Arrieta, the CEO here at Multilingual Media. Today we're diving into one of the most transformative shifts in localization, the rise of AI to check and fix AI or trigger human intervention to scale multilingual content without sacrificing quality. Our guest, Adam Bittlemeier is the CEO and co founder of Modelfront, a company leading this new frontier. Modelfront's AI predicts which machine translations can safely skip human editing, enabling companies to translate more content at the same human quality. Their mission is bold, to make more content available in more languages for the billions of people who can yet access it on their own. Adam, welcome and thank you for joining us.
Thank you for having me, Adam. You know, when we were preparing, I thought, okay, the questions are great, but I cannot forget this last sentence where we are considering the part of your mission that is really bold, right, which is to make content available in more languages for the billions of people who can yet access it. How do you come across that problem of the billions of people not accessing content? How do you, like, how do you see it or how did it come about to pay attention to that specific gap?
[00:01:39] Speaker B: So if you kind of go back to sort of old school Silicon Valley, like 2000s era, right? Google make the world's information accessible and useful. Remember when there never used to be like any geofencing on any website, it didn't matter where in the world you were from, right? That was kind of like the early dream of the Internet, right? We're all equals. Nobody knows you're a cat, right?
And then, you know, time went on, we got a little bit more locked down, right?
And I was, you know, working in Silicon Valley and you know, there the perception is that basically everybody can understand English, machine translation is working fine, et cetera, et cetera. I even kind of thought that, right? Even though I was working in Google Translate, working in Google.
But then it turns out when you actually do the math and you just think of your own family, you think more people know English than they did before, et cetera, et cetera. You actually go look at the statistics.
1 1/2 billion people ops kind of understand English, native, non native, right? Half native, half non native speakers understand English. And that was basically the first, you know, billion and a half people who got the Internet. And sort of all the growth beyond that has been people who don't understand English, right? And so these are just Statistics on how many, you know, there's whatever around 8 billion people on Earth now. And a billion and a half understand English. And it's surprisingly constant because this is another topic. But English is catastrophic for birth rates. Once you start speaking English, you stop having kids. So it just kind of stays constant. Even though people keep learning English, there's just more people who don't understand English who are being born.
[00:03:18] Speaker A: That is incredible. The scale of the challenge and how it multiplies over time. And we see it, we see it. We are paying close attention to indigenous languages and also young languages. We've come across languages like Bamm, 120-year-old language and it's got phonology and orthography that is changing over time. And I've mentioned this story in many places, but yeah, it's crazy how it's evolving so quickly. You cannot even go get to catch it, it seems. But it's a. It's a beautiful.
[00:03:50] Speaker B: These are moving targets.
[00:03:52] Speaker A: Yes, yes, yes. So this inspires Model Front. So what's the story behind Model Front? How did the idea come about and the problem, of course, that we're talking about here?
[00:04:02] Speaker B: Yeah, so I was working in Google Translate, which had, you know, been part of Google research and Google Search and as you know. Well, and a lot of folks know, Right. Basically Gen AI started in translation and started in those circles actually inside of Google. This transformer model architecture, The T&GPT was invented there around 2017 and effect neural machine translation a few years prior. Right. And it wasn't just a research thing. It rolled out very quickly into production and the translation industry sort of shifted from translating from scratch to basically editing the output of AI. Right. Basically the same thing that's happening to me as a programmer and you as a journalist now or both of us CEOs, right. Chief email officers know now we're finally shifting to editing AI output. But for translation and translators, that happened a long time ago, starting as early as like the earliest I've Heard was like 2001, 2002 at sun and Silicon Valley. And so what you would expect, right, what I expected as, even as a guy who worked on this stuff was like, wow, generation quality just got like radically better. Right. I mean, we all agree, even the biggest critics of AI of machine translation agree that it was worse, you know, 20 years ago. So they agree that it was, you know, it got a lot better.
So if you're like, you know, let's say a large buyer, you're, you know, one of some Fortune 500, you spend whatever $10 million a year on translation, you'd think like, wow, I get, I'm going to get way more for my money for this like high quality content.
And that didn't happen. So one thing what did happen as at the low quality end, you know, if you go to, you know, these people around the world and they didn't have any translation before, now they can get some kind of translation on their cheap Android phone and they're better than nothing. And so they're very happy. And that was sort of like what I played a small part in previously, right? So I was at Android, then at Google Translate, and at Google Translate I was also maintaining that Google Chrome Google Translate integration that fires every time somebody around the world who's not on an iPhone, which is most of them, open a web page that's not in their language, which is most web pages, right? Super cool, great for humanity. I make no apologies for that, by the way.
But if you actually go to like a high quality document, say like a technical doc or a patent or a product description, right, or even let's say like the articles that you guys put out, right, which are more high quality, unique content, it doesn't actually enable very much. Like it didn't shift to the point that even multilingual media says, okay, you know what, we're going to have everything in Spanish. I'm talking Spanish again. I'm not talking boom, boom, right? It didn't get to that point because it's still really, really, really slow and expensive, right? And so that's the point where, you know, us AI guys got to look ourselves in the mirror and say, all right, what? Why not, right? Like what? There's something here that isn't as we thought it was, right? Because we all would have assumed that if you just make, keep making generation quality better, eventually you're going to crack this thing or at least you're going to help really, really, really accelerate how people produce this content.
[00:07:13] Speaker A: That is really an interesting take. And you were saying earlier, of course, that hybrid translation, at least of the conversation, we're talking about hybrid translation. And in that conversation I was mentioning how a report on the future of Jobs Survey 2024 by the World Economic Forum, the combination or this hybrid is expected of this amount of work or jobs done by hybrid, right now it's going to increase from 30 to 33% of the entire size of the pie, if we could put it that way.
What role is model front playing in this whole hybrid conversation?
[00:07:56] Speaker B: I guess you can say we're right in the thick of it. Right. So from my view, basically, yeah, what you always had was tons and tons of stuff done by machine translation, right. Even 10, 20 years ago, that volume just exceeded by, you know, orders and orders of magnitude whatever could go through this slow, expensive human translation pipeline. Yeah. And but what you didn't have was the ability to fully automate really any of the human quality content, maybe with one exception. So the translation memory, right. So they say, okay, if something's already been translated once, we're not going to pay for that. That's going to just going to come from the cache like that, that. But that was then in the 80s and sort of like stay till. But for most new high quality content there's been like, it's sort of hybrid in the sense of that, okay, maybe AI generates it and then someone looks at it. It's hybrid in the sense that there's AI involved. But nothing ever actually skipped humans. Everything still, every single word. If it's content that matters, still had to get a human look at it. And so what we're doing is saying, hey, these machine translations are good, right? Or these ones are bad and maybe we fix some of them and now they're good and these ones are bad, still bad and need to go to a human. Right? And so that means that it's hybrid in the sense that some of the content gets looked at and some of the content doesn't. But for the content that doesn't, it really is fully automated. It really actually skipped humans and got published to this Fortune 500 website without any, any human looking at it before that happened. Right. And so that's a quite different kind of hybrid. So yeah, and I would just really stress that the pie is radically growing. There is no shortage of content. There is no shortage of people who can't understand English. And they are growing in the real world like crazy. And they are growing on the Internet even faster because they're all the ones coming on the Internet now. Right. And they're growing by GDP as well, I should say.
[00:09:49] Speaker A: Right.
So we have growing frontiers, right. And you have these vast amounts of volume couldn't get checked for quality to be able to access, let's say markets, to put it that way, to access people.
Now vast amounts of that content gets accessed by people, which in turn is gonna generate many more interactions, very likely, and exchanges in disparate languages.
What are the bottlenecks that you foresee coming to continue solving this? Right, because I presume then you, you know, those translation memories need to Be updated, need to continue growing. You need linguistics linguists to really like, be hand in hand with this. I was just seeing a video about how OpenAI hire all these therapists, hundreds of them, just to train the model because people are accessing ChatGPT in distress. And then the system was basically hallucinating all the time and then sending people to like their deaths. And then they saw an opportunity to train the model to make it a lot better. Right. So it feels to me, of course, that this is a self, it must have some cycles for reinforcement, cycles for growth. And I'm trying to think what the bottlenecks would be in this path.
[00:11:09] Speaker B: So that's very true. Right. So all of the labs spend more on data labeling, basically from experts. They don't want trash data, right?
So, so they spend on high quality experts. They want high quality data, which is then police double checked, et cetera, and still often has problems. Right? But they spend money on that data. It's called reinforcement learning with human feedback. Right. And those budgets are probably bigger than a lot of the translation budgets and you know, translation company revenues, right? So that's how you build good models, right? You have human feedback and, you know, continued fine tuning on that. However, there is a big question about like who's going to be controlling that, right? So essentially, if you're a translation buyer, if you just go, okay, I'm going to trust everything to GPT, then they're the ones sort of deciding what those terms should be, how translation output should be, et cetera. Whereas what I think is probably like a little bit more what a buyer wants is that the buyer is the one doing that labeling according to their, you know, labeling, editing, et cetera, according to their style of guide. Basically they're the ones giving the feedback and doing the monitoring of what's outputted because they're, you know, nobody can know what your product should be called other than you, right? In, in Spanish, let's say. So that's, that's kind of. So. So now when you talk about the future. Well, there are the translation departments who are going to embrace that and do that and be the one who kind of controls how their models behave, right? And be the ones who provide the data to their models and evaluate which vendor they should get the models from and all of that.
And there are, let's be real, translation departments where like the people cannot just make this jump, right? And they're going to be replaced by new people who can that you can see it happening, right? So I can't predict the future. I'm just talking about what's already happening now.
[00:12:51] Speaker A: Ye. In fact, I was telling someone at a meeting today, at another panel earlier, that I was replaced by AI Eddie.
I'm AI Eddie. This one knows how to use AI. I replaced the one from three years ago that no longer exists.
So it was already replaced. By the way, I think a lot of people are taking up to the challenge, maybe something that can't. And I have many questions and I'm not sure I want to get into the future of work and, you know, how people should be looking at their careers that, that I think people should discover very quickly. They were moments where we're very fortunate to have many tools to be able to kind of like, understand how you like, inflict change and growth in your life. So it's like, getting there seems to me like, like almost disrespectful to certain people. So I'd certainly say one thing that's also interesting to me. This whole transition and people changing functions and what they do and transforming their careers, it's trying to understand what's evolving in the elements that allowed for translation to happen like this, for translation to be automated. So when it was human, of course, there were so many steps and so many manual clicks that you had to do to get a project going from, you know, input to output. And now it's simplified.
It almost feels to me like that image that you see when they compare the different rockets from SpaceX, when you have all these cords all messed up and then you see the latest model and it's just like much more simplified, much more efficient. But it's. The elements are there somehow. So in the perspective of model front, whatever you can tell us about what are those, like, elements, constituent elements of this job that have been simplified and allow for this level of efficiency to be even possible or exist.
[00:14:47] Speaker B: So, yeah, I would say that there's a ton of stuff that goes into the process, for example, like getting the data, you know, getting the content out of the CMS and, you know, designing projects to people and all of this stuff and that I can't stress enough how important that stuff is. Right. And if you don't have that working, you probably shouldn't be talking about AI yet, and you probably shouldn't be on stage at industry conferences telling other people about AI. But what modelfront does is then specifically only that actual core workflow part, right? So when you have, okay, I have a machine translation already generated inside of this project, and we want to know which segments need a human to look at them and which don't. So we're very myopically focused on what you can say is the most expensive part. Right. And going to what you're talking about, a lot of times you find that the things around that sort of project management are also not very automated. Right. And they shouldn't really be done with AI, they should be done with like good old software. But when the part that we focus on was the bottleneck, you didn't have that much incentive to go fix the things before and after, right. It's like, I don't know, booking airlines is not, it's not a very pleasant experience. Booking airfare is not a very pleasant experience. But the reality is that the biggest cost is sitting for 10 hours on a plane. Right. So you don't have that much incentive to fix these things. But if you made the plane rides, you know, like, I don't know, twice as, twice as fast or maybe like 100 times as fast, all of a sudden you'd think a lot more about, you know, what a pain it is to get through security. Right. So, and that's not, that's not what we focus on. But I think that's why all of a sudden there's a little bit more focus on those things as well.
[00:16:19] Speaker A: How complex is, is it getting in the content that you are designing? Because I assume it's an ever growing list of, of things that don't need human intervention. As you know, you've progressed, I presume new things are coming that become easier to predict that they are good to go. Right. So I'm trying to think about like those use cases or let's say success successes of, of, of successful publishes that you have seen there, the ones that you can talk about to get a sense of what are the things that don't need humans but that still complex in perhaps how it is conceived to be executed at high stakes is I guess what I'm looking for.
[00:17:04] Speaker B: Right. So probably you've seen content that was actually like approved by model front and you just didn't know it. Right. So technical docs for some of the Fortune 500, a lot of patents probably haven't seen those, but a lot of patents in all fields, biotech, machine learning itself, even like AI itself, patents, all chemistry, et cetera, et cetera. Right. All that kind of stuff.
The hotel descriptions on the largest booking platform. Right. So travel, travel booking platform, Right. So all that kind of content is going through Modelfront.
But I would stress that.
Well, yeah, Farfetch, you know, they've spoken at low world, right? So the top luxury fashion marketplace, right. I would stress that it's used for all sorts of content, right? I mean patents to luxury fashion is a pretty, pretty wide span. However, the percentages within those content types is very different, right? So what you're typically going to see is that for some kind of E commerce content, you can automate a very high rate of that content. That content is easier. For some kind of very heavy technical, legal content, the rate of automation is much lower. But the value of automating a small percentage of that may be very high.
So, you know, there are different ways to look at this and I think both of them are basically good for humanity.
But one way I like to look at it is that some of the stuff that's more of a luxury, figuratively speaking, it's much easier to innovate with those. And then that kind of funds the innovation for the stuff where there's a.
[00:18:42] Speaker A: Little bit more bureaucracy and more complexity.
It's got to manage so many moving elements. In many cases I can only think about how policies change and those change with political climate. And then some of those things could really affect a lot of like these really high value, high stakes content that it's the institutionalization of the world. And I definitely believe something like this needs to probably it's a whole layer that needs to be covered.
[00:19:11] Speaker B: And yeah, there I would say one interesting thing, one big difference between doing generation and what we do. So sort of if I go back to doing unverified generation, right? I know me and just a ton of people, sort of AI engineers out there, you spend your life, you get a little bit higher blue score, generate slightly better output and you see some inputs and you're just like, I don't know what we could do with this input. It's so ambiguous and hard. And the law might change, as you said, right. And you still gotta spit out something. And the great thing about what we do is we're saying we don't have to say what it should be, we just have to. We don't, we don't. Or go check what the law is currently. We just need to detect that this is something that depends on, let's say the law changing, right. And reject it. That's incredibly freeing, right? And that's sort of the whole trick here is that if you don't know, don't just guess, say that you don't know and send it to a human. Right? And so in that sense, even though, yes, it's very complex since with Our approach, you have the way to, to deal with that complexity. Say just the fact that this is complex, this sentence is super hairy, we reject it. Then you detect the complexity. You don't have to find the right, generate the right thing given that complexity. So it's really a fundamental change in how we build AI.
[00:20:24] Speaker A: And it provides a kind of image that I get of a type of dashboard where that dashboard is specific to every, you know, activity or every, I don't want to say client, but every challenge. And then almost like in that movie where you have the five feelings and then you have all these things that start appearing as the kid grows. Like companies probably add complexity to certain things immediately you would say everything that's like, I can imagine the first decision, I'm just playing with it right now. But for multilingual would be anything that social media, human check everything, internal communications.
[00:21:00] Speaker B: Not in the check, and that complexity is in the data. Right. Because the humans are, they've been making those decisions and on average they are editing those things more. For example.
[00:21:13] Speaker A: It's also very freeing for the model that you're suggesting because it puts some of the responsibility on the holder of the model front, on the one that decides to use you. Because it's almost like the people that defer responsibility to their large language model.
It's almost in here similar where you're saying you can defer that responsibility to the layer. You have to think about it as well.
[00:21:39] Speaker B: I would counter there. We definitely have a lot of responsibility put on us because in some sense we are the last defender.
So in most previous sort of AI rollouts either they said, okay, this content doesn't matter that much, so we can just roll out without any human checks. Or this content matters so much that we're still going to have humans look at every word, in which case there's not that much responsibility on the AI. Now in our case, we're telling them which ones need to have a human look at it and which one don't for the relatively high value, high quality content.
So there is a ton of responsibility on us. And I think that's why, you know, basically my old team at Google, a lot of other tech companies, translation agencies, TMSs have sort of failed to, to make this work in the real world because there is like this, this higher, you know, you have to have people carrying pagers, you have to have a different kind of guardrails. You have to have the whole system set up around it and take that responsibility, which they kind of don't want to Do. Right. If I'm Google Translator or tms, I kind of just want to add this feature onto my existing system. It's just one more little checkbox in the roadmap and get on with life. Right. And so to be fair, let's say there is a lot more responsibility on us and we say that explicitly.
[00:22:50] Speaker A: No, Absolutely. So would you say then that you get involved in the definition or is the definition of it's predetermined and then how is that exchange with the one interested in engaging in this core responsibility? If we put it that way.
[00:23:08] Speaker B: Great question. Right. So yeah, it is primarily defined by them via their data and their style guides and so on. And we try to find any ambiguities or conflicts there because a lot of times data is not in sync and ask them then, okay, we see sometimes it's going this way, sometimes it's going that way. Maybe there's some really complicated new nuance.
[00:23:30] Speaker A: Right.
[00:23:30] Speaker B: It depends what day of the week it is or whatever. In that case, we're just going to reject everything, right? Everything should go to humans. Maybe they actually want it this way and that the other way was, you know, the old way or people who weren't following instructions. And then we said, okay, great, thanks. We'll do our best to make things work that way. Yeah, so that's kind of, you know, they primarily guide it, but that we always have to go in for feedback and get more input.
[00:23:52] Speaker A: And that's really interesting, that interaction of like what works and what doesn't work and how you understand what it works. And I'm very optimistic about that approach of where you are proactively also almost learning from everything that's going on within the model front ecosystem. Right. So I suggest, I suggest, I imagine then that there is some way in which you are bringing these learnings to reject things that perhaps were not rejected before. To a different conversation, you say, hey, this was rejected in these different instances.
Then is your system able to suggest over here, hey, we're rejecting it, check it because, not because something that you did, but something that we've seen anonymously, I don't know, anonymized, I guess. Is that something that happens?
[00:24:41] Speaker B: Right. So yeah, we have to be very careful about, like we can't transfer any data across customers. Each one gets their own private custom models. But I think there are a lot of learnings, of course, that are across customers. So like an obvious example would be that, for example, short segments are very ambiguous. Right. If you just have something without context, you know, we Got burnt by that a few times and then it became the default, you know, across customers. A lot of you take a lot of stuff around. Let's say a favorite one of mine is like the time of day. Right. And does 9 o' clock mean, you know, sometimes it means 9 o', clock, sometimes it means 21, et cetera. Right. So see this kind of stuff, currencies, all the usual places where you get burnt. And I would say this goes back actually far before model front. Right. So I had a, a kind of rabid fascination with translation fails when I was at Google Translate and sort of my favorite hobby was predicting the like only seeing the source input and saying oh it's going to mess up on this one. Right. And getting a very, very good sense for that. And that stuff hasn't really changed. Right. Like the, you know, it's evolved a little bit. But I think that having that like really, really good instinct is just based on seeing a lot, a lot of examples of what can go wrong. And I would say knowing a lot of languages. So I think it's very different when you, when you take like a cross lingual view and you're able to at least kind of understand 10 or 20 languages. Yeah. And it is kind of the same things that go wrong. And it has like, I think deep. It is, it is. I'd say translation errors are largely way more than people think about the source. Basically that let's say English is a source. Like there's an ambiguity into English and that that ambiguity that's in English is a problem for most target languages because they don't have that ambiguity.
[00:26:21] Speaker A: Adam, I thought that the elephant in the room, definitely not AI for us was the whole human quality element of.
Is a bold statement. But as we are discussing it, right. When we start talking about different types of outputs that are very well established and well said, very institutionalized, to put it that way. It seems that it is possible to get a good number of these use cases. Could you give us a sense of the use cases where you're like this is 100% going to happen, human quality. And then there is this other translating Hamlet that it's like sure, maybe we'll see type thing because already there are institutionalization of types of translation.
I see where it could go, but could you give us a sense you are in it? How does that.
[00:27:16] Speaker B: Actually, yeah, so I think like a very good rule of thumb for this is how much translation memory, how many translation memory matches there are. So basically if there's a lot of repeats Repeated sentences in the content. It's kind of. Or even fuzzy matches, right? So they say history doesn't repeat, but it rhymes, right? So if the content kind of rhymes with what came before and it's just sort of different remixes of the same thing. So like a product catalog or like the. So the instruction manual in a car, right? Just year after year after year where a lot of it's the same. Even the parts that aren't the same tend to be very similar to what came before. And so automation works better. So if the old automation worked well, the new automation is also going to work well. If you got, let's say, a totally new poem or a novel, as you mentioned, or like boutique marketing strings, right? Just some slogan that's going to go on a billboard in 100 countries. But it's three words long, right? In English, that kind of stuff. Even humans don't agree on it. Old school automation didn't work well, I'm the first guy to say, you know, like, why do we talk to each other in different languages? Like, you know, you speak one language, you know, even you and I or you and your colleague, right? You speak English sometimes and Spanish sometimes because you actually can't translate this stuff and you're like a completely different person. I'm the first guy to say there's stuff that's just untranslatable, right. And so that is obviously, you know, not touchable by these approaches. You're not going to get much efficiency there.
[00:28:40] Speaker A: Yeah, I really like that question because it really helps scope what we have in front of us in a very clear way. I of course have to also ask you about where you see model front or how you see model front growing. How does it even position itself in the whole globalization spectrum and what companies are looking to do, what organizations are looking to do.
Are there any particular verticals that are more interesting or not? Because it seems like for you the sky is the limit. You could really go in a lot of different directions. What do you see right now as that direction?
[00:29:23] Speaker B: So there are definitely limits.
I'm keenly aware of the limits. The thing that bothers me a lot, firstly is that in a lot of cases, the percentage of translations, like in that content type that we can actually safely approve is too low. Right. Is not where it is in the other areas. Right?
So like as I mentioned, like the E. Comm. Vertical, things work super, super well. Right? But when you get to more technical, high value stuff, the percentages are lower, even though the value may be high. The other Thing is, I mean, we turn away buyers weekly because they don't have enough volume, right? So we're not able to provide that level of quality, you know, for, you know, in small data scenarios or medium data scenarios, let's say. And therefore, we don't provide anything because there's no point if you don't provide human quality. Right. So those are, those are some of the big limitations here. What I can say in terms of our position is that we're very independent, right? So we have put more resources into this problem than any other company on Earth. And I'm including, you know, like Google, Microsoft, Amazon, et cetera, the large translation agencies, et cetera. I don't think that's the only secret here. But like, that's. That's part of it. And the reason is because we don't provide machine translation. We don't provide a tms, we don't provide human language services, and I promise you we never will. Right? Nobody needs another human language service. Nobody needs another machine translation engine. Nobody, Right? And so those things, and people have those already. They don't need that. And so it's better if we just focus on the one part that, that we do better than anybody else. And so that's been our position since the start, and that's remained our position, and that's going to remain our position. Yeah. And I think one big limitation here is that a lot of the legacy tools, contracts, people, et cetera, block companies that they do want to use AI, they do come to us and they want to use this. They're actually blocked when we say, okay, let's go look at your scenario. So first you got to fix this, this, and this, because you're not even in control right now. So I think when you talk about limitations, there are a lot of limitations right now. Like people just can't get their data out of some, you know, dinosaur system or they can't get out of contract that's like this, right? So there are a lot of those kind of limitations too, if we're honest.
[00:31:43] Speaker A: Very interesting then to know. And then, hablas espanol?
So you know languages.
It's not like you hate languages.
You're not evil AI. I never said that. You said it.
[00:32:01] Speaker B: But I am an evil AI guy.
But I'm not only an evil AI guy.
I speak eight languages, actually. Oh, wow.
[00:32:13] Speaker A: Which eight languages do you speak?
[00:32:16] Speaker B: English, Spanish, German, Italian, French, Serbo, Croatian, Russian, and Armenian.
[00:32:23] Speaker A: Oh, wow. How did you make that connection? How did you. Yeah, learned all of them.
[00:32:28] Speaker B: Just School of Life, it was not a plan.
[00:32:30] Speaker A: Oh, wow.
[00:32:31] Speaker B: So. But yeah, I obviously like it and I live it right, and my kids live it right. And I mean, this is kind of what bugs me. Like, you're in the same situation. Right? I mean, there's just content. I don't know. Your kids probably started learning English now.
[00:32:43] Speaker A: Yes, they know English by now.
[00:32:46] Speaker B: They do, but they weren't. When they were little. They didn't. Right. And then. And same thing here. But you know, my oldest one, she starts, you know, learning at. The younger ones don't write and you just. And then you want to share something with them and they don't have access to it. Well, you know, they're in a, in a good position because, you know, their dads are, you know, CEO of some company in the translation business. So they're going to be fine. Right. But you realize what it's like for. There's a lot of little girls who don't have, you know, if you want to learn to program or you want to watch this, this podcast right here, and they don't have access to it. Right. So this is, this is.
Yeah, it's, it's. It's very personal for me like that. And, and, and then you. Yeah, I remember when my daughter was learning to code, we started when she was like six. Right. And I don't want to say the name of, of the right, because, you know, it's, it's free, it's a good tool, it helps a lot of kids. Right. But like I went in and looked at it in German and the translations were just terrible. Like I. You couldn't understand it. The only way I could help her is because like, her dad is actually like a programmer who knows what the English was supposed to be that this thing was translated into. Right. It was so bad. Right. And then we're not talking a long time ago here. Right. And so, yeah, this, this is the kind of thing that, that motivates the hell out of me, I gotta be honest.
[00:33:59] Speaker A: And, and that's great because then, then talks also about the linguists and, and you know, where we originally were were discussing languages are growing in the numbers of people that speak them and in their complexity and in their nuance and in their culture. So is that ever complexity problem that seems to be growing exponentially and still there is like these things that are making it very bottlenecky. It feels very bottlenecky.
What role do linguists would play in it? And what roles do the large language models play in it? Like this automation. We already Talked plenty of the automation. But the linguists, the people that actually understand the culture, that like speakers, you know this, and you probably feel this when you speak the languages, then you get to know like culture and jokes and like things that you're like.
[00:34:50] Speaker B: And in your head you're not translating. Right.
[00:34:52] Speaker A: This world is very different to understand the language than to have like the translation of what they are talking about. Like, it's very different. So what role do you see linguists and the ones that enjoy translation and seeing if the words match or not or where they didn't match, and the taggers, I guess, of the future.
[00:35:11] Speaker B: Right. So I would actually say that I think the linguists were right basically about a lot of stuff and about maybe the most important thing, which is that post editing AI output is not actually faster if you still have to check every single word. So what happened? Right, as I said in the beginning, translation generation got way better. And we all expected these savings, right? And you got to the industry, the supply chain, this like middleman to the middleman to the middleman to the middleman. They all promised these savings. And people inside, you know, companies promise savings to their bosses, et cetera, et cetera. But that didn't materialize because it turns out that if you still have to look at every single word, you're not actually that much faster. So the linguists were right, right.
And what then was done was, okay, let's just like squeeze them a little bit. In certain TMSs, they even have the default that they pay you less if you don't edit that translation, even though you spent time looking at it. Right. Just kind of like whip these people, squeeze them a little bit, et cetera, and you know, introduce this light post editing. And by that we mean look at the same amount of content, but just like, can you go fast?
[00:36:19] Speaker A: Right?
[00:36:20] Speaker B: And so, and then you look at this data. My, my co founder's wife, our cto, his wife was actually a freelance translator for one of our customers now. And I, given what I'm about to say, I sure as hell can't say which one, but. And he was looking at this and it's like, wow, she leaves like, you know, a lot of these unedited and then has to edit every, you know, you look, confirm, confirm, confirm, then edit, confirm, confirm, confirm, right? And this becomes, you know, your whole day where you spend 80% of your time is spent on stuff that, that you don't actually end up editing. Right? Which is crazy. And a lot of them were pretty quick to like to Notice this. It was the supply chain because they're just middlemen. They didn't really notice this.
And yeah, so you have this pressure and their job became just like rushing over stuff and then making very few edits and not really doing a very good job. So you look at the data and it's, it's has full of mistakes. You know, the, the people who really actually enjoyed the beauty of the language, they quit or they shifted to those like boutique things that were under this kind of pressure. And you had a lot of people like, yeah, I'm a translator who couldn't really translate. And I just like, you know, they accept these conditions. Right. And just undercutting each other, et cetera. That's kind of what happened. And what I think, what I think Modelfront is doing is actually bringing things back to maybe a little bit more like how they were before, where, okay, we're saying, look, the easy automatable stuff is going to be off your plate and what you look at is even more important because it's going to be used to retrain models, to monitor models, police them. Right. So please take your time. I'm telling our customers, please tell them, take their time, pay them well for the part that they do look at. Right.
And the translator for them, they're still fully employed the whole day. It's just the company could do more content.
So the part that they're going to do is going to back to these interesting parts, like how do you translate that word? That's the first time we've ever encountered this word. Rather than fixing a straight apostrophe to a curly apostrophe for the millionth time this year, I think we're going to shift back to the translators doing that interesting high value work and on average the wages and so on are going to reflect that. Right. But it's not always a smooth shift because you have a lot of middlemen. You have a lot of, you know, there are going to be people who get shafted, who lose because of the way the system works, which is, you know, not my mission to change, but, but you know, it has some problems.
[00:38:46] Speaker A: Yeah, yeah. And I understand this is also the type of convergence when, you know, teams also take the opportunity to reassess, to restructure, to rethink, to realign ambitions, to redo their missions. And, and it, understandably, it's hard sometimes to communicate those. And I see challenges in certain companies communicating how they are navigating and communicating that the moment that we're going through.
[00:39:14] Speaker B: And it's Definitely not always fair. I mean, I'm not, again, not predicting the future. I'm talking about, like, what's already happened and what's happening. I see it. It's not always fair who gets laid off or who loses a job because of these things. Right.
There may be somebody else who's more political, let's say.
[00:39:28] Speaker A: Yeah. And I think we understand that part of the conversation. I think this has been a great conversation, Adam. I think we're going to continue it because I'm going to be seeing you in Saudi or India.
[00:39:42] Speaker B: India.
[00:39:43] Speaker A: India. We'll see you in India.
[00:39:44] Speaker B: I'm really looking forward to this, by the way.
[00:39:46] Speaker A: This fantastic. And I think there will be some great conversations happening there. Adam, before we go, any final thoughts? And of course, if you're listening to this, you should look for the second part of this conversation. Well, the continuation of this conversation, I'm pretty sure I'll talk to Adam in India and we'll dig a little deeper into other things that we see over there. Adam, any final thoughts or comments?
[00:40:06] Speaker B: Yeah, I can't stress enough how important it is to be in control of your data, your workflows, et cetera. Right. If you're a translation buyer. So, yeah, I think if you're excited about AI, if you want to have the freedom of different moves on the board now, or let's hopefully at least in six months, then you really want to make sure that you're in control of everything that you're paying for.
[00:40:30] Speaker A: Adam, thank you so much. It's been great.
Thank you everyone else for listening to localization today. Again, a big thank you to Adam Bidley. Mayer did a butcher did. How was that, Adam? Good.
We'll train a bit more.
[00:40:51] Speaker B: Come on, Espanol.
[00:40:53] Speaker A: Oh, see Bitly Meyer for joining us. Thank you, Adam.
Yes, mucho gusto.
Thank you for sharing how Modelfront is making human quality translation scalable, accessible, empowering organizations to reach more people in more languages than ever before. For those listening, remember, you can catch new episodes of localization today on Spotify, Apple Podcasts and YouTube. Subscribe rates and share so others can find it. I'm Eddie Arrieta, CEO here, Multilingual Media. Thank you for joining us. See you next time. Goodbye.