Turning Data Into Direction

Episode 281 May 08, 2025 00:34:39
Turning Data Into Direction
Localization Today
Turning Data Into Direction

May 08 2025 | 00:34:39

/

Hosted By

Eddie Arrieta

Show Notes

In this episode, we speak with Veronique Ozkaya, Co-CEO of Datamundi AI, about the company’s transformation from Summa Linguae Technologies into a data-focused, AI-driven service provider. Drawing on her 25+ years in the language industry, Veronique explores how multilingual data is becoming central to enterprise AI strategies—and why localization teams are well positioned to lead this shift.

We discuss the rebranding decision, the evolution of data services, real client use cases, and the internal cultural changes required to support such a pivot. Veronique also reflects on the strategic role of language in AI adoption, the challenges of bias and data quality, and the core values—openness, resilience, and transparency—that drive Datamundi’s vision for the future.

View Full Transcript

Episode Transcript

[00:00:03] Speaker A: The following is our conversation with Veronik Oskaya. She's the co CEO at Datamundi AI. She's got over 25 years of experience in our industry. I'll tell you more about it as I introduce her. This was very exciting because we were able to see the transformation of a company in our industry, language services industry, from Summa Lingua Technologies, very well known into Datamandy AI as a way to adapt into the current demands of the market. A very interesting conversation with Veronique. We're very grateful to have her multilingual. I hope you enjoy the conversation. [00:00:48] Speaker B: Thank you Eddie for having me. What an intro. [00:00:51] Speaker A: Thank you, thank you. I'm a good reader, an executor of great profiles and yours is a lot easier to work with. Veronik, thank you so much for doing this. We were talking of the recording that you are located in Izmir and you gave me some fun facts. What is your fun fact about Izmir in Turkey? [00:01:17] Speaker B: The fun fact about Izmir is that Aristotle Onassis was born here. So you might know him as like this very successful shipping magnate from, from Greece. He was actually born here and then in 1920s he moved to Argentina and then the rest is history. And then another thing when you come and visit, Eddie, like one of the marvels of the world is here, the Ephesus, which is like ancient city. So there's a lot to do here and as I was mentioning and great tech talents. So that's one also one of the reasons why we are based here. [00:01:48] Speaker A: Fantastic. And we are excited to get to know more about Turkey language and everything through there. Who knows, we might do an a whole issue on Turkey soon. That could be an option. That could be an option. And of course the big news is Summa Lingua Technologies is now Data Mundi AI. Why don't you tell us in your own words Veronik, what does it mean to you? [00:02:15] Speaker B: So we as an organization, our strategy and our focus is very much on. On data services and the history. Just to give you a little bit of background of how we came about that is that Summalingua grew very much through a series of acquisitions and a couple of those brought a new level of services to the portfolio. The first one was a company called Globalme and that was bought in 2019 and it had already audio collection and this kind of, this kind of services. And then after this, interestingly enough, at the end of 2020 we bought a company called Datamondi and this company that was headquartered in Belgium at the time was specialized in data services and had a platform to provide these services. And services were typically around enhancing data sets, cleaning data sets. So very much once the data set is created or supplied, making it better or making it fit for a certain purpose. And that was actually, I think, a very smart acquisition because it really showed where businesses were going. And this was prior to ChatGPT. So if you kind of look really back in the language industry in the 90s, you had cat tools. In the 2000s, was TMSS. In the 2 tens it was neural MT, you know, the era of MT. And then with AI, it kind of came out really with ChatGPT in 2022, where there was already a lot of work being done in the background to enable actually these systems to come to life. And this is what Data Mundi was doing. So very smart acquisition on the part of my co founder and co CEO Christoph. And then it's actually why I joined. It was one of the reasons I joined because I thought this is really where the future lies. And I thought that having data services but also having this multilingual angle, because when you have the background of being a global language service provider, you have this understanding of culture, of nuances, of linguistic aspects. And when you think about it, AI is in English mostly, right? And at some point organization says, okay, we need you to improve the English for the AI, because English is a language, but then how do we go to actually serve in all markets? And that's where your language skills can be very, very useful, even though it might be different profiles, but the understanding of how language works is a big asset. So in the end of last year, when we were making our strategy plans, we thought our cmo Kathleen said, you know, our name doesn't reflect what we do. So we thought about how we would rebrand and we thought, why don't we use Data Mundi? This is a brand that actually, you know, we've bought and this is actually, instead of trying to come up with a new name, we decided to. To go with Theta Mundi. And she was very, very smart by. She joined in September and in October she made sure she bought the domain and everything that was related to acquiring the brand. [00:05:39] Speaker A: Very quickly and a very quick decision, of course. Data mundi.in data mundi, poor data of the world. Could you tell us a little bit more about how this concept captures that expanded focus? What does it mean for your clients and the broader AI ecosystem in language as well? [00:06:02] Speaker B: So for our clients, if you, if you look at, I would call Them traditional client in localization. So traditional clients in language services, they, what they do is that they have data in a certain language, usually it's English, can be German, whatever, a source language, and they transform it, right? What I think clients maybe don't always see see is that they're sitting on a gold mine. All of this multilingual data that they've been curating and enhancing and so on and so forth, it's seen sometimes as mundane as translation. We look at it as data that can be reused for other purposes. It can be reused for creating AI system, it can be reused for getting insights, it can be used for your technicians in the field. You have all of this information that you could feed them, right? So we look at you, we actually look at all of our services are some way of manipulating data. So that's what for localization clients, it means that we believe we can go further and we believe our language services clients should really see how more strategic they can be by being really the multilingual and the data owners within the organization. So that's, that's the part where what we're seeing with our traditional clients is that they have now new needs and they need us to help them figure out how to serve these needs. So imagine you're a localization buyer, localization manager, and you have all these AI initiatives going in your organization. Take something very simple, an internal AI assistant. At some point someone comes and says, oh, you know what? We have workforce in France, in Chile, in Italy, in Japan. We want them to use this AI assistant. This is not going to be in English. They want to get the information in their language. Where do they go? And I think a lot of localization entities within clients are starting to think, hmm, how can I be more strategic and really support these initiatives? So that's the part on language services. Another aspect is also using the data as insights to help sales. So let me give you an example where let's say you're a retail site and you're selling shoes, like let's say for the swimming pool, this kind of flip flop type shoes, right? And you've sold a few thousand, you've got a lot of reviews, right? If you improve even the description of that product, to what extent is that going to increase your sales? We can do this in such a much easier way today with all the technology that's available and you can be much closer to your client when it comes to their sales goals. When it comes to the actual, the client goals, I think in our industry maybe we tend to be a little bit too navel gazing and look at all the ingredients that come to make a recipe. We tend to really focus on what the client is trying to achieve and how our services can help them. Then the bigger part of what we do is pure data services and this is really to fuel AI. So if you imagine AI systems, you've got the models and so on and so forth, you have layers coming on IT platforms and so on. But really it only functions if the data is good. So again, back to this AI assistant. Imagine that you're in an organization and you want to get access to knowledge. You've got this AI assistant that can help you if you're not getting the right answers. How are you going to continue using it if it's giving you things that you know is not correct or you know if it's hallucinating or if it's biased or if it's incomplete? Because these are the three major issues you have with AI, right? It's like the data is not complete, the system doesn't know how to answer you, it's telling you things that are not true or it's giving you bias. It can be gender bias, it can be so this is all down to data and that's what we fix, that's what we address. So that's like our clients, we do number of things. They ask us to collect data because this is going to feed the system. So there's parameters to collect this data. They ask us often and that's where we do the most work to improve the data. And there's a number of techniques, you know, I don't need to get technical there, but this is really what we do is that they give us a set, they tell us what the problem is and we help improve this set. And then when they reapply the data into the system, you know, it's by iteration, it gets better and better. So what does it mean? You're using the AI assistant and you're happy because it's really helping you getting the knowledge faster, it's accurate and it enhances your productivity. [00:11:01] Speaker A: But how much of this is also a transformation that's also happening internally? So a lot of it, yes, it's a projection of what you were doing and then how much of it is also transformation undergoing. [00:11:12] Speaker B: So this is something that super clear to everybody at Data Monday. We've been talking about this for months and the way I run the company is very transparent. So everyone working with us knows what's our strategic focus and what we're doing so how are we executing that strategy? So already since I would say September, October of last year, a strategic pivot was very clear to everyone, including the steps on execution. So it's not just of course, marketing, right? There's everything that's happening behind. And if I can maybe illustrate what it means, it means from a sales perspective, since last summer we've been training all of our sales and account managers, like intensely in really understanding the services, the use cases, what does it mean, what to ask clients, where to find out. So you've got all of this machine behind it. And it was very interesting because it also came from salespeople saying to me, you know, in my first couple of weeks, you know, I can see the addressable market is so much bigger. I want to sell there, but I need to be equipped. So we put this in place. It's also in production. We've actually shifted a lot of people from part of the company into the data unit. That that's, I mean that more than doubled in the last year and that concretely that means also if you're a project manager, what are you going to do? Your work, to some extent there's similarities. If you're a project manager, you organize, right, you manage time and quality and money, but you need to understand what you're doing, the subject matter. So we've been training also operations, we've been restructuring how we do the third party stack. So we have a platform, it's called ADA platform, and there it's being able to follow the needs of the clients. Because I think one thing maybe to understand about data services is the speed at which they evolve and how the client needs evolve. So in order to be successful, you need to be very nimble, you need to be very innovative and you need to have that alignment, right? Alignment between people in the field who are selling your operation people, your tech people and of course, and that's very important, your talent, your talent pool. And that's been something that is definitely a tough part because what we do is high quality data. So typically you're talking SMEs, you're talking to some extent also linguists, right, Depending on the type of work. But what we're seeing more and more is you need the highly specialized SMEs, so you've got different levels of. And that is something I think for global service provider who are looking at data services, not to underestimate what it is to actually curate the supply chain. I think that's one of the biggest and interesting challenges as well, right? Because you're trying to find literally needles in haystacks. [00:14:16] Speaker A: And of course, thank you for sharing that. It gives some perspective on the conversation. Of course, as we look at the case studies and what the company is actually doing, it will refer back to those that read the press release in multilingual or on your website that this should. And the idea is that it brings agility, it brings efficiencies, it brings all these very positive elements of scalability and speed to the clients. Are there any real case studies, even if it's anonymous, that you can tell us about, of something like this working for your clients? [00:14:59] Speaker B: Yeah. So I can say that during some recent presidential elections, we worked for a client and we were able 24, 7 to update a system that you call on to get your latest use voice activated and literally within seconds being able to provide the news that were happening in a given country across a number of different countries. So that's one good use case. We have also use cases with existing clients that created chatbots for their support for their call centers. So using data and curating data in order to actually alleviate a lot of the calls that would, you know, would happen live and being able to direct that to the chatbot. So, and the measurement there was the usage of the chatbot because you know yourself that if you're using an online chatbot and after a while it's getting in a loop and not giving you the answers you give up, you want to talk to an agent. So the measurement there is like the time, you know, and the number of cases that can be resolved without having to involve an agent. That's another use case. We do that quite a bit. Also we have use case that maybe are more related to language where we would literally clean data sets that have bias. Right. And it can. So, you know, as with languages, you know, take, take French, if you're talking about a nurse. Nurse is typically associated to feminine gender. Right. But there are plenty of male nurses. So how do you fix the data sets in order to not keep this bias or not to infuse more bias within it. So that's another use case that's quite, that's quite frequent. So this just to give you kind of a little bit of an example of actually the variety. And that's actually also very interesting because imagine you're a team leader or project manager. You're going to be getting different problems. It's not always the same problem. Sometimes you can put the same things in a bucket because you do the typical same work for a client. Often you're going to see an evolution of the needs and things. I mean, we're doing things like SQL to Python prompt. I mean some things that are really technical where you need to find resources that are not so easy, easy to find or clients that say, you know what, we would like you to find 2,000 people that you're going to record on a specific device with a specific app. And we needed within these parameters and then with this data, we need you to do X, Y or Z with this data. So that, I mean, what you see also I think that's interesting, Eddie, is that we tend to specialize in the high quality data, but some of our clients also need some more kind of crowd. But again, it's curated crowd because you need to have them do specific tasks. But we're saying that to feed the systems the variety of needs can be quite large. Oh, I don't hear you. [00:18:16] Speaker A: I was muted. My apologies. Thank you for sharing those case studies. I think our listeners are really going to appreciate it. It's going to give them perspective and it's. And show them some of the variety. As you mentioned, that out there, one of the things that you have mentioned, I like to dig a little bit deeper into and you have made some references to it and we can probably infer the answer to it, but it's related to the evolution of companies within our industry. The localization industry, the language services industry, it seems like, and you mentioned it as well. It seems like one of the key elements or one of the key differentiators that companies have in our industry. Sorry, that. Yeah, the companies have in our industry is the multilingualism and the knowledge of what you said in your own words, the nuance we internally call it, kind of like the texture. How critical is multilingualism in this new, let's say, stage of our economies around the world? [00:19:22] Speaker B: I think that it's, it's crucial because again, you know, a lot of AI has been created in English and you get to a point where you know it, you know that AI is also used at a global level in an incredible manner. Right. And it's been used by anyone even to just get trans. You can ask your questions in whatever language is going to give you the answers. In many, many languages. What, what I'm talking about, like the professional AI applications where you're going to be a user in a given country, you will want that experience in your language. In the same way that if you're buying a product or you're consulting a user, notice you're going to be so much more comfortable if you can have that in your own language. I think that because the adoption of AI is just so wide, language is going to be more and more crucial. What I do believe, however, is that it is not that easy or simple. So to me, like the very basic bread and butter thing, that's something that gets highly automated. And I believe that that applies to a lot of sectors that are the very basic tasks. That's the whole purpose of AI. It's going to replace a lot of things that can be basic. So what is going to be crucial for individuals and talent that have the multilingual knowledge is the depths that they can go to to really understand how the systems function, recognize patterns and be able to address that. So this is really the matter that's at stake here is to elevate the game and understand how language plays and how to fix it in order for the models to do better. [00:21:04] Speaker A: And that's fantastic because multilingualism almost gives you this superpower to be able to reverse engineer what you're seeing as results. And then unless you have that nuance, it's going to be really difficult to gain advantage in the conversation. Right. Of course, as the conversation evolves very rapidly now, it seems like fear has dissipated and we were starting to see the first sign, first signs of life. In this new evolution, clients are changing and their demands are changing. What they are asking for is changing. You've been in data for a long time. How are the data needs of the clients evolve? How have they evolved and what are their needs looking like right now? [00:21:50] Speaker B: What I saw is that initially the needs of the clients were somewhat basic and quite related to volume. Right. As they were building systems. So the classic things, is this a cat, is this a dog? And you know, so you would have like big crowds of people or even in search, in the search kind of domain, you could have people who were maybe not full time employees, but they would do this as kind of a side job or on a freelance basis. So it was kind of more, I don't like to say lower level, but more basic tasks. Right. That was really what, what the whole kind of world of data was known for. But that has evolved. Right. And as these systems, like all the foundational systems are evolving so fast, I mean, between OpenAI and Claude and Llama and Gemini, I mean deep seq, you're saying like the speed at which these systems are developing is just like mind blowing. Right? And the needs have increased in terms of complexity. That's what we've seen so. And we've seen a lot of organization also trying to figure out, you know, how to win this arms race and not necessarily on volume and on quality. If you kind of interested in that, there's some really good resources you can find online that show you how the different systems work better. Or, and I have, like, I have a daughter who's in college and I sent her the other day this ranking of different system related to maths and saying like, this is better. And she replied to me, so this is really useful. That's always my acid test, right, to see, you know, how much the younger generation picks up on that. And she was like, yeah, actually I've seen that. X, Y and Z. So the systems have all developed their special speciality and their, their domain expertise. And with that the requirements have become more complex in order to make these systems better. So you're going to have systems that are more generic and, and what's feeding them is more generic. And then you're going to have systems who are really specialist in some area. And we're back to this SNE thing. So let me give you another example. I'm based here in Turkey. I'm, let's say I'm a cardiologist, right? And my hospital, and my hospital gives me a system that essentially is going to help me validate a diagnostics, right? So patient comes, I do whatever test I need to do on the, and I enter this data in Turkish, right, in the system. You want the system to be correct. So who validates this data? Well, we need to have doctors who are going to check the data set and make sure it is, you know, it is actually correct and it's just not going to come up with weird diagnostics. So that's kind of the level of depth that some of these systems go to for the use cases. And that means also the level we need to get to in order to sourcing the resources that are going to validate and curate the data sets. [00:24:56] Speaker A: And it's great to think about the elements that internally will need to come into place to make sure that all of these happens. This translates to your clients. And you have said that, or at least you've said to your clients, we are your partner, not just a provider. And outside of the cliches, what does this mean in practical terms when someone becomes a client of data Mundi, what does it mean internally for you? [00:25:23] Speaker B: That's a good one because I think you've put your finger on it about the aspect that's around consulting with the clients, you cannot provide data services if you don't understand what the client is trying to do. It's not like they just send you a doc and your data sets and here you go, here's a set of instructions. The normal or the typical process means that you actually sit down with the clients. They're data scientists and you try to understand with them what they're trying to do. And then it's not just about doing it. What's expected is that you're going to do it better than they thought about. And by that I mean that you're going to be able to bring some process improvements or another way of doing things. Let's say that clients wants you to improve retail description, product descriptions on a retail site. How can you do this in a way where the person who's going to be fixing these descriptions has got context? Well, maybe we build a tool around that and we say, hey, look, Mr. Client, we are Mrs. Client. We have now the ability to get our annotators the same environment that they would have if they went onto your site and looked at the product. So that's going to make the descriptions that much better, that much more relevant. And that's a process improvement. Or another process improvement is to say to a client, we've applied all of these techniques, we're not getting more improvements on these data sets for x, Y or z reasons. That's also very valuable because you see the client, okay, we've tried all of this, we hit a wall now. What else can we do? So that aspect of consultative selling and production is extremely relevant and very important. [00:27:17] Speaker A: It seems, of course, that the future is bright, not without challenges and the best, from what we've seen, that they've come to do certain things internally and then they start communicating to the world what they've done. And then their clients are more confident than ever and values stay similar and they evolve. What are some of the values from Similingual technologies now, data mundi that you see standing the proof of time to do this very, very massive endeavor that we're talking about? [00:27:57] Speaker B: I think it kind of revolves around three main values. And of course the values come from the people, right? So if people are not on board with those values, it doesn't work. Right. One of the strong values we have is mind openness, meaning that people being curious, trying new things, failing, and not being blamed for failing, but rather, you know, being praised for coming up with ideas. So we have like our innovation Hub leader, Gert is like channeling ideas and seeing like, okay, what can we implement? So that open mindedness, very important. Second resilience. So which means that you're going to stumble and fall and get back up. And I'm very blessed to see that in our organization we have incredible talent and they're very resilient. I mean we've got people working weekends, doing incredible things for clients, never complaining with coming up with ideas. I mean just even with the rebrand, if you can imagine we change all of the email address domains this and that. Like the IT team was fabulous and working really hand in hand with our marketing team. So that resilience, very important. So innovation or open mindedness Resilience. And the third thing is transparency and that applies in everything that we do. It starts with me and the direct teams and their teams. So we want everybody to embrace of course our strategy. But to do that you need to be very clear about what it is and what it means and how it gets executed. And that is, that's always been, you know, my policy and I, and I see that at Data Mundi that is something also that people value and that when you have that. So if you think these values, how they translate toward client, towards clients, this open mindedness, it's your innovation, how you try to always do better. The resilience is like you don't give up, you always find a way to help know solve their challenges. And the transparency is also that honesty towards clients and to be able to say okay, we've done all this, that's not or yes, we need a hundred doctors and it's not going to be Tuesday that we're going to have them ready, it's going to be Thursday and this is why and how we're going to, we're going to do it for you. So that transparency I think also clients value that you build this partnership with them because they know this is not an easy, an easy task that we're trying to do. [00:30:34] Speaker A: Thank you for sharing that. It's going to be great for our audience also to, to, to listen to. We are coming to an end to our conversation but before we do that, I want to put you on the spot. I know it's very difficult to predict the future and the world, that dynamic place more so these days than we were used to. Also very exciting times. What do you see in the future? What are some of the things that you expect that are going to take place? [00:31:03] Speaker B: I think that we are going to see further disruption and I don't think it's a bad thing. It's a scary thing. Of course. But if I told you, Eddie, that instead of watching a movie on a streaming platform, you would have to go out, get on your bike or in your car, go to a, whatever, DVD or you know, like a blockbuster, spend 20 minutes to pick a film and then, oh, you better watch it within 48 hours otherwise you get a penalty. You'll say, come on. So I think that change is, change is there and it's going to be much more, much deeper than we expect. But it's not necessarily a bad thing. It's going to help clients actually do more and be faster and better and cheaper. So I see that it's going to be painful, all these changes that are being brought by technology, but it's also going to be for the resilience of business in general. And then the second thing that I see is this addressable market of AI is massive. It has like a 25% year on year growth just in the data for AI services alone. So I do see that this is really a very vibrant sector. Not an easy one to get into, but if you find your niche and your specialty, this is something where the language skills, the multilingual skills are going to be an asset that you can bring to the table. [00:32:38] Speaker A: Thank you so much. That's so related to what I wanted to ask to end the conversation, which is your final thoughts and a message, which you've given some already, but your final thoughts and message to your current and future team members, your current and future partners and clients. [00:32:58] Speaker B: So my take is that embrace technology, don't be afraid of tech, don't put your head in the sand, embrace technology and keep an open mind because there's always a path to find. I am not an optimist or somebody who's pessimistic. I'm quite realistic with a little bit of optimism of course. And I really think that if you embrace and you learn there, there's actually a very good future. And at the same time, our global services industry, language service industry is going to evolve quite dramatically, in my opinion, in the coming years. So we put the seatbelt on. Sorry, seat belt on. And we get on this ride because it's an exciting one. And rather than denying it or saying it's not happening, I think if we embrace it, we're going to be part of the success of that sector. [00:33:55] Speaker A: That is fantastic. Veronik, thank you so much for your time today. [00:33:59] Speaker B: My pleasure. Thank you so much, Eddie. [00:34:01] Speaker A: And this was our conversation with Veronique Ozkaya. She's the co CEO at Datamundi AI and this was an amazing conversation. I hope our audience has loved it as much as we have. We are going to be releasing more on these later on in the form of a snippet. So we appreciate your comments and your likes. Without any more to say, thank you so much for listening. My name is Eddie Arrieta, CEO here at Multilingual Magazine and Media. Until the next time, goodbye. Thank you, Veronik. [00:34:33] Speaker B: Thank you, Eddie. Pleasure.

Other Episodes

Episode 173

July 25, 2022 00:04:18
Episode Cover

Interpreters Unlimited offers Uber-like app to book USA interpreters

Interpreters Unlimited (IU) has developed and released IU Customer App, a proprietary program that allows professionals and businesses to book foreign language interpreters and...

Listen

Episode 150

June 27, 2022 00:03:13
Episode Cover

Accent-translation startup raises $32 million in Series A

Sanas, a Palo Alto, CA-based technology startup, received $32 million in Series A funding, according to an announcement earlier this week.

Listen

Episode 25

February 15, 2023 00:04:21
Episode Cover

Oregon senate bill generates debate among interpreters, LSPs, labor organizers

A recently introduced senate bill in Oregon is drumming up a bit of controversy among healthcare interpreters and language service providers (LSPs) that work...

Listen