LocWorld51: Amy Grace O'Brien, Adobe and Roberto Silva, Amazon Web Services.

Episode 190 June 12, 2024 00:10:34
LocWorld51: Amy Grace O'Brien, Adobe and Roberto Silva, Amazon Web Services.
Localization Today
LocWorld51: Amy Grace O'Brien, Adobe and Roberto Silva, Amazon Web Services.

Jun 12 2024 | 00:10:34

/

Hosted By

Eddie Arrieta

Show Notes

Amy Grace O'Brien, Sr. Linguistic Quality Manager at Adobe, and Roberto Silva, Terminology Program Manager at Amazon Web Services talk about the crucial role of terminology in the language industry.

View Full Transcript

Episode Transcript

[00:00:00] Speaker A: The following is our conversation with Amy O'Brien from Adobe and Roberto Silva from Amazon Web Services. We talk about the role of terminology in the language industry. This is a very exciting conversation. Enjoy. [00:00:16] Speaker B: I'm Amy Grace O'Brien. I'm a language linguistic quality manager at Adobe. So that mainly involves is me being responsible for all things quality. So working on localization resources that we own, such as SAR guides, mainly terminology management, that is like a bit a big passion of mine. That's what I'm, if anybody wants to talk to me about, that's what I do and that's what I'm known for against I guess in the company and yeah, lqas and things like that, that's my role. [00:00:48] Speaker C: I am Roberto Silva. I am the terminology program manager for the Amazon Web Services localization team. And I, like Amy, I focus on all the management of all the linguistic assets, including the term basis or the glossaries, the style guide and everything related to the quality of those assets. So in terms of also LQA and the classification and so on, or taxonomy and ontology. So basically ensuring that those assets are in conformity in compliance with all the internal branding, editorial, trademark requirements. [00:01:21] Speaker B: I think terminologists, they've had a little bit of a, they're not really known in the industry. I've been, certainly when I came to this, this conference this week, I was desperate to finally meet Roberto because, you know, looking for people who have shared the same pain points as me in the industry was, you know, you want somebody to rant out and run it with, right, and share, you know, there's also the positive aspect, share best practices and discuss up and coming initiatives with each, with each other. And anyway, so what I was explaining is that basically now that we're moving to AI, for the longest time, terminologists have sort of been an afterthought, you know, like, oh yeah, I suppose we could document terminology. I suppose that's important. But you know, for linguists, when people who are people have really been relying on the human linguists to actually propose terms and suggest terminology and document it correctly. But now that we're moving towards AI first, so Mt localization and building LLMs, I don't know about, you know, at Amazon, but certainly at Adobe, that's something. We're considering building our own LLMs to make sure the quality is even better. And now all of a sudden, hey, what do we need to train that with? We're going to need decent terminology. Garbage in, garbage out, right? So if our terminology is not up there, you know, you're going to be lost. You're going to be stuck in the. In the. In the ice age. [00:02:43] Speaker C: Yeah. I believe that Amy already summarizes everything correctly. Because we were a niche, we are still a niche. We are literally always undercover. Nobody properly, even within the localization sphere, ever properly consider the importance of a terminologist. It's the core asset, the term base and the glossary. Without this, nothing can be done in the localization for quality purposes. It's where you have your language governed. The term entries are so crucial. So in fact, and specifically moving forward, it will be even much more crucial than ever before. Here in the conference we had yesterday, a moment where the shift was from being around a moon of a planet. The terminology is going to be the sun of the solar system, because indeed we will need to have the tasks and govern. So it's even more responsibility, because otherwise all the ethics problems that we can face, potential legal liabilities, will be even much more rising than what we could have seen already nowadays in a more traditional environment, if I can say so, we have cases, for example, a lot of companies, for as much as this, is the expertise of, I believe, of the localization teams, the internationalization teams, there are legal requirements in some countries, I think of the most immediate, maybe Canadian French, where everything must be in Canadian French. And if that compliance isn't guaranteed and insured, the office Quebecois de la office, the french office for that, sorry, the office for the french language there in Canada will immediately find the company that is not complying with the rule. And that's where terminology will be crucial, because it's also, as I said, a crucial importance for legal and trademark matters, but especially for trademark violations and breaches. Somebody's going to translate, I don't know, a trademark while you have it registered. And you're going to be basically losing the trademark and the copyright rights in a given country, because perhaps there wasn't enough terminology management, the maintenance itself of a turn based to be done. Or I can think of biases in the AI, like in terms of gender, of race, a lot of stuff that, especially from the more creative side, if those items are not properly managed within a repository, like can be indeed the turnbase or the style guide or anything similar, you're going to be risking that based on the data you have available, then the governance, the trained model won't be. Will be giving you results that will be absolutely risky and potentially dangerous for your own branding, if that makes sense. [00:05:11] Speaker B: I think Roberto's example is absolutely relevant, but I think it's the most extreme and the most highest risk for companies. But not only that, I also think the more sort of messy risks that you have are more sort of low level, but super annoying and probably most frequent issues they're going to run into is inconsistencies. Right. That's our bread and butter. That's what we have to deal with our day in, day out. But I think if the companies don't put in the work right now to, before they move to an LLM, clean up their data, make sure it's consistent and document it, they're gonna have pain. They're gonna be going to be stuck in refinement of the LLMs for years. [00:05:47] Speaker C: I think this is the case I've been having. All the companies I've worked for in the past. I've been always working as a terminologist, as education. I mean, I study all of this in the past. The problem is that indeed, as you said, there is no framework, there is no governance, even within localization teams, mature localization teams. And I found it astonishing that nobody, as Emi said at the beginning, implemented it. You need to have a clear responsibility of who's doing what. You need to implement a so called raci, and you need to define, first of all, what are their insertion criteria for your glossary. Because you cannot put everything, as I've been seeing several times. You need to understand what are those parts that you can translate or not so called d and t, do not translate. So you need to have a daily interaction with the legal or trademark or branding department. You need to have a proper classification of the pieces of information that you're going to equip for each concept entry, which are called the metadata in our technical jargon. So there is a need for the given PM, potentially, indeed, to be a terminology specifically to implement a set of rules that clearly define and identify what's your content. Define the ontology, for example, the series to map your content, in order to have the map of domains, right, classify it properly and from there identify with the several stakeholders how to properly detect, and as I said, map it for the term base itself, because otherwise you risk without anything that you end up having a sort of dictionary, which in terminology science is absolutely the contrary. [00:07:26] Speaker B: I have three. And once you talk about that, I thought about three different things. The first thing I would say is, companies need to educate themselves. The stakeholders, people who are making those decisions, need to educate themselves. I can't tell you how many times. And you must also run into this. How many times do you people have misunderstood the difference between a TM and a TD and all the time we run into that and you know I can understand but they need to educate themselves or somebody needs to educate themselves. Secondly is exactly what Roberto was saying. Put in a racy someone needs to decide it. And that especially if you're, if they're a young company, the earlier they put that in the better. Somebody branding or legal needs to take a decision and finalize that decision. Do a survey with your linguists or do a survey figure out what it is. That is what metadata is most important. And do not accept anything less than that. And be prepared to have pushback and say oh, do I really have to provide a definition? Yes you do. You have to be prepared to be, you know, I'm a little bit the bitch at an obi, right? I'm gonna have to put it out. That's what, that's what I have to. Otherwise it would be the wow west. And no I don't want it. [00:09:12] Speaker C: And also if I'm ahead to that it's because indeed you need to have also the final say. I know I sound a bit authoritative to say that, but a lot of stuff I see a lot of term edits because XYZ person becomes, wakes up in the morning and say I want to have this called a, b c. Preferential changes for terminologies are completely out of scope. I mean not that everything I want to say, everything is set in stone, can be changed over time, but every term must be on the, I want to say on the linguist, but always needs to avoid a lot of, you know, arbitrations because I had cases not in the company I'm working for, but in the past, in my pastimes as a freelancer as well, where an arbitration of a term took us around nine months because everybody was bringing their own valid opinion. But in that case, it's a never ending story. So when you build up your framework, as Amy said, it's a wild west if you don't make it clear that, after all, if there is total discrepancy, is the terminology, together with having the expertise to have the final say. [00:10:15] Speaker A: This was our conversation with Amy O'Brien from Adobe and Roberto Silva from Amazon Web Services. We talked about the role of terminology in the language industry. My name is Eddie Arrieta, CEO of multilingual magazine, and this is localization today. Thanks for listening.

Other Episodes

Episode 174

May 02, 2024 00:11:07
Episode Cover

Throughlines of Genius Al-Kindi and the origins of machine translation

By Cameron Rasmusson By analyzing the frequency of letters in an Arabic text, ninth-century scholar Abu Yusuf al-Kindi established a framework for identifying patterns...

Listen

Episode 275

November 29, 2022 00:04:05
Episode Cover

Do we have enough (human) translators and interpreters to help vulnerable communities?

Refugee crises across the globe this year — from the European Union to Martha’s Vineyard — have highlighted the importance of maintaining a large,...

Listen

Episode 124

November 09, 2023 00:09:51
Episode Cover

Caroline Crushell - The treasure of language

A love of language has been a part of Caroline Crushell’s life since infancy. Now, in her role managing localization projects for Warner Bros....

Listen