Seventy Years of Machine Translation The legacy of the Georgetown–IBM experiment

Episode 174 May 02, 2024 00:07:42
Seventy Years of Machine Translation The legacy of the Georgetown–IBM experiment
Localization Today
Seventy Years of Machine Translation The legacy of the Georgetown–IBM experiment

May 02 2024 | 00:07:42

/

Hosted By

Eddie Arrieta

Show Notes

By Rodrigo Corradi.

This year marks 70 years since the first public demonstration of MT, which arguably sparked the language AI revolution that we see today. On January 7, 1954, a team from Georgetown University and IBM automatically translated 60 Russian sentences into English.

View Full Transcript

Episode Transcript

[00:00:06] Speaker A: This is localization Today, a podcast from multilingual media covering the most relevant daily news in the language industry. [00:00:15] Speaker B: 70 years of machine translation the legacy of the Georgetown IBM experiment by Rodrigo Fuentes Coratti in 2024 the possibilities of artificial intelligence in the language industry seem endless, but the AI future that we. [00:00:34] Speaker C: See so clearly today actually began in the middle of the last century. This year marks 70 years since the. [00:00:40] Speaker B: First public demonstration of machine translation, which arguably sparked the language AI revolution. [00:00:48] Speaker C: The first undertaking to solve what became the Mt challenge was a response to. [00:00:52] Speaker B: The Cold War led by Leon Dostert. [00:00:55] Speaker C: A Georgetown University professor and pioneering linguist who developed interpreting systems at the Nuremberg. [00:01:01] Speaker B: Trials, and Kufberg Herd, head of the. [00:01:04] Speaker C: Applied science department at IBM. The Georgetown IBM experiment aimed at automatically translating about 60 russian sentences into English. The carefully chosen sentences were derived from both scientific documents and general interest sources. [00:01:20] Speaker B: In order to appeal to a broad audience. On January 7, 1954, the team gathered. [00:01:26] Speaker C: At IBM's New York headquarters to demonstrate their progress. According to John Hatchins article on the. [00:01:32] Speaker B: Topic, though the experiment was small scale. [00:01:35] Speaker C: With an initial vocabulary of just 250 lexical items and a set of only six rules, it was ultimately able to illustrate some grammatical and morphological problems and to give some idea of what might be feasible in the future. [00:01:50] Speaker B: The original AI hype cycle the event. [00:01:53] Speaker C: Was reported on by the press both. [00:01:55] Speaker B: In the US and abroad, and it. [00:01:57] Speaker C: Garnered considerable public engagement. Given both the growing interest in computers. [00:02:02] Speaker B: And the political backdrop. [00:02:04] Speaker C: Us government funding for more experimentation was. [00:02:07] Speaker B: Soon made available, with a prediction, partly based on the excitement generated, that MD. [00:02:12] Speaker C: Systems would be capable of translating almost. [00:02:15] Speaker B: Everything within five years. However, the reality was that it would. [00:02:19] Speaker C: Take a whole lot more patience than first anticipated. The subsequent years were to prove bumpy due to the complexity of the russian language and the technological limitations. [00:02:30] Speaker B: According to Hutchins, after eight years of work, the Georgetown University MT project tried to produce useful output in 1962, but. [00:02:39] Speaker C: They had to resort to post editing. The post edited translation took slightly longer to do and was more expensive than conventional human translation. [00:02:48] Speaker B: Government funding came under increasing scrutiny, culminating. [00:02:52] Speaker C: In the creation of the Automatic language. [00:02:54] Speaker B: Processing Advisory committee and its 1966 report, languages and machines, computers in translation and linguistics. The report highlighted slow progress, lack of quality, and high costs. [00:03:09] Speaker C: It noted that research funding over the. [00:03:11] Speaker B: Previous decade amounted to $20 million, while. [00:03:14] Speaker C: Real government translation costs stood at only. [00:03:17] Speaker B: $1 million per year. [00:03:19] Speaker C: More damaging were the criticisms that the. [00:03:21] Speaker B: Methodology of early experiments, perhaps in the enthusiasm for attention and investment, was not credible. [00:03:28] Speaker C: The small scale demonstration did not robustly. [00:03:31] Speaker B: Test the MT system, as the selected. [00:03:34] Speaker C: Test sentences were expected to perform well. The report stated that there was no immediate or predictable prospect of useful machine translation. [00:03:42] Speaker B: The initiative, born in the Cold War, was placed on nice. [00:03:47] Speaker C: Most MT proponents were understandably disappointed. [00:03:51] Speaker B: The foundation for Future Solutions the AlpAC. [00:03:55] Speaker C: Report pulled the rug out from under. [00:03:56] Speaker B: MT efforts for the next three decades. However, while the report effectively paused investment. [00:04:03] Speaker C: It also shone a light on a. [00:04:05] Speaker B: Potential hybrid solution, one that reintroduced humans into the equation. [00:04:10] Speaker C: The report described a system of human. [00:04:12] Speaker B: Aided machine translation, relying on post editors. [00:04:15] Speaker C: To make up for the deficiencies of. [00:04:17] Speaker B: The machine output, which set the stage for MT Post editing. Eventually, computer processing power increased, allowing a. [00:04:27] Speaker C: Resurgence of MT innovations. MT quality started to improve as research shifted from recreating language rules to applying. [00:04:34] Speaker B: Machine learning techniques through combinations of algorithms, data, and probability. This became known as statistical Mt Smt. In the mid 2010s, deep learning and artificial neural networks enabled neural MT, resulting. [00:04:51] Speaker C: In dramatically improved translation accuracy and fluency. NMT models have now facilitated the widespread use of MT in the language industry. [00:05:02] Speaker B: The Georgetown IBM legacy the Georgetown IBM. [00:05:06] Speaker C: Experiment and subsequent ALPAC report laid the foundation of MT technology. It also clarified the importance of human. [00:05:14] Speaker B: Led translation and MTP, which has emerged. [00:05:17] Speaker C: As a credible response to global enterprise content challenges. [00:05:21] Speaker B: Chiefly, these early experiments managed to push. [00:05:24] Speaker C: The theories of the early pioneers into the realm of practical and public demonstration. [00:05:29] Speaker B: Thus illuminating their value, if not their actual viability. [00:05:34] Speaker C: Even if the topic was to remain dormant for a few decades to come, the Georgetown IBM experiment played a key role in the development of MT as. [00:05:42] Speaker B: We know it to date 70 years on. [00:05:45] Speaker C: From those initial attempts to solve the. [00:05:47] Speaker B: Language challenge, the emergence of large language models, llms, and generative AI. Gen AI has caused another stir. [00:05:56] Speaker C: Llms and their future productization will now drive innovation with features such as mt. [00:06:01] Speaker B: Quality estimators, content, insight capabilities, and summarization. As we enter this new era, what. [00:06:08] Speaker C: Exactly can we learn from the Georgetown IBM experiment? [00:06:12] Speaker B: Well, to kick start the language AI story, those early pioneers needed to engage with the public imagination, draw attention to their cause, and find support and benefactors. [00:06:23] Speaker C: With current public discourse around Genai mired. [00:06:26] Speaker B: In uncertainty, outreach programs will likely contribute. [00:06:29] Speaker C: To changing attitudes and increased usage. [00:06:32] Speaker B: Moreover, persistence and patience are indispensable. [00:06:36] Speaker C: Successes in early AI language experiments proved. [00:06:39] Speaker B: Delusive, to say the least, but look where we are now. [00:06:42] Speaker C: Empty optimization and Gen AI advances will depend on determination and growing human expertise. [00:06:49] Speaker B: The challenge is not new for translators and linguists, who seem to have technology. [00:06:54] Speaker C: Adoption and innovation in their professional DNA. Only by being proactive in the face of AI driven changes will we achieve. [00:07:02] Speaker B: The next level of progress? [00:07:05] Speaker C: This article was written by Rodrigo Fuentes Coratti. He has worked in the language industry for the past 25 years, specializing in machine translation technology and human processes and capabilities. [00:07:19] Speaker B: Originally published in multilingual magazine, issue 227, April 2024. [00:07:26] Speaker A: Thank you for listening to localization today. To subscribe to multilingual magazine, go to multilingual.com subscribe.

Other Episodes

Episode 225

November 06, 2024 00:21:02
Episode Cover

From Innovation to Integration: The Future of memoQ with Globalese

We discuss memoQ's recent acquisition of Globalese and the technological synergies between the two companies, along with their vision for AI-powered translation solutions.

Listen

Episode 74

April 20, 2022 00:03:10
Episode Cover

CITLoB president Sandeep Nulkar on partnership with ATC

In March, the UK-based Association of Translation Companies (ATC) and the Indian Confederation of Interpreting, Translation, and Localisation Businesses (CITLoB) announced a new partnership...

Listen

Episode 110

October 20, 2023 00:06:04
Episode Cover

LocWorld50 - Interview with Maddies Mundo

Maddie shares her passion for the Hispanic culture and Spanish language, reflecting on her experiences from learning Spanish from scratch. She reveals how the...

Listen