传铭文化

News

Google announces Google Neural Machine Translation system (GNMT), which utilizes state-of-the-art training techniques so as to achieve the largest improvements to date for machine translation quality. In the new technical report Google describes the full research result, released: “Google’s Neural Machine Translation System: Bridging the Gap between Human and Machine Translation.”

Just a few years ago, Google had started using Recurrent Neutral Networks (RNNs) to directly learn the mapping between input sequences, whereas Phase-Based Machine Translation (PBMT) breaks an input sentence into words, as well as phases, so as to be translated mainly independently. Neural Machine Translation (NMT) considers the complete input sentence as a unit for translation. The advantage of this particular approach is that it requires lesser engineering design choices compared to the earlier Phrase-Based translation systems. When this was first launched, NMT showed equivalent accuracy with existing Phase-based translation systems on modest-sized public benchmark data sets.

Since then, researchers have gone on to propose several techniques so as to improve NMT, which includes working on handling rare words by mimicking an external alignment model, using attention in order to align input words as well as output words, and breaking words into smaller units so as to code with the rare words. In spite of these improvements, NMT was not fast or accurate enough so as to be used in the production system, like Google Translate.

The visualization given below shows the progression of GNMT as it translates a Chinese sentence to English. Firstly, the network encodes the Chinese words as a list of vectors where each vector represents the meaning of the entire words read so far (“Encoder”). Once the entire sentence is read, the decoder begins, generating the English sentence one at a time (“Decoder”). In order to generate the translated word at every step, the decoder gives complete attention to a weight distributed over the encoded Chinese vectors, so as to generate the English word (“Attention”; the blue link transparency represents how much the decoder pays attention to an encoded word).

The company states, “Using human-rated side-by-side comparison as a metric, the GNMT system produces translations that are vastly improved compared to the previous phrase-based production system. GNMT reduces translation errors by more than 55%-85% on several major language pairs measured on sampled sentences from Wikipedia and news websites with the help of bilingual human raters.”

Along with the release of this particular research paper, Google has also launched GNMT in production on an infamously difficult language pair: Chinese to English. The Google Translate mobile and web apps are now using GNMT for 100 percent of machine translations from Chinese to English – around 18 million translations every day.

The company concluded by saying, “Machine translation is by no means solved. GNMT can still make significant errors that a human translator would never make, like dropping words and mistranslating proper names or rare terms, and translating sentences in isolation rather than considering the context of the paragraph or page. There is still a lot of work we can do to serve our users better. However, GNMT represents a significant milestone. We would like to celebrate it with the many researchers and engineers—both within Google and the wider community—who have contributed to this direction of research in the past few years.”

By Tahseen Jamil on Sep 28, 2016

From: http://www.c-sharpcorner.com/news/google-announces-neural-machine-translation-system-for-improved-translation