Quick Thought: Universal translator and same language translator

25 October 2019 2 min read Quick Thought

Quick Thoughts are random thoughts looking for comments

Let’s imagine a universal translator able to translate any language to any language. Sourcing a corpus of pair translation is a major hurdle. However there is an almost infinite corpus of pair translations: a language with itself; translating English to English is easy, even for a computer.

Let’s give the blackbox universal translator three inputs: a source text, the language of the source text, the language of the desired translation. What would be the consequences for the learning system inside the blackbox of being constrained that if the languages are the same, the output has to be identical to the input?

Obviously, the blackbox could quickly learn that bypassing the translation does the trick. However, that would probably require the internal circuitry to allow for the bypass, and that could be constrained out. So:

Could we expect any interesting result?
Could the input to be eventually forced down to a language-independent universal representation?
Let’s say there is a language-independent universal representation kernel. If the input comes in without information of which is the output language, and the output has no information of what the input language was, does it force the network to create a universal representation, or would it just withered away?
Is it possible to invert a network? Probably not in a truly bijective way, but to model the fact that text representation \(\rightarrow\) universal representation is the inverse (for some definition of the word) of universal representation \(\rightarrow\) text representation of the same language?

Comments welcome.

Quick Thought Machine Learning Neural Network Statistical Learning What About?

Quick Thought: Universal translator and same language translator

Emmanuel Rialland

Consultant Finance - Machine Learning

Related