{"id":2404,"date":"2022-11-30T15:47:57","date_gmt":"2022-11-30T15:47:57","guid":{"rendered":"https:\/\/www.codeastar.com\/?p=2404"},"modified":"2022-12-02T18:40:47","modified_gmt":"2022-12-02T18:40:47","slug":"the-lazy-and-easy-pre-trained-translator-of-the-year","status":"publish","type":"post","link":"https:\/\/www.codeastar.com\/the-lazy-and-easy-pre-trained-translator-of-the-year\/","title":{"rendered":"The Lazy (and Easy) Pre-Trained Translator of the Year"},"content":{"rendered":"\n<p>We made <a href=\"https:\/\/www.codeastar.com\/nmt-make-an-easy-neural-machine-translator\/\">our own Neural Machine Translator (NMT)<\/a> in 2019, which helped us to translate Dutch to English. Now it is 2022, and many things have changed in the world of Data Science. The arrival of <a rel=\"noreferrer noopener\" href=\"https:\/\/github.com\/google-research\/bert\" target=\"_blank\">Bidirectional Encoder Representations from Transformers (BERT)<\/a>, a pre-trained transformer model, in 2019 brought a new page for <a rel=\"noreferrer noopener\" href=\"https:\/\/en.wikipedia.org\/wiki\/Natural_language_processing\" target=\"_blank\">Natural Language Processing&nbsp;(NLP)<\/a>. And nowadays we have <a rel=\"noreferrer noopener\" href=\"https:\/\/openai.com\/api\/\" target=\"_blank\">Generative Pre-trained Transformer 3&nbsp;(GPT-3)<\/a>, an even larger model than BERT with 175 billion(!) parameters. As technologies evolve day after day, we should take the advantage of the evolution. So our project this time is &#8212; take a shortcut and use a pre-trained model to build our translator.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Pick a Pre-Trained Model<\/h3>\n\n\n\n<p>We mentioned about BERT is the new gold standard of Data Science, then can we use it as our pre-trained translator model? Well, BERT is good, but it is not a good translation model. BERT is a pre-trained model specialized in context word representation. i.e. It can tell the difference between &#8220;mouse&#8221; as a computer  peripheral and a small mammal. But the way it trains the model using masked words may case problems in handling translation. There is a <a rel=\"noreferrer noopener\" href=\"https:\/\/arxiv.org\/abs\/2002.06823\" target=\"_blank\">research paper<\/a> on using BERT in neural translation, however, according to its content and setting, it is never the &#8220;lazy and easy&#8221; thing we wanted.   <\/p>\n\n\n\n<figure class=\"wp-block-image aligncenter size-large\"><img data-attachment-id=\"2434\" data-permalink=\"https:\/\/www.codeastar.com\/the-lazy-and-easy-pre-trained-translator-of-the-year\/jerpy_bert\/\" data-orig-file=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2022\/11\/Jerpy_bert.png?fit=725%2C279&amp;ssl=1\" data-orig-size=\"725,279\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"Jerpy_bert\" data-image-description=\"\" data-image-caption=\"\" data-medium-file=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2022\/11\/Jerpy_bert.png?fit=300%2C115&amp;ssl=1\" data-large-file=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2022\/11\/Jerpy_bert.png?fit=725%2C279&amp;ssl=1\" decoding=\"async\" loading=\"lazy\" width=\"725\" height=\"279\" src=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2022\/11\/Jerpy_bert.png?resize=725%2C279&#038;ssl=1\" alt=\"BERT on word representation\" class=\"wp-image-2434\" srcset=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2022\/11\/Jerpy_bert.png?w=725&amp;ssl=1 725w, https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2022\/11\/Jerpy_bert.png?resize=300%2C115&amp;ssl=1 300w\" sizes=\"(max-width: 725px) 100vw, 725px\" data-recalc-dims=\"1\" \/><figcaption class=\"wp-element-caption\">BERT on word representation<\/figcaption><\/figure>\n\n\n\n<p>Then what about the all mighty GPT-3? Yes, GPT-3 can legitimately do everything, including translation. But then the problem is on our side. The place I am currently living in, Hong Kong, is one of the GPT-3 unsupported countries. There is a GPT-3 open source alternative, <a rel=\"noreferrer noopener\" href=\"https:\/\/github.com\/kingoflolz\/mesh-transformer-jax\/\" target=\"_blank\">GPT-J<\/a>, which contains 6 billion parameters as well. This powerful open source pre-trained model does need a little bit of time for setting it up. Thus it just violates our &#8220;lazy and easy&#8221; principle. <\/p>\n\n\n\n<p>When one door closes, another opens. On the internet nowadays, there are always more doors than we expect. So we have <a rel=\"noreferrer noopener\" href=\"https:\/\/github.com\/argosopentech\/argos-translate\" target=\"_blank\">Argos Translate<\/a>, the <a rel=\"noreferrer noopener\" href=\"https:\/\/opennmt.net\/\" target=\"_blank\">OpenNMT<\/a> and the <a rel=\"noreferrer noopener\" href=\"https:\/\/github.com\/google\/sentencepiece\" target=\"_blank\">SentencePiece<\/a> powered open source translator. And you may notice, this Argos Translate is using the same tech stack we did last time for our own translator. This time, we can skip the training part and go straight for the translating part, easy peasy!<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Never to the Training, just Straight to Results<\/h3>\n\n\n\n<p>Although we don&#8217;t need a training in using Argos Translate, we do need an installation of the library :]]. Like we what did in past, we use <a rel=\"noreferrer noopener\" href=\"https:\/\/pipenv.pypa.io\/\" target=\"_blank\">pipenv<\/a> to do the deployment job.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>$pipenv --three\n$pipenv shell\n$pipenv install argostranslate<\/code><\/pre>\n\n\n\n<p>Okay, we are good to go. We are CodeAStar, let&#8217;s do what we do best here &#8212; code!<\/p>\n\n\n\n<p>WAIT!<\/p>\n\n\n\n<p>In order to become a good developer, please always remember, design first before you code. So we have the following sequence diagram:<\/p>\n\n\n\n<figure class=\"wp-block-image aligncenter size-large\"><img data-attachment-id=\"2413\" data-permalink=\"https:\/\/www.codeastar.com\/the-lazy-and-easy-pre-trained-translator-of-the-year\/pre-trained-translator-websequencediagrams\/\" data-orig-file=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2022\/08\/Pre-Trained-Translator-WebSequenceDiagrams.png?fit=748%2C916&amp;ssl=1\" data-orig-size=\"748,916\" data-comments-opened=\"1\" data-image-meta=\"{&quot;aperture&quot;:&quot;0&quot;,&quot;credit&quot;:&quot;&quot;,&quot;camera&quot;:&quot;&quot;,&quot;caption&quot;:&quot;&quot;,&quot;created_timestamp&quot;:&quot;0&quot;,&quot;copyright&quot;:&quot;&quot;,&quot;focal_length&quot;:&quot;0&quot;,&quot;iso&quot;:&quot;0&quot;,&quot;shutter_speed&quot;:&quot;0&quot;,&quot;title&quot;:&quot;&quot;,&quot;orientation&quot;:&quot;0&quot;}\" data-image-title=\"Pre-Trained-Translator-WebSequenceDiagrams\" data-image-description=\"\" data-image-caption=\"\" data-medium-file=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2022\/08\/Pre-Trained-Translator-WebSequenceDiagrams.png?fit=245%2C300&amp;ssl=1\" data-large-file=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2022\/08\/Pre-Trained-Translator-WebSequenceDiagrams.png?fit=748%2C916&amp;ssl=1\" decoding=\"async\" loading=\"lazy\" width=\"748\" height=\"916\" src=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2022\/08\/Pre-Trained-Translator-WebSequenceDiagrams.png?resize=748%2C916&#038;ssl=1\" alt=\"Argo Translator Sequence Diagram \" class=\"wp-image-2413\" srcset=\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2022\/08\/Pre-Trained-Translator-WebSequenceDiagrams.png?w=748&amp;ssl=1 748w, https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2022\/08\/Pre-Trained-Translator-WebSequenceDiagrams.png?resize=245%2C300&amp;ssl=1 245w\" sizes=\"(max-width: 748px) 100vw, 748px\" data-recalc-dims=\"1\" \/><\/figure>\n\n\n\n<p>The translation flow is straight forward. But there are somethings we can find out from the diagram: <\/p>\n\n\n\n<ol>\n<li>we need to provide a way for a user to enter the translation language pair, e.g. from GUI(Graphical User Interface) or command line arguments<\/li>\n\n\n\n<li>the application should allow a user to submit input through GUI or command line interface<\/li>\n\n\n\n<li>the internet connection is required, otherwise we have to download language packages to local drive first<\/li>\n\n\n\n<li>like the input, the application should present the translated outcome in different ways, like GUI, file output or console output.<\/li>\n<\/ol>\n\n\n\n<p>When we clean up those design considerations, finally, we are starting to work.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">EZ coding for the pre-trained translator<\/h3>\n\n\n\n<p>When we say &#8220;EZ coding&#8221;, it always comes easy with an easy interface. So we are building the pre-trained translator with the command line interface. i.e. <\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>$python ez_trans.py &lt;FROM LANG&gt; &lt;TO LANG&gt; &lt;INPUT FILE&gt;<\/code><\/pre>\n\n\n\n<p>And we should add some code to handle the command line arguments and we have <a rel=\"noreferrer noopener\" href=\"https:\/\/docs.python.org\/3\/library\/argparse.html\" target=\"_blank\">argparse<\/a> &#8211; the Python bundled arguments handler.<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: python; title: ; notranslate\" title=\"\">\nimport argparse\nfrom argostranslate import package, translate\n\nparser = argparse.ArgumentParser()\nparser.add_argument(&quot;from_lang&quot;, help = &quot;From Language, e.g. en&quot;)\nparser.add_argument(&quot;to_lang&quot;, help = &quot;To Language, e.g. es&quot;)\nparser.add_argument(&quot;input_file&quot;, help = &quot;Input Text File, e.g. abc.txt&quot;)\nargs = parser.parse_args()\n<\/pre><\/div>\n\n\n<p>After getting the user inputs, it is time for us to get the pre-trained language models from Argo Translate.<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: python; title: ; notranslate\" title=\"\">\ntry:\n    print(&quot;Getting the ArgosTranslate package index...&quot;)\n    available_packages = package.get_available_packages()\nexcept:\n    package.update_package_index()\n    available_packages = package.get_available_packages()\n\ntry: \n    selected_package = list(\n    filter(\n         lambda x: x.from_code == args.from_lang and x.to_code == args.to_lang, available_packages\n    ))&#x5B;0]\nexcept: \n    print(f&quot;Error for finding language pair for &#x5B;{args.from_lang}] to &#x5B;{args.to_lang}]&quot;)\n    exit()\n\nprint(f&quot;Download '{selected_package}' model from the ArgosTrans if no model is found in the current system...&quot;)\ndownload_path = selected_package.download()\npackage.install_from_path(download_path)\n\ninstalled_languages = translate.get_installed_languages()\nargo_from_lang = list(filter(lambda x: x.code == args.from_lang,installed_languages))&#x5B;0]\nargo_to_lang = list(filter(lambda x: x.code == args.to_lang,installed_languages))&#x5B;0]\ntranslation = argo_from_lang.get_translation(argo_to_lang)\ntranslated_lines = &#x5B;]\n<\/pre><\/div>\n\n\n<p>Then we have only few things left: open the input file, translate it line by line with Argo Translate and save to the output file. Since we are working on foreign languages, remember to add &#8220;<em>encoding=&#8217;utf8&#8242;<\/em>&#8221; from the &#8220;<em>open(&#8230;.)&#8221; <\/em>command.<\/p>\n\n\n<div class=\"wp-block-syntaxhighlighter-code \"><pre class=\"brush: python; title: ; notranslate\" title=\"\">\nprint(f&quot;Reading '{args.input_file}' and starting to translate...&quot;)\n#load text from file\nwith open(args.input_file, encoding='utf8') as f:\n    lines =  f.read().splitlines() \n    for l in lines:\n        translated_l = translation.translate(l)\n        translated_lines.append(translated_l)\ntranslated_output = '\\n'.join(translated_lines)\n\n#save our translated result into a file\nwith open(&quot;output_&quot;+args.input_file,'w',encoding='utf8') as o:\n    o.write(translated_output)\nprint(f&quot;Translated output is saved as 'output_{args.input_file}', enjoy!&quot;)\n<\/pre><\/div>\n\n\n<p>Our EZ pre-trained translator is done! See? The lines of code, including comments, are just under 50!<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">Test Drive on the Pre-Trained Translator<\/h3>\n\n\n\n<p>It is a tiny translator, but does size matter? Let&#8217;s prove it. Since I am studying Portuguese (or Portugu\u00eas), we will take the Portuguese from my text book and use our EZ translator to translate it back to English. Then we can see if it is well matched to the actual translation. <\/p>\n\n\n\n<p>Here is our input:<\/p>\n\n\n\n<blockquote class=\"wp-block-quote\">\n<p>Afonso: Ol\u00e1! Eu chamo-me Afonso. E voc\u00ea, como \u00e9 que se chama? <br>Russell: Ol\u00e1! Eu sou o Russell. <br>Afonso: De onde \u00e9? <br>Russell: Sou da Austr\u00e1lia, de Darwin. E voc\u00e9? <br>Afonso: Eu sou Portugal, de \u00c9vora. Que l\u00ednguas fala? <br>Russell: Falo ingl\u00eas, alem\u00e3o e um pouco de portugu\u00eas. <br>Afonso: Eu falo portugu\u00eas, canton\u00eas, ingl\u00eas e um pouco de mandarim.<\/p>\n<\/blockquote>\n\n\n\n<p>This is a conversation between two men, Afonso and Russell, introducing each other and asking what languages they can speak.<\/p>\n\n\n\n<p>What we do next is, save the conversation into a file then run our EZ translator for Portuguese (pt) to English (en) translation.<\/p>\n\n\n\n<pre class=\"wp-block-code\"><code>$python ez_trans.py pt en pt_test.txt<\/code><\/pre>\n\n\n\n<p>And we get the output file as:<\/p>\n\n\n\n<blockquote class=\"wp-block-quote\">\n<p>Afonso: Hello! My name is Afonso. What&#8217;s your name?<br>Russell: Hello! I&#8217;m Russell.<br>Afonso: Where are you from?<br>Russell: I&#8217;m from Australia, from Darwin. And you?<br>Afonso: I&#8217;m Portual from \u00c9vora. What languages do you speak?<br>Russell: I speak English, German and a little Portuguese.<br>Afonso: I speak Portuguese, Cantonese, English and a little Mandarin.<\/p>\n<\/blockquote>\n\n\n\n<p>This is exactly what are they talking about. Therefore our tiny EZ translator does score big in the translation task. Maybe I am taking the elementary course, thus the conversation piece is a bit simple and straight-forward for our EZ translator. Overall, it is a tiny, fast and accurate machine translator.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\">What have we learned in this post?<\/h3>\n\n\n\n<ol>\n<li>Life is short, if there is something we can use, just use it, don&#8217;t reinvent the wheel<\/li>\n\n\n\n<li>Always design first code later<\/li>\n\n\n\n<li>The use of Argo Translate <\/li>\n\n\n\n<li><span style=\"font-size: 1rem; font-weight: inherit;\">Size doesn&#8217;t matter in the world of coding<\/span><\/li>\n<\/ol>\n\n\n\n<p>(the complete source package can be found at\u00a0<strong>GitHub<\/strong>:\u00a0<a href=\"https:\/\/github.com\/codeastar\/lazy_translator\" target=\"_blank\" rel=\"noreferrer noopener\">https:\/\/github.com\/codeastar\/lazy_transator<\/a>)<\/p>\n\n\n\n<p> <\/p>\n","protected":false},"excerpt":{"rendered":"<p>We made our own Neural Machine Translator (NMT) in 2019, which helped us to translate Dutch to English. Now it is 2022, and many things have changed in the world of Data Science. The arrival of Bidirectional Encoder Representations from Transformers (BERT), a pre-trained transformer model, in 2019 brought a new page for Natural Language [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":2436,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"om_disable_all_campaigns":false,"_monsterinsights_skip_tracking":false,"_monsterinsights_sitenote_active":false,"_monsterinsights_sitenote_note":"","_monsterinsights_sitenote_category":0,"site-sidebar-layout":"default","site-content-layout":"default","ast-site-content-layout":"","site-content-style":"default","site-sidebar-style":"default","ast-global-header-display":"","ast-banner-title-visibility":"","ast-main-header-display":"","ast-hfb-above-header-display":"","ast-hfb-below-header-display":"","ast-hfb-mobile-header-display":"","site-post-title":"","ast-breadcrumbs-content":"","ast-featured-img":"","footer-sml-layout":"","theme-transparent-header-meta":"default","adv-header-id-meta":"","stick-header-meta":"","header-above-stick-meta":"","header-main-stick-meta":"","header-below-stick-meta":"","astra-migrate-meta-layouts":"default","ast-page-background-enabled":"default","ast-page-background-meta":{"desktop":{"background-color":"","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-gradient":""},"tablet":{"background-color":"","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-gradient":""},"mobile":{"background-color":"","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-gradient":""}},"ast-content-background-meta":{"desktop":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-gradient":""},"tablet":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-gradient":""},"mobile":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-gradient":""}},"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_newsletter_tier_id":0,"jetpack_publicize_message":"","jetpack_is_tweetstorm":false,"jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":true,"jetpack_social_options":{"image_generator_settings":{"template":"highway","enabled":false}}},"categories":[18],"tags":[119,160,140,87,173,174],"jetpack_publicize_connections":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v21.8.1 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>Pre-Trained Translator &#8902; Code A Star<\/title>\n<meta name=\"description\" content=\"Let&#039;s make an easy Pre-Trained Translator using Neural Machine Translation (NMT) with less than 50 lines of code!\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.codeastar.com\/the-lazy-and-easy-pre-trained-translator-of-the-year\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Pre-Trained Translator &#8902; Code A Star\" \/>\n<meta property=\"og:description\" content=\"Let&#039;s make an easy Pre-Trained Translator using Neural Machine Translation (NMT) with less than 50 lines of code!\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.codeastar.com\/the-lazy-and-easy-pre-trained-translator-of-the-year\/\" \/>\n<meta property=\"og:site_name\" content=\"Code A Star\" \/>\n<meta property=\"article:publisher\" content=\"codeastar\" \/>\n<meta property=\"article:author\" content=\"codeastar\" \/>\n<meta property=\"article:published_time\" content=\"2022-11-30T15:47:57+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2022-12-02T18:40:47+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/www.codeastar.com\/wp-content\/uploads\/2022\/11\/lazy_translator.png\" \/>\n\t<meta property=\"og:image:width\" content=\"2000\" \/>\n\t<meta property=\"og:image:height\" content=\"782\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/png\" \/>\n<meta name=\"author\" content=\"Raven Hon\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@codeastar\" \/>\n<meta name=\"twitter:site\" content=\"@codeastar\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Raven Hon\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"6 minutes\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\/\/www.codeastar.com\/the-lazy-and-easy-pre-trained-translator-of-the-year\/#article\",\"isPartOf\":{\"@id\":\"https:\/\/www.codeastar.com\/the-lazy-and-easy-pre-trained-translator-of-the-year\/\"},\"author\":{\"name\":\"Raven Hon\",\"@id\":\"https:\/\/www.codeastar.com\/#\/schema\/person\/832d202eb92a3d430097e88c6d0550bd\"},\"headline\":\"The Lazy (and Easy) Pre-Trained Translator of the Year\",\"datePublished\":\"2022-11-30T15:47:57+00:00\",\"dateModified\":\"2022-12-02T18:40:47+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/www.codeastar.com\/the-lazy-and-easy-pre-trained-translator-of-the-year\/\"},\"wordCount\":1038,\"commentCount\":0,\"publisher\":{\"@id\":\"https:\/\/www.codeastar.com\/#\/schema\/person\/832d202eb92a3d430097e88c6d0550bd\"},\"keywords\":[\"easy\",\"Machine Translation\",\"NLP\",\"Open Source\",\"OpenNMT\",\"Pre-Trained\"],\"articleSection\":[\"Learn Machine Learning\"],\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\/\/www.codeastar.com\/the-lazy-and-easy-pre-trained-translator-of-the-year\/#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/www.codeastar.com\/the-lazy-and-easy-pre-trained-translator-of-the-year\/\",\"url\":\"https:\/\/www.codeastar.com\/the-lazy-and-easy-pre-trained-translator-of-the-year\/\",\"name\":\"Pre-Trained Translator &#8902; Code A Star\",\"isPartOf\":{\"@id\":\"https:\/\/www.codeastar.com\/#website\"},\"datePublished\":\"2022-11-30T15:47:57+00:00\",\"dateModified\":\"2022-12-02T18:40:47+00:00\",\"description\":\"Let's make an easy Pre-Trained Translator using Neural Machine Translation (NMT) with less than 50 lines of code!\",\"breadcrumb\":{\"@id\":\"https:\/\/www.codeastar.com\/the-lazy-and-easy-pre-trained-translator-of-the-year\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/www.codeastar.com\/the-lazy-and-easy-pre-trained-translator-of-the-year\/\"]}]},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\/\/www.codeastar.com\/the-lazy-and-easy-pre-trained-translator-of-the-year\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"https:\/\/www.codeastar.com\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"The Lazy (and Easy) Pre-Trained Translator of the Year\"}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/www.codeastar.com\/#website\",\"url\":\"https:\/\/www.codeastar.com\/\",\"name\":\"Code A Star\",\"description\":\"We don&#039;t wish upon a star, we code a star\",\"publisher\":{\"@id\":\"https:\/\/www.codeastar.com\/#\/schema\/person\/832d202eb92a3d430097e88c6d0550bd\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/www.codeastar.com\/?s={search_term_string}\"},\"query-input\":\"required name=search_term_string\"}],\"inLanguage\":\"en-US\"},{\"@type\":[\"Person\",\"Organization\"],\"@id\":\"https:\/\/www.codeastar.com\/#\/schema\/person\/832d202eb92a3d430097e88c6d0550bd\",\"name\":\"Raven Hon\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\/\/www.codeastar.com\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2018\/08\/logo70.png?fit=70%2C70&ssl=1\",\"contentUrl\":\"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2018\/08\/logo70.png?fit=70%2C70&ssl=1\",\"width\":70,\"height\":70,\"caption\":\"Raven Hon\"},\"logo\":{\"@id\":\"https:\/\/www.codeastar.com\/#\/schema\/person\/image\/\"},\"description\":\"Raven Hon is\u00a0a 20 years+ veteran in information technology industry who has worked on various projects from console, web, game, banking and mobile applications in different sized companies.\",\"sameAs\":[\"https:\/\/www.codeastar.com\",\"codeastar\",\"https:\/\/twitter.com\/codeastar\"]}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Pre-Trained Translator &#8902; Code A Star","description":"Let's make an easy Pre-Trained Translator using Neural Machine Translation (NMT) with less than 50 lines of code!","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.codeastar.com\/the-lazy-and-easy-pre-trained-translator-of-the-year\/","og_locale":"en_US","og_type":"article","og_title":"Pre-Trained Translator &#8902; Code A Star","og_description":"Let's make an easy Pre-Trained Translator using Neural Machine Translation (NMT) with less than 50 lines of code!","og_url":"https:\/\/www.codeastar.com\/the-lazy-and-easy-pre-trained-translator-of-the-year\/","og_site_name":"Code A Star","article_publisher":"codeastar","article_author":"codeastar","article_published_time":"2022-11-30T15:47:57+00:00","article_modified_time":"2022-12-02T18:40:47+00:00","og_image":[{"width":2000,"height":782,"url":"https:\/\/www.codeastar.com\/wp-content\/uploads\/2022\/11\/lazy_translator.png","type":"image\/png"}],"author":"Raven Hon","twitter_card":"summary_large_image","twitter_creator":"@codeastar","twitter_site":"@codeastar","twitter_misc":{"Written by":"Raven Hon","Est. reading time":"6 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/www.codeastar.com\/the-lazy-and-easy-pre-trained-translator-of-the-year\/#article","isPartOf":{"@id":"https:\/\/www.codeastar.com\/the-lazy-and-easy-pre-trained-translator-of-the-year\/"},"author":{"name":"Raven Hon","@id":"https:\/\/www.codeastar.com\/#\/schema\/person\/832d202eb92a3d430097e88c6d0550bd"},"headline":"The Lazy (and Easy) Pre-Trained Translator of the Year","datePublished":"2022-11-30T15:47:57+00:00","dateModified":"2022-12-02T18:40:47+00:00","mainEntityOfPage":{"@id":"https:\/\/www.codeastar.com\/the-lazy-and-easy-pre-trained-translator-of-the-year\/"},"wordCount":1038,"commentCount":0,"publisher":{"@id":"https:\/\/www.codeastar.com\/#\/schema\/person\/832d202eb92a3d430097e88c6d0550bd"},"keywords":["easy","Machine Translation","NLP","Open Source","OpenNMT","Pre-Trained"],"articleSection":["Learn Machine Learning"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/www.codeastar.com\/the-lazy-and-easy-pre-trained-translator-of-the-year\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/www.codeastar.com\/the-lazy-and-easy-pre-trained-translator-of-the-year\/","url":"https:\/\/www.codeastar.com\/the-lazy-and-easy-pre-trained-translator-of-the-year\/","name":"Pre-Trained Translator &#8902; Code A Star","isPartOf":{"@id":"https:\/\/www.codeastar.com\/#website"},"datePublished":"2022-11-30T15:47:57+00:00","dateModified":"2022-12-02T18:40:47+00:00","description":"Let's make an easy Pre-Trained Translator using Neural Machine Translation (NMT) with less than 50 lines of code!","breadcrumb":{"@id":"https:\/\/www.codeastar.com\/the-lazy-and-easy-pre-trained-translator-of-the-year\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.codeastar.com\/the-lazy-and-easy-pre-trained-translator-of-the-year\/"]}]},{"@type":"BreadcrumbList","@id":"https:\/\/www.codeastar.com\/the-lazy-and-easy-pre-trained-translator-of-the-year\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/www.codeastar.com\/"},{"@type":"ListItem","position":2,"name":"The Lazy (and Easy) Pre-Trained Translator of the Year"}]},{"@type":"WebSite","@id":"https:\/\/www.codeastar.com\/#website","url":"https:\/\/www.codeastar.com\/","name":"Code A Star","description":"We don&#039;t wish upon a star, we code a star","publisher":{"@id":"https:\/\/www.codeastar.com\/#\/schema\/person\/832d202eb92a3d430097e88c6d0550bd"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.codeastar.com\/?s={search_term_string}"},"query-input":"required name=search_term_string"}],"inLanguage":"en-US"},{"@type":["Person","Organization"],"@id":"https:\/\/www.codeastar.com\/#\/schema\/person\/832d202eb92a3d430097e88c6d0550bd","name":"Raven Hon","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.codeastar.com\/#\/schema\/person\/image\/","url":"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2018\/08\/logo70.png?fit=70%2C70&ssl=1","contentUrl":"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2018\/08\/logo70.png?fit=70%2C70&ssl=1","width":70,"height":70,"caption":"Raven Hon"},"logo":{"@id":"https:\/\/www.codeastar.com\/#\/schema\/person\/image\/"},"description":"Raven Hon is\u00a0a 20 years+ veteran in information technology industry who has worked on various projects from console, web, game, banking and mobile applications in different sized companies.","sameAs":["https:\/\/www.codeastar.com","codeastar","https:\/\/twitter.com\/codeastar"]}]}},"jetpack_featured_media_url":"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2022\/11\/lazy_translator.png?fit=2000%2C782&ssl=1","jetpack_sharing_enabled":true,"jetpack_shortlink":"https:\/\/wp.me\/p8PcRO-CM","jetpack-related-posts":[{"id":1895,"url":"https:\/\/www.codeastar.com\/word-embedding-in-nlp-and-python-part-1\/","url_meta":{"origin":2404,"position":0},"title":"Word Embedding in NLP and Python &#8211; Part 1","author":"Raven Hon","date":"April 30, 2019","format":false,"excerpt":"We have handled text in machine learning using TFIDF. And we can use it to build word cloud for analytic purpose. But is it the capability of a machine can do on text? Definitely not, as we just haven't let machine to \"learn\" about text yet. TFIDF is a statistics\u2026","rel":"","context":"In &quot;Learn Machine Learning&quot;","block_context":{"text":"Learn Machine Learning","link":"https:\/\/www.codeastar.com\/category\/machine-learning\/"},"img":{"alt_text":"Happy word embedding","src":"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2019\/04\/happy_embedding.png?fit=800%2C779&ssl=1&resize=350%2C200","width":350,"height":200,"srcset":"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2019\/04\/happy_embedding.png?fit=800%2C779&ssl=1&resize=350%2C200 1x, https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2019\/04\/happy_embedding.png?fit=800%2C779&ssl=1&resize=525%2C300 1.5x, https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2019\/04\/happy_embedding.png?fit=800%2C779&ssl=1&resize=700%2C400 2x"},"classes":[]},{"id":1941,"url":"https:\/\/www.codeastar.com\/recurrent-neural-network-rnn-in-nlp-and-python-part-2\/","url_meta":{"origin":2404,"position":1},"title":"RNN (Recurrent Neural Network) in NLP and Python &#8211; Part 2","author":"Raven Hon","date":"May 15, 2019","format":false,"excerpt":"From our Part 1 of NLP and Python topic, we talked about word pre-processing for a machine to handle words. This time, we are going to talk about building a model for a machine to classify words. We learned to use CNN to classify images in past. Then we use\u2026","rel":"","context":"In &quot;Learn Machine Learning&quot;","block_context":{"text":"Learn Machine Learning","link":"https:\/\/www.codeastar.com\/category\/machine-learning\/"},"img":{"alt_text":"Recurrent Neural Network","src":"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2019\/05\/rnn.png?fit=1000%2C723&ssl=1&resize=350%2C200","width":350,"height":200,"srcset":"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2019\/05\/rnn.png?fit=1000%2C723&ssl=1&resize=350%2C200 1x, https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2019\/05\/rnn.png?fit=1000%2C723&ssl=1&resize=525%2C300 1.5x, https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2019\/05\/rnn.png?fit=1000%2C723&ssl=1&resize=700%2C400 2x"},"classes":[]},{"id":2134,"url":"https:\/\/www.codeastar.com\/nmt-make-an-easy-neural-machine-translator\/","url_meta":{"origin":2404,"position":2},"title":"NMT &#8211; make an easy Neural Machine Translator","author":"Raven Hon","date":"November 10, 2019","format":false,"excerpt":"I haven't updated this blog for a few months. As there is something big happened in my hometown. But life must go on, so we come back here and learn a new topic --- NMT (Neural Machine Translation). You may try the translate service before (if not, let's Google Translate),\u2026","rel":"","context":"In &quot;Learn Machine Learning&quot;","block_context":{"text":"Learn Machine Learning","link":"https:\/\/www.codeastar.com\/category\/machine-learning\/"},"img":{"alt_text":"NMT","src":"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2020\/01\/nmt_4.png?fit=1052%2C551&ssl=1&resize=350%2C200","width":350,"height":200,"srcset":"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2020\/01\/nmt_4.png?fit=1052%2C551&ssl=1&resize=350%2C200 1x, https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2020\/01\/nmt_4.png?fit=1052%2C551&ssl=1&resize=525%2C300 1.5x, https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2020\/01\/nmt_4.png?fit=1052%2C551&ssl=1&resize=700%2C400 2x, https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2020\/01\/nmt_4.png?fit=1052%2C551&ssl=1&resize=1050%2C600 3x"},"classes":[]},{"id":2027,"url":"https:\/\/www.codeastar.com\/save-and-load-your-rnn-model\/","url_meta":{"origin":2404,"position":3},"title":"Save and Load your RNN model","author":"Raven Hon","date":"June 25, 2019","format":false,"excerpt":"In this blog, we tasted different kinds of machine learning projects so far. Our projects included prediction on stock price, image recognizer on hand writing, NLP on comment classification and others. There was one thing in common --- we used long time to train a model. It is okay to\u2026","rel":"","context":"In &quot;Learn Machine Learning&quot;","block_context":{"text":"Learn Machine Learning","link":"https:\/\/www.codeastar.com\/category\/machine-learning\/"},"img":{"alt_text":"Save and Load Model","src":"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2019\/06\/save_n_load.png?fit=1100%2C400&ssl=1&resize=350%2C200","width":350,"height":200,"srcset":"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2019\/06\/save_n_load.png?fit=1100%2C400&ssl=1&resize=350%2C200 1x, https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2019\/06\/save_n_load.png?fit=1100%2C400&ssl=1&resize=525%2C300 1.5x, https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2019\/06\/save_n_load.png?fit=1100%2C400&ssl=1&resize=700%2C400 2x, https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2019\/06\/save_n_load.png?fit=1100%2C400&ssl=1&resize=1050%2C600 3x"},"classes":[]},{"id":1066,"url":"https:\/\/www.codeastar.com\/bartener-machine-learning\/","url_meta":{"origin":2404,"position":4},"title":"&#8220;Do you have a dog?&#8221; explained in Machine Learning","author":"Raven Hon","date":"May 19, 2018","format":false,"excerpt":"You have probably read the above comic in 9gag or imgur before. It is a funny joke, but on the other hand, it is also a material for our Machine Learning topic. It sounds weird? Oh yeah, sometimes knowledge comes from strange ideas. The Comic Here is the comic, for\u2026","rel":"","context":"In &quot;Learn Machine Learning&quot;","block_context":{"text":"Learn Machine Learning","link":"https:\/\/www.codeastar.com\/category\/machine-learning\/"},"img":{"alt_text":"\"Do you have a dog?\" in Machine Learning","src":"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2018\/05\/dyhad.png?fit=377%2C221&ssl=1&resize=350%2C200","width":350,"height":200},"classes":[]},{"id":2529,"url":"https:\/\/www.codeastar.com\/stable-diffusion-quick-and-easy-guide-for-everyone-part-1\/","url_meta":{"origin":2404,"position":5},"title":"Easy Guide for Beginner: Learn how to use Stable Diffusion WebUI &#8211; Part 1","author":"Raven Hon","date":"May 29, 2023","format":false,"excerpt":"We have been learning programming and AI since the beginning of this website. Now, we are excited to explore the world of generative AI art using the Stable Diffusion WebUI. This innovative web interface provides us with a simple and efficient way to generate AI art. In one of our\u2026","rel":"","context":"In &quot;Learn Machine Learning&quot;","block_context":{"text":"Learn Machine Learning","link":"https:\/\/www.codeastar.com\/category\/machine-learning\/"},"img":{"alt_text":"Stable Diffusion","src":"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2023\/05\/Stable_Diffusion.png?fit=1024%2C512&ssl=1&resize=350%2C200","width":350,"height":200,"srcset":"https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2023\/05\/Stable_Diffusion.png?fit=1024%2C512&ssl=1&resize=350%2C200 1x, https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2023\/05\/Stable_Diffusion.png?fit=1024%2C512&ssl=1&resize=525%2C300 1.5x, https:\/\/i0.wp.com\/www.codeastar.com\/wp-content\/uploads\/2023\/05\/Stable_Diffusion.png?fit=1024%2C512&ssl=1&resize=700%2C400 2x"},"classes":[]}],"_links":{"self":[{"href":"https:\/\/www.codeastar.com\/wp-json\/wp\/v2\/posts\/2404"}],"collection":[{"href":"https:\/\/www.codeastar.com\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.codeastar.com\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.codeastar.com\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.codeastar.com\/wp-json\/wp\/v2\/comments?post=2404"}],"version-history":[{"count":19,"href":"https:\/\/www.codeastar.com\/wp-json\/wp\/v2\/posts\/2404\/revisions"}],"predecessor-version":[{"id":2475,"href":"https:\/\/www.codeastar.com\/wp-json\/wp\/v2\/posts\/2404\/revisions\/2475"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.codeastar.com\/wp-json\/wp\/v2\/media\/2436"}],"wp:attachment":[{"href":"https:\/\/www.codeastar.com\/wp-json\/wp\/v2\/media?parent=2404"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.codeastar.com\/wp-json\/wp\/v2\/categories?post=2404"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.codeastar.com\/wp-json\/wp\/v2\/tags?post=2404"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}