Skip to content

Feature: Machine translation for user-generated content#6127

Merged
tramuntanal merged 19 commits intodevelopfrom
feat/machine-translation
Aug 17, 2020
Merged

Feature: Machine translation for user-generated content#6127
tramuntanal merged 19 commits intodevelopfrom
feat/machine-translation

Conversation

@mrcasals
Copy link
Copy Markdown
Contributor

@mrcasals mrcasals commented May 26, 2020

🎩 What? Why?

We're working on adding a machine-translation system for Decidim. This comes from:

https://meta.decidim.org/processes/roadmap/f/122/proposals/15238

This PR will serve as a base branch for all related changes, so that we don't accidentally release the feature half way.

Implementation

The whole system will lay behind two config options:

  1. Installation-wise (Decidim.config.enable_machine_translations), disabled by default
  2. Organization setting, disabled by default, only visible if the installation one is enabled.

We'll create Decidim::TranslatedField, a new model that will hold all machine translations. This is the same approach the globalize gem does. By separating these translations we have a way to differentiate machine translations from human translations.

This is the process:

  1. Someone creates or updates a translatable resource.
  2. We gather this resource, check if the fields need to be translated, and for each that does we create a new Decidim::TranslatedField in a "pending" state. This means we've scheduled a translation request but it's not finished yet.
  3. Using ActiveRecord hooks, we gather the translatable resource and send it to a job if the machine translation is enabled. This job will send the translation request for that field an expect nothing. Translation requests usually happen in an asynchronous way, for what we've gathered. We'll send 1 job per language that needs translation.
  4. We'll build a dummy translator that will serve as an example for other implementators. This dummy translator will only prepend the locale name to the field value, and update the Decidim::TranslatedField with it.
  5. We'll replace all translated_attribute(model.field) calls, because with the current method we can't get to the DB because we only receive the field value. We'll create a new translated(model, :field) that will behave similarly to the current one, but if there's no human translation available then we'll delegate the search to the Decidim::TranslatedField table.
  6. We'll add a custom Rubocop rule that will run together with the linter CI step that will ensure we don't use the old translated_attribute(model.field) method, and help us automfix this change. We'll also deprecate the old method, just in case.

With this system, we expect developers to be able to implement bridges to Google Translate, DeepL, Bing or similar services.

📌 Related Issues

📋 Subtasks

See related PRs.

📷 Screenshots (optional)

Description

@mrcasals
Copy link
Copy Markdown
Contributor Author

I've updated the PR description, in case anyone's curious!

@anaghavl anaghavl force-pushed the feat/machine-translation branch from c8fe241 to 513b8f2 Compare June 3, 2020 08:51
@anaghavl anaghavl mentioned this pull request Jun 8, 2020
6 tasks
@mrcasals mrcasals mentioned this pull request Jul 14, 2020
6 tasks
@anaghavl anaghavl force-pushed the feat/machine-translation branch from 681aaa3 to 4ae9f10 Compare July 27, 2020 09:21
@anaghavl anaghavl changed the base branch from develop to release/0.11-stable July 27, 2020 11:53
@anaghavl anaghavl changed the base branch from release/0.11-stable to develop July 27, 2020 11:54
@anaghavl anaghavl force-pushed the feat/machine-translation branch 2 times, most recently from 4ae9f10 to 5cbf1c3 Compare July 27, 2020 12:09
@anaghavl anaghavl force-pushed the feat/machine-translation branch from 5cbf1c3 to 6cd43aa Compare July 28, 2020 11:36
@mrcasals
Copy link
Copy Markdown
Contributor Author

mrcasals commented Aug 4, 2020

@decidim/core Hi! This is ready for reviews, can you check it please? 😄

@mrcasals mrcasals changed the title [WIP] Feature: Machine translation for user-generated content Feature: Machine translation for user-generated content Aug 4, 2020
@mrcasals mrcasals marked this pull request as ready for review August 4, 2020 13:16
@tramuntanal tramuntanal self-assigned this Aug 13, 2020
@tramuntanal tramuntanal self-requested a review August 13, 2020 09:33
@tramuntanal
Copy link
Copy Markdown
Contributor

tramuntanal commented Aug 13, 2020

@mrcasals As a Decidim implementor I would like to know about this feature, right now it hides in the code. Can you add some docs in docs/customization please?

@mrcasals
Copy link
Copy Markdown
Contributor Author

@tramuntanal OK, will do! This PR is also waiting for #6385 too, it would be awesome if you could review that PR too 🙏

Copy link
Copy Markdown
Contributor

@tramuntanal tramuntanal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good job!

I mainly find a lack of documentation:

  • documentation for implementors in how to configure an machine translation service and a links to real implementations of services (if any), in docs/customize
  • documentation for developers about the API that a compatible machine translation service should implment and the machine translation flow, the MachineTranslationSaveJob..., in docs/advanced

Also, I found many models with incorrect translatable_fields can you please re-check that all models have the correct fields defined? maybe we're missing some

@mrcasals
Copy link
Copy Markdown
Contributor Author

@tramuntanal I've addressed the feedback. Can you re-review this PR and #6385 please? 😄

mrcasals and others added 19 commits August 14, 2020 14:18
* Base branch

* remove file

* Require confirmation on exiting a survey mid-answering (#6118)

* Require confirmation on exit

* Add specs

* Use path instead of url

* Fix changelog

* Trigger build

* Fix expected path on test

* Fix method call

* Take textareas and selects into account

* WIP adding concern

* Adding concern in all the models which have translatable fields

* removed :extended_data as translatable field

* WIP adding concern

* Adding concern in all the models which have translatable fields

* Revert "Require confirmation on exiting a survey mid-answering (#6118)"

This reverts commit bdeb933.

* Revert "remove file"

This reverts commit 2565dbb.

* Revert "Base branch"

This reverts commit 2a09cc4.

Co-authored-by: Marc Riera Casals <mrc2407@gmail.com>
…6128)

* Adding setting to organizations table and creating global config

* Adding config accessor to core.rb

* Base branch

* Added a check to display machine translation settings and changed initializer value

* Fixing lint issue in migration file

* Adding test and removing test file

Co-authored-by: decidim-bot <decidim-bot@users.noreply.github.com>
Co-authored-by: anagha <anagha1996@gmail.com>
* Base branch

* remove file

* Idenifying translatable fields in meetings and comments

Co-authored-by: Marc Riera Casals <mrc2407@gmail.com>
Co-authored-by: Marc Riera Casals <mrc2407@gmail.com>
Co-authored-by: Marc Riera <mrc2407@gmail.com>
@mrcasals mrcasals force-pushed the feat/machine-translation branch from fc17203 to 34519e4 Compare August 14, 2020 12:19
Copy link
Copy Markdown
Contributor

@tramuntanal tramuntanal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good @oriolgual and @mrcasals !

@tramuntanal tramuntanal merged commit 883a7d1 into develop Aug 17, 2020
@tramuntanal tramuntanal deleted the feat/machine-translation branch August 17, 2020 10:38
andreslucena added a commit to decidim/documentation that referenced this pull request Aug 27, 2020
@mrcasals mrcasals mentioned this pull request Aug 31, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants