Skip to content

Use content classification systems for better SPAM detection #10038

@andreslucena

Description

@andreslucena

Ref: SPAM06

This proposal was original created by @ahukkanen and available at
https://meta.decidim.org/processes/roadmap/f/122/proposals/16256

There are a couple changes introduced by @decidim/product

Is your feature request related to a problem? Please describe.

SPAM users are becoming bigger and bigger problem for all Decidim instances. They register profiles to put advertisement in their profile bio or a SPAM link in their personal URL and they are flooding the comments section with SPAM.

This is a real issue that is causing lots of extra work for the moderators of the platform. We should apply some automation in order to help their work.

Describe the solution you'd like

There is a gem available named Classifier Reborn which provides two alternative content classification algorithms:

  • Bayes - The system is trained using a predefined set of sentences to detect what are considered good and what are considered bad. When classifying content, it applies a word density search for the new content against this predefined database and provides a probability if the new content is considered good or bad.

  • Latent Semantic Indexer (LSI) - Behaves with similar logic as above but adds semantic indexing to the equation. Slower but more flexible.

More information available from:

Based on one of these algorithms, we could calculate a SPAM probability score for any content the user enters + the user profile itself when it is updated because in the past years we have been seeing many users who create SPAM profiles to get a back link to their site for improved SEO scores.

The only automated action that will be done is to report this user account (and SPAM contents) to the Moderation panel, so a human can review this report and hide/block if it's SPAM indeed. In the future this could be evolved to automatically hiding the content after we have more experience.

Describe alternatives you've considered

  • Manually moderating all users/content that are considered SPAM - very work heavy
  • Using 3rd party APIs to detect SPAM but they are likely not any better as what is suggested above + they come with a cost (or alternatively with a privacy impact)

Additional context

The suggested content classification systems with the predefined databases are likely to work only for English. I haven't dug deeper whether such databases are available for other languages.

But, as of our experience, most of the SPAM users are spamming in English, so I think such classification systems could solve the problem at least for English SPAM.

If the classification needs to be applied to other languages as well, there could be some way to train the system further with other datasets. By default, it could just be trained in English to get rid of most of the SPAM users.

See original proposal at Metadecidim.

Does this issue could impact on users private data?

No.

Funded by

Decidim Association

Acceptance criteria

  • Given that I'm sysadmin
    When I run the command bin/rails decidim:spam:train:moderation
    Then the algorithm is trained with the past moderated contents.
  • Given that I'm sysadmin
    When I run the command bin/rails decidim:spam:train:file[path/to/file]
    Then the algorithm is trained with a spam database file.
  • Given that I'm moderator or an admin
    When I block a participant
    Then the algorithm is trained with its profile data.
  • Given that I'm moderator or an admin
    When I hide a content
    Then the algorithm is trained with its data.
  • Given that I'm a registered confirmed user
    When I create a proposal with some words that appear as spam (for instance, "You are the lucky winner! Claim your holiday prize.")
    Then the system automatically report this content.
  • Given that I'm a registered confirmed user
    When I edit a proposal with some words that appear as spam (for instance, "You are the lucky winner! Claim your holiday prize.")
    Then the system automatically report this content.
  • Given that I'm a registered confirmed user
    When I create a comment with some words that appear as spam (for instance, "You are the lucky winner! Claim your holiday prize.")
    Then the system automatically report this content.
  • Given that I'm a registered confirmed user
    When I edit a comment with some words that appear as spam (for instance, "You are the lucky winner! Claim your holiday prize.")
    Then the system automatically report this content.
  • Given that I'm a registered confirmed user
    When I edit my profile with some words that appear as spam (for instance, "You are the lucky winner! Claim your holiday prize.")
    Then the system automatically report this participant.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions