Skip to content

Use content classification systems for better SPAM detection#10151

Closed
alecslupu wants to merge 14 commits intodecidim:fature/prepare-analyzer-eventsfrom
i-need-another-coffee:ale-add-spam-detection
Closed

Use content classification systems for better SPAM detection#10151
alecslupu wants to merge 14 commits intodecidim:fature/prepare-analyzer-eventsfrom
i-need-another-coffee:ale-add-spam-detection

Conversation

@alecslupu
Copy link
Copy Markdown
Contributor

@alecslupu alecslupu commented Dec 14, 2022

🎩 What? Why?

This PR adds the spam detection mechanism, created in a stand alone bundle that can be installed also in older decidim installations. Please refer to decidim-tools-ai/Readme.md for configuration details.

📌 Related Issues

Link your PR to an issue

Testing

  1. Follow the installation instructions in the readme file
  2. Index the data
  3. Create some content and check to see if it get's marked as spam

📷 Screenshots

Please add screenshots of the changes you're proposing
Description

♥️ Thank you!

@alecslupu alecslupu changed the title Add small minimal files for AI Tools Add AI Tools for spam detection Dec 14, 2022
@alecslupu alecslupu self-assigned this Dec 20, 2022
@alecslupu alecslupu force-pushed the ale-add-spam-detection branch 2 times, most recently from 19010ac to 9161cfa Compare January 5, 2023 19:20
@alecslupu alecslupu marked this pull request as ready for review January 6, 2023 14:31
@alecslupu alecslupu changed the title Add AI Tools for spam detection Use content classification systems for better SPAM detection Jan 6, 2023
@alecslupu alecslupu removed their assignment Jan 8, 2023
@alecslupu alecslupu force-pushed the ale-add-spam-detection branch from abeaf79 to 6d272fe Compare March 2, 2023 19:31
Copy link
Copy Markdown
Contributor

@ahukkanen ahukkanen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work so far. There are still some rough corners that you are maybe already aware of. I've added some comments and suggestions for cleaning the code and improving its functionality.

It also turned out that this approach works fairly well at least for comment spam with some extra "clean content" training using the non-moderated comments. There are some changes that we need to do in the analysis (i.e. assign weights) to make it perform better as the analyzer is not optimal right now. But I also ran the same analysis against the bayes classifier only to see some comparison.

I tested this against several datasets:

  • From one of our instances
    • 3254 spam comments
    • 6223 blocked user accounts
    • 615 clean comments (that should not be flagged)
  • From a second instance 308 clean comments to test training the classifier with the first dataset of clean comments
  • Metadecidim public user accounts

With these datasets I had the following results. With the analysis I used CLD to detect the language of the content and the classifier would give it a lower score in case the language is in Finnish.

First instance

Pre-training data shipped with the module

Using bayes classifier alone:

  • 99.78% spam comments flagged as spam
  • 99.29% blocked users flagged as spam (based on profile description)
  • Error rate: 93.17% (clean comments flagged as spam)

Using Decidim::Tools::Ai::SpamContent::Classifier:

  • 11.34% spam comments flagged as spam
  • 15.54% blocked users flagged as spam (based on profile description)
  • Error rate: 1.79% (clean comments flagged as spam)

So, high error rates so far but then I added additional training with the clean comments data.

Pre-training data shipped with the module + extra language specific training

Using bayes classifier alone:

  • 98.46% spam comments flagged as spam
  • 99.33% blocked users flagged as spam (based on profile description)
  • Error rate: 1.79% (clean comments flagged as spam)

Using Decidim::Tools::Ai::SpamContent::Classifier:

  • 11.34% spam comments flagged as spam
  • 15.54% blocked users flagged as spam (based on profile description)
  • Error rate: 0% (clean comments flagged as spam)

So better using the bayes classifier alone but understandable since I trained it with this exact dataset. Let's try it with another set of comments.

Second instance

With the second instance I only tested the error rate both with the pre-training data only and the extra training data from the first instance.

Pre-training data shipped with the module

Using bayes classifier alone:

  • Error rate: 95.13% (clean comments flagged as spam)

Using Decidim::Tools::Ai::SpamContent::Classifier:

  • Error rate: 3.57% (clean comments flagged as spam)

Pre-training data shipped with the module + extra language specific training (from the first instance)

Using bayes classifier alone:

  • Error rate: 11.69% (clean comments flagged as spam)

Using Decidim::Tools::Ai::SpamContent::Classifier:

  • Error rate: 2.27% (clean comments flagged as spam)

Metadecidim

I fetched the users from the API and scraped their profile descriptions using automation to get a sample set of analysis data.

In total, there were 11705 records to be analyzed that had some content in their profile description.

Bayes classifier alone flagged 99.25% of the users as spam accounts and Decidim::Tools::Ai::SpamContent::Classifier flagged 9.23%. Note that in this analysis, only those users were included that have something written to their about section within their profiles, so this left about 4742 users unanalyzed (i.e. would be considered clean).

This test I ran with the same pre-training data + clean data from the first instance but the Finnish training data is likely very irrelevant here.

I would not be surprised if about 10% of the user accounts would be spam accounts at Metadecidim, this would match with our findings from some other instances. But clearly for this case, the bayes classifier would need more pre-training in English, Catalan and Spanish to work better for this dataset. I did not have any such data available so I couldn't test how it would preform training it more in these languages.

ahukkanen

This comment was marked as duplicate.

@ahukkanen
Copy link
Copy Markdown
Contributor

ahukkanen commented May 22, 2023

I did a bit more research about the Metadecidim users public data and it turns out that the above analysis is not actually far off. It seems around 99% of the users who had a profile description are actual spam accounts.

Anyways, I did further analysis and feeded some more sample data to the bayes classifier. I added the publicly available SPAM email data which I also translated into Spanish and Catalan. As I was thinking there is something wrong with the classification, just based on the numbers.

Then I made a datasheet of all the analyzed accounts with the contents which I looked at manually and it seems the classifier was doing its job mostly correctly. Most of the users were actually profile spammers. The numbers with the additional training data were that the bayes classifier flagged about 99.19% of the users as spammers and Decidim::Tools::Ai::SpamContent::Classifier flagged about 9.33%.

So even with this data the bayes classifier seems to work quite well.

Note to make here is that it has also done mistakes with the data. Some users who I manually identified as real users have also been classified as spammers but this is likely to be a very small subset of the data. And we can improve on those, if we mark these users as "ham" for the further analysis round. We could likely find these users pretty easily by looking at the most active users in Metadecidim who haven't been spamming. I would expect most of these profile spammer accounts have no activity on the platform (or only few spam comments).

@alecslupu
Copy link
Copy Markdown
Contributor Author

@ahukkanen , the concept of having that with_events method works pretty well for updating the resource, but is not working well when we need to create the resource and pass it further in the event.
For example the create_meeting command:

def call
        return broadcast(:invalid) if form.invalid?

        transaction do
          create_meeting!
          schedule_upcoming_meeting_notification
          send_notification
          dispatch_system_event
        end

        create_follow_form_resource(form.current_user)
        broadcast(:ok, meeting)
      end

Expanded would be :

def call
        return broadcast(:invalid) if form.invalid?

        transaction do
          create_meeting!
          schedule_upcoming_meeting_notification
          send_notification
          ActiveSupport::Notifications.publish(
          "decidim.system.meetings.meeting.created",
          resource: meeting, # this is the resource created in the create_meeting method
          author: form.current_user,
          locale: I18n.locale
        )
        end

        create_follow_form_resource(form.current_user)
        broadcast(:ok, meeting)
      end

Using the with_events for this command, would be impossible.

@alecslupu alecslupu marked this pull request as draft July 3, 2023 20:56
@alecslupu alecslupu changed the base branch from develop to fature/prepare-analyzer-events July 9, 2023 23:06
@alecslupu alecslupu changed the base branch from fature/prepare-analyzer-events to develop July 9, 2023 23:19
@alecslupu alecslupu changed the base branch from develop to fature/prepare-analyzer-events July 9, 2023 23:19
@alecslupu alecslupu force-pushed the fature/prepare-analyzer-events branch 2 times, most recently from 9803056 to 0d11ecc Compare July 13, 2023 16:04
@alecslupu alecslupu force-pushed the ale-add-spam-detection branch 5 times, most recently from 3420091 to 910cc68 Compare July 15, 2023 09:20
Add Gitlab action workflow

Patch the generator

Running linters

Gemfiles
@alecslupu alecslupu force-pushed the ale-add-spam-detection branch from 910cc68 to d204dc7 Compare July 15, 2023 18:55
@alecslupu alecslupu marked this pull request as ready for review July 20, 2023 16:13
@alecslupu alecslupu marked this pull request as draft July 20, 2023 16:13
@ahukkanen ahukkanen deleted the branch decidim:fature/prepare-analyzer-events July 22, 2023 07:33
@ahukkanen ahukkanen closed this Jul 22, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Use content classification systems for better SPAM detection

3 participants