So @pwnallthethings decided to try calling out @wcathcart and @WhatsApp as “scanning content” like Apple; here’s why that’s incorrect & misleading

First up: a disclaimer: I worked at Facebook between 2013-16 and I led the team which built “Messenger Secret Conversations”, referenced below. I know my shit, but I may be a little dated. Also: I am currently working on an Internet-Draft with the intention of it becoming an RFC, to help people understand and falsify bogus claims of end-to-end security in encrypted messengers.

If I have misspoken about any of this, the relevant people know how to contact me to get it corrected; that said, readers should be aware that I am long out-of-date and writing at speed to address a “hot” issue. There may be errors.

tl;dr — none of what WhatsApp, Facebook, etc, do in the name of user safety, involve leaking end-to-end encrypted content NOR does it involve rifling though your photo albums looking for “bad stuff”, nor scanning E2E messages for nudity a-la Apple.

Matt (@pwnallthethings) wrote this, and @runasand responded:

https://twitter.com/runasand/status/1423757855082127362

Count me squarely with Runa on this one. There are at least 4 instant messengers of some form that are currently shipped by Facebook, and their characteristics are all different, I will outline them briefly below, and discuss the issue in terms of my Internet Draft — because unpicking stuff like this is meant to be the point of that document.

Facebook Messenger

Not end-to-end secure. Messages are held in cleartext on the Facebook servers. The servers scan messages for abusive content, images are checked against PhotoDNA, links are checked to see if they point at websites of poor reputation, or that have been reported as insecure, or that contain malware, etc. Additional checks are applied where one/more participant is a minor.

Facebook Messenger Secret Conversations

I worked on this. It will have evolved since 2016 / when I left, but it’s a “special feature” of Messenger, it runs on a different communications stack, messages are end-to-end encrypted using Signal protocol, multidevice support automatically enrolls your devices into conversations where you are logged in. I no longer have any viable knowledge whether additional checks are applied to minor participants. Supports disappearing messages. Messages and content are not scanned – especially not: PhotoDNA, because Microsoft who own PhotoDNA generally will not license it to be put onto phones, tablets, etc, just in case someone reverse-engineers and publishes it, leading to publication of weaknesses, bypasses, etc. If a message is copied/reported to the safety team, then of course it will be subject to all kinds of analysis, likely including PhotoDNA.

Note: importantly: your profile image and other “Messengery” stuff is still associated with all these conversations, and is in-clear on the Facebook servers because they are… well, part of Messenger. Nobody seems to complain about that.

WhatsApp

Started off as not-end-to-end-encrypted, became strongly end-to-end encrypted. Runs on yet another communications stack, messages are end-to-end encrypted using Signal protocol, multidevice support is apparently being rolled out, as-are differing forms of niceties like “disappearing messages”. I have no knowledge whether additional checks are applied to minor participants, or even if there is a concept of “minor participant”. If a message is copied/reported to the safety team, then of course it will be subject to all kinds of analysis, likely including PhotoDNA.

Possibly because of its formerly unencrypted heritage, possibly as a technical, “discovery”, or strategic decision (I do not know either way) the names of Group Chats, and the profile photos associated with them — and therefore presumably the profile photos of users — are not end-to-end encrypted. This content has PhotoDNA run against it and other signals like text analysis of group names, because of corporate safety and looking for signals of abusers – e.g. reused profile photos from known-bad actors, attempting to rejoin conversations, etc.

If you want to know more about WhatsApp and where the boundary between E2E versus non-E2E exists, watch this:

Instagram Direct Messages

I have no technical experience of Instagram Direct Messages.

What does it mean to trust an E2E encrypted messenger?

When you adopt a messenger solution, you essentially bring it – and all of its infrastructure – into what is called your “Trusted Computing Base”; even with a “gold standard” app like Signal, you are trusting:

  • the app authors
  • the app builders
  • the app stores
  • the app deployment & infrastructure
  • the inherent, claimed trust and data-protection mechanisms of the app

…to all act “as described” and also “according to a bunch of nebulous, poorly-specified expectations of what/how an end-to-end secure, encrypted messenger” should behave.

Back in 1995, if you were sending PGP-encrypted emails to a maillist called “cypherpunks” then it was pretty likely that the thematic content of your email would have been encryption, even if the content was obscure. The same remains for some forms of group chat, today. Signal (apparently) does not leak thematic metadata like group names and group profile icons. WhatsApp does. Telegram, who knows? If this matter disturbs you, make a choice of solution which better fits your needs much as you almost certainly choose not to use PGP-over-Email nowadays. Choose and adopt a solution to insert into your Trusted Computing Base, one which best fits your need.

I recommend that you do not – unless you can point at an existing defined standard which enables you to – attempt to compare E2E encrypted messengers, and then blame/shame them for “not doing enough”. “Gut feelings” and (surprisingly) “established norms” are not a good metric in this space – there is much diversity of approach.

A Standard Metric

Speaking of “existing defined standards”, this matter is covered on Slide 28 of my recent IETF presentation:

…and the whole presentation is linked below, and the current working draft is on Github.

I welcome your feedback.

Comments

3 responses to “So @pwnallthethings decided to try calling out @wcathcart and @WhatsApp as “scanning content” like Apple; here’s why that’s incorrect & misleading”

  1. Thank you for this post, Alec (I would’ve DM’d you on Twitter but can’t cuz you don’t follow me). I’m not responding to this post’s subject, as I’m not one who thinks WhatsApp is “scanning content like Apple”; I’m just very interested in this privacy protection vs CSAM eradication “debate” while not claiming to be an expert on either “side” (youth online risk being a vast subject of which I’ve focused on the non-criminal aspects that affect the vast majority of U18 Internet users); I just have a couple of questions in the interest of learning more….

    1) Thx for including the link to Matt Jones’s talk. Question about that: Is WhatsApp doing all the same things to detect and remove CSAM-sending/distributing accounts as with spam? In other words could Matt have subbed in “CSAM” for “spam”?

    2) Agree with what you say in your tl;dr based on what I’ve heard as a member of FB’s safety advisory, but are you saying that Apple will actually be “rifling through [iPhone users’] photo albums looking for [CSAM]”? That’s not what I’m “hearing” from multiple descriptions of the technology; in fact, if Apple’s aim was to help reduce (recorded) child sexual exploitation, not just find known CSAM, they would be doing more than designing technology that just scans for the latter. I’m honestly not sure what anti-CSAM activists are applauding; do they feel this is a step in the right direction on Apple’s part? (I’m not asking you that; just wondering out loud.)

    BTW, I searched for looking for context on what he said, and your tweet was the only search result Google turned up. So another question is, what did you mean by that tweet?

  2. […] So @pwnallthethings decided to try calling out @wcathcart and @WhatsApp as “scanning content&#… […]

  3. […] above (section: “Metadata Analysis”) and likewise the Primer linked above. See also this blogpost which covers some of these comparative aspects, including the difference between “scanning metadata” vs: “scanning […]

Leave a Reply

Your email address will not be published. Required fields are marked *