- Download Docker for Desktop - restart your computer after installation and make sure it is running before continuing. In Docker Desktop, especially on Macs, please increase the amount of memory available to at least 8GB.
- Clone this repo to your local machine and CD into Repo in terminal or CMD
- Run
docker-compose buildto build (required first boot/update) anddocker-compose upto start server - Copy and Paste the Jupyter server notebook
URLfrom Terminal into your local web browser.
To update, simply git pull the repo and run docker-compose build to rebuild. You should end with Successfully tagged meme-observatory_jupyter:latest when successfully built or rebuilt.
Configure settings such as the subreddit to download from in config.py.
Not Starting: Make sure Docker Desktop is running - pull the repo and run docker-compose build
- Try Kernel > Restart and Clear Output in Jupyter
Download Troubles: clear /data/platforms/reddit/ of all files - make sure folder structure /data/platforms/reddit/<subreddit-name>/ exists
Rendering Issues: clear the mosaics folder in /data/platforms/reddit/<subreddit-name> - create the mosaics folder in /<subreddit-name> if it doesn't exist - clear all .gz files in /<subreddit-name> and restart the whole process
Email us: code@memeticinfluence.com
Tweet us: @memeinfluence
Text us: 855-420-MEME (6363)
Intro from Leon Yin
The Meme Observatory is an open source computer vision toolkit used to trace and measure image-based activity online. These notebooks download imagery from various sources, such as subreddits, and then creates image-clustered mosaics over time. This helps us understand, visually, what sorts of source content and edits are being spread into an online community.
In order to use the two functions of the Meme Observatory, images need to be transformed into differential features. To do this we use two computer vision techniques, d-hashing and feature extraction using a pre-trained neural network.
D-Hashing creates a fingerprint for an image (regardless of size or minor color variations). With this technique it is easy to check for duplicate images. This method (outlined here) is also quick and not intense for a computer. We use the imagehash Python library to do this.
Neural networks are able to learn numeric attributes used to differentiate between images. These numbers are continuous which allows us to calculate similarity. These features are what allow us to cluster images for the mosaic and rank similarity for the reverse image search. Thanks for the decidcation of open source developers and researchers implementation is relatively easy. However, it requires a lot of matrix math which is a lot of work for a regular computer. This process is greatly accelerated using a computer with a graphics processing unit (GPU). We use PyTorch to do this.
These two techniques serve somewhat different purposes. The Meme Observatory architecture intends to take advantage of both techniques when appropriate.
We seek to empower newsrooms, researchers and members of civil society groups to easily research and debunk coordinated hoaxes, harassment campaigns, and racist propaganda that originate from unindexed online commuities.
Specifically, the Meme Observatory will help evidence-based reporting and research into content that is ephemeral, unindexed and toxic in nature. The Meme Observatory would allow a greater variety of users the ability to navigate and investigate these spaces in a more secure and systematic way than is currently available. Formalizing how we observse this content is of utmost importance, as extended contact with these spaces is unnecessary and can lead to vicarious trauma, and in some rare cases radicalization. The Meme Observatory allows users to distance themselves from tertiary material not relevant to their investigation, while providing context vertically and horizontally.
Credits - visit us at www.memeticinfluence.com
Developed by:
memetic influence
Ported by Jansen Derr
Last Updated: 2021-02-26
Technique by:
Leon Yin + Dr. Joan Donovan - HKS Shorenstein Center - TaSC
Published: 2019-04-06


