image compression python ftp

Automate Image Compression with Python over FTP

Estimated Read Time: 6 minute(s)
Common Topics: ftp, image, images, data, python

Image compression isn’t new to the tech SEO world, but with site performance and Core Web Vitals now influencing rankings, it’s time to take action. I’ve done dozens of site audits and find that roughly 80% of performance issues fall into two buckets: images or JavaScript. When images are a major issue, I cheer — it’s one of the easiest problems to fix. A common culprit is sites delivering uncompressed images: files that contain more data (including metadata) than the human eye needs to see a clear, undistorted image.

We can strip that unneeded data, often reducing file sizes by 80% or more. That can have a profound impact on site performance. Too often I see pages loading images larger than 1MB. Unless you’re running an art or photography store (where maximum quality matters), that is excessive. Many common CMSs now include built-in or plugin-based image compression, but I still encounter legacy or uncommon CMSs that don’t — and in those cases you must use FTP. If that’s you, this tutorial will help.

In this tutorial I show how to set up an automated process that downloads new images for the day, compresses them, and uploads them back to the server — all for free.

Requirements and Assumptions

  • Python 3 is installed and basic Python syntax is understood
  • Access to a Linux installation (I recommend Ubuntu) or Google Cloud Platform (some alterations will need to be made)
  • An FTP login to the host server with access to the server image folder

Import Modules and Set Authentication

Before we begin, watch indentation when copying code snippets — they sometimes don’t copy perfectly. Most modules used here should be available in a standard Python 3 environment. I found I needed to upgrade PIL to version 8.2; you can update it with this command in your terminal:

pip3 install PIL --upgrade

  • ftplib: handles the FTP connection
  • pathlib: helps identify the image extension cleanly
  • dateutil: extends the datetime module
  • datetime: handles date and time functions
  • PIL: processes the image compression
  • os: for opening and writing files locally
  • glob: extends the os module

Import Python Modules

First, import the modules listed above.

from ftplib import FTP
import pathlib
from dateutil import parser
from datetime import date
from datetime import datetime
from PIL import Image
import PIL
import os
import glob

Setup FTP Connection

Next, set up the FTP connection variables. Note: the script as written does not use FTP over TLS. If you require FTP over TLS, make small modifications such as using the FTP_TLS() function — see the documentation. After defining the variables, make the connection and change to the folder where images are stored. Many systems use subfolders; if so, you’ll need a recursive method to loop through them, which is beyond this tutorial.

host = "YOUR_SERVER_IP"
port = "YOUR_PORT"
username = "YOUR_USERNAME"
password = "YOUR_PASSWORD"
img_folder_path = "WHERE_YOUR_IMAGES_ARE_STORED_ON_SERVER"

ftp = FTP()
ftp.set_debuglevel(2)
ftp.connect(host, port) 
ftp.login(username, password)

ftp.cwd(img_folder_path )

Setup Script Variables

Next, capture the names of all files in the folder using ftp.mlsd(). We only optimize JPG and PNG files, so create a list of extensions to match against server files. Define a local path to store downloaded (uncompressed) images. Store today’s date to match files uploaded today so you only download newly uploaded images — this prevents repeatedly downloading and reprocessing an entire directory. Finally, create a log file if one doesn’t already exist to track optimizations over time.

names = ftp.mlsd()
imglist = [".jpg",".jpeg",".png",".JPG",".JPEG",".PNG"]
rawpath = "UNCOMPRESSED_IMG_PATH_LOCAL"

today = date.today()
todayshort = today.strftime("%Y/%m/%d")

logfile = open(rawpath + "opt/log.txt", "a")

Loop Through Images

With the file list captured, process only those with a modified date equal to today and an extension in our list. Files uploaded previously, or files with other extensions (for example GIFs or WebP), are skipped.

for name, facts in names:
    mod_date = str(datetime.strptime(facts["modify"],"%Y%m%d%H%M%S"))[:10]
    if pathlib.Path(name).suffix in imglist and mod_date == todayshort:

Optimize Images

When a file matches the criteria, use ftp.retrbinary() to download it to the local path defined by rawpath. Create a filename variable that includes the local path for the uncompressed image, and create an opt subfolder to store the optimized version so you can revert if needed. Never overwrite the original on your local machine; always keep backups.

Then use the Image functionality in the PIL module to open the image and resave it to the optimized subfolder with optimize=True and a reasonable quality (65 is a good starting point). Quality can be adjusted — some images tolerate values as low as 35, but most sit comfortably between 65 and 80.

    ftp.retrbinary("RETR " + name, open(rawpath + name, 'wb').write)
        
    filename = rawpath + name
    filename_opt = rawpath + "opt/" + name
        
    picture = Image.open(filename)
    picture.save(filename_opt, optimize=True, quality=65)

Upload and Log

With the optimized file ready, upload it back to the server to replace the original (keep your local backup until you’re satisfied). Then log the details and close any open files and connections.

    fp = open(filename_opt, 'rb')
    ftp.storbinary('STOR %s' % os.path.basename(filename_opt), fp, 1024)
    fp.close()

    org_size = os.path.getsize(filename)
    opt_size = os.path.getsize(filename_opt)

    logfile.write(today + " - " + name + "Org: " + str(org_size/1024)kb + " Opt: " + str(opt_size/1024)kb + str((opt_size-org_size)/org_size*100) + "% savings")

ftp.quit()
logfile.close()

That’s it — now it’s time to automate.

Automating the Compression

Here are two options to automate this process:

  1. Send it to the cloud and use Google Cloud Platform. I have a tutorial on setting up Google Cloud Platform with Cloud Functions and Cloud Scheduler.
  2. Automate locally using your system’s cronjob facility if you’re on Linux or macOS. See below.

Linux provides a built-in scheduler via crontab. The crontab stores script entries that control when scripts execute (any time of day, day of week, day of month, etc.), giving you flexible scheduling.

If you go this route, add a shebang to the top of your script. It tells Linux to run the script with Python 3:

#!/usr/bin/python3

To edit the crontab, run:

crontab -e

It likely opens the crontab file in the vi editor. Add the following line on a blank line at the bottom of the file to run the script at midnight every Sunday. Use crontab.guru to customize the schedule and update the path to your script.

0 0 * * SUN /usr/bin/python3 PATH_TO_SCRIPT/filename.py

If you want to capture a run log, use this variant and customize the path:

0 0 * * SUN /usr/bin/python3 PATH_TO_SCRIPT/filename.py > PATH_TO_FILE/FILENAME.log 2>&1

Save the crontab file and you’re set. Note that the machine must be powered on when the cronjob is scheduled to run.

Conclusion

There you have it — you can now optimize images over FTP for free and automate the process. Set it and forget it. Don’t forget: Core Web Vitals was scheduled to become a ranking factor in May 2021. Future extensions to this tutorial include handling multiple folders and recursion. Now get out there and try it! Follow me on Twitter and share your applications and ideas.

Thanks to James Phoenix for the inspiration after reading his tutorial here:
https://sempioneer.com/python-for-seo/how-to-compress-images-in-python/

Python FTP and Image Compression FAQ

How can image compression be automated using Python over FTP?

Python scripts can automate compressing images and transferring them over FTP by leveraging FTP libraries and image-processing modules.

What Python libraries are commonly used for image compression and FTP interactions?

For image compression, the Pillow library is commonly used. To interact with FTP, the ftplib module enables automated transfers and integration in scripts.

What specific steps are involved in automating image compression with Python over FTP?

The steps include compressing images with Pillow, connecting to an FTP server with ftplib, and transferring optimized images back to the server. These actions are combined into a script for scheduling and automation.

Are there any considerations or limitations to be aware of when automating image compression with Python over FTP?

Consider image size and resolution, potential quality loss from compression, and ensure secure FTP connections with appropriate authentication to protect data during transfer.

Where can I find examples and documentation for automating image compression with Python over FTP?

Consult official documentation for Pillow and ftplib, and review online tutorials and Python resources for practical examples and guidance on automating image compression and FTP operations.

Greg Bernhardt
Follow me