Skip to content

feat: add last_chunk_time to accurately clean up abandoned uploads without interrupting long transfers#2617

Merged
monkeyiq merged 3 commits intofilesender:development3from
victoritis:feature/abandoned-upload-cleanup-last-chunk-time
Mar 17, 2026
Merged

feat: add last_chunk_time to accurately clean up abandoned uploads without interrupting long transfers#2617
monkeyiq merged 3 commits intofilesender:development3from
victoritis:feature/abandoned-upload-cleanup-last-chunk-time

Conversation

@victoritis
Copy link
Copy Markdown

Feature: last_chunk_time — Precise Detection of Abandoned Uploads

Summary

This PR introduces a new field last_chunk_time to the transfers table and the Transfer class.
Its purpose is to accurately track when the last file chunk was received for a transfer, enabling
the cleanup cron job to distinguish between genuinely abandoned uploads and large files that are
legitimately still uploading.

It also fixes an unrelated but critical bug in allFailed() where date-only comparison caused an
effective cleanup window of 24–48 h instead of the configured value.


Motivation

Real-world context

This problem was encountered in a storage-constrained FileSender installation where:

  1. Disk space is limited. Available storage is a finite, scarce resource that must be managed carefully.
  2. Legitimate uploads can take more than a day. Users regularly upload very large files (hundreds of GB)
    whose transfer takes longer than 24 hours to complete. These must not be interrupted.
  3. FileSender accounts for in-progress uploads when computing available quota / storage.
    An incomplete transfer with status uploading is treated as "space in use", even if the user
    abandoned the upload days ago and no more chunks will ever arrive.

The combination of these three factors creates a compounding problem:

Multiple users start uploads and abandon them mid-way (browser crash, network drop, user gives up).
The transfers remain in uploading status for failed_transfer_cleanup_days (default: 7 days).
During those 7 days, FileSender's storage accounting treats the partially uploaded bytes as
occupied space
, effectively blocking new uploads that would otherwise fit on disk.
Administrators are misled into believing the disk is full when in fact most of the "used" space
belongs to phantom, abandoned transfers.

The original cleanup mechanism was too coarse

The cron job (scripts/task/cron.php) calls Transfer::allFailed() to detect and delete stuck
incomplete transfers. Before this change, the detection query was:

created < :date
  AND (status = 'created' OR status = 'started' OR status = 'uploading')

The :date threshold was computed as NOW() - failed_transfer_cleanup_days.

This had two conflicting flaws that could not be resolved simultaneously:

Goal Required action Conflict
Free space quickly by deleting truly abandoned uploads Set failed_transfer_cleanup_days to a small value (e.g. 1 day) Deletes actively uploading large files that simply take more than 1 day
Allow large multi-day uploads to complete Set failed_transfer_cleanup_days to a large value (e.g. 7 days) Phantom abandoned uploads occupy storage for up to 7 days before cleanup

There was no way to distinguish between a transfer that is actively uploading chunks and one
that was silently abandoned — both look identical from the created column alone.

The default value of failed_transfer_cleanup_days = 7 merely masked the problem for most users.
Any installation with limited disk or with active large uploads was in an impossible bind.

What this PR enables

By tracking last_chunk_time — the timestamp of the most recent chunk received — the system can
now make the correct distinction:

  • Truly abandoned upload: last_chunk_time stops advancing → cleaned up after
    failed_transfer_cleanup_days of inactivity.
  • Large active upload: last_chunk_time advances by at least one chunk per minute →
    never cleaned up, regardless of how long the transfer has been running.

This means failed_transfer_cleanup_days now means exactly what its name says: days of
inactivity, not days since creation. A value of 1 day is now safe and effective: it frees
abandoned upload space within 24 hours while allowing a 10-day upload of a multi-TB file to
complete without any risk of premature deletion.


Changes

1. classes/data/Transfer.class.php

1a. New field in dataMap

'last_chunk_time' => array(
    'type' => 'datetime',
    'null' => true
),

Adds last_chunk_time as a nullable datetime column, mapped to the ORM.

1b. New FAILED query — uses last_chunk_time instead of created

const FAILED = "last_chunk_time < :date
    AND (status = 'created' OR status = 'started' OR status = 'uploading')
    ORDER BY expires ASC";

The cleanup condition now checks inactivity (time since last chunk), not age (time since
creation). A transfer uploading a 500 GB file for 5 days is not abandoned — it last received a
chunk a minute ago.

1c. Property declaration

protected $last_chunk_time = null;

Required for PHP's magic method system. Without this explicit declaration, __set() is triggered
on assignment and, since last_chunk_time was not in the whitelist, would throw a
PropertyAccessException (HTTP 500 on every POST /rest.php/transfer).

1d. __get() whitelist

, 'last_chunk_time'

Added to the in_array list inside __get() so $transfer->last_chunk_time can be read without
triggering PropertyAccessException. Needed by File::writeChunk().

1e. __set() handler

} elseif ($property == 'last_chunk_time') {
    $this->last_chunk_time = (int)$value;

Allows writing $transfer->last_chunk_time = time() safely from both create() and
File::writeChunk().

1f. Initialization in create()

$transfer->last_chunk_time = time();

Every new transfer gets last_chunk_time set to its creation timestamp. This ensures that if no
chunk is ever uploaded (e.g. the user closes the browser before starting), the transfer will still
be cleaned up after failed_transfer_cleanup_days of inactivity.

1g. Fix: allFailed() — use full datetime, not date-only

Before:

return self::all(self::FAILED, array(
    ':date' => date('Y-m-d', time() - ($days * 24 * 3600))
));

After:

return self::all(self::FAILED, array(
    ':date' => date('Y-m-d H:i:s', time() - ($days * 24 * 3600))
));

date('Y-m-d', ...) truncates to midnight, so the effective cleanup window is anywhere between
N and N+1 days depending on when the cron runs. Using 'Y-m-d H:i:s' makes the threshold
exact: a transfer inactive for exactly failed_transfer_cleanup_days × 24 h will be eligible for
cleanup regardless of the time of day.


2. classes/data/File.class.phpwriteChunk() update

public function writeChunk($data, $offset = null) {
    // ... existing chunk write logic ...

    // Update last_chunk_time on the parent transfer, throttled to once per minute
    // to avoid thousands of DB writes for large files with many small chunks.
    if (!$this->transfer->last_chunk_time
        || (time() - $this->transfer->last_chunk_time) >= 60) {
        $this->transfer->last_chunk_time = time();
        $this->transfer->save();
    }
}

Each time a chunk is successfully written, the transfer's last_chunk_time is updated. The
throttle (>= 60 seconds) prevents excessive UPDATE queries: a 1 GB file with 5 MB chunks
generates ~200 chunks — without throttling that would be 200 UPDATE statements per file per
transfer. With throttling it is bounded to one per minute regardless of file count or chunk size.


3. scripts/upgrade/database.php — schema migration + backfill

3a. Column creation

The updateTable() function (already present in database.php) detects and adds missing columns.
It will automatically create last_chunk_time as a DATETIME NULL column on the transfers
table for existing installations.

3b. Backfill for pre-existing rows

// Active uploads: give them NOW() so the cron does not delete them immediately.
UPDATE transfers
   SET last_chunk_time = NOW()
 WHERE last_chunk_time IS NULL
   AND status IN ('created', 'started', 'uploading');

// Everything else: use creation date (irrelevant for cleanup,
// since FAILED only targets incomplete statuses).
UPDATE transfers
   SET last_chunk_time = created
 WHERE last_chunk_time IS NULL;

This backfill runs every time database.php is executed, but is idempotent: it only touches rows
where last_chunk_time IS NULL, so subsequent runs are no-ops.

The distinction between active and non-active statuses during backfill is intentional:

  • Active (created/started/uploading): setting NOW() grants them a fresh
    failed_transfer_cleanup_days window, preventing immediate deletion on the next cron run.
  • Completed (available/closed/etc.): the value doesn't matter for cleanup purposes, but
    created is a semantically accurate fallback.

Files changed

File Change
classes/data/Transfer.class.php dataMap, FAILED, property, __get, __set, create(), allFailed()
classes/data/File.class.php writeChunk() throttled update of last_chunk_time
scripts/upgrade/database.php Column migration + idempotent backfill

Testing

Manual test procedure

  1. Upload any file to create a transfer (status becomes available).
  2. Confirm last_chunk_time is populated:
    SELECT id, status, last_chunk_time, created FROM transfers ORDER BY id DESC LIMIT 5;
  3. Simulate an abandoned upload:
    UPDATE transfers
       SET status = 'uploading',
           last_chunk_time = DATE_SUB(NOW(), INTERVAL 25 HOUR)
     WHERE id = <your_id>;
  4. Run the cron:
    php scripts/task/cron.php --verbose
  5. Confirm the transfer was deleted (failed, deleting it in output) and the row is gone.
  6. Verify that a transfer with last_chunk_time = NOW() is not deleted by the cron.

Throttle verification

Upload a large file and observe the database during upload:

SELECT last_chunk_time FROM transfers WHERE id = <id>;

The timestamp should advance roughly once per minute, not on every chunk.


Backwards compatibility

  • The new column is nullable — no existing row is broken if the migration hasn't run yet.
  • The backfill is idempotent — safe to run on every startup.
  • No API surface changes. All modifications are internal to the ORM layer and the cron task.
  • failed_transfer_cleanup_days semantics are preserved; only the reference timestamp changes
    from "when the transfer was created" to "when the last chunk was received".

@victoritis
Copy link
Copy Markdown
Author

Fix: "Forget it" button on resume-upload modal

Problem

When an upload failed and the user dismissed the resume modal with
"Forget it", only the localStorage entry was removed.
The transfer remained orphaned in the database with uploading status,
reserving disk space indefinitely until the cleanup cron detected it
(using last_chunk_time as the abandonment signal).

Solution

Before clearing localStorage, a DELETE /rest.php/transfer/{id} request
is sent, triggering FileSender's cascade delete:
storage files, recipients, and audit log entries are all removed.

If the DELETE fails (e.g. the cron already cleaned it up), localStorage
is cleared anyway so the user is never left stuck.

Modified file

  • www/js/upload_page.js — "Forget it" button handler inside the
    resume-upload modal (filesender.ui.transfer.resumeOrForgetTransfer)

@monkeyiq monkeyiq merged commit 00b5135 into filesender:development3 Mar 17, 2026
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants