-
Notifications
You must be signed in to change notification settings - Fork 2.1k
RFC: prisma migrate commit #25889
Description
RFC: prisma migrate commit
Problem statement
When working on a new feature in your application that involves changing the database schema you often need need multiple iterations until you get it right, especially when adding new models.
The case of initial prototyping before creation of the first migration is supported in Prisma well but iterating on new changes without thorough planning is not as ergonomic.
Applying each and every change with migrate dev will lead to long and messy migration history that needs to be squashed and rewritten before merging, especially if some of the steps may be undoing the others. Unnecessary intermediate migrations can lead to longer deployment and migration times, increased risk of errors and increased risk of degraded user experience due to locking if they touch existing tables. Squashing them is not hard but it’s a hassle.
Using db push during development, like during the initial prototyping, and then creating a migration using migrate dev before committing the code is more straightforward and leads to better results. The downside, however, is that creating that migration requires resetting the development database:
Drift detected: Your database schema is not in sync with your migration history.
The following is a summary of the differences between the expected database schema given your migrations files, and the actual schema of the database.
It should be understood as the set of changes to get from the expected schema to the actual schema.
[*] Changed the `Post` table
[+] Added column `banana`
? We need to reset the PostgreSQL database "db" at "localhost:5432".
Do you want to continue? All data will be lost. > (y/N)
While it is certainly a good practice to have a seed script with good test data that can be applied any moment and treat the development database as disposable, it is not always the case, and sometimes it is desirable to keep the local data.
Proposed solution
I propose a new command for creating migrations: migrate commit. Unlike migrate dev, it requires the database schema to be in sync with the Prisma schema but not with the latest migration.
This command writes the pending changes in the database schema into a new migration file, and marks this migration as applied. This allows to use db push to interactively iterate on the schema, and then commit the changes at once without losing any data.
Essentially this command expects the database schema to be different than the state recorded in the migration history, and diffs from the migrations directory to the database rather than to the schema file (hence the requirement that the database schema and the Prisma schema are in sync, which is not required or even expected by migrate dev).
Other use case: baselining
Mechanically this is very similar to baselining, the only difference being that for baselining we diff from the empty state and not from the existing migrations.
Since this command should support creating the migration history when it doesn’t exist, it means we also get a high-level command for baselining for free: prisma db pull && prisma migrate commit.
Couldn't prisma migrate dev just not reset the database if database schema and Prisma schema are in sync?
It could in principle. One way to justify it is that if the database schema is in sync with the Prisma schema, then it is also in some sense in sync with the migration history — just not with the current state of it but with the future state that is exactly being created right now — it’s not a part of a completely different timeline as far as the migration history is concerned.
But it would make the mental model and the code more complicated:
- The command would have somewhat different behavior depending on whether the schema is in sync or not.
- You'd still get the schema drift warning and the reset prompt if you applied some changes with
db push, then edited the schema file once more and ranmigrate devto migrate to the final state of the schema file, which may be confusing and/or annoying. migrate dev --create-onlywould leave the database in a weird state where the migration is de-facto applied but not recorded as applied, so there would need to be an additional check to ignore or forbid this flag in this case.
To me it feels like these are two separate commands after all.
When is it not useful?
Basically in the cases where you’d want to use migrate dev --create-only, such as when you need to write data migrations.
When is it especially useful?
Iteratively working on mostly additive changes in the scope of one code change, e.g., create a model, then add a bunch more fields, then remove one, then realize you need to add one more.
Implementation
This does not require any substantial changes in the Schema Engine outside of adding a new handler in the JSON-RPC frontend, it will fully rely on the existing core functionality.
This can also be implemented fully in TypeScript in the @prisma/migrate package using the existing JSON-RPC methods but it will require one extra diffing if we want to print a nice summary of the changes and will duplicate the migration file and directory creation logic, so the engine-side implementation is better.
It can also be implemented in terms of the existing low-level CLI commands, here’s the reference implementation as a shell script:
#!/bin/sh
set -eu
print_help () {
>&2 echo "Commit all changes interactively created with \`prisma db push\` to a new"
>&2 echo "applied migration without schema drift errors and without losing any data."
>&2 echo
>&2 echo "Usage: $0 migration_name [--schema=/path/to/schema.prisma] [--migrations=/path/to/migrations_directory] [--help]"
>&2 echo
>&2 echo " migration_name The name of the migration to create"
>&2 echo " --schema=<schema_path> The path to the schema.prisma file"
>&2 echo " --migrations=<migrations_dir> The path to the migrations directory"
>&2 echo " --help Print this help message"
}
if [ $# -lt 1 ]; then
print_help
exit 1
fi
migration_name="$(date -u +"%Y%m%d%H%M%S")_$1"
shift
while [ $# -gt 0 ]; do
case "$1" in
--schema=*)
schema_file="${1#*=}"
shift
;;
--migrations=*)
migrations_directory="${1#*=}"
shift
;;
--help)
print_help
exit 0
;;
*)
>&2 echo "Error: unexpected argument \`$1\`"
exit 1
;;
esac
done
schema_file="${schema_file:-$(pwd)/prisma/schema.prisma}"
migrations_directory="${migrations_directory:-$(pwd)/prisma/migrations}"
npx prisma migrate diff \
--from-schema-datasource="$schema_file" \
--to-schema-datamodel="$schema_file" \
--exit-code > /dev/null || {
if [ $? -eq 2 ]; then
>&2 echo "Error: the database and the schema are not in sync. Please run \`prisma db push\` first."
fi
exit 1
}
echo "Creating migration \`$migration_name\`"
if [ -d "$migrations_directory" ]; then
from_arg="--from-migrations=$migrations_directory"
else
from_arg="--from-empty"
provider=$(grep -A 3 '^ *datasource ' prisma/schema.prisma | grep '^ *provider *=' | sed -e 's/^ *provider *= *"\(.*\)"/\1/')
mkdir -p "$migrations_directory"
echo "provider = \"$provider\"" > "$migrations_directory/migration_lock.toml"
>&2 echo "Initialized migrations directory at \`$migrations_directory\`"
fi
migrate_diff () {
set +e
npx prisma migrate diff \
"$from_arg" \
--to-schema-datasource="$schema_file" \
--exit-code \
"$@"
error_code=$?
set -e
case $error_code in
0)
>&2 echo "Error: no changes in the database since the last migration."
return 1
;;
2)
# Non-empty diff, continuing
return 0
;;
*)
return $error_code
;;
esac
}
# Print the human-readable summary to the screen
migrate_diff
mkdir -p "$migrations_directory/$migration_name"
# Write the actual SQL migration to the disk
migrate_diff --script > "$migrations_directory/$migration_name/migration.sql"
npx prisma migrate resolve --applied "$migration_name" --schema="$schema_file"