Difference Between CHAR and VARCHAR in SQL Server (Practical Guide)

I still see production schemas where a single column choice quietly costs gigabytes, slows indexing, or creates messy data fixes later. That choice is often CHAR vs VARCHAR. You might not notice it in a small dev database, but in real systems—millions of rows, multiple replicas, and strict SLAs—it shows up as bloated storage, avoidable I/O, and confusing comparisons. When I audit an OLTP database, this is one of the first places I look because small gains here scale everywhere else. You should care too: these data types are foundational, and they affect storage, performance, indexing, and data quality.

I’m going to walk you through how CHAR and VARCHAR behave in SQL Server, when I choose each one, and the edge cases that surprise even experienced developers. I’ll include runnable examples, show how they compare under indexing and collation rules, and give you practical rules that have held up for me across 2024–2026 workloads. If you’re designing new schemas or cleaning up old ones, this will save you time and downtime.

What `CHAR` and `VARCHAR` Actually Store

CHAR is fixed-length. When you define CHAR(10), SQL Server allocates 10 bytes for every row, even if you store only 3 characters. VARCHAR is variable-length. When you define VARCHAR(10), SQL Server stores only the characters you put in, plus a tiny overhead.

That overhead is 2 bytes per row for the length. So a VARCHAR(10) with 3 characters uses 5 bytes total. A CHAR(10) with 3 characters uses 10 bytes. That difference is nothing in a single row, but in 100 million rows, it’s a huge pile of disk and cache.

Here’s a quick demo you can run in SQL Server to feel the difference:

-- Create a demo table
DROP TABLE IF EXISTS dbo.CharVsVarcharDemo;
CREATE TABLE dbo.CharVsVarcharDemo (
Id INT IDENTITY(1,1) PRIMARY KEY,
FixedCode CHAR(10) NOT NULL,
VarCode   VARCHAR(10) NOT NULL
);
-- Insert varying lengths
INSERT INTO dbo.CharVsVarcharDemo (FixedCode, VarCode)
VALUES
(‘ABC‘, ‘ABC‘),
(‘ABCDE‘, ‘ABCDE‘),
(‘ABCDEFGHIJ‘, ‘ABCDEFGHIJ‘);
-- Inspect storage size by row
SELECT
Id,
FixedCode,
VarCode,
DATALENGTH(FixedCode) AS FixedBytes,
DATALENGTH(VarCode) AS VarBytes
FROM dbo.CharVsVarcharDemo;

The results will show FixedBytes always equals 10, while VarBytes equals the actual length. I use DATALENGTH because LEN ignores trailing spaces and can mislead you during analysis.

In practice, I choose CHAR only when I have fixed-length data that is truly fixed in reality, not just “usually short.” Examples: ISO country codes (2 characters), state codes (2 characters), or fixed-format identifiers that you control and always fill completely. For anything else, I default to VARCHAR.

How Trailing Spaces Behave (The Trap You’ll Hit Eventually)

SQL Server treats trailing spaces in CHAR and VARCHAR in a way that surprises many people. When you store data in CHAR, SQL Server pads trailing spaces to the fixed length. When you compare values, SQL Server typically ignores trailing spaces. That means ‘ABC‘ and ‘ABC ‘ compare equal in most collations.

This can be helpful but also confusing. I’ve watched APIs send VARCHAR values with trailing spaces, then watched comparisons silently succeed even though the raw values differ. That can hide input bugs. With CHAR, trailing spaces are always there, even if you didn’t type them.

Try this:

SELECT
CASE WHEN ‘ABC‘ = ‘ABC   ‘ THEN ‘Equal‘ ELSE ‘Not Equal‘ END AS ComparisonResult,
DATALENGTH(‘ABC‘) AS BytesA,
DATALENGTH(‘ABC   ‘) AS BytesB;

You’ll see Equal, but the byte counts differ. This is not just a curiosity; it affects indexes, unique constraints, and how you design keys.

If you store user input with trailing spaces in a VARCHAR, SQL Server still tends to treat it as equivalent in comparisons. If you need to preserve trailing spaces as meaningful data (rare), you should validate in your app layer or use binary collation for that column. I only do this for system-generated values where spaces carry real meaning.

Storage and Page Density: Why It Matters for Performance

SQL Server stores rows on 8 KB pages. The more rows you fit per page, the fewer pages you need to read and cache. Fewer pages generally means faster scans, faster index seeks, and less memory pressure. Fixed-length columns eat space even when empty, reducing page density.

Here’s a simplified way I think about it: If CHAR(50) is used for a column that typically stores 10 characters, you’re wasting 40 bytes per row. At 10 million rows, that’s about 400 MB of wasted storage just for that one column, not counting indexes. That cost often shows up as extra I/O, especially in rowstore indexes.

To explore page usage, you can use spspaceused and sys.dmdbindexphysical_stats:

EXEC sp_spaceused ‘dbo.CharVsVarcharDemo‘;
SELECT
OBJECTNAME(ips.objectid) AS TableName,
ips.indextypedesc,
ips.page_count,
ips.avgpagespaceusedin_percent
FROM sys.dmdbindexphysicalstats(DBID(), OBJECTID(‘dbo.CharVsVarcharDemo‘), NULL, NULL, ‘SAMPLED‘) ips;

When I’m tuning, I compare these metrics before and after switching a column from CHAR to VARCHAR in a staging environment. I usually see improved page density, which translates into more cache hits and lower read latency in the 10–15ms range for heavy scans.

Indexing and Sorting Behavior

Both CHAR and VARCHAR can be indexed, but the choice affects index size and search performance. Larger index keys mean fewer entries per page, which leads to deeper index trees and more IO for seeks. That’s why CHAR can hurt when it’s used for variable-length data: you inflate the index for no real benefit.

I’ve also run into surprises with indexing on CHAR because the padding can produce seemingly redundant keys. Suppose you store a short code in CHAR(10) and expect the index to differentiate ‘ABC‘ and ‘ABC ‘. It won’t—those compare equal under most collations. That can violate assumptions about uniqueness.

If you need to enforce true uniqueness including trailing spaces (rare), you should use a binary collation on that column or store a computed hash. Here’s a pattern I’ve used for strict uniqueness without changing collation globally:

-- Example: enforce trailing-space-sensitive uniqueness with a computed column
ALTER TABLE dbo.CharVsVarcharDemo
ADD VarCodeBinary AS CONVERT(VARBINARY(200), VarCode) PERSISTED;
CREATE UNIQUE INDEX UXCharVsVarcharDemoVarCodeBinary
ON dbo.CharVsVarcharDemo(VarCodeBinary);

I don’t recommend this unless you truly need it, because it adds complexity. The usual fix is to trim input and stick to VARCHAR.

Real-World Use Cases and What I Choose

Here’s how I decide in practice. I’ll give you the same guidance I give teams I work with.

Use `CHAR` when

The value length is fixed by definition, and every value is always full length.
You control the data format and enforce it strictly.
You care about fixed-width exports or legacy integrations that require padded values.

Examples that fit:

ISO country code (CHAR(2)), state code (CHAR(2)) when strict fixed length is enforced.
Fixed-format status codes that are always 4 characters, like ACTV, SUSP, CLOS.
Hash or checksum that is always fixed length and stored in ASCII (though for hashes I prefer BINARY).

Use `VARCHAR` when

The length varies in the real world.
User input is involved.
You expect partial values, optional middle names, or system-generated strings that can expand.

Examples that fit:

Names, emails, addresses, product titles, slugs.
JSON strings or serialized metadata.
External identifiers where length is not guaranteed across providers.

If you’re unsure, I default to VARCHAR and set a reasonable max length. In 2026, with better tooling and schema linting, it’s easy to monitor and tighten later. It’s much harder to expand a CHAR column without a messy migration.

NULLs, Empty Strings, and Defaults

CHAR and VARCHAR treat NULL the same, but the fixed-width nature of CHAR can hide the difference between NULL and an empty string padded with spaces. This is a frequent bug:

-- Insert with empty string
INSERT INTO dbo.CharVsVarcharDemo (FixedCode, VarCode)
VALUES (‘‘, ‘‘);
SELECT
Id,
FixedCode,
VarCode,
DATALENGTH(FixedCode) AS FixedBytes,
DATALENGTH(VarCode) AS VarBytes
FROM dbo.CharVsVarcharDemo;

FixedCode shows 10 bytes even though you inserted an empty string, because it’s padded. If you rely on LEN(FixedCode) = 0 to detect empty values, you’ll get 0, but the storage is still there. That matters for checks and for data quality audits. I recommend consistent input validation: treat empty strings as NULL or enforce a minimal length. Don’t let CHAR padding define your data semantics.

Collation and Comparisons: Case, Accent, and Binary

Both CHAR and VARCHAR follow collation rules for comparisons and ordering. This matters because the same string can compare differently depending on collation settings. If you’re doing case-sensitive comparisons or you need binary-safe behavior, you need to know how collation interacts with type.

The key point: collation rules apply to both types, but the padding behavior of CHAR can influence comparisons. If your collation ignores trailing spaces, then CHAR and VARCHAR behave similarly in equality checks. If you use a binary collation, trailing spaces are significant and comparisons become byte-by-byte.

Here’s a demo:

-- Case-sensitive, binary collation for a direct comparison
SELECT
CASE
WHEN ‘ABC‘ COLLATE Latin1GeneralBIN2 = ‘ABC   ‘ COLLATE Latin1GeneralBIN2
THEN ‘Equal‘
ELSE ‘Not Equal‘
END AS BinaryComparison;

This will return Not Equal. If you’re enforcing exact matches for security tokens or checksums, you should use binary collation or binary types, not CHAR or VARCHAR in a default collation.

Variable-Length Columns and Row Overflow

VARCHAR can store up to 8,000 bytes in-row. If you define VARCHAR(8000) and store data larger than the row can fit, SQL Server can use row-overflow storage. This doesn’t apply to CHAR because it can’t exceed its fixed size, but it also means CHAR(8000) is dangerously large in a rowstore table.

I usually avoid very wide CHAR columns. If you’re storing text that can vary or grow, use VARCHAR, and if it can exceed 8,000 bytes, use VARCHAR(MAX) with appropriate indexing and full-text strategies. In 2026, I’m seeing more teams store JSON in VARCHAR(MAX) or NVARCHAR(MAX) and then index key fields using computed columns. That’s a modern pattern that aligns well with VARCHAR but is awkward with CHAR.

`CHAR` and `VARCHAR` in OLTP vs Analytics

In OLTP systems, row size and page density are critical. I prioritize VARCHAR for variable fields because it reduces storage and improves cache utilization. In analytics or ETL staging, fixed-width CHAR sometimes helps with bulk data imports or fixed-format files, but even there I usually normalize into VARCHAR once the data lands.

If you’re using columnstore indexes, SQL Server compresses data in segments. VARCHAR often compresses better because repeated patterns take fewer bytes, while CHAR has trailing spaces that add noise to compression. I’ve seen 10–20% better columnstore compression when switching fixed-width “mostly short” columns to VARCHAR in large warehouses.

Common Mistakes I See (And How You Can Avoid Them)

Here are the errors that keep showing up in code reviews and production incidents:

1) Using CHAR(50) for names or emails

You waste space and make indexing heavier. Use VARCHAR and size it based on known constraints: for example, VARCHAR(320) for email per practical standards.

2) Assuming LEN reflects storage

LEN ignores trailing spaces, which can hide CHAR padding. Use DATALENGTH when you’re auditing storage or diagnosing unexpected bloat.

3) Treating empty string as NULL interchangeably

If you use CHAR, you might see empty strings stored as padded spaces. Decide on a single policy (I prefer NULL for “missing”) and enforce it in constraints or the app layer.

4) Indexing wide CHAR columns as keys

This leads to large index pages, deeper trees, and slower seeks. If the column is variable in length, use VARCHAR and consider a surrogate key.

5) Mixing data types in joins

Joining a CHAR column to a VARCHAR column can result in implicit conversions and poor plans. I standardize types across related columns and add explicit casts in queries if necessary.

When I Recommend `CHAR` Even in 2026

I’m not anti-CHAR. I just use it deliberately. When you have truly fixed-length values and you care about uniform storage or predictable output formatting, CHAR is a good choice. I still use it for:

Country and state codes
Fixed-length codes defined by a standard (e.g., currency codes when fixed 3 characters)
Padded legacy exports where fixed-width is required

If you don’t have a strong reason, I default to VARCHAR. This mirrors how I think about schemas: keep them adaptable, reduce waste, and favor the data model that matches reality rather than a hypothetical standard.

A Practical Migration Strategy (If You’re Fixing Old Schemas)

If you’re stuck with overly wide CHAR columns and want to move to VARCHAR, here’s a safe path I’ve used:

1) Measure actual lengths

Query max length using DATALENGTH and MAX to see the real usage:

SELECT
MAX(DATALENGTH(LegacyCode)) AS MaxBytes,
AVG(DATALENGTH(LegacyCode) * 1.0) AS AvgBytes
FROM dbo.LegacyTable;

2) Pick a realistic VARCHAR length

Use VARCHAR(n) instead of VARCHAR(MAX) unless you truly need it. Overly wide VARCHAR doesn’t cost space per row, but it can affect memory grants and plan choices.

3) Add a new column and backfill

Create a new VARCHAR column, populate it with RTRIM to remove trailing spaces, and compare results:

ALTER TABLE dbo.LegacyTable
ADD NewCode VARCHAR(20) NULL;
UPDATE dbo.LegacyTable
SET NewCode = RTRIM(LegacyCode);

4) Switch dependencies

Update indexes, constraints, and app code to use the new column. Then drop the old column when you’re confident.

This approach avoids long locks and keeps a rollback path. In 2026, I often pair this with online index rebuilds and deployment pipelines that handle phased schema changes.

Traditional vs Modern: How I Think About It Now

If you’re building systems today, the tools and patterns are more forgiving than they were a decade ago. That doesn’t mean you can ignore fundamentals. Here’s a quick comparison of how I see the shift:

Decision Area

Traditional Approach

Modern Approach (2026) —

—

— Schema Defaults

CHAR for fixed feel, VARCHAR used cautiously

VARCHAR by default with reasoned limits Storage Strategy

Oversize columns “just in case”

Measure actual length, tighten limits, monitor growth Index Design

Keys reflect natural string values

Surrogate keys + indexed lookups for strings Data Quality

Clean inputs in app layer only

App + database constraints + periodic audits Exports

Fixed-width in schema

Fixed-width at export time, not storage

In short: I’m comfortable using VARCHAR more liberally, but I’m stricter about consistency and constraints. In 2026, the real win is not just choosing the right type once—it’s making sure your system keeps that choice honest over time.

Deeper Example: Inserting, Comparing, and Indexing at Scale

Let’s build a slightly bigger test so you can see how CHAR vs VARCHAR behaves in a realistic table. This isn’t a benchmark; it’s a demonstration of how padding and index size show up in metadata.

DROP TABLE IF EXISTS dbo.CodeTest;
GO
CREATE TABLE dbo.CodeTest (
Id INT IDENTITY(1,1) PRIMARY KEY,
FixedCode CHAR(12) NOT NULL,
VarCode   VARCHAR(12) NOT NULL,
Payload   VARCHAR(100) NOT NULL DEFAULT(‘x‘)
);
GO
-- Insert 100k rows with varying code lengths
;WITH n AS (
SELECT TOP (100000) ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) AS rn
FROM sys.allobjects a CROSS JOIN sys.allobjects b
)
INSERT INTO dbo.CodeTest (FixedCode, VarCode, Payload)
SELECT
RIGHT(‘000000000000‘ + CAST(rn AS VARCHAR(12)), 6) AS FixedCode,
CAST(rn AS VARCHAR(12)) AS VarCode,
REPLICATE(‘x‘, (rn % 100) + 1)
FROM n;
GO
-- Observe row lengths
SELECT
AVG(DATALENGTH(FixedCode)) AS AvgFixedBytes,
AVG(DATALENGTH(VarCode)) AS AvgVarBytes
FROM dbo.CodeTest;
GO
-- Create indexes on both columns
CREATE INDEX IXCodeTestFixedCode ON dbo.CodeTest(FixedCode);
CREATE INDEX IXCodeTestVarCode ON dbo.CodeTest(VarCode);
GO
-- Compare index sizes
EXEC sp_spaceused ‘dbo.CodeTest‘;

You’ll see that FixedCode averages 12 bytes, while VarCode averages fewer because some values are shorter. The indexes are the key point: the fixed-length index is larger, even though the logical data is the same. That doesn’t prove all queries are slower, but it’s a solid indicator that you’re carrying extra I/O with no user-facing benefit.

Edge Cases That Bite in Production

Here are the less obvious issues that show up after months of use:

1) String Concatenation and Implicit Padding

CHAR values are padded before concatenation, which can create unexpected strings:

SELECT
‘X‘ + CAST(‘ABC‘ AS CHAR(5)) + ‘Y‘ AS Result;

You’ll get XABC Y. If that result is used to build keys or hashes, you can get mismatches between environments where one uses CHAR and another uses VARCHAR.

2) Sorting and Ordering Subtleties

When ordering by CHAR, the trailing spaces typically don’t affect order in standard collations. But if you switch to a binary collation for exactness, the ordering can change. That can break assumptions in pagination queries or tests that expect a fixed order.

3) Unicode Mixing

If you store Unicode data, you’re going to use NCHAR or NVARCHAR, not CHAR or VARCHAR. But mixed schemas still happen: a CHAR column in one table, NVARCHAR in another. When you join these, SQL Server may convert the CHAR to Unicode, which can affect performance and index usage. The rule I follow is simple: if there’s any chance of multilingual data, go Unicode across the schema so conversions don’t happen mid-query.

4) Trimming in Triggers and Constraints

I’ve seen databases add RTRIM in triggers or computed columns to normalize CHAR input. That works, but it’s easy to forget when you add new columns or copy the pattern. A simpler, more robust approach is to store the data in VARCHAR, trim it in the app layer, and then add a check constraint to enforce length and format.

Practical Scenarios: Choosing the Right Type

Let’s make this tangible with a few common scenarios.

Scenario A: API Tokens

Tokens are usually fixed length and case-sensitive. But they are not human-readable and should be treated as binary-safe. I prefer VARBINARY or BINARY depending on the generation method. If you must store as string (e.g., base64), use VARCHAR with binary collation to enforce exact matches and avoid collation surprises. CHAR is fine only if the token is always fixed length and you never trim it, but I still lean toward VARCHAR because it reduces confusion around padding.

Scenario B: Country Codes

Country codes are a perfect use case for CHAR(2) when they’re ISO-defined and always two characters. This is one of the few places where CHAR is not just acceptable but arguably better: it signals fixed semantics, makes the schema self-documenting, and avoids downstream checks for length.

Scenario C: Customer Names

Names are variable. Use VARCHAR, pick a reasonable max length, and add validation. If you have a system with high sensitivity to memory grants or you want to prevent uncontrolled growth, don’t use VARCHAR(MAX); use VARCHAR(200) or similar and log or reject outliers.

Scenario D: SKU or Product Code

If the SKU format is fixed and enforced (always 8 characters, always uppercase), CHAR(8) is fine. If it’s not fully fixed or can evolve, use VARCHAR and add a constraint to enforce a pattern. The constraint gives you flexibility while keeping the data clean.

Scenario E: Legacy Fixed-Width Files

For ingest, you might store the raw values in CHAR temporarily for staging. That’s fine. But normalize into VARCHAR for the core schema. Keep the fixed-width formatting at the boundary, not in the heart of the database.

Performance Considerations with Query Plans

When query performance gets weird, type choice is often the hidden reason. Here’s what I watch:

Implicit conversion: If you join CHAR to VARCHAR, SQL Server can convert one side, potentially turning an index seek into a scan. I look at the execution plan for “CONVERT_IMPLICIT” and standardize the column types.
Key width: Wide CHAR keys push index row sizes up. In hot tables, that can mean fewer entries per page, deeper B-trees, and more logical reads. I rarely accept wide CHAR keys for primary keys.
Memory grants: Very wide string types can inflate estimated row size, leading to larger memory grants for sorts and hashes. While VARCHAR stores only what it needs, the optimizer estimates can still be influenced by declared length. That’s another reason I keep VARCHAR(n) tight and avoid VARCHAR(MAX) unless needed.

If you want to see how this impacts a real query, run a simple filter with actual execution plans and compare logical reads:

SET STATISTICS IO ON;
SELECT *
FROM dbo.CodeTest
WHERE FixedCode = ‘000123‘;
SELECT *
FROM dbo.CodeTest
WHERE VarCode = ‘123‘;
SET STATISTICS IO OFF;

The logic is simple, but the point is to get in the habit of comparing I/O and plan shapes when you change data types. Even small differences can compound in production.

Data Quality and Constraints: My Playbook

Data type choice is only half the story. The other half is the constraints that enforce meaning. Here’s what I tend to add for string columns:

Length constraints: CHECK (LEN(RTRIM(Col)) BETWEEN 2 AND 10) for codes; CHECK (Col ‘‘) if empty is disallowed.
Pattern constraints: CHECK (Col NOT LIKE ‘%[^A-Z0-9]%‘) for uppercase alphanumeric codes.
Domain tables: A lookup table for codes instead of trusting free-form strings.

These checks matter more for VARCHAR because you don’t get implicit padding. But they also prevent the silent data quality drift that often happens with CHAR columns.

Alternative Approaches: When Neither `CHAR` nor `VARCHAR` Is Best

Sometimes the correct answer is “use a different type.” Here are a few examples:

Hashes and checksums: Use BINARY(16) or VARBINARY(32) for MD5/SHA. It’s smaller and avoids collation issues.
GUIDs: Use UNIQUEIDENTIFIER rather than storing as text. It’s more compact and indexable, though still not tiny.
Booleans: Use BIT instead of CHAR(1) with ‘Y‘/‘N‘.
Structured IDs: If your “code” is really numeric, use INT or BIGINT and format it in your application.

I bring this up because CHAR vs VARCHAR is a string question, and sometimes the right answer is “don’t store it as a string at all.”

Monitoring and Auditing: Keeping the Schema Honest

Once your schema is in place, you still need to watch for drift. I use small, periodic checks to verify actual lengths and distribution:

SELECT
COUNT(*) AS TotalRows,
MAX(DATALENGTH(VarCode)) AS MaxBytes,
AVG(DATALENGTH(VarCode) * 1.0) AS AvgBytes,
SUM(CASE WHEN DATALENGTH(VarCode) = 0 THEN 1 ELSE 0 END) AS EmptyCount
FROM dbo.CodeTest;

This is how you catch the slow creep toward “just make it VARCHAR(2000) so it never fails.” Use actual data to make changes, not guesses.

AI-Assisted Workflows (Useful, But Don’t Outsource Judgment)

AI tools can help detect overly wide string columns and suggest adjustments based on observed data. I’ve used schema linting tools that inspect DATALENGTH distributions and flag waste. That’s helpful for speed, but I still do a manual pass. The AI can’t always see downstream contracts, exports, or legacy integrations that require fixed-width behavior. Use tools to surface candidates, then confirm with human context.

A practical workflow I recommend:

1) Scan for CHAR(n) columns with low average length.

2) Verify if those columns are truly fixed by standard or by habit.

3) If not fixed, propose a VARCHAR migration in a staging environment.

4) Measure storage, index size, and query performance before/after.

This keeps the process data-driven rather than opinion-driven.

Expanded Pitfalls and How to Fix Them

Let’s get specific about common fixes that prevent outages:

Implicit conversion in joins: Standardize the column type across related tables. If you can’t change a table, cast in the query explicitly and test the plan.
Over-allocated CHAR: If a column is 50 but max length is 8, move to VARCHAR(10) and add a check constraint. Keep a fallback by adding a new column first.
Duplicate “unique” values: If uniqueness fails due to trailing spaces being ignored, normalize input with RTRIM and add a unique index on the trimmed value (or use binary collation if spaces are meaningful).
Empty strings vs NULL: Pick a policy. I prefer NULL for missing data, and I enforce CHECK (Col IS NOT NULL AND LEN(RTRIM(Col)) > 0) when the field is required.

These are not theoretical. I’ve fixed production defects caused by each of these in the last two years.

Migration Checklist (Operationalized)

Here’s the more concrete checklist I use so migrations don’t break production:

1) Inventory: List all CHAR columns, their sizes, and usage frequency.

2) Measure: Capture MAX, AVG, and 95th percentile lengths.

3) Decide: Determine which columns are truly fixed by definition.

4) Plan: For candidates, pick a VARCHAR(n) and determine if any external systems rely on fixed-width formatting.

5) Stage: Add new columns and backfill. Compare counts and validate equality with RTRIM.

6) Switch: Update indexes and app code. Use a dual-write period if needed.

7) Cleanup: Drop old columns and rebuild indexes.

8) Monitor: Add alerting for length violations or drift.

If you follow this, the change is boring—and boring is what you want in production.

Why This Still Matters

With SSDs and large memory, it’s tempting to treat storage as cheap and index size as a minor concern. But the real cost is not just storage; it’s I/O, cache churn, and operational complexity. The type choice flows through everything: replication, backups, log shipping, index maintenance, and performance troubleshooting.

Choosing VARCHAR for variable data is one of the simplest wins. Choosing CHAR for truly fixed data gives you clarity and consistency. The mistake is mixing the two out of habit. I’m not aiming for perfection; I’m aiming for a schema that behaves predictably under real load.

Summary Rules I Actually Use

If you want a short list you can keep on a sticky note, this is it:

If the data is truly fixed length, use CHAR and enforce it.
If the data varies, use VARCHAR and set a sane max length.
If the data is binary in nature, use a binary type.
Don’t use LEN to measure storage; use DATALENGTH.
Avoid joins across mixed types; align the schema.
Treat empty strings and NULLs intentionally, not accidentally.

That’s the difference between a schema that ages well and one that needs constant patching.

Final Take

CHAR is not wrong. It’s just easy to use it in the wrong places. VARCHAR is not a magic fix. It’s just more aligned with how most real-world data behaves. If you select the type based on reality rather than tradition, you’ll see smaller tables, faster indexes, and fewer weird bugs. That’s why I keep coming back to this simple choice—it’s a foundational decision with outsized impact.

If you’re designing new schemas, default to VARCHAR and use CHAR intentionally. If you’re cleaning up legacy schemas, measure first, migrate carefully, and validate with real data. This is one of those changes that pays back again and again.

What CHAR and VARCHAR Actually Store