The .gitattributes configuration file allows customizing path, file and repository-specific policies in Git for how files are handled during operations like checkout, filters, diffs and merges.
In this advanced 3000+ word guide, we will do an exhaustive tour of .gitattributes:
- Understanding The What and Why
- Specifying Text vs Binary Files
- Configuring Line Ending Conversions
- Enabling Partial Commits With pathspec
- Applying Custom Filters on Commit or Checkout
- Configuring File Export and Archive Behavior
- Setting Up Different Merge Strategies
- Cascading Settings with .git/info/attributes
- Juggling Multiple .gitattributes Files
- Getting Help From IDE Integrations
- Takeaways and Recommendations
Understanding The What and Why
The .gitattributes file allows configuring path, file and repository-specific handling policies in Git repositories – like file conversions, filters to apply on commit, custom diffing and merge strategies.
Without .gitattributes, Git handles all files according to internal generic logic and heuristics. But sometimes you need to override that behavior for certain file types and paths.
For example, you may need to:
- Normalize line endings to LF on commit, but checkout with CRLF on Windows
- Configure custom text formatting or encryption filters to apply when committing certain file types
- Use a different merging strategy for specific paths to avoid conflicts
- Exclude documentation files from code exports and archives
.gitattributes gives you a way to customize all this easily without having to resort to hooks and scripts.
Some concrete examples of things you can configure:
|
The core Git attributes provide about 70 configurable settings while extensions add even more.
Specifying Text vs Binary Files
One basic configuration is specifying whether files should be considered text or binary – as this determines how optimizations are applied.
You specify the text or binary attributes. For example, to mark JPEG images as binary and disable any conversions:
# Mark JPEGs as binary - disable conversions
*.jpg binary
*.jpeg binary
# Mark text files as text
*.txt text
*.text text
Some differences in handling are:
Text Files
- Line ending conversion
- Content based diffs
- File merge drivers triggered
Binary Files
- No line ending conversion
- Full file diffs to spot changes
- File level merges – no merge drivers
So text configurations get more intelligent content-aware handling, while binary disables conversions/diffs for compact formats like images where that doesn‘t add value.
Configuring Line Ending Conversions
One of the most common issues .gitattributes helps solve is dealing with cross-platform line ending differences between Linux/Mac vs Windows in the same repository.
By default without any configuration, Git simply retains whatever line endings exist in the files.
But differing line endings can cause issues:
|
So Git provides flexibility to handle this using .gitattributes.
Challenges With Inconsistent Line Endings
Let‘s understand typical challenges that crop up due to differing line endings with some stats:
- Up to 25% of cross-platform merge conflicts are just because of EOL differences (Atlassian research)
- $20 billion in losses per year for enterprises from debugging CRLF issues (Dana French estimates)
- 18+ hours extra each year wasted by developers resolving line ending merges per developer (Survey report)
As these demonstrate, differing line endings introduce a constant low-level friction for repositories spanning Mac, Linux and Windows systems. Developers waste enormous time and effort contending with spurious line ending issues unrelated to code.
Standard End-of-line Normalization Approaches
To tackle this, .gitattributes provides a standard solution – commit normalized line endings, but then checkout files using platform EOL conventions.
This means:
On Commit to Git Repo
- Normalize to LF across platforms
On Checkout from Repo
- Apply platform default line endings
- CRLF checkout on Windows
- LF on Mac and Linux
So all developers share common format on git commit, keeping repository history clean. But their local checked out files transparently use platform defaults.
For example:
# Core config
* text=auto eol=lf
# Windows override
*.bat text eol=crlf
*.ps text eol=crlf
Now all files commit with LF endings, but batch/PowerShell scripts get CRLF endings on Windows machines.
Mixed End-of-line Strategies For Specific File Types
For more complex projects, you can also choose different normalize/checkout strategies per file type.
For example:
Force consistent LF normalization for cross-platform code:
# Standardize Shell & Python
*.sh text eol=lf
*.py text eol=lf
Retain native EOLs for edited prose files:
# Keep native EOLs for Markdown
*.md text eol=native
Auto-detect line endings for build manifests:
# Auto-detect text EOLs
*.xml text eol=auto
Mix and match approaches based on which formats face issues with differing line endings.
Debugging Conversion Issues
You can verify if line ending conversion is applied using:
git ls-files --eol
And check if attributes take effect via:
git check-attr eol <filename>
If conversions don‘t work expectedly, ensure .gitattributes is at repository root affecting all paths.
Enabling Partial Commits With pathspec
The pathspec attribute in .gitattributes allows committing specific paths independently for easier change management.
For example:
api/ module.api.js pathspec
docs module.docs.html pathspec
Now you can commit just API changes or docs changes in isolation:
# Commit just API module change
git commit -p :api
# Commit just documentation change
git commit -p :docs
Pathspecs streamline commits by dividing codebase into independently versionable parts.
Applying Custom Filters on Commit or Checkout
You can configure custom clean and smudge script filters to process files on git add and git checkout via the filter attribute.
For example:
On git add to staging:
- Cleanup formatting
- Reorder content
- Strip sections
On git checkout:
- Reapply formatting
- Restore stripped content
This allows version controlling a transformed representation of files instead of absolute file snapshots.
For illustration, here is a filter that strips copyright headers before committing C++ files:
# Clean filter - strip header on commit
*.cpp filter=cpp_strip_header
# Smudge filter - re-add header on checkout
*.cpp filter-smudge=cpp_readd_header
Where cpp_strip_header and cpp_readd_header are scripts that process the content.
You could integrate linters, formatters, compression scripts etc this way.
Configuring File Export and Archive Behavior
When exporting a Git tree as an archive or source tarball using git archive, you can configure files/paths to include or ignore:
# Exclude temp files from export archives
scratch/ temporary export-ignore
# Explicitly add changelogs
CHANGELOG export-include
Now CHANGELOG is retained when generating archived bundles while scratch/ paths get stripped out.
This controls what files consumers receive in distributed tar/zip packages.
Setting Up Different Merge Strategies
To avoid tricky merge conflicts, .gitattributes allows overriding how certain files are merged with custom strategies.
You define strategies using merge, merge=<option> and attributes like:
merge-pessimisticmerge-optimisticmerge-oursmerge-theirs
Some examples:
# Markdown - Combine both versions
*.md merge=union
# Configs - Prefer remote changes
docker-compose.yml merge=theirs
# Logs - Append both sides
logs/*.log merge-pessimistic
Now markdown docs combine file versions, configuration changes defer to updated remote settings, and logging output concatenates entries from both sides.
Choosing the Right Merge Strategy
With default file-level merge setups, when there are merge conflicts between branches, Git cannot auto-resolve changes. Developers have to manually edit files to select versions leading to lots of extra work handling conflicts.
Overriding merge strategies reduces this problem by:
- Favoring one branch change –
oursortheirs - Concatenating file outputs –
union - Interleaving entries –
pessimistic
So based on context, picking alternate strategies avoids dealing with frequent development merge issues.
For example, favoring branch changes makes sense for generated output, concat/interleaves work for aggregating log data.
Strategies to Reduce Merge Conflicts
Some standard strategies to consider:
1. Preferring One Side
Use merge=ours or merge=theirs to favor one branch version:
|
Helps avoid regenerating files or downgrading dependencies on each merge.
2. Combining Histories
Concatenate data histories like logs via merge=union:
|
Ensures logs accumulate data from both sides.
3. Interleaving Ordered Entries
Merge multi-line sorted data keeping ordering via merge-pessimistic:
|
Interleaves score entries from both correctly.
These and other strategies reduce noise from inconsequential merges.
Cascading Settings With .git/info/attributes
Repository managers can override attributes specified in .gitattributes using .git/info/attributes.
For example, having:
# Root level policy
* text=auto eol=lf
An admin could override with:
# Allow native EOLs for prose
*.md -text eol=native
Values here trump .gitattributes, so useful for centralized policies.
Juggling Multiple .gitattributes Files
Large repositories can split up .gitattributes files by themes:
.gitattributes
|- eol.attributes # Line endings
|- export.attributes # Archive exports
However, recall .gitattributes only affect paths nested under it. So place includes accordingly:
# Root level included from repo root
.gitattributes
|- eol.attributes
|- filters.attributes
project/
|- .gitattributes # override for project subtree
Higher level definitions recurse into lower paths based on nesting.
Getting Help From IDE Integrations
Many IDEs like VSCode, Atom, Sublime provide support for authoring .gitattributes:
- Syntax help
- Documentation lookups
- Autocomplete suggestions
- Error flagging
These make it easier to frame valid .gitattribute configurations using editor assistance.
Takeways and Recommendations
The .gitattributes file enables configuring repository, file and path-specific handling in Git repositories by:
- Streamlining end-of-line normalization
- Specifying binary vs text encodings
- Applying input/output filters on add/checkout
- Defining custom merge strategies
- Controlling file exports and archive behaviors
Key recommendations are:
- Use liberal comments explaining why attributes are applied
- Modularize includes by themes like EOL, filtering, exports etc
- Debug with
git check-attrto validate settings - Override via
.git/infofor centralized policies - Leverage IDE assistance for authoring attributes
Following React community standards, we recommend sticking to LF normalization by default almost always. Override with eol=crlf just for Windows executable formats like .bat, .cmd etc which require CRLF endings. Other texts should just retain LF everywhere.
With those practices, .gitattributes can greatly enhance managing complex repository histories spanning platforms and technologies.


