Python Pandas - Merge DataFrame with indicator value

To merge Pandas DataFrame with indicator information, use the merge() function with the indicator parameter set to True. This adds a special _merge column showing the source of each row.

What is the Indicator Parameter?

The indicator parameter creates a column that tracks whether each row comes from the left DataFrame, right DataFrame, or both ?

import pandas as pd

# Create DataFrame1
dataFrame1 = pd.DataFrame({
    "Car": ['BMW', 'Lexus', 'Audi', 'Mustang', 'Bentley', 'Jaguar'],
    "Units": [100, 150, 110, 80, 110, 90]
})

print("DataFrame1:")
print(dataFrame1)
DataFrame1:
       Car  Units
0      BMW    100
1    Lexus    150
2     Audi    110
3  Mustang     80
4  Bentley    110
5   Jaguar     90

Creating the Second DataFrame

import pandas as pd

# Create DataFrame2
dataFrame2 = pd.DataFrame({
    "Car": ['BMW', 'Lexus', 'Tesla', 'Mustang', 'Mercedes', 'Jaguar'],
    "Reg_Price": [7000, 1500, 5000, 8000, 9000, 6000]
})

print("DataFrame2:")
print(dataFrame2)
DataFrame2:
       Car  Reg_Price
0      BMW       7000
1    Lexus       1500
2    Tesla       5000
3  Mustang       8000
4 Mercedes       9000
5   Jaguar       6000

Merging with Indicator

Now merge the DataFrames using indicator=True to track row sources ?

import pandas as pd

# Create both DataFrames
dataFrame1 = pd.DataFrame({
    "Car": ['BMW', 'Lexus', 'Audi', 'Mustang', 'Bentley', 'Jaguar'],
    "Units": [100, 150, 110, 80, 110, 90]
})

dataFrame2 = pd.DataFrame({
    "Car": ['BMW', 'Lexus', 'Tesla', 'Mustang', 'Mercedes', 'Jaguar'],
    "Reg_Price": [7000, 1500, 5000, 8000, 9000, 6000]
})

# Merge DataFrames with indicator
mergedRes = pd.merge(dataFrame1, dataFrame2, how="left", indicator=True)
print("Merged dataframe with indicator:")
print(mergedRes)
Merged dataframe with indicator:
       Car  Units  Reg_Price      _merge
0      BMW    100     7000.0        both
1    Lexus    150     1500.0        both
2     Audi    110        NaN   left_only
3  Mustang     80     8000.0        both
4  Bentley    110        NaN   left_only
5   Jaguar     90     6000.0        both

Understanding the _merge Column

The _merge column contains three possible values ?

Value Meaning Description
both Present in both DataFrames Row exists in left and right
left_only Only in left DataFrame Row exists only in first DataFrame
right_only Only in right DataFrame Row exists only in second DataFrame

Custom Indicator Column Name

You can specify a custom name for the indicator column ?

import pandas as pd

dataFrame1 = pd.DataFrame({
    "Car": ['BMW', 'Lexus', 'Audi'],
    "Units": [100, 150, 110]
})

dataFrame2 = pd.DataFrame({
    "Car": ['BMW', 'Lexus', 'Tesla'],
    "Reg_Price": [7000, 1500, 5000]
})

# Custom indicator column name
mergedRes = pd.merge(dataFrame1, dataFrame2, how="outer", indicator="source")
print(mergedRes)
     Car  Units  Reg_Price      source
0    BMW  100.0     7000.0        both
1  Lexus  150.0     1500.0        both
2   Audi  110.0        NaN   left_only
3  Tesla    NaN     5000.0  right_only

Conclusion

The indicator parameter in merge() adds a tracking column showing row sources. Use indicator=True for default _merge column, or specify a custom name for better clarity in your analysis.

Updated on: 2026-03-26T02:14:10+05:30

5K+ Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements