Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
Selected Reading
Python – Strip whitespace from a Pandas DataFrame
To strip whitespace from a Pandas DataFrame, use the str.strip() method on string columns. This removes both leading and trailing whitespace characters from text data.
Importing Pandas
First, let us import the required Pandas library with an alias ?
import pandas as pd
Creating a DataFrame with Whitespace
Let's create a DataFrame with 3 columns where the first column has leading and trailing whitespaces ?
import pandas as pd
dataFrame = pd.DataFrame({
'Product Category': [' Computer', ' Mobile Phone', 'Electronics ', 'Appliances', ' Furniture', 'Stationery'],
'Product Name': ['Keyboard', 'Charger', 'SmartTV', 'Refrigerators', 'Chairs', 'Diaries'],
'Quantity': [10, 50, 10, 20, 25, 50]
})
print("Original DataFrame:")
print(dataFrame)
Original DataFrame: Product Category Product Name Quantity 0 Computer Keyboard 10 1 Mobile Phone Charger 50 2 Electronics SmartTV 10 3 Appliances Refrigerators 20 4 Furniture Chairs 25 5 Stationery Diaries 50
Stripping Whitespace from a Single Column
To remove whitespace from a single column "Product Category" ?
import pandas as pd
dataFrame = pd.DataFrame({
'Product Category': [' Computer', ' Mobile Phone', 'Electronics ', 'Appliances', ' Furniture', 'Stationery'],
'Product Name': ['Keyboard', 'Charger', 'SmartTV', 'Refrigerators', 'Chairs', 'Diaries'],
'Quantity': [10, 50, 10, 20, 25, 50]
})
# Strip whitespace from a single column
dataFrame['Product Category'] = dataFrame['Product Category'].str.strip()
print("DataFrame after removing whitespaces:")
print(dataFrame)
DataFrame after removing whitespaces: Product Category Product Name Quantity 0 Computer Keyboard 10 1 Mobile Phone Charger 50 2 Electronics SmartTV 10 3 Appliances Refrigerators 20 4 Furniture Chairs 25 5 Stationery Diaries 50
Stripping Whitespace from All String Columns
To strip whitespace from all string columns in the DataFrame ?
import pandas as pd
dataFrame = pd.DataFrame({
'Product Category': [' Computer', ' Mobile Phone', 'Electronics ', 'Appliances', ' Furniture', 'Stationery'],
'Product Name': [' Keyboard ', 'Charger', 'SmartTV ', 'Refrigerators', ' Chairs', 'Diaries'],
'Quantity': [10, 50, 10, 20, 25, 50]
})
# Strip whitespace from all object/string columns
string_columns = dataFrame.select_dtypes(include=['object']).columns
dataFrame[string_columns] = dataFrame[string_columns].apply(lambda x: x.str.strip())
print("DataFrame after stripping all string columns:")
print(dataFrame)
DataFrame after stripping all string columns: Product Category Product Name Quantity 0 Computer Keyboard 10 1 Mobile Phone Charger 50 2 Electronics SmartTV 10 3 Appliances Refrigerators 20 4 Furniture Chairs 25 5 Stationery Diaries 50
Conclusion
Use str.strip() to remove leading and trailing whitespace from pandas DataFrame columns. Apply it to single columns or use select_dtypes() to strip all string columns at once.
Advertisements
