Processing time with Pandas DataFrame

Working with time-based data is crucial in data analysis. Pandas provides powerful tools for generating and processing timestamps through date_range(), datetime accessors, and time-based filtering operations.

Setting Up the Environment

Before working with time data, install pandas using the following command ?

pip install pandas

Creating a DataFrame with Time Series

Use pd.date_range() to generate datetime sequences with specified frequency ?

import pandas as pd

# Create DataFrame with time column
data_struct = pd.DataFrame()
data_struct['time'] = pd.date_range('2019-07-14', periods=4, freq='3H')
print(data_struct['time'])
0   2019-07-14 00:00:00
1   2019-07-14 03:00:00
2   2019-07-14 06:00:00
3   2019-07-14 09:00:00
Name: time, dtype: datetime64[ns]

Extracting Date Components

Use the .dt accessor to extract specific components like year, month, or day ?

import pandas as pd

data_struct = pd.DataFrame()
data_struct['time'] = pd.date_range('2019-07-14', periods=4, freq='3H')

# Extract year component
data_struct['year'] = data_struct['time'].dt.year
print(data_struct.head())
                 time  year
0 2019-07-14 00:00:00  2019
1 2019-07-14 03:00:00  2019
2 2019-07-14 06:00:00  2019
3 2019-07-14 09:00:00  2019

Converting String to DateTime

Convert time strings to datetime objects using pd.to_datetime() with format specification ?

import pandas as pd
import numpy as np

# Create time strings
dt_timestring = np.array(['14-07-2019 07:26 AM', '13-07-2019 11:01 PM'])

# Convert to datetime objects
timestamps = [pd.to_datetime(date, format="%d-%m-%Y %I:%M %p", errors="coerce") 
              for date in dt_timestring]
print(timestamps)
[Timestamp('2019-07-14 07:26:00'), Timestamp('2019-07-13 23:01:00')]

Setting DateTime as Index

Use datetime columns as DataFrame index for time-based operations ?

import pandas as pd

# Create DataFrame with date index
data_struct1 = pd.DataFrame()
data_struct1['date'] = pd.date_range('2019-07-18', periods=5, freq='2H')
data_struct1 = data_struct1.set_index('date')
print(data_struct1.head())
                     date
date                     
2019-07-18 00:00:00 2019-07-18 00:00:00
2019-07-18 02:00:00 2019-07-18 02:00:00
2019-07-18 04:00:00 2019-07-18 04:00:00
2019-07-18 06:00:00 2019-07-18 06:00:00
2019-07-18 08:00:00 2019-07-18 08:00:00

Time-Based Filtering

Filter DataFrame rows based on datetime conditions ?

import pandas as pd

# Create sample time data
data_struct2 = pd.DataFrame()
data_struct2['date'] = pd.date_range('2019-07-17', periods=3, freq='4H')
print("Original DataFrame:")
print(data_struct2)

# Filter data after specific time
filtered_data = data_struct2[data_struct2['date'] > '2019-07-17 04:00:00']
print("\nFiltered data:")
print(filtered_data)
Original DataFrame:
                 date
0 2019-07-17 00:00:00
1 2019-07-17 04:00:00
2 2019-07-17 08:00:00

Filtered data:
                 date
2 2019-07-17 08:00:00

Common Time Frequencies

Frequency Code Description Example
'H' Hourly Every hour
'D' Daily Every day
'W' Weekly Every week
'M' Monthly End of month

Conclusion

Pandas provides comprehensive tools for time series processing including date_range() for generation, .dt accessor for component extraction, and powerful filtering capabilities. These features make working with temporal data efficient and intuitive.

Updated on: 2026-03-25T06:14:53+05:30

256 Views

Kickstart Your Career

Get certified by completing the course

Get Started
Advertisements