Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
Processing time with Pandas DataFrame
Working with time-based data is crucial in data analysis. Pandas provides powerful tools for generating and processing timestamps through date_range(), datetime accessors, and time-based filtering operations.
Setting Up the Environment
Before working with time data, install pandas using the following command ?
pip install pandas
Creating a DataFrame with Time Series
Use pd.date_range() to generate datetime sequences with specified frequency ?
import pandas as pd
# Create DataFrame with time column
data_struct = pd.DataFrame()
data_struct['time'] = pd.date_range('2019-07-14', periods=4, freq='3H')
print(data_struct['time'])
0 2019-07-14 00:00:00 1 2019-07-14 03:00:00 2 2019-07-14 06:00:00 3 2019-07-14 09:00:00 Name: time, dtype: datetime64[ns]
Extracting Date Components
Use the .dt accessor to extract specific components like year, month, or day ?
import pandas as pd
data_struct = pd.DataFrame()
data_struct['time'] = pd.date_range('2019-07-14', periods=4, freq='3H')
# Extract year component
data_struct['year'] = data_struct['time'].dt.year
print(data_struct.head())
time year
0 2019-07-14 00:00:00 2019
1 2019-07-14 03:00:00 2019
2 2019-07-14 06:00:00 2019
3 2019-07-14 09:00:00 2019
Converting String to DateTime
Convert time strings to datetime objects using pd.to_datetime() with format specification ?
import pandas as pd
import numpy as np
# Create time strings
dt_timestring = np.array(['14-07-2019 07:26 AM', '13-07-2019 11:01 PM'])
# Convert to datetime objects
timestamps = [pd.to_datetime(date, format="%d-%m-%Y %I:%M %p", errors="coerce")
for date in dt_timestring]
print(timestamps)
[Timestamp('2019-07-14 07:26:00'), Timestamp('2019-07-13 23:01:00')]
Setting DateTime as Index
Use datetime columns as DataFrame index for time-based operations ?
import pandas as pd
# Create DataFrame with date index
data_struct1 = pd.DataFrame()
data_struct1['date'] = pd.date_range('2019-07-18', periods=5, freq='2H')
data_struct1 = data_struct1.set_index('date')
print(data_struct1.head())
date
date
2019-07-18 00:00:00 2019-07-18 00:00:00
2019-07-18 02:00:00 2019-07-18 02:00:00
2019-07-18 04:00:00 2019-07-18 04:00:00
2019-07-18 06:00:00 2019-07-18 06:00:00
2019-07-18 08:00:00 2019-07-18 08:00:00
Time-Based Filtering
Filter DataFrame rows based on datetime conditions ?
import pandas as pd
# Create sample time data
data_struct2 = pd.DataFrame()
data_struct2['date'] = pd.date_range('2019-07-17', periods=3, freq='4H')
print("Original DataFrame:")
print(data_struct2)
# Filter data after specific time
filtered_data = data_struct2[data_struct2['date'] > '2019-07-17 04:00:00']
print("\nFiltered data:")
print(filtered_data)
Original DataFrame:
date
0 2019-07-17 00:00:00
1 2019-07-17 04:00:00
2 2019-07-17 08:00:00
Filtered data:
date
2 2019-07-17 08:00:00
Common Time Frequencies
| Frequency Code | Description | Example |
|---|---|---|
| 'H' | Hourly | Every hour |
| 'D' | Daily | Every day |
| 'W' | Weekly | Every week |
| 'M' | Monthly | End of month |
Conclusion
Pandas provides comprehensive tools for time series processing including date_range() for generation, .dt accessor for component extraction, and powerful filtering capabilities. These features make working with temporal data efficient and intuitive.
