Article Categories
- All Categories
-
Data Structure
-
Networking
-
RDBMS
-
Operating System
-
Java
-
MS Excel
-
iOS
-
HTML
-
CSS
-
Android
-
Python
-
C Programming
-
C++
-
C#
-
MongoDB
-
MySQL
-
Javascript
-
PHP
-
Economics & Finance
Difference between data frames and matrices in Python Pandas?
In this article, we will explore the differences between DataFrames and matrices in Python Pandas. Both are 2-dimensional data structures, but they serve different purposes and have distinct characteristics.
Both DataFrames and matrices are 2-dimensional data structures. In general, DataFrames can include multiple types of data (numeric, character, factor, etc) while matrices can only store one type of data.
DataFrame in Python
In Python, a DataFrame is a two-dimensional, tabular, mutable data structure that may store tabular data containing objects of various data types. A DataFrame has axes that are labeled in the form of rows and columns. DataFrames are useful tools in data preprocessing because they provide valuable data handling methods. DataFrames can also be used to create pivot tables and plot data with Matplotlib.
Applications of DataFrame
Data frames can perform a variety of tasks, such as fit statistical formulas.
Data processing (Not possible with Matrix, first converting to Data Frame is mandatory)
Transposing rows to columns and vice versa is feasible, which is useful in Data Science.
Creating a Sample DataFrame
Example
The following program creates a DataFrame using the DataFrame() function ?
# importing pandas, numpy modules with alias names
import pandas as pd
import numpy as np
# creating a dataframe
inputDataframe = pd.DataFrame({
'Name': ['Virat', 'Rohit', 'Meera', 'Nick', 'Sana'],
'Jobrole': ['Developer', 'Analyst', 'Help Desk', 'Database Developer', 'Finance accountant'],
'Age': [25, 30, 28, 25, 40]
})
# displaying the dataframe
print(inputDataframe)
Name Jobrole Age
0 Virat Developer 25
1 Rohit Analyst 30
2 Meera Help Desk 28
3 Nick Database Developer 25
4 Sana Finance accountant 40
Matrix in Python
Matrix is a homogeneous collection of data sets organized in a two-dimensional rectangular grid. It's an m*n array with the same data type. There are a fixed number of rows and columns. Python supports numerous arithmetic operations such as addition, subtraction, multiplication, and division on Matrix.
Applications of Matrix
It is very useful in Economics for calculating statistics such as GDP (Gross Domestic Product) or PI (Price per capita income).
It's also useful for studying electrical and electronic circuits.
Matrices are utilized in survey research, such as plotting graphs.
This is useful in probability and statistics.
Matrix Multiplication by Converting a Matrix to DataFrame
Example
The following program demonstrates matrix multiplication using DataFrames ?
# importing pandas module
import pandas as pd
# input matrix 1
inputMatrix_1 = [[1, 2, 2],
[1, 2, 0],
[1, 0, 2]]
# input matrix 2
inputMatrix_2 = [[1, 0, 1],
[2, 1, 1],
[2, 1, 2]]
# creating a dataframe of first matrix
df_1 = pd.DataFrame(data=inputMatrix_1)
# creating a dataframe of second matrix
df_2 = pd.DataFrame(data=inputMatrix_2)
# printing the dataframe of input matrix 1
print("inputMatrix_1:")
print(df_1)
print("The dimensions(shape) of input matrix 1:")
print(df_1.shape)
print()
# printing the dataframe of input matrix 2
print("inputMatrix_2:")
print(df_2)
print("The dimensions(shape) of input matrix 2:")
print(df_2.shape)
print()
# multiplying both the matrices inputMatrix_1 and inputMatrix_2
result_mult = df_1.dot(df_2)
# Printing the resultant of matrix multiplication
print("Resultant Matrix after Matrix multiplication:")
print(result_mult)
print("The dimensions(shape) of Resultant Matrix:")
print(result_mult.shape)
inputMatrix_1: 0 1 2 0 1 2 2 1 1 2 0 2 1 0 2 The dimensions(shape) of input matrix 1: (3, 3) inputMatrix_2: 0 1 2 0 1 0 1 1 2 1 1 2 2 1 2 The dimensions(shape) of input matrix 2: (3, 3) Resultant Matrix after Matrix multiplication: 0 1 2 0 9 4 7 1 5 2 3 2 5 2 5 The dimensions(shape) of Resultant Matrix: (3, 3)
Comparison
| Aspect | Matrix | DataFrame |
|---|---|---|
| Data Types | Homogeneous (same data type) | Heterogeneous (multiple data types) |
| Structure | Fixed m×n array | Variable rows and columns |
| Labels | No column/row labels | Labeled rows and columns |
| Use Cases | Mathematical operations | Data analysis and manipulation |
Conclusion
DataFrames are ideal for data analysis with mixed data types and labeled axes, while matrices are better for mathematical computations with homogeneous numeric data. Choose based on your specific use case and data requirements.
