28

Another Pandas question!

I am writing some unit tests that test two data frames for equality, however, the test does not appear to look at the values of the data frame, only the structure:

dates = pd.date_range('20130101', periods=6)

df1 = pd.DataFrame(np.random.randn(6, 4), index=dates, columns=list('ABCD'))
df2 = pd.DataFrame(np.random.randn(6, 4), index=dates, columns=list('ABCD'))

print df1
print df2
self.assertItemsEqual(df1, df2)

-->True

Do I need to convert the data frames to another data structure before asserting equality?

4 Answers 4

44

Ah, of course there is a solution for this already:

from pandas.util.testing import assert_frame_equal

EDIT: pandas.util.testing was deprecated in 2020. From version 1.0 forward, use:

from pandas.testing import assert_frame_equal

The API documentation for assert_frame_equal can be found here

Sign up to request clarification or add additional context in comments.

Comments

6

While assert_frame_equal is useful in unit tests, I found the following useful on analysis as one might want to further check which values are not equal: df1.equals(df2)

Comments

4

Also numpy's utilities work:

import numpy.testing as npt

npt.assert_array_equal(df1, df2)

Comments

0
In [62]: import numpy as np

In [63]: import pandas as pd

In [64]: np.random.seed(30)

In [65]: df_old = pd.DataFrame(np.random.randn(4,5))

In [66]: df_old
Out[66]: 
          0         1         2         3         4
0 -1.264053  1.527905 -0.970711  0.470560 -0.100697
1  0.303793 -1.725962  1.585095  0.134297 -1.106855
2  1.578226  0.107498 -0.764048 -0.775189  1.383847
3  0.760385 -0.285646  0.538367 -2.083897  0.937782

In [67]: np.random.seed(30)

In [68]: df_new = pd.DataFrame(np.random.randn(4,5))

In [69]: df_new
Out[69]: 
          0         1         2         3         4
0 -1.264053  1.527905 -0.970711  0.470560 -0.100697
1  0.303793 -1.725962  1.585095  0.134297 -1.106855
2  1.578226  0.107498 -0.764048 -0.775189  1.383847
3  0.760385 -0.285646  0.538367 -2.083897  0.937782

In [70]: df_old.equals(df_new) #Equality check here, returns boolean expression: True/False
Out[70]: True

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.