Compare Two pandas DataFrames in Python (Example) | Find Differences Row by Row | merge() Function

preview_player
Показать описание
Python code of this video:

import pandas as pd # Import pandas library in Python

data1 = pd.DataFrame({'x1':range(20, 26), # Create first pandas DataFrame
'x2':['a', 'a', 'a', 'b', 'b', 'b'],
'x3':range(21, 27)})
print(data1) # Print first pandas DataFrame

data2 = pd.DataFrame({'x1':range(22, 29), # Create second pandas DataFrame
'x2':['a', 'a', 'b', 'b', 'b', 'b', 'b'],
'x3':range(23, 30)})
print(data2) # Print second pandas DataFrame

indicator = True,
how = 'outer')
print(data_12) # Print merged DataFrames

print(data_12_diff) # Print differences between DataFrames

Follow me on Social Media:

Рекомендации по теме
Комментарии
Автор

Thank you very much, I was looking for something like this for some time.

ali
Автор

Thank you so much, will this work if 2 dataframes has Millions of Records? In my case, Data has been migrated from SQL to Snowflake. Ensure that all the records are migrated correctly, If any of the records do not match, generate a report of those records. Since My tables has millions of records how to I start with validating

divya
Автор

is it possible to take any value from the dataframe and compare how many greater or equal values there are in that same column?

poekewquiaro
Автор

Can you compare two Large excel files with the same columns ?

fay
Автор

Hi, i tried this code to compare data in two dataframes. For some of the rows i am getting false breaks. Even though all the data is exactly same, its still showing as different

kalyanisonar
Автор

Why use lambda ?when you can just use

vensandy_Data_Analyst
Автор

Hi, I would like to know how to publish this data comparison output to a HTmL report. Any help regarding this please?

srujjankotagiri