August 19, 2019

Compare 2 excel inputs and highlight the difference in python

I have 2 input files with almost similar columns


I want to compare 2 files for the difference and highlight the difference and generate the difference file as output

The flexibility of comparing 2 files always lies with the user

as here we are using the the column mapping file to compare 2 inputs.

refer below code


import pandas as pd
import numpy as np

import  gc
dfInput1 = pd.read_excel(
"E:/001 E books/006 Python/003 Projects/003 Compare 2 excel files/"Input files.xlsx", sheet="Sheet1")

dfInput2 = pd.read_excel(
"E:/001 E books/006 Python/003 Projects/003 Compare 2 excel files/input2.xlsx", sheet="Sheet2")

dfOutputMapping = pd.read_excel(
"E:/001 E books/006 Python/003 Projects/003 Compare 2 excel files/OutputMapping.xlsx", sheet="Sheet1")


dffinalOP = pd.merge(dfInput1, dfInput2,
left_on="Row ID", right_on="Ip2_Row ID", how="left")

# Added the output columns  here
for  index, col1 in dfOutputMapping.iterrows():
   
dffinalOP[col1[
'Output']] = (dffinalOP[col1['File1_IP']] == dffinalOP[col1['File2_IP']])

dffinalOP.to_csv(
"E:/001 E books/006 Python/003 Projects/003 Compare 2 excel files/output.csv")



Reference file format and output

Input file 1

Input file 2
Mapping file : Defines which columns to compare from input file 1 and input  file 2

Output file with status in last column


This code can be further developed as per your requirement.

please share your suggestions. 

Cheeeeeeeeeeeers !!