It is very common in IT world where we have to compare to compare the two columns of excel and then present a report based on the outcome. In order to perform this activity, we rely on the functionality of Excel to perform this function. However, it is very difficult to use excel when the data is huge or if we have data in any other format(.csv). In this post, I will be using Python's pandas library to compare the data of two different columns of the same excel. Additionally, we can use this to compare the data of two columns in two different excels as well.
At first, I will be having below data in Book1.xlsx
Here in Book1.xlsx, I have two columns with Note1 and Note2. Both the columns have data in character format. Now, I will be comparing the data in Note1 with Note2, if the data in a single row matches with each other then I will be exporting the whole row into a new excel.
Next, I will be importing the pandas library, reading the data from excel and printing the columns in the excel:-
Now, compare the two columns of the excel and appending the result to a list and then into a dictionary.
Finally, an excel with the name output.xlsx will be created at the location specified and below is the result in the output.xlsx.
Please do share your thoughts about this blog.
Thank you. Happy learning.
At first, I will be having below data in Book1.xlsx
Here in Book1.xlsx, I have two columns with Note1 and Note2. Both the columns have data in character format. Now, I will be comparing the data in Note1 with Note2, if the data in a single row matches with each other then I will be exporting the whole row into a new excel.
Next, I will be importing the pandas library, reading the data from excel and printing the columns in the excel:-
Now, compare the two columns of the excel and appending the result to a list and then into a dictionary.
After comparing the two columns and taking the matched records into one dictionary, now create a dataframe through this dictionary and then export this dataframe to an excel.
Finally, an excel with the name output.xlsx will be created at the location specified and below is the result in the output.xlsx.
Please do share your thoughts about this blog.
Thank you. Happy learning.