You get a new data file and the numbers are close, but slightly different from the other file you already had. Suddenly, you’ve got two different sets of data. Now you have to spend two hours matching things up.
Okay, let’s NOT do this. Instead let’s use three steps and make this project a ten minute exercise. Then we can bank that extra time and use some of it towards that side project you’ve been wanting to do.
Step 1: Find the criteria
Ask the person who sent you the data to include the criteria they used to run the report. Ninety percent of your effort will be trying to find this information by looking in the data, but if you ask up front you can save yourself a lot of time. Many times the issue is simply that each set of data is using different date ranges (fiscal vs. calendar, or up through yesterday vs. including today, etc.)
Step 2: Find the primary data items
Create a concatenation of the “key” fields in each report. So if you’re looking at customer, region, and part, then join those three fields together into one field. Do this for both reports. Then sort the data by this field. You can use this list as a unique set of data and that alone may be all you need. Here’s a video on how to get the unique records in Microsoft Office Excel 2003 (Tip: Office Excel 2007 has a one button solution for this – click on the “Data” Ribbon and select “Remove Duplicates”.)
Step 3: Find the subtotals of the primary data
Create a subtotal for each set of data by the concatenated field. Use this subtotal on each set of data. This usually reduces the checking from thousands of rows to less than one hundred. If not, you may want to adjust your concatenation to have fewer fields. Once you’ve done this for both sets of data, the differences jump out at you and you can look at the records within that concatenated “key” to isolate your differences.
What other steps should be included? Share with the community in the comments.
Photo credit: ajturner