Cleaning is a task that I can only enjoy with music. There are few things that can compete with simultaneously sweeping and singing. Sadly, due to the decrease in attention to detail, I cannot benefit from this luxury when data cleaning.
One thing I can enjoy when data cleaning is thinking of ways to make it more efficient. I do hold an opinion that data cleaning is a process we need to keep sacred and simple in order to understand the attributes our datasets have, but when express lanes can be installed, you bet I am going indulge my curiosities. Which is what happened when working on a project that needed to combine 12 different datasets together.
The purpose of the project is to analyze patient and satisfaction data from 12 different sites. The problem is each site downloads their data in different formats. Some download data into numeric output and others text. The text responses differ across sites as well. Some have spaces in-between responses that are than one word in length, others do not. Which means I used to go in and “Find & Replace” each column for each value. The result is a combined dataset in SPSS.
Excel Macro. The function of the macro is to look at each column and “find & replace” the current values without consuming a whole workday. The site-specific response is replaced with a standardized SPSS value for easy merging. The first site was written from scratch to standardize the first campus. Once that was done, the steps were minimized to pulling specific columns from the original dataset and then changing the values the macro had to find. Finally, you can “run” your macro and the dataset is all standardized and ready to merge into SPSS.
This method took the repetition out of the process and minimized the amount of human error. Also, writing syntax in any program can come with a feeling of accomplishment once it works.
Have you found another solution? What approach would you take? Let us know what you think on Twitter.
One of the ways we communicate important news and info is through our email newsletters. You can sign up to stay connected!