Bad data is sneaky. It looks fine sitting in a spreadsheet — until your totals are wrong, your VLOOKUP returns errors, or your pivot table groups the same thing under three different names. The root cause is almost always the same: messy, inconsistent, or improperly formatted data.
Cleaning up data isn't glamorous, but it's one of the highest-value things you can do in Excel. A clean dataset means reliable formulas, accurate reports, and decisions you can actually trust. Here's a practical step-by-step guide to cleaning up messy Excel data — no technical background required.
Extra spaces are the most common and most invisible data problem in Excel. They hide at the beginning or end of a cell and look like nothing — but they cause VLOOKUP to return #N/A, make identical values not match, and create duplicate groups in pivot tables.
" Smith"
"Smith "
" Smith"
"Smith"
"Smith"
"Smith"
The TRIM function removes all leading, trailing, and extra internal spaces from a cell in one step:
Apply this to a helper column next to your data, then copy and paste the results as values over the original column to replace the messy data with clean data.
When the same name or category is entered different ways — "new york", "New York", "NEW YORK" — Excel treats them as three completely different values. Your pivot tables split them into separate rows, your SUMIF misses entries, and your reports look unprofessional.
"new york"
"NEW YORK"
"New york"
"New York"
"New York"
"New York"
Three functions handle this depending on what you need:
This is one of the most common and damaging data problems — numbers that look like numbers but are actually stored as text. Your SUM formula ignores them. Your AVERAGE is wrong. And nothing tells you something is broken.
The quickest fix is to select the column, look for the small green warning triangle in the corner of affected cells, click it, and choose "Convert to Number."
If that doesn't work, use the VALUE function to force the conversion:
Apply to a helper column, paste as values, and replace the original data — same process as TRIM above.
Duplicate rows silently inflate your totals. If a sale is entered twice, your revenue figures are wrong — and you may never know unless you specifically check for duplicates.
Excel has a built-in tool for this. Select your data, go to Data → Remove Duplicates, choose which columns to check for duplicates, and click OK. Excel tells you how many duplicates it found and removed.
If you want to find duplicates without removing them yet, use conditional formatting to highlight them first. Go to Home → Conditional Formatting → Highlight Cell Rules → Duplicate Values. This lets you review them before deciding what to delete.
When the same category gets entered multiple ways — "Mktg", "Marketing", "marketing dept" — your summaries by category become unreliable. Find & Replace is the fastest way to standardize these across an entire column at once.
Press Ctrl+H to open Find & Replace. Type the inconsistent version in the Find box and the correct version in the Replace box, then click Replace All. Repeat for each variation.
"Mktg"
"marketing"
"Marketing Dept"
"Marketing"
"Marketing"
"Marketing"
Sometimes data arrives with information crammed into one cell that should be in separate columns — like "John Smith" instead of separate First Name and Last Name columns. Or the opposite: first and last name in separate columns when you need them combined.
To split one cell into two: Use Text to Columns (Data → Text to Columns) and split by space, comma, or whatever separator your data uses. Or use LEFT, RIGHT, and MID functions to extract specific portions.
To combine two cells into one: Use the ampersand operator or the CONCAT function:
Before you trust any dataset enough to build reports or make decisions from it, run through this quick checklist:
Working through this checklist on a new dataset takes 20-30 minutes the first time. But it's time well spent — clean data means every formula, pivot table, and report built on top of it will be accurate and reliable from the start.
If you've inherited a particularly messy dataset and don't have time to clean it yourself, that's one of the most common things we help with at HelpMyData. Send us the file and we'll return it clean, consistent, and ready to use.
Send it over and describe the problem. We'll clean it up, fix the formatting, and make sure everything is consistent and ready to use — starting at just $75.
📧 Get It Cleaned Up Today