I was only kidding. One of them IS Power Query. Seeing as Power Query is one of the best things to happen to Excel data analysis since the invention of the …. PivotTable, how could I possibly write about splitting data without mentioning Power Query? Note that Power Query is now confusingly called Get and Transform in Excel 2016 but I’m going to stick with the name Power Query - for now. I know that it’s easy to stick your head in the sand and try to pretend that the world of Excel isn’t changing but I tried that, and it didn’t work.
When using Excel for the purpose of data analysis, some of the new analytical tools now available in Modern Excel (Excel 2010+) mean that the old way of doing things are now completely outdated. Of course, the old methods still work, but it shouldn’t mean that we continue to do things in a manner that takes longer, is less efficient, or results in an inferior solution.
Let’s take a simple but common example of a list of names we’d like to manipulate to show “J. Brown” instead of “Johanna Brown”. If you’d like to follow along on the tutorial, then you can download the file here. There are several ways of achieving this.
Firstly, (and this is the old way of course) we can manipulate the data using “Text to Columns” on the Data Tab as shown in Figure 1. This is also called “parsing”.
Once the surname appears in column C, we simply add a formula such as =LEFT(A2)&". "&B2 into cell C2 and copy it down to bring the name “J. Brown” instead of the full name. The LEFT function pulls the first character of cell A2, and the ampersand (&) strings the cells and text together (or you can also use the CONCATENATE function). Now, this method has been around since the beginning of time, and is still a great way to split data manually. If you’re using Excel 2007 then you won’t have Flash Fill or Power Query, so this is going to be the only way to do this. Although there are now several much easier ways to perform this analysis, looking at this method is not without merit, particularly as the ampersand or CONCATENATE function is extremely useful in many models, particularly dashboard reports.
If you’re using Excel 2013 or later, then Flash Fill is much easier. Simply start typing the names as you want them in the column adjacent to the raw data. After you’ve entered the name in the correct format a couple of times, Flash Fill will kick in, as shown below in Figure 2. Press enter, and the data will populate down the column.
This method is great if you’re in a hurry but it’s a once-off solution, and not particularly reusable. If the source data changes, it will need to be recreated.
Using Power Query is going to create a solution that is (once you’ve got the hang of it) much quicker and easier to build than using parsing and concatenate, and it’s going to be reusable, unlike Flash Fill. Whilst it might take some time to get your head around, once you get a handle on it, you won’t look back.
If you’re using Excel 2010 or 2013, you’ll need to download the Power Query add in first before you begin. If you don’t see the tab, go to File – Options – Add Ins and in the down-down box at the bottom, select COM Add Ins and press Go. Select Power Query and press OK.
1. Click anywhere in the data, and select From Table in the Get and Transform group on the Data Tab, or the Power Query tab if you’re using Excel 2010 or 2007.
2. In order to use Power Query, it firstly needs to convert the data to a structured reference table. Press OK, and your data will automatically convert to a table.
3. This now takes you into the Power Query window as shown in Figure 3
4. On the Home Tab, select Split Column – By Delimiter, change the drop-down from comma to space on the dialog box, as shown in Figure 3, check all the other options are what you wish to show, and press OK.
5. The data should now show in two columns. If it doesn’t, or you don’t like what you see, you can simply undo the action by hitting the X in the “Applied Steps” box on the right hand, and start again. These steps are the actions which are recorded and will be applied each time the query is run, even if the data has changed. (You’ll see what I mean in a minute.)
6. Now, press Close & Load in the top left-hand corner. This will bring the data back into ordinary Excel, so creating a new tab where you can continue to use it.
7. In order to create the merged column with “J. Brown”, you’ve got two choices; you can continue to work in the table, and complete it in ordinary Excel, or you can go back into Power Query.
If you are happy to complete your analysis in Excel, stay in the table, and either use Flash Fill, or use the formula =LEFT(A2)&". "&B2. Note that because you’re in a table, the formula will automatically convert to =LEFT([@[Name.1]])&". "&[@[Name.2]]
1. To complete your analysis in Power Query, go back into the query by double-clicking on the Query Editor on the right-hand side of your screen.
2. Select Add Custom Column from the Add Column tab. Now, this is where Power Query formulas deviate from ordinary Excel formulas. But stick with me. It’s worth it!
3. Type =Text.Start and then select Name.1 from the available columns. Note the unlike ordinary Excel formulas, Power Query formulas are case sensitive.
4. Add the rest of the formula so that it looks like this: Text.Start([Name.1],1)&". "&[Name.2] and press OK. Your query should now show the new column as shown in Figure 5 below:
Although this option is more difficult to build, the really beauty of it is that if you go back to your original data, and make a change, once you have refreshed the query (by right-hand clicking and pressing refresh), the data automatically updates. This allows you to have your data entry on one page, and the analysis on another. Try this by going back to your original data, changing the names, hit refresh and watch the data change.