Getting Started with Microsoft Power Query
I always enjoy the PASS Summits, and most of the time there is always a handful of things that I get excited about. I don’t even wait ‘til I get back to work to try these out, I usually try it out right after the session or the night after the session.
Lately I have been more involved in data warehousing, ETL and visualization/dashboarding/analytics. I have not had the chance to look more closely at some of the new packages or tools from any vendor for that matter. At this week’s PASS summit, I attended many of the Power BI and Power Query sessions. After I saw the first session on Power Query, I was hooked. There were some data sets that I’ve been trying to scrape with other tools – SSIS, PowerShell etc – that seemed to have been scraped and cleaned much, much easier (or should I say magically) by Power Query. Power Query – you had me at “hello”.
What is Power Query?
Let me quote Microsoft on this definition.
Microsoft Power Query for Excel, is a new add-in that provides a seamless experience for data discovery, data transformation and enrichment for Information Workers, BI professionals and other Excel users
To put it differently, it is the desktop ETL tool you wish you had, which you now do. I am no Excel guru, and I usually don’t look to Excel if I need to do any kind of data dump, cleanup and transform. Before even thinking of Excel, I would probably think of ways to do the cleanup and transform in T-SQL, SSIS, PowerShell, C# (sometimes in this order, sometimes not). After seeing Power Query, my list and order of preference for these tools have just changed.
So what do you need to start your Power Query adventure?
Download and Install
First off, you will need to download the Power Query add-in. You can download it from http://www.microsoft.com/en-us/download/details.aspx?id=39379. There is an Excel 2010 and Excel 2013 version, and there is also a 32 and a 62 bit version. So just make sure you know which Excel version you have. Should you make a mistake, the installer will tell you anyway, so no harm done.
Once you install the add-in, you will see the new Power Query tab in Excel.
Let me walk you through a very, very simple demo. Let’s say you want to know the top universities around the world. You don’t know where to get that data set, but you know it must be there – somewhere out there.
Let’s start by opening Excel, and choosing the Power Query tab.
Once there, you can click on Online Search, and search for “top universities”. You will notice that this will fetch a few resources from the web, and once you highlight it shows you a preview of the data set it can scrape.
Double click on the first link that shows up in the search, and this downloads the data set to Excel.
Once the records are downloaded, you should see the records in your Excel sheet with Country, University and each individual year in the columns. You also should see one query on the right hand Workbook Queries pane, which indicates the number of rows that were loaded.
We need to do a bit more transformations in this data set.
Click on the one entry in the Workbook Queries pane. This will open up a new window that allows you to do more work and cleanup on the data set.
Scroll all the way to the right. There is a field there called key. Let’s remove that by right clicking on the field, and clicking on remove.
We will also need to unpivot the years, so that we get a row per ranking per year, instead of a row per university. To do this, highlight the years columns, and click on Unpivot.
Notice as well that in the Query Settings pane, there are now two entries under Applied Steps. Power Query records all the steps you make, which allows you to easily replay the steps of your transformation.
After the unpivot, you should see all the universities in the worksheet, with an entry per year. You should also see this step under the Applied Steps box. Click on the Close & Load once done.
There you go, that’s a simple, mini-demo of Power Query. Although it is a “simple” demo, this could have been a lot more complicated using other tools. With Power Query, it’s a matter of following a few steps. It’s simply amazing.
Get your references
Since I am the the PASS Summit, I picked up Chris Webb’s (@technitrain) Apress book Power Query for Power BI and Excel. Lady luck is on my side, and I finally met Chris in person (after a few chats on LinkedIn and Twitter). The first thing the socially awkward me did? I unashamedly ask him to sign my book (thank you Chris!)
I skimmed through the whole book already the night after I bought it to get a better idea of Power Query overall. I plan to do more close reading (and lots of experiments!) in the coming days and weeks. At this point I am just super excited to get started.
Here are some additional resources on getting started with Power Query:
Getting Started with Excel and Power BI Series
Getting Started with Power Query Part I
Getting Started with Power Query Part II
Microsoft Power Query for Excel Help
Chris Webb’s BI Blog – Power Query
Reza Rad’s Blog – Power Query
Rafael Salas’ Blog – Power Query
Matt Masson’s Blog – Power Query
In addition to checking for updates, send a smile or a frown. The team monitors the smiles, the frowns and the feedback, and uses these as basis for what’s going to come next.
Matt Masson just posted a Notepad++ language file for M (Power Query’s formula language). Will come in quite handy when working with M.