Merg­ing tables in Pow­er Query with sum­ma­rized data pro­vides a pow­er­ful way to con­sol­i­date infor­ma­tion with­out over­whelm­ing detail. By aggre­gat­ing invoice details such as counts, sums, aver­ages, and medi­ans, you can cre­ate con­cise reports that high­light key insights while main­tain­ing the orig­i­nal table structure.

This tech­nique is par­tic­u­lar­ly valu­able when deal­ing with sales data where each invoice con­tains mul­ti­ple line items that need to be condensed.

In my video tuto­r­i­al, I demon­strate how to effec­tive­ly merge tables with sum­ma­riza­tion in Pow­er Query:

Understanding the Problem: Merging Tables with Summarization

When work­ing with rela­tion­al data like invoic­es and their detailed items, we often need to com­bine infor­ma­tion from mul­ti­ple tables. How­ev­er, sim­ply merg­ing tables can result in redun­dant data and unnec­es­sary com­plex­i­ty. The chal­lenge is to merge data while simul­ta­ne­ous­ly sum­ma­riz­ing the details to pro­vide mean­ing­ful insights.

In this demon­stra­tion, I work with two tables: one con­tain­ing basic invoice infor­ma­tion and anoth­er with detailed line items for each invoice. The goal is to enhance the invoice table with sum­ma­rized infor­ma­tion from the details table, such as the count of items, aver­age quan­ti­ties, sum of prices, and even sta­tis­ti­cal mea­sures like medians.

Loading Tables into Power Query

The first step in our process is to load both tables into Pow­er Query for manip­u­la­tion. This gives us access to the pow­er­ful data trans­for­ma­tion capa­bil­i­ties that allow for sophis­ti­cat­ed merg­ing and summarization.

To begin, I nav­i­gate to the Data tab in Excel and select “From Table/Range” to import our first table into Pow­er Query. For this ini­tial details table, I choose to load it only as a con­nec­tion rather than import­ing it direct­ly into the work­sheet. This is done by select­ing “Close and Load To” and then choos­ing the “Con­nec­tion only” option.

After load­ing the details table, I fol­low the same process to load the invoice table. This time, how­ev­er, we’ll per­form our merge oper­a­tion direct­ly with­in this query rather than cre­at­ing a new one.

Merging the Tables

With both tables loaded into Pow­er Query, we can now merge them using the com­mon invoice ID field. This is where the pow­er of our approach becomes evident.

To per­form the merge:

  1. Select the “Merge Queries” com­mand from the ribbon
  2. Choose the sec­ond table (details) to merge with
  3. Select match­ing columns for merg­ing (invoice ID in both tables)
  4. Choose “Left Out­er Join” as the join kind to ensure all invoic­es are included

After press­ing OK, Pow­er Query cre­ates a new col­umn in our invoice table that con­tains the match­ing rows from the details table. You’ll notice that for each invoice, we now have a nest­ed table show­ing all the cor­re­spond­ing detail rows. For exam­ple, some invoic­es might have two detail rows, oth­ers three, and some even five or more.

The Key Difference: Aggregating Instead of Expanding

The crit­i­cal step that dif­fer­en­ti­ates this approach from a stan­dard merge is what we do next. Instead of expand­ing these nest­ed tables (which would cre­ate dupli­cate invoice rows), we want to aggre­gate the infor­ma­tion they contain.

When we click on the expand but­ton (the small tri­an­gle) in the col­umn head­er, we get options to expand or aggre­gate. By choos­ing to aggre­gate, we can per­form cal­cu­la­tions across all the detail rows for each invoice.

Creating Summarized Metrics

The pow­er of this tech­nique lies in the vari­ety of aggre­ga­tions we can per­form on the detailed data. For our invoice exam­ple, I demon­strate sev­er­al use­ful metrics:

  • Count of items: Sim­ple count of how many line items exist for each invoice
  • Aver­age quan­ti­ty: Instead of sum­ming quan­ti­ties, we cal­cu­late the aver­age num­ber of items per line
  • Sum of unit price: Total val­ue of the invoice
  • Medi­an of unit price: Sta­tis­ti­cal mea­sure show­ing the mid­dle val­ue of all unit prices on the invoice

To cre­ate these aggre­ga­tions, I select the appro­pri­ate options from the drop­down menu when expand­ing the nest­ed table col­umn. By uncheck­ing “Use orig­i­nal col­umn name as pre­fix,” we get clean­er col­umn names in our results.

Refining the Results

After cre­at­ing our aggre­gat­ed columns, we may want to refine the col­umn names for clar­i­ty. This can be done direct­ly in the for­mu­la bar in Pow­er Query. For exam­ple, I rename one of the columns to “Count” to bet­ter reflect its purpose.

It’s also impor­tant to check the data types of our new columns. Pow­er Query auto­mat­i­cal­ly assigns data types, but they may need adjust­ment. In our case, the “Count” col­umn is cor­rect­ly set as a dec­i­mal num­ber, which works well since counts will always be whole numbers.

When work­ing with mon­e­tary val­ues like the sum of unit prices, Pow­er Query may pre­serve the cur­ren­cy data type. How­ev­er, it’s worth not­ing that this data type infor­ma­tion does­n’t auto­mat­i­cal­ly trans­late to for­mat­ting in Excel. We’ll need to apply cur­ren­cy for­mat­ting in Excel after load­ing the data.

Loading the Final Results

Once we’re sat­is­fied with our merged and sum­ma­rized table, we can load it back to Excel. From the Home tab in Pow­er Query, I select “Close and Load To” and choose to cre­ate a new table in the exist­ing worksheet.

The final result is an enhanced invoice table that includes aggre­gat­ed infor­ma­tion from the details table with­out dupli­cat­ing invoice records. This gives us a con­cise view of our data with valu­able insights such as the num­ber of items per invoice, aver­age quan­ti­ties, total val­ues, and sta­tis­ti­cal measures.

This tech­nique is par­tic­u­lar­ly valu­able for finan­cial analy­sis, sales report­ing, and any sit­u­a­tion where you need to com­bine mas­ter records with sum­ma­rized infor­ma­tion from relat­ed detail records. By lever­ag­ing Pow­er Query’s merg­ing and aggre­ga­tion capa­bil­i­ties, we can cre­ate more infor­ma­tive reports while main­tain­ing a clean, non-redun­dant data structure.