Merging tables in Power Query with summarized data provides a powerful way to consolidate information without overwhelming detail. By aggregating invoice details such as counts, sums, averages, and medians, you can create concise reports that highlight key insights while maintaining the original table structure.
This technique is particularly valuable when dealing with sales data where each invoice contains multiple line items that need to be condensed.
In my video tutorial, I demonstrate how to effectively merge tables with summarization in Power Query:
Understanding the Problem: Merging Tables with Summarization
When working with relational data like invoices and their detailed items, we often need to combine information from multiple tables. However, simply merging tables can result in redundant data and unnecessary complexity. The challenge is to merge data while simultaneously summarizing the details to provide meaningful insights.
In this demonstration, I work with two tables: one containing basic invoice information and another with detailed line items for each invoice. The goal is to enhance the invoice table with summarized information from the details table, such as the count of items, average quantities, sum of prices, and even statistical measures like medians.
Loading Tables into Power Query
The first step in our process is to load both tables into Power Query for manipulation. This gives us access to the powerful data transformation capabilities that allow for sophisticated merging and summarization.
To begin, I navigate to the Data tab in Excel and select “From Table/Range” to import our first table into Power Query. For this initial details table, I choose to load it only as a connection rather than importing it directly into the worksheet. This is done by selecting “Close and Load To” and then choosing the “Connection only” option.
After loading the details table, I follow the same process to load the invoice table. This time, however, we’ll perform our merge operation directly within this query rather than creating a new one.
Merging the Tables
With both tables loaded into Power Query, we can now merge them using the common invoice ID field. This is where the power of our approach becomes evident.
To perform the merge:
- Select the “Merge Queries” command from the ribbon
- Choose the second table (details) to merge with
- Select matching columns for merging (invoice ID in both tables)
- Choose “Left Outer Join” as the join kind to ensure all invoices are included
After pressing OK, Power Query creates a new column in our invoice table that contains the matching rows from the details table. You’ll notice that for each invoice, we now have a nested table showing all the corresponding detail rows. For example, some invoices might have two detail rows, others three, and some even five or more.
The Key Difference: Aggregating Instead of Expanding
The critical step that differentiates this approach from a standard merge is what we do next. Instead of expanding these nested tables (which would create duplicate invoice rows), we want to aggregate the information they contain.
When we click on the expand button (the small triangle) in the column header, we get options to expand or aggregate. By choosing to aggregate, we can perform calculations across all the detail rows for each invoice.
Creating Summarized Metrics
The power of this technique lies in the variety of aggregations we can perform on the detailed data. For our invoice example, I demonstrate several useful metrics:
- Count of items: Simple count of how many line items exist for each invoice
- Average quantity: Instead of summing quantities, we calculate the average number of items per line
- Sum of unit price: Total value of the invoice
- Median of unit price: Statistical measure showing the middle value of all unit prices on the invoice
To create these aggregations, I select the appropriate options from the dropdown menu when expanding the nested table column. By unchecking “Use original column name as prefix,” we get cleaner column names in our results.
Refining the Results
After creating our aggregated columns, we may want to refine the column names for clarity. This can be done directly in the formula bar in Power Query. For example, I rename one of the columns to “Count” to better reflect its purpose.
It’s also important to check the data types of our new columns. Power Query automatically assigns data types, but they may need adjustment. In our case, the “Count” column is correctly set as a decimal number, which works well since counts will always be whole numbers.
When working with monetary values like the sum of unit prices, Power Query may preserve the currency data type. However, it’s worth noting that this data type information doesn’t automatically translate to formatting in Excel. We’ll need to apply currency formatting in Excel after loading the data.
Loading the Final Results
Once we’re satisfied with our merged and summarized table, we can load it back to Excel. From the Home tab in Power Query, I select “Close and Load To” and choose to create a new table in the existing worksheet.
The final result is an enhanced invoice table that includes aggregated information from the details table without duplicating invoice records. This gives us a concise view of our data with valuable insights such as the number of items per invoice, average quantities, total values, and statistical measures.
This technique is particularly valuable for financial analysis, sales reporting, and any situation where you need to combine master records with summarized information from related detail records. By leveraging Power Query’s merging and aggregation capabilities, we can create more informative reports while maintaining a clean, non-redundant data structure.