Learning how to rank data within specific groups is a valuable Excel skill that can help you organize and analyze information more effectively. By using the COUNTIFS function, you can easily determine the ranking position of elements within their respective categories without manual calculation, saving time and enhancing your data analysis capabilities.
This technique is particularly useful when working with sales data, performance metrics, or any dataset where you need to identify top performers within different categories.
In my video, I demonstrate how to implement ranking within groups using a practical example:
Understanding Rank in Groups
When analyzing data, we often need to rank items not just overall but within specific categories or groups. For example, you might want to know which salesperson performed best in each region, or which product sold most in each store. This is where ranking within groups becomes essential.
In my demonstration, I work with a dataset that’s sorted by town (our grouping variable) and color-coded for easier visualization. The data includes different towns like Moria, Ironforge, and Azmarin, with each town representing a distinct group for our ranking purposes.
What makes this technique powerful is that you can immediately see how items rank within their specific categories rather than just seeing an overall ranking that might be dominated by one particular group. This provides much more actionable insights for decision-making.
Using COUNTIFS for Group Ranking
The core of this technique relies on Excel’s COUNT_IFS function, which allows us to count rows where multiple criteria are met simultaneously. This is perfect for our group ranking because we need to check two things at once: whether a row belongs to the same group as our current row, and whether its value is greater than or equal to our current row’s value.
group_range — This is the column containing our grouping variable (town names in column A)
current_group — This refers to the town in the current row we’re evaluating (A2 for the first data row)
value_range — The column containing the values we want to rank (revenue)
current_value — The specific value in the current row that we’re trying to rank
Step-by-Step Implementation
To implement this ranking system in your own spreadsheet, follow these steps:
1. First, ensure your data is organized with a clear grouping variable (like town, region, category, etc.)
2. Create a new column for your rank results
3. In this column, enter the COUNTIFS formula, which will:
Check the group column to find rows in the same group as the current row
Check the value column to find values greater than or equal to the current row’s value
Count how many rows meet both criteria — this count becomes the rank
The formula uses absolute references (with F4 to create $ signs) for the ranges and relative references for the current values. This ensures that when you copy the formula down, it still refers to the correct columns while adapting to each row’s specific values.
Analyzing the Results
After applying the COUNTIFS formula to our data, we can see that each item gets a rank within its group. In our example:
For the town of Moria, we have ranks from 1 to 5
For Ironforge, ranks range from 1 to 4
For Azmarin, ranks go from 1 to 3
What’s particularly noteworthy is that the ranks don’t have to appear in sequence in your spreadsheet. When we sort the entire dataset by revenue from largest to smallest, we can see that the highest-ranked items from each town appear mixed together. For example, after the top performers from Moria, we might see Ironforge’s top performer, then more from Moria, then perhaps Azmarin’s best.
This demonstrates how our ranking works independently within each group, regardless of where the items appear in the sorted list. The second-ranked item in Azmarin might appear as the tenth row in our sorted data, but it still correctly shows as rank 2 within its group.
Practical Applications
This ranking technique has numerous practical applications:
Sales analysis — Identify top-performing products within each category
Employee performance — Rank staff within departments or regions
Tournament results — Rank competitors within age groups or divisions
Academic performance — Rank students within classes or subjects
Market analysis — Compare performance of stocks within industry sectors
By implementing this group ranking system, you can quickly identify patterns that might otherwise be obscured when looking at data as a whole. It allows you to make fair comparisons within relevant peer groups rather than across dissimilar categories.
Additional Tips for Working with Group Rankings
When implementing group rankings in your spreadsheets, consider these helpful tips:
Color-coding your data by groups (as shown in my video) makes it much easier to visually identify the different categories and understand the rankings at a glance.
You can easily modify the ranking logic by changing the operator in the formula. For example, if you want to rank from smallest to largest instead, you would use “<=” instead of “>=” in your COUNTIFS formula.
For ties (when multiple items have the same value), this formula will assign the same rank to all tied items. If you need to handle ties differently, you might need to use more complex formulas or additional columns.
Remember that this technique works with any type of numerical data you want to rank — sales figures, scores, times, quantities, or any other measurable metric — as long as you have a clear grouping variable.
Dynamically removing top and bottom rows in Power Query can significantly streamline your data cleaning process when dealing with inconsistent data imports. This technique allows you to automatically eliminate unnecessary header rows or footer information based on specific conditions rather than fixed row counts, making your data transformation process more robust and adaptable to changing source files.
The ability to use conditions rather than static numbers is particularly valuable when working with regularly updated reports that may contain varying amounts of metadata or summary information.
In my video, I demonstrate how to implement this dynamic row removal technique that I learned at an Excel London Meetup:
Understanding the Problem with Static Row Removal
When importing data from text files, CSV files, or other sources, you often encounter extraneous information at the top and bottom of your dataset. These might include title rows, explanatory notes, summary statistics, or footer information that aren’t part of the actual data you need to analyze. Using Power Query’s standard “Remove Top Rows” or “Remove Bottom Rows” functions with a fixed number works fine when your source data structure never changes, but becomes problematic when the number of these rows varies.
In the example shown in my video, we have multiple rows of metadata before the actual header row (which contains “Date” as the first column value), and several rows of additional information at the bottom of the data that need to be removed. There are also some missing values throughout the data that make simple filtering ineffective.
Dynamically Removing Top Rows Based on Conditions
The conventional approach to removing top rows in Power Query involves specifying a fixed number. However, this can be problematic when the number of header rows changes. The dynamic solution involves using a condition rather than a fixed count.
Here’s how to implement this technique:
Go to the Home tab in Power Query Editor
Select “Remove Rows” and then “Remove Top Rows”
Instead of entering a number, modify the formula to use the “each” keyword followed by a condition
The formula will look something like this: each [Column1] <> “Date” (assuming “Date” is the header text in your first column). This tells Power Query to keep removing rows until it finds a row where the first column contains the text “Date”.
The Technical Details of the Table.Skip Function
Behind the scenes, Power Query uses the Table.Skip function when you remove top rows. This function has a hidden capability that isn’t obvious from the user interface — it can accept either a count or a condition parameter.
When using a condition, the syntax changes from simply providing a number to using the format:
each [condition]
The condition is evaluated for each row, starting from the top, and rows are removed until the condition is no longer true. This allows for dynamic adaptation to varying source data structures.
Dynamically Removing Bottom Rows
Similarly, we can apply the same concept to remove rows from the bottom of our dataset. This is particularly useful when dealing with files that contain summary information, notes, or other footer data that should be excluded from analysis.
The process for removing bottom rows dynamically is:
Go to the Home tab in Power Query Editor
Select “Remove Rows” and then “Remove Bottom Rows”
Replace the static number with a condition using the “each” keyword
For example, you might use a formula like: each [Merchant] = “” to remove rows from the bottom where the Merchant column contains an empty text string. Or you might use each [Revenue] = null to remove rows where the Revenue column contains null values.
Handling Different Types of Empty Values
When working with bottom rows, it’s important to understand the different types of empty values that might appear in your data:
Empty text strings — represented by “” in formulas
Null values — represented by null in formulas
Missing values — which might be null or empty depending on the data source
In the video demonstration, I show how to handle both empty text strings and null values as conditions for removing bottom rows. The key is to identify which column and which type of empty value reliably indicates the footer section of your data.
Practical Applications and Benefits
This dynamic row removal technique is particularly valuable in several scenarios:
When working with regularly updated reports where the structure might change slightly between versions, this approach ensures your Power Query solution remains robust. It’s also helpful when dealing with data exports from systems that include varying amounts of metadata or when consolidating multiple files that might have different header structures.
The major benefits include:
More resilient data transformation processes that don’t break when source formats change slightly
Reduced need for manual intervention when processing new data
Ability to handle files with inconsistent structure automatically
Greater flexibility compared to static row removal or simple filtering
This technique demonstrates the power of M language in Power Query, allowing for solutions that go beyond what’s immediately available in the user interface. By understanding and leveraging these more advanced capabilities, you can create more robust data transformation processes.
Important Considerations
When implementing this technique, keep in mind a few important points:
The condition you use must reliably identify the boundary between the rows you want to keep and those you want to remove. Choose column values that are consistently present (or consistently absent) at these boundaries. Also be aware that if your condition never evaluates to false, you could potentially remove all rows from your dataset, so testing with representative sample data is essential.
Additionally, remember that this technique works even when you have missing values in your actual data. As shown in the video, the rows are only removed when they match the specific condition you’ve defined, allowing rows with some missing values to be retained as long as they don’t match your removal condition.
Pivot tables offer a powerful way to analyze data, particularly when you need to understand proportions within hierarchical categories. In this tutorial, I’ll show you how to quickly add sums in a pivot table and display values as a percentage of their parent row, allowing for immediate visualization of how individual items contribute to their category totals.
This technique is especially valuable when analyzing sales data across product categories and individual items.
Watch my step-by-step video tutorial below to see this process in action:
Setting Up Your Pivot Table
To begin creating an informative pivot table with percentage calculations, we need to start with a simple dataset that contains hierarchical information. In my example, I’m using a dataset that includes categories (such as fruits, vegetables, and sweets) along with the specific products within each category and their corresponding revenue figures.
The process of creating the pivot table is straightforward:
Select your data range
Navigate to the Insert tab in the Excel ribbon
Click on “Pivot Table”
Choose to place the pivot table on an existing worksheet (I selected cell F1 in my demonstration)
Click “OK” to create the basic pivot table structure
Once your pivot table framework is established, you’ll need to structure it properly to show both categories and their constituent products. In the PivotTable Fields panel, drag the appropriate fields to build your hierarchical view.
Structuring Your Pivot Table
For proper hierarchical analysis, you’ll want to arrange your fields in a logical order. In the Rows section of the PivotTable Fields panel, add your Category field first, followed by the Product field. This creates a nested structure where products appear under their respective categories.
For the values section, we need to add the Revenue field twice — once to show the raw sum and once to show the percentage of parent. Simply drag the Revenue field to the Values area twice. By default, Excel will sum these values, which is exactly what we want for this analysis.
Changing the Layout
By default, Excel displays pivot tables in compact form, but for better readability, I prefer the tabular layout. To change this:
Go to the Design tab under PivotTable Tools
Click on “Report Layout”
Select “Show in Tabular Form”
This adjustment separates the Category and Product into distinct columns, making your data more readable and easier to analyze at a glance.
Adding Percentage of Parent Row
Now comes the key part — transforming one of our revenue columns to show percentage of parent row. This calculation will show how each product contributes proportionally to its category total, and how each category contributes to the grand total. Follow these steps:
Right-click on any cell within the second Sum of Revenue column
Select “Show Values As” from the context menu
Choose “% of Parent Row Total”
This simple change transforms the raw numbers into percentages, giving you immediate insight into the proportional contribution of each item. For instance, in my example, you can now see that apples represent approximately 35% of all fruit sales, while the fruits category as a whole represents about 41% of total sales across all categories.
Understanding the Results
After applying the percentage of parent row calculation, your pivot table automatically adjusts to show meaningful proportions at every level:
Individual products show their percentage contribution to their immediate category
Category subtotals show their percentage contribution to the grand total
The grand total always equals 100%
In my demonstration, this clearly showed that sweets accounted for approximately 40% of total sales, vegetables for about 18%, and fruits for approximately 41%. Within each category, you can similarly see the proportional contribution of each product.
Finalizing Your Pivot Table
To make your pivot table more understandable, it’s important to rename the column headers to accurately reflect what each column represents. In our case:
Change the heading of the first sum column to simply read “Sum”
Rename the second column to “Percentage of Parent”
These descriptive headers ensure that anyone viewing your pivot table will immediately understand what the numbers represent without needing additional explanation.
With these adjustments complete, you now have a comprehensive pivot table that not only shows the raw revenue figures but also clearly illustrates the proportional relationships between categories and their constituent products. This dual-view approach provides both absolute and relative perspectives on your data, enabling more nuanced analysis and decision-making.
This technique is particularly valuable when analyzing sales performance, budget allocations, or any hierarchical data where understanding proportional relationships is important. By visualizing both raw numbers and percentages simultaneously, you gain deeper insights into your data structure and can more effectively communicate those insights to others.
In this tutorial, I demonstrate a practical application of recursive functions in Power Query to calculate hierarchy levels in organizational structures or MLM systems. Using a custom function with recursion, we can efficiently determine each person’s position in a hierarchical structure based on their referrer relationships, providing valuable insights for organizational analysis and reporting.
Understanding hierarchy levels is essential for visualizing reporting structures, tracking MLM downlines, or mapping any parent-child relationships in your data.
Understanding the Problem: Hierarchy Levels in Organizations
In many organizational structures, particularly in multi-level marketing (MLM) systems or corporate hierarchies, understanding the level depth of each member is crucial. The level represents how many steps a person is from the top of the organization. For instance, in our example, John is at the top (level 0), N is directly below John (level 1), and Thomas is below N (level 2).
Our sample data contains an ID column that uniquely identifies each person and a Referrer column that indicates who brought that person into the organization. The referral relationship establishes the hierarchical structure we need to analyze. Our goal is to calculate each person’s hierarchy level automatically using Power Query’s recursive capabilities.
Setting Up Power Query
To begin working with our data, we need to import it into Power Query where we can create and apply our recursive function:
Select any cell in your data table
Go to the Data tab
Click From Table/Range to import your data into Power Query
This imports your data containing the ID and Referrer columns into the Power Query Editor, where we can start building our solution.
Creating a Recursive Function to Calculate Hierarchy Levels
The core of our solution is a custom function that can call itself (recursion) to track up through the hierarchy until it reaches the top. Here’s how to create it:
In the Power Query Editor, go to the Home tab
Click New Source > Other Sources > Blank Query
Now we need to define our function. Our function will require two parameters: the person’s ID we want to calculate the level for and the complete table of people data to reference. The function will:
Find the row for the current person
Get their referrer’s ID
Check if they have a referrer
If they don’t (they’re at the top), return 0
If they do have a referrer, call the same function for the referrer and add 1 to the result
The M code for our function looks like this:
(personID, personTable) =>
let
personRow = Table.SelectRows(personTable, each [ID] = personID){0},
personAboveID = personRow[Referrer],
result = if personAboveID = null
then 0
else @HierarchyLevel(personAboveID, personTable) + 1
in
result
Make sure to name your function HierarchyLevel so that the recursive reference to itself works properly. The @ symbol in front of the function name emphasizes that we’re calling the same function again.
Applying the Function to Our Data
After creating our function, we need to apply it to every row in our data table:
Go back to your main query with the imported data
Click Add Column > Invoke Custom Function
Select your HierarchyLevel function
For personID, select the ID column
For personTable, we need to reference the current table
When setting the personTable parameter, we initially might try to reference a column name, but this will generate errors. Instead, we need to reference the entire table from the previous step. In Power Query, we can do this by referring to the previous step name.
Once correctly configured, the function will calculate the hierarchy level for each person in our table. Thomas, who is referred by N (ID 2) who in turn is referred by John (ID 1), will show as level 2. John, who has no referrer, will be at level 0.
Testing and Validating the Recursion
To verify our function works correctly, we can examine the calculated levels for each person in our organization:
John (ID 1): Level 0 (top of hierarchy, no referrer)
Anne (ID 2): Level 1 (referred by John)
Thomas (ID 5): Level 2 (referred by Anne)
Paul: Level 3 (referred by Thomas)
We can further test by changing referrer relationships. For example, if we change Paul’s referrer from Thomas (ID 5) to someone who is already at level 3, Paul would then become level 4. After making such changes in the source data, we can simply refresh our Power Query to see the updated hierarchy levels.
Top-level members (those with null referrers) are assigned level 0
The function will work for organizations of any depth, continuing to recurse up the chain until it reaches the top
If the data contains circular references (Person A refers to Person B who refers back to Person A), the recursion could create an infinite loop — consider adding error handling for this scenario in real applications
In a real-world scenario, you might want to enhance this function to handle more complex requirements, such as detecting circular references or processing multiple hierarchies within the same dataset.
Loading the Results Back to Excel
Once you’re satisfied with your hierarchy level calculations:
In the Power Query Editor, go to Home > Close & Load
Your data table with the new hierarchy level column will appear in Excel
Any time your hierarchy changes, simply refresh the query to recalculate all levels
This powerful technique allows you to maintain an up-to-date view of your organizational structure with minimal effort. The recursive approach efficiently handles even large hierarchies without requiring complex formulas or manual tracking.
With this solution in place, you can easily build reports and visualizations that leverage hierarchy level information, enabling better insights into your organizational structure, MLM downlines, or any hierarchical data you’re working with.
Splitting text by length in Power Query can transform cluttered data into organized, usable information without complex formulas. This technique allows you to break down text strings into separate columns based on specific character positions, making it especially useful when dealing with fixed-width data formats that contain multiple pieces of information.
The real power of this approach lies in its ability to handle irregular splitting requirements where each section has a different length.
In my video, I demonstrate the complete process of splitting text by length in Power Query:
Understanding the Data Structure
When working with text data that needs to be split, it’s essential to first analyze the structure of your text. In the demonstration, I work with a dataset where each text string contains several pieces of information with varying lengths:
Person information (30 characters)
Delimiter characters (semicolons, pipes) that need to be removed
Date information (appearing as numbers)
Currency values in different formats
The challenge lies in the fact that each section has a different length, making standard split functions less effective. This is precisely where Power Query’s split by position feature becomes invaluable.
Importing Data to Power Query
The first step in the process is to import your data into Power Query. This can be done easily by selecting your data table and using the From Table/Range option in the Data tab. Once your data is in Power Query, you’ll have access to powerful transformation tools that aren’t available in standard Excel.
Power Query provides a user-friendly interface where you can see your data and apply various transformations step by step. This visual approach makes it easier to track changes and ensure that your data is being processed correctly.
Splitting Text by Position
With the data imported into Power Query, we can now split the text column based on specific positions. Here’s how to do it:
Go to the Home tab in Power Query
Select Split Column and choose “By Positions” (not “By Number of Characters”)
Enter the specific positions where you want to split the text
In the example, I needed to split at positions 0, 30, 31, 39, and 41. It’s important to note that Power Query counts from zero for the first position, not one. These numbers represent the starting points for each section of text.
After pressing OK, Power Query creates new columns based on these position splits. The result is five separate columns, each containing a distinct part of the original text string.
Refining the Split Data
Renaming and Removing Columns
After splitting the text, we need to organize our data by giving meaningful names to the important columns and removing unnecessary ones. In the formula bar, I renamed the columns to reflect their content:
“Text.1” became “Person”
“Text.3” became “Date”
“Text.5” became “Payout”
For columns containing delimiters or other unwanted information (in this case, “Text.2” and “Text.4”), we can simply delete them by selecting them with Ctrl+click and pressing the Delete key.
Correcting Data Types
Once we have our properly named columns, we need to ensure that each column has the correct data type. In the demonstration:
The “Date” column was initially recognized as an integer and needed to be converted to the date data type
The “Payout” column contained currency values in different formats that needed proper interpretation
Power Query can automatically detect and convert data types, but sometimes manual intervention is necessary. By clicking on the data type icon in the column header, you can force Power Query to interpret the data as a specific type.
Working with Regional Settings
An important aspect of working with dates and currency values is understanding how regional settings affect data interpretation. In Power Query, you can adjust these settings to match the format of your data.
To access these settings, go to:
Options and Settings
Query Options
Current Workbook
Regional Settings
In my demonstration, the regional settings were set to “English (United Kingdom)” which allowed Power Query to correctly interpret the pound (£) currency symbols regardless of their position in the text. If your data uses different regional formats, you can adjust these settings to match your needs.
For individual columns, you can also click on the data type icon and select “Using Locale” to specify both the data type and regional format for that particular column. This gives you fine-grained control over how Power Query interprets your data.
Finalizing the Transformation
After completing all the necessary transformations — splitting the text, renaming columns, removing unnecessary columns, and setting the correct data types — the final step is to load the transformed data back into Excel.
From the Home tab, select “Close & Load” to export your properly organized table back to Excel. The result is a clean, structured dataset with separate columns for person information, dates, and payment values, all with the appropriate data types.
This technique of splitting text by position in Power Query is particularly useful when dealing with fixed-width data exports from legacy systems, standardized report outputs, or any situation where text strings contain multiple data points at known positions. By mastering this approach, you can quickly transform dense, combined text fields into organized and usable data.