Remove Top or Bottom Rows Automatically

Remove Top or Bottom Rows Automatically

Dynam­i­cal­ly remov­ing top and bot­tom rows in Pow­er Query can sig­nif­i­cant­ly stream­line your data clean­ing process when deal­ing with incon­sis­tent data imports. This tech­nique allows you to auto­mat­i­cal­ly elim­i­nate unnec­es­sary head­er rows or foot­er infor­ma­tion based on spe­cif­ic con­di­tions rather than fixed row counts, mak­ing your data trans­for­ma­tion process more robust and adapt­able to chang­ing source files.

The abil­i­ty to use con­di­tions rather than sta­t­ic num­bers is par­tic­u­lar­ly valu­able when work­ing with reg­u­lar­ly updat­ed reports that may con­tain vary­ing amounts of meta­da­ta or sum­ma­ry information.

In my video, I demon­strate how to imple­ment this dynam­ic row removal tech­nique that I learned at an Excel Lon­don Meetup:

Understanding the Problem with Static Row Removal

When import­ing data from text files, CSV files, or oth­er sources, you often encounter extra­ne­ous infor­ma­tion at the top and bot­tom of your dataset. These might include title rows, explana­to­ry notes, sum­ma­ry sta­tis­tics, or foot­er infor­ma­tion that aren’t part of the actu­al data you need to ana­lyze. Using Pow­er Query’s stan­dard “Remove Top Rows” or “Remove Bot­tom Rows” func­tions with a fixed num­ber works fine when your source data struc­ture nev­er changes, but becomes prob­lem­at­ic when the num­ber of these rows varies.

In the exam­ple shown in my video, we have mul­ti­ple rows of meta­da­ta before the actu­al head­er row (which con­tains “Date” as the first col­umn val­ue), and sev­er­al rows of addi­tion­al infor­ma­tion at the bot­tom of the data that need to be removed. There are also some miss­ing val­ues through­out the data that make sim­ple fil­ter­ing ineffective.

Dynamically Removing Top Rows Based on Conditions

The con­ven­tion­al approach to remov­ing top rows in Pow­er Query involves spec­i­fy­ing a fixed num­ber. How­ev­er, this can be prob­lem­at­ic when the num­ber of head­er rows changes. The dynam­ic solu­tion involves using a con­di­tion rather than a fixed count.

Here’s how to imple­ment this technique:

  1. Go to the Home tab in Pow­er Query Editor
  2. Select “Remove Rows” and then “Remove Top Rows”
  3. Instead of enter­ing a num­ber, mod­i­fy the for­mu­la to use the “each” key­word fol­lowed by a condition

The for­mu­la will look some­thing like this: each [Column1] <> “Date” (assum­ing “Date” is the head­er text in your first col­umn). This tells Pow­er Query to keep remov­ing rows until it finds a row where the first col­umn con­tains the text “Date”.

The Technical Details of the Table.Skip Function

Behind the scenes, Pow­er Query uses the Table.Skip func­tion when you remove top rows. This func­tion has a hid­den capa­bil­i­ty that isn’t obvi­ous from the user inter­face — it can accept either a count or a con­di­tion parameter.

When using a con­di­tion, the syn­tax changes from sim­ply pro­vid­ing a num­ber to using the format:

each [condition]

The con­di­tion is eval­u­at­ed for each row, start­ing from the top, and rows are removed until the con­di­tion is no longer true. This allows for dynam­ic adap­ta­tion to vary­ing source data structures.

Dynamically Removing Bottom Rows

Sim­i­lar­ly, we can apply the same con­cept to remove rows from the bot­tom of our dataset. This is par­tic­u­lar­ly use­ful when deal­ing with files that con­tain sum­ma­ry infor­ma­tion, notes, or oth­er foot­er data that should be exclud­ed from analysis.

The process for remov­ing bot­tom rows dynam­i­cal­ly is:

  1. Go to the Home tab in Pow­er Query Editor
  2. Select “Remove Rows” and then “Remove Bot­tom Rows”
  3. Replace the sta­t­ic num­ber with a con­di­tion using the “each” keyword

For exam­ple, you might use a for­mu­la like: each [Mer­chant] = “” to remove rows from the bot­tom where the Mer­chant col­umn con­tains an emp­ty text string. Or you might use each [Rev­enue] = null to remove rows where the Rev­enue col­umn con­tains null values.

Handling Different Types of Empty Values

When work­ing with bot­tom rows, it’s impor­tant to under­stand the dif­fer­ent types of emp­ty val­ues that might appear in your data:

  • Emp­ty text strings — rep­re­sent­ed by “” in formulas
  • Null val­ues — rep­re­sent­ed by null in formulas
  • Miss­ing val­ues — which might be null or emp­ty depend­ing on the data source

In the video demon­stra­tion, I show how to han­dle both emp­ty text strings and null val­ues as con­di­tions for remov­ing bot­tom rows. The key is to iden­ti­fy which col­umn and which type of emp­ty val­ue reli­ably indi­cates the foot­er sec­tion of your data.

Practical Applications and Benefits

This dynam­ic row removal tech­nique is par­tic­u­lar­ly valu­able in sev­er­al scenarios:

When work­ing with reg­u­lar­ly updat­ed reports where the struc­ture might change slight­ly between ver­sions, this approach ensures your Pow­er Query solu­tion remains robust. It’s also help­ful when deal­ing with data exports from sys­tems that include vary­ing amounts of meta­da­ta or when con­sol­i­dat­ing mul­ti­ple files that might have dif­fer­ent head­er structures.

The major ben­e­fits include:

  • More resilient data trans­for­ma­tion process­es that don’t break when source for­mats change slightly
  • Reduced need for man­u­al inter­ven­tion when pro­cess­ing new data
  • Abil­i­ty to han­dle files with incon­sis­tent struc­ture automatically
  • Greater flex­i­bil­i­ty com­pared to sta­t­ic row removal or sim­ple filtering

This tech­nique demon­strates the pow­er of M lan­guage in Pow­er Query, allow­ing for solu­tions that go beyond what’s imme­di­ate­ly avail­able in the user inter­face. By under­stand­ing and lever­ag­ing these more advanced capa­bil­i­ties, you can cre­ate more robust data trans­for­ma­tion processes.

Important Considerations

When imple­ment­ing this tech­nique, keep in mind a few impor­tant points:

The con­di­tion you use must reli­ably iden­ti­fy the bound­ary between the rows you want to keep and those you want to remove. Choose col­umn val­ues that are con­sis­tent­ly present (or con­sis­tent­ly absent) at these bound­aries. Also be aware that if your con­di­tion nev­er eval­u­ates to false, you could poten­tial­ly remove all rows from your dataset, so test­ing with rep­re­sen­ta­tive sam­ple data is essential.

Addi­tion­al­ly, remem­ber that this tech­nique works even when you have miss­ing val­ues in your actu­al data. As shown in the video, the rows are only removed when they match the spe­cif­ic con­di­tion you’ve defined, allow­ing rows with some miss­ing val­ues to be retained as long as they don’t match your removal condition.

Excel Pivot Table Hack: Display Percentages of Parent Row Totals

Excel Pivot Table Hack: Display Percentages of Parent Row Totals

Piv­ot tables offer a pow­er­ful way to ana­lyze data, par­tic­u­lar­ly when you need to under­stand pro­por­tions with­in hier­ar­chi­cal cat­e­gories. In this tuto­r­i­al, I’ll show you how to quick­ly add sums in a piv­ot table and dis­play val­ues as a per­cent­age of their par­ent row, allow­ing for imme­di­ate visu­al­iza­tion of how indi­vid­ual items con­tribute to their cat­e­go­ry totals.

This tech­nique is espe­cial­ly valu­able when ana­lyz­ing sales data across prod­uct cat­e­gories and indi­vid­ual items.

Watch my step-by-step video tuto­r­i­al below to see this process in action:

Setting Up Your Pivot Table

To begin cre­at­ing an infor­ma­tive piv­ot table with per­cent­age cal­cu­la­tions, we need to start with a sim­ple dataset that con­tains hier­ar­chi­cal infor­ma­tion. In my exam­ple, I’m using a dataset that includes cat­e­gories (such as fruits, veg­eta­bles, and sweets) along with the spe­cif­ic prod­ucts with­in each cat­e­go­ry and their cor­re­spond­ing rev­enue figures.

The process of cre­at­ing the piv­ot table is straightforward:

  1. Select your data range
  2. Nav­i­gate to the Insert tab in the Excel ribbon
  3. Click on “Piv­ot Table”
  4. Choose to place the piv­ot table on an exist­ing work­sheet (I select­ed cell F1 in my demonstration)
  5. Click “OK” to cre­ate the basic piv­ot table structure

Once your piv­ot table frame­work is estab­lished, you’ll need to struc­ture it prop­er­ly to show both cat­e­gories and their con­stituent prod­ucts. In the Piv­ot­Table Fields pan­el, drag the appro­pri­ate fields to build your hier­ar­chi­cal view.

Structuring Your Pivot Table

For prop­er hier­ar­chi­cal analy­sis, you’ll want to arrange your fields in a log­i­cal order. In the Rows sec­tion of the Piv­ot­Table Fields pan­el, add your Cat­e­go­ry field first, fol­lowed by the Prod­uct field. This cre­ates a nest­ed struc­ture where prod­ucts appear under their respec­tive categories.

For the val­ues sec­tion, we need to add the Rev­enue field twice — once to show the raw sum and once to show the per­cent­age of par­ent. Sim­ply drag the Rev­enue field to the Val­ues area twice. By default, Excel will sum these val­ues, which is exact­ly what we want for this analysis.

Changing the Layout

By default, Excel dis­plays piv­ot tables in com­pact form, but for bet­ter read­abil­i­ty, I pre­fer the tab­u­lar lay­out. To change this:

  1. Go to the Design tab under Piv­ot­Table Tools
  2. Click on “Report Layout”
  3. Select “Show in Tab­u­lar Form”

This adjust­ment sep­a­rates the Cat­e­go­ry and Prod­uct into dis­tinct columns, mak­ing your data more read­able and eas­i­er to ana­lyze at a glance.

Adding Percentage of Parent Row

Now comes the key part — trans­form­ing one of our rev­enue columns to show per­cent­age of par­ent row. This cal­cu­la­tion will show how each prod­uct con­tributes pro­por­tion­al­ly to its cat­e­go­ry total, and how each cat­e­go­ry con­tributes to the grand total. Fol­low these steps:

  1. Right-click on any cell with­in the sec­ond Sum of Rev­enue column
  2. Select “Show Val­ues As” from the con­text menu
  3. Choose “% of Par­ent Row Total”

This sim­ple change trans­forms the raw num­bers into per­cent­ages, giv­ing you imme­di­ate insight into the pro­por­tion­al con­tri­bu­tion of each item. For instance, in my exam­ple, you can now see that apples rep­re­sent approx­i­mate­ly 35% of all fruit sales, while the fruits cat­e­go­ry as a whole rep­re­sents about 41% of total sales across all categories.

Understanding the Results

After apply­ing the per­cent­age of par­ent row cal­cu­la­tion, your piv­ot table auto­mat­i­cal­ly adjusts to show mean­ing­ful pro­por­tions at every level:

  • Indi­vid­ual prod­ucts show their per­cent­age con­tri­bu­tion to their imme­di­ate category
  • Cat­e­go­ry subto­tals show their per­cent­age con­tri­bu­tion to the grand total
  • The grand total always equals 100%

In my demon­stra­tion, this clear­ly showed that sweets account­ed for approx­i­mate­ly 40% of total sales, veg­eta­bles for about 18%, and fruits for approx­i­mate­ly 41%. With­in each cat­e­go­ry, you can sim­i­lar­ly see the pro­por­tion­al con­tri­bu­tion of each product.

Finalizing Your Pivot Table

To make your piv­ot table more under­stand­able, it’s impor­tant to rename the col­umn head­ers to accu­rate­ly reflect what each col­umn rep­re­sents. In our case:

  1. Change the head­ing of the first sum col­umn to sim­ply read “Sum”
  2. Rename the sec­ond col­umn to “Per­cent­age of Parent”

These descrip­tive head­ers ensure that any­one view­ing your piv­ot table will imme­di­ate­ly under­stand what the num­bers rep­re­sent with­out need­ing addi­tion­al explanation.

With these adjust­ments com­plete, you now have a com­pre­hen­sive piv­ot table that not only shows the raw rev­enue fig­ures but also clear­ly illus­trates the pro­por­tion­al rela­tion­ships between cat­e­gories and their con­stituent prod­ucts. This dual-view approach pro­vides both absolute and rel­a­tive per­spec­tives on your data, enabling more nuanced analy­sis and decision-making.

This tech­nique is par­tic­u­lar­ly valu­able when ana­lyz­ing sales per­for­mance, bud­get allo­ca­tions, or any hier­ar­chi­cal data where under­stand­ing pro­por­tion­al rela­tion­ships is impor­tant. By visu­al­iz­ing both raw num­bers and per­cent­ages simul­ta­ne­ous­ly, you gain deep­er insights into your data struc­ture and can more effec­tive­ly com­mu­ni­cate those insights to others.

How to Calculate Hierarchy Levels in Power Query

How to Calculate Hierarchy Levels in Power Query

In this tuto­r­i­al, I demon­strate a prac­ti­cal appli­ca­tion of recur­sive func­tions in Pow­er Query to cal­cu­late hier­ar­chy lev­els in orga­ni­za­tion­al struc­tures or MLM sys­tems. Using a cus­tom func­tion with recur­sion, we can effi­cient­ly deter­mine each per­son­’s posi­tion in a hier­ar­chi­cal struc­ture based on their refer­rer rela­tion­ships, pro­vid­ing valu­able insights for orga­ni­za­tion­al analy­sis and reporting.

Under­stand­ing hier­ar­chy lev­els is essen­tial for visu­al­iz­ing report­ing struc­tures, track­ing MLM down­lines, or map­ping any par­ent-child rela­tion­ships in your data.

Understanding the Problem: Hierarchy Levels in Organizations

In many orga­ni­za­tion­al struc­tures, par­tic­u­lar­ly in mul­ti-lev­el mar­ket­ing (MLM) sys­tems or cor­po­rate hier­ar­chies, under­stand­ing the lev­el depth of each mem­ber is cru­cial. The lev­el rep­re­sents how many steps a per­son is from the top of the orga­ni­za­tion. For instance, in our exam­ple, John is at the top (lev­el 0), N is direct­ly below John (lev­el 1), and Thomas is below N (lev­el 2).

Our sam­ple data con­tains an ID col­umn that unique­ly iden­ti­fies each per­son and a Refer­rer col­umn that indi­cates who brought that per­son into the orga­ni­za­tion. The refer­ral rela­tion­ship estab­lish­es the hier­ar­chi­cal struc­ture we need to ana­lyze. Our goal is to cal­cu­late each per­son­’s hier­ar­chy lev­el auto­mat­i­cal­ly using Pow­er Query’s recur­sive capabilities.

Setting Up Power Query

To begin work­ing with our data, we need to import it into Pow­er Query where we can cre­ate and apply our recur­sive function:

  1. Select any cell in your data table
  2. Go to the Data tab
  3. Click From Table/Range to import your data into Pow­er Query

This imports your data con­tain­ing the ID and Refer­rer columns into the Pow­er Query Edi­tor, where we can start build­ing our solution.

Creating a Recursive Function to Calculate Hierarchy Levels

The core of our solu­tion is a cus­tom func­tion that can call itself (recur­sion) to track up through the hier­ar­chy until it reach­es the top. Here’s how to cre­ate it:

  1. In the Pow­er Query Edi­tor, go to the Home tab
  2. Click New Source > Oth­er Sources > Blank Query

Now we need to define our func­tion. Our func­tion will require two para­me­ters: the per­son­’s ID we want to cal­cu­late the lev­el for and the com­plete table of peo­ple data to ref­er­ence. The func­tion will:

  1. Find the row for the cur­rent person
  2. Get their refer­rer’s ID
  3. Check if they have a referrer
  4. If they don’t (they’re at the top), return 0
  5. If they do have a refer­rer, call the same func­tion for the refer­rer and add 1 to the result

The M code for our func­tion looks like this:

(per­son­ID, personTable) =>
let
per­son­Row = Table.SelectRows(personTable, each [ID] = personID){0},
per­son­AboveID = personRow[Referrer],
result = if per­son­AboveID = null
then 0
else @HierarchyLevel(personAboveID, per­son­Table) + 1
in
result

Make sure to name your func­tion Hier­ar­chyLev­el so that the recur­sive ref­er­ence to itself works prop­er­ly. The @ sym­bol in front of the func­tion name empha­sizes that we’re call­ing the same func­tion again.

Applying the Function to Our Data

After cre­at­ing our func­tion, we need to apply it to every row in our data table:

  1. Go back to your main query with the import­ed data
  2. Click Add Col­umn > Invoke Cus­tom Function
  3. Select your Hier­ar­chyLev­el function
  4. For per­son­ID, select the ID column
  5. For per­son­Table, we need to ref­er­ence the cur­rent table

When set­ting the per­son­Table para­me­ter, we ini­tial­ly might try to ref­er­ence a col­umn name, but this will gen­er­ate errors. Instead, we need to ref­er­ence the entire table from the pre­vi­ous step. In Pow­er Query, we can do this by refer­ring to the pre­vi­ous step name.

Once cor­rect­ly con­fig­ured, the func­tion will cal­cu­late the hier­ar­chy lev­el for each per­son in our table. Thomas, who is referred by N (ID 2) who in turn is referred by John (ID 1), will show as lev­el 2. John, who has no refer­rer, will be at lev­el 0.

Testing and Validating the Recursion

To ver­i­fy our func­tion works cor­rect­ly, we can exam­ine the cal­cu­lat­ed lev­els for each per­son in our organization:

  • John (ID 1): Lev­el 0 (top of hier­ar­chy, no referrer)
  • Anne (ID 2): Lev­el 1 (referred by John)
  • Thomas (ID 5): Lev­el 2 (referred by Anne)
  • Paul: Lev­el 3 (referred by Thomas)

We can fur­ther test by chang­ing refer­rer rela­tion­ships. For exam­ple, if we change Paul’s refer­rer from Thomas (ID 5) to some­one who is already at lev­el 3, Paul would then become lev­el 4. After mak­ing such changes in the source data, we can sim­ply refresh our Pow­er Query to see the updat­ed hier­ar­chy levels.

Handling Edge Cases in Your Hierarchy

Our recur­sive func­tion han­dles sev­er­al impor­tant edge cases:

  1. Top-lev­el mem­bers (those with null refer­rers) are assigned lev­el 0
  2. The func­tion will work for orga­ni­za­tions of any depth, con­tin­u­ing to recurse up the chain until it reach­es the top
  3. If the data con­tains cir­cu­lar ref­er­ences (Per­son A refers to Per­son B who refers back to Per­son A), the recur­sion could cre­ate an infi­nite loop — con­sid­er adding error han­dling for this sce­nario in real applications

In a real-world sce­nario, you might want to enhance this func­tion to han­dle more com­plex require­ments, such as detect­ing cir­cu­lar ref­er­ences or pro­cess­ing mul­ti­ple hier­ar­chies with­in the same dataset.

Loading the Results Back to Excel

Once you’re sat­is­fied with your hier­ar­chy lev­el calculations:

  1. In the Pow­er Query Edi­tor, go to Home > Close & Load
  2. Your data table with the new hier­ar­chy lev­el col­umn will appear in Excel
  3. Any time your hier­ar­chy changes, sim­ply refresh the query to recal­cu­late all levels

This pow­er­ful tech­nique allows you to main­tain an up-to-date view of your orga­ni­za­tion­al struc­ture with min­i­mal effort. The recur­sive approach effi­cient­ly han­dles even large hier­ar­chies with­out requir­ing com­plex for­mu­las or man­u­al tracking.

With this solu­tion in place, you can eas­i­ly build reports and visu­al­iza­tions that lever­age hier­ar­chy lev­el infor­ma­tion, enabling bet­ter insights into your orga­ni­za­tion­al struc­ture, MLM down­lines, or any hier­ar­chi­cal data you’re work­ing with.

How to Split Text by Position in Power Query

How to Split Text by Position in Power Query

Split­ting text by length in Pow­er Query can trans­form clut­tered data into orga­nized, usable infor­ma­tion with­out com­plex for­mu­las. This tech­nique allows you to break down text strings into sep­a­rate columns based on spe­cif­ic char­ac­ter posi­tions, mak­ing it espe­cial­ly use­ful when deal­ing with fixed-width data for­mats that con­tain mul­ti­ple pieces of information.

The real pow­er of this approach lies in its abil­i­ty to han­dle irreg­u­lar split­ting require­ments where each sec­tion has a dif­fer­ent length.

In my video, I demon­strate the com­plete process of split­ting text by length in Pow­er Query:

Understanding the Data Structure

When work­ing with text data that needs to be split, it’s essen­tial to first ana­lyze the struc­ture of your text. In the demon­stra­tion, I work with a dataset where each text string con­tains sev­er­al pieces of infor­ma­tion with vary­ing lengths:

  • Per­son infor­ma­tion (30 characters)
  • Delim­iter char­ac­ters (semi­colons, pipes) that need to be removed
  • Date infor­ma­tion (appear­ing as numbers)
  • Cur­ren­cy val­ues in dif­fer­ent formats

The chal­lenge lies in the fact that each sec­tion has a dif­fer­ent length, mak­ing stan­dard split func­tions less effec­tive. This is pre­cise­ly where Pow­er Query’s split by posi­tion fea­ture becomes invaluable.

Importing Data to Power Query

The first step in the process is to import your data into Pow­er Query. This can be done eas­i­ly by select­ing your data table and using the From Table/Range option in the Data tab. Once your data is in Pow­er Query, you’ll have access to pow­er­ful trans­for­ma­tion tools that aren’t avail­able in stan­dard Excel.

Pow­er Query pro­vides a user-friend­ly inter­face where you can see your data and apply var­i­ous trans­for­ma­tions step by step. This visu­al approach makes it eas­i­er to track changes and ensure that your data is being processed correctly.

Splitting Text by Position

With the data import­ed into Pow­er Query, we can now split the text col­umn based on spe­cif­ic posi­tions. Here’s how to do it:

  1. Go to the Home tab in Pow­er Query
  2. Select Split Col­umn and choose “By Posi­tions” (not “By Num­ber of Characters”)
  3. Enter the spe­cif­ic posi­tions where you want to split the text

In the exam­ple, I need­ed to split at posi­tions 0, 30, 31, 39, and 41. It’s impor­tant to note that Pow­er Query counts from zero for the first posi­tion, not one. These num­bers rep­re­sent the start­ing points for each sec­tion of text.

After press­ing OK, Pow­er Query cre­ates new columns based on these posi­tion splits. The result is five sep­a­rate columns, each con­tain­ing a dis­tinct part of the orig­i­nal text string.

Refining the Split Data

Renaming and Removing Columns

After split­ting the text, we need to orga­nize our data by giv­ing mean­ing­ful names to the impor­tant columns and remov­ing unnec­es­sary ones. In the for­mu­la bar, I renamed the columns to reflect their content:

  • “Text.1” became “Per­son”
  • “Text.3” became “Date”
  • “Text.5” became “Pay­out”

For columns con­tain­ing delim­iters or oth­er unwant­ed infor­ma­tion (in this case, “Text.2” and “Text.4”), we can sim­ply delete them by select­ing them with Ctrl+click and press­ing the Delete key.

Correcting Data Types

Once we have our prop­er­ly named columns, we need to ensure that each col­umn has the cor­rect data type. In the demonstration:

  • The “Date” col­umn was ini­tial­ly rec­og­nized as an inte­ger and need­ed to be con­vert­ed to the date data type
  • The “Pay­out” col­umn con­tained cur­ren­cy val­ues in dif­fer­ent for­mats that need­ed prop­er interpretation

Pow­er Query can auto­mat­i­cal­ly detect and con­vert data types, but some­times man­u­al inter­ven­tion is nec­es­sary. By click­ing on the data type icon in the col­umn head­er, you can force Pow­er Query to inter­pret the data as a spe­cif­ic type.

Working with Regional Settings

An impor­tant aspect of work­ing with dates and cur­ren­cy val­ues is under­stand­ing how region­al set­tings affect data inter­pre­ta­tion. In Pow­er Query, you can adjust these set­tings to match the for­mat of your data.

To access these set­tings, go to:

  1. Options and Settings
  2. Query Options
  3. Cur­rent Workbook
  4. Region­al Settings

In my demon­stra­tion, the region­al set­tings were set to “Eng­lish (Unit­ed King­dom)” which allowed Pow­er Query to cor­rect­ly inter­pret the pound (£) cur­ren­cy sym­bols regard­less of their posi­tion in the text. If your data uses dif­fer­ent region­al for­mats, you can adjust these set­tings to match your needs.

For indi­vid­ual columns, you can also click on the data type icon and select “Using Locale” to spec­i­fy both the data type and region­al for­mat for that par­tic­u­lar col­umn. This gives you fine-grained con­trol over how Pow­er Query inter­prets your data.

Finalizing the Transformation

After com­plet­ing all the nec­es­sary trans­for­ma­tions — split­ting the text, renam­ing columns, remov­ing unnec­es­sary columns, and set­ting the cor­rect data types — the final step is to load the trans­formed data back into Excel.

From the Home tab, select “Close & Load” to export your prop­er­ly orga­nized table back to Excel. The result is a clean, struc­tured dataset with sep­a­rate columns for per­son infor­ma­tion, dates, and pay­ment val­ues, all with the appro­pri­ate data types.

This tech­nique of split­ting text by posi­tion in Pow­er Query is par­tic­u­lar­ly use­ful when deal­ing with fixed-width data exports from lega­cy sys­tems, stan­dard­ized report out­puts, or any sit­u­a­tion where text strings con­tain mul­ti­ple data points at known posi­tions. By mas­ter­ing this approach, you can quick­ly trans­form dense, com­bined text fields into orga­nized and usable data.

How to Import Fixed-Width Files to Excel

How to Import Fixed-Width Files to Excel

Import­ing fixed-width col­umn data to Excel can be chal­leng­ing when each col­umn has a set num­ber of char­ac­ters, often result­ing in unnec­es­sary spaces. In this arti­cle, I’ll show you how to prop­er­ly import fixed-width CSV files using Pow­er Query, ensur­ing clean data with­out those prob­lem­at­ic extra spaces.

This tech­nique is essen­tial for han­dling struc­tured data exports from lega­cy sys­tems or stan­dard­ized reports.

Understanding Fixed-Width Column Data

Fixed-width col­umn data is a spe­cif­ic for­mat where each col­umn occu­pies exact­ly the same num­ber of char­ac­ters in every row. Unlike delim­it­ed for­mats (like com­ma-sep­a­rat­ed val­ues), fixed-width files allo­cate a pre­de­ter­mined num­ber of spaces for each field. This means short­er val­ues get padded with spaces to main­tain the fixed width. For exam­ple, the date col­umn might always be 10 char­ac­ters wide, the mer­chant col­umn 20 char­ac­ters, and so on.

This for­mat is com­mon in old­er sys­tems, gov­ern­ment data, and stan­dard­ized reports, but it presents unique chal­lenges when import­ing to Excel. The main issue is that these extra spaces can cause prob­lems with data analy­sis and visu­al­iza­tion if not prop­er­ly han­dled dur­ing the import process.

Importing Fixed-Width Data Using Power Query

Pow­er Query pro­vides excel­lent tools for han­dling fixed-width col­umn data. Here’s the process to prop­er­ly import this type of data:

Initial Import Steps

To begin import­ing fixed-width data, fol­low these steps:

  1. Go to the Data tab in Excel
  2. Click on “From Text/CSV” option
  3. Locate and select your fixed-width file

When the Pow­er Query Edi­tor opens, you might need to adjust the win­dow size to see all avail­able options, as some­times the third col­umn of set­tings may be hid­den if the win­dow is too nar­row. Make sure to expand the win­dow to see all import settings.

Configuring Import Settings

Dur­ing the import process, you’ll need to con­fig­ure sev­er­al impor­tant settings:

First, check the file ori­gin (encod­ing) — UTF‑8 works well for most mod­ern files. Then, ver­i­fy that the delim­iter is set cor­rect­ly. For fixed-width files, Pow­er Query typ­i­cal­ly auto­mat­i­cal­ly detects this for­mat. You should see “Fixed Width” select­ed rather than com­ma, tab, or oth­er delimiters.

The data type detec­tion set­ting is anoth­er cru­cial choice. You have three options:

  • Based on first 200 rows
  • Based on entire dataset
  • Do not detect data types

In most cas­es with fixed-width data, it’s bet­ter to select “Do not detect data types” ini­tial­ly and han­dle data type con­ver­sion after clean­ing the spaces. This pre­vents Pow­er Query from mak­ing assump­tions about your data types that might be affect­ed by the extra spaces.

Handling Extra Spaces in Power Query

The key chal­lenge with fixed-width data is man­ag­ing the extra spaces that pad each col­umn. When exam­in­ing the data in Pow­er Query, you’ll notice these extra spaces in both the col­umn head­ers and text fields. These spaces are nec­es­sary in the orig­i­nal file to main­tain the fixed-width for­mat, but they’re prob­lem­at­ic for analy­sis in Excel.

For exam­ple, when click­ing on a col­umn head­er, you might see some­thing like “Date ” with sev­er­al spaces after the actu­al name. Sim­i­lar­ly, text fields like prod­uct names or mer­chant names will have trail­ing spaces to fill their allo­cat­ed width.

Removing Extra Spaces

To clean up the data properly:

  1. Select all columns in the Pow­er Query Edi­tor by click­ing the first col­umn head­er and then Shift-click­ing the last column
  2. Go to the Trans­form tab
  3. Click on “For­mat” and select “Trim”

This impor­tant step removes all lead­ing and trail­ing white spaces from every cell in your dataset while pre­serv­ing spaces that are actu­al­ly part of your data (like spaces between words in prod­uct names). After trim­ming the data, you’ll notice a sig­nif­i­cant improve­ment in how your data looks.

Setting Headers and Data Types

After remov­ing the extra spaces, you can pro­ceed with con­fig­ur­ing head­ers and data types:

Use the “Use First Row as Head­ers” option (found in the Trans­form tab or by right-click­ing on the first row) to set your head­ers prop­er­ly. Now that the spaces have been removed, your col­umn names will appear clean and professional.

With clean data, you can now set appro­pri­ate data types for each col­umn. For exam­ple, date columns should be for­mat­ted as dates, and rev­enue or cost columns as cur­ren­cy or dec­i­mal num­bers. Pow­er Query will help iden­ti­fy appro­pri­ate types, but you can man­u­al­ly change them by click­ing the data type icon in each col­umn header.

Loading Data to Excel

The final step is to load your cleaned data into Excel:

  1. Click “Close & Load” (or “Close & Load To” for more options)
  2. Select whether to load as a table or to a spe­cif­ic location
  3. Choose an exist­ing work­sheet or cre­ate a new one

When load­ing the data, you have the option to add it to the data mod­el if you plan to use it with Pow­er Piv­ot, but this isn’t nec­es­sary for basic Excel analy­sis. One impor­tant thing to note is that while Pow­er Query may detect cur­ren­cy data types, this does­n’t auto­mat­i­cal­ly apply Excel’s cur­ren­cy for­mat­ting to your table. You’ll need to apply appro­pri­ate num­ber for­mat­ting in Excel after importing.

For­tu­nate­ly, when you refresh your data in the future, Excel will main­tain any num­ber for­mat­ting you’ve applied to the table, mak­ing future updates seamless.

Benefits of Using Power Query for Fixed-Width Data

Using Pow­er Query for import­ing fixed-width data offers sev­er­al advan­tages over oth­er methods:

The abil­i­ty to trans­form data before load­ing it to Excel saves sig­nif­i­cant time com­pared to man­u­al cleanup after import. The process is also repeat­able — if you receive updat­ed ver­sions of the same file, you can sim­ply refresh your query rather than repeat­ing the entire import and cleanup process.

Addi­tion­al­ly, Pow­er Query main­tains a record of all trans­for­ma­tions applied, mak­ing your data prepa­ra­tion process trans­par­ent and eas­i­ly adjustable if need­ed. This approach ensures con­sis­ten­cy in how your fixed-width data is processed each time.

Merge Tables with Summarized Data for Cleaner Reports

Merge Tables with Summarized Data for Cleaner Reports

Merg­ing tables in Pow­er Query with sum­ma­rized data pro­vides a pow­er­ful way to con­sol­i­date infor­ma­tion with­out over­whelm­ing detail. By aggre­gat­ing invoice details such as counts, sums, aver­ages, and medi­ans, you can cre­ate con­cise reports that high­light key insights while main­tain­ing the orig­i­nal table structure.

This tech­nique is par­tic­u­lar­ly valu­able when deal­ing with sales data where each invoice con­tains mul­ti­ple line items that need to be condensed.

In my video tuto­r­i­al, I demon­strate how to effec­tive­ly merge tables with sum­ma­riza­tion in Pow­er Query:

Understanding the Problem: Merging Tables with Summarization

When work­ing with rela­tion­al data like invoic­es and their detailed items, we often need to com­bine infor­ma­tion from mul­ti­ple tables. How­ev­er, sim­ply merg­ing tables can result in redun­dant data and unnec­es­sary com­plex­i­ty. The chal­lenge is to merge data while simul­ta­ne­ous­ly sum­ma­riz­ing the details to pro­vide mean­ing­ful insights.

In this demon­stra­tion, I work with two tables: one con­tain­ing basic invoice infor­ma­tion and anoth­er with detailed line items for each invoice. The goal is to enhance the invoice table with sum­ma­rized infor­ma­tion from the details table, such as the count of items, aver­age quan­ti­ties, sum of prices, and even sta­tis­ti­cal mea­sures like medians.

Loading Tables into Power Query

The first step in our process is to load both tables into Pow­er Query for manip­u­la­tion. This gives us access to the pow­er­ful data trans­for­ma­tion capa­bil­i­ties that allow for sophis­ti­cat­ed merg­ing and summarization.

To begin, I nav­i­gate to the Data tab in Excel and select “From Table/Range” to import our first table into Pow­er Query. For this ini­tial details table, I choose to load it only as a con­nec­tion rather than import­ing it direct­ly into the work­sheet. This is done by select­ing “Close and Load To” and then choos­ing the “Con­nec­tion only” option.

After load­ing the details table, I fol­low the same process to load the invoice table. This time, how­ev­er, we’ll per­form our merge oper­a­tion direct­ly with­in this query rather than cre­at­ing a new one.

Merging the Tables

With both tables loaded into Pow­er Query, we can now merge them using the com­mon invoice ID field. This is where the pow­er of our approach becomes evident.

To per­form the merge:

  1. Select the “Merge Queries” com­mand from the ribbon
  2. Choose the sec­ond table (details) to merge with
  3. Select match­ing columns for merg­ing (invoice ID in both tables)
  4. Choose “Left Out­er Join” as the join kind to ensure all invoic­es are included

After press­ing OK, Pow­er Query cre­ates a new col­umn in our invoice table that con­tains the match­ing rows from the details table. You’ll notice that for each invoice, we now have a nest­ed table show­ing all the cor­re­spond­ing detail rows. For exam­ple, some invoic­es might have two detail rows, oth­ers three, and some even five or more.

The Key Difference: Aggregating Instead of Expanding

The crit­i­cal step that dif­fer­en­ti­ates this approach from a stan­dard merge is what we do next. Instead of expand­ing these nest­ed tables (which would cre­ate dupli­cate invoice rows), we want to aggre­gate the infor­ma­tion they contain.

When we click on the expand but­ton (the small tri­an­gle) in the col­umn head­er, we get options to expand or aggre­gate. By choos­ing to aggre­gate, we can per­form cal­cu­la­tions across all the detail rows for each invoice.

Creating Summarized Metrics

The pow­er of this tech­nique lies in the vari­ety of aggre­ga­tions we can per­form on the detailed data. For our invoice exam­ple, I demon­strate sev­er­al use­ful metrics:

  • Count of items: Sim­ple count of how many line items exist for each invoice
  • Aver­age quan­ti­ty: Instead of sum­ming quan­ti­ties, we cal­cu­late the aver­age num­ber of items per line
  • Sum of unit price: Total val­ue of the invoice
  • Medi­an of unit price: Sta­tis­ti­cal mea­sure show­ing the mid­dle val­ue of all unit prices on the invoice

To cre­ate these aggre­ga­tions, I select the appro­pri­ate options from the drop­down menu when expand­ing the nest­ed table col­umn. By uncheck­ing “Use orig­i­nal col­umn name as pre­fix,” we get clean­er col­umn names in our results.

Refining the Results

After cre­at­ing our aggre­gat­ed columns, we may want to refine the col­umn names for clar­i­ty. This can be done direct­ly in the for­mu­la bar in Pow­er Query. For exam­ple, I rename one of the columns to “Count” to bet­ter reflect its purpose.

It’s also impor­tant to check the data types of our new columns. Pow­er Query auto­mat­i­cal­ly assigns data types, but they may need adjust­ment. In our case, the “Count” col­umn is cor­rect­ly set as a dec­i­mal num­ber, which works well since counts will always be whole numbers.

When work­ing with mon­e­tary val­ues like the sum of unit prices, Pow­er Query may pre­serve the cur­ren­cy data type. How­ev­er, it’s worth not­ing that this data type infor­ma­tion does­n’t auto­mat­i­cal­ly trans­late to for­mat­ting in Excel. We’ll need to apply cur­ren­cy for­mat­ting in Excel after load­ing the data.

Loading the Final Results

Once we’re sat­is­fied with our merged and sum­ma­rized table, we can load it back to Excel. From the Home tab in Pow­er Query, I select “Close and Load To” and choose to cre­ate a new table in the exist­ing worksheet.

The final result is an enhanced invoice table that includes aggre­gat­ed infor­ma­tion from the details table with­out dupli­cat­ing invoice records. This gives us a con­cise view of our data with valu­able insights such as the num­ber of items per invoice, aver­age quan­ti­ties, total val­ues, and sta­tis­ti­cal measures.

This tech­nique is par­tic­u­lar­ly valu­able for finan­cial analy­sis, sales report­ing, and any sit­u­a­tion where you need to com­bine mas­ter records with sum­ma­rized infor­ma­tion from relat­ed detail records. By lever­ag­ing Pow­er Query’s merg­ing and aggre­ga­tion capa­bil­i­ties, we can cre­ate more infor­ma­tive reports while main­tain­ing a clean, non-redun­dant data structure.

How to Create Sales Summaries with Power Query

How to Create Sales Summaries with Power Query

Sales data analy­sis often requires sum­ma­riz­ing results by spe­cif­ic cat­e­gories to gain valu­able busi­ness insights. In this arti­cle, I demon­strate how to effi­cient­ly cal­cu­late sales sum­maries using Pow­er Query’s group­ing func­tion­al­i­ty, pro­vid­ing a pow­er­ful alter­na­tive to tra­di­tion­al Excel for­mu­las for ana­lyz­ing mer­chant per­for­mance, income totals, and prod­uct distribution.

The group­ing tech­nique shown allows for mul­ti­ple aggre­ga­tions in a sin­gle oper­a­tion, dra­mat­i­cal­ly improv­ing effi­cien­cy when work­ing with larg­er datasets.

Watch my detailed tuto­r­i­al video below to fol­low along with each step:

Understanding the Sales Summary Challenge

When ana­lyz­ing sales data, we often need to cre­ate sum­maries that show per­for­mance met­rics grouped by spe­cif­ic cat­e­gories. In this demon­stra­tion, I’m work­ing with a dataset con­tain­ing sales trans­ac­tions that include mer­chants, prod­ucts, dates, and income val­ues. My goal is to sum­ma­rize this data to show sev­er­al key met­rics for each merchant:

  • Count of sales transactions
  • Sum of income
  • Aver­age income per transaction
  • List of unique prod­ucts sold

While these cal­cu­la­tions can be done using Excel for­mu­las (which I’ve pre­pared in the work­sheet for com­par­i­son), using Pow­er Query offers sig­nif­i­cant advan­tages, espe­cial­ly for larg­er datasets or repet­i­tive analy­sis tasks.

Importing Data into Power Query

The first step in our process is to bring the data into Pow­er Query where we can per­form the group­ing operations:

  1. Select the data table in Excel
  2. Nav­i­gate to the Data tab in the Excel ribbon
  3. Click “From Table/Range” to import the data into Pow­er Query

Once the data is loaded into the Pow­er Query Edi­tor, we can see all our sales records and begin the group­ing process to cre­ate our summary.

Creating Groups in Power Query

To sum­ma­rize our data by mer­chant, we need to use the Group By func­tion in Pow­er Query:

  1. Select the Mer­chant col­umn in the Pow­er Query Editor
  2. Go to the Home tab and click “Group By”

The default Group By dia­log offers basic func­tion­al­i­ty with just one col­umn for group­ing and a sin­gle aggre­ga­tion. For our needs, we require the Advanced option since we want to cre­ate mul­ti­ple aggre­ga­tions based on the mer­chant column.

Setting Up Advanced Grouping

In the advanced Group By dia­log, we need to con­fig­ure mul­ti­ple aggregations:

  1. Set the group­ing col­umn to Mer­chant
  2. Add the fol­low­ing aggregations: 
    • “Count Rows” for the num­ber of sales (named “Count”)
    • “Sum” of the Income col­umn (named “Sum”)
    • “Aver­age” of the Income col­umn (named “Aver­age”)
    • “All Rows” to cap­ture all relat­ed data (named “List”)

After click­ing OK, Pow­er Query cre­ates a new table with our mer­chant groups and the spec­i­fied aggre­ga­tions. The Count, Sum, and Aver­age val­ues match the Excel for­mu­las we pre­pared ear­li­er, con­firm­ing that our group­ing is work­ing correctly.

Customizing the Product List

The “All Rows” aggre­ga­tion cre­ates a nest­ed table for each mer­chant con­tain­ing all their sales records. How­ev­er, we only want a list of unique prod­ucts sold by each mer­chant, not all the row data. Since the Group By dia­log does­n’t offer this spe­cif­ic option, we need to mod­i­fy the M for­mu­la directly.

Modifying the M Formula

To cus­tomize our prod­uct list, we need to edit the M code that Pow­er Query generated:

  1. The default “All Rows” aggre­ga­tion pro­duces tables with all columns for each merchant
  2. We need to change this to extract only the Prod­uct col­umn values
  3. Fur­ther­more, we need to elim­i­nate dupli­cates from these lists

The orig­i­nal for­mu­la includes {_} which ref­er­ences the entire row. We need to replace this with a ref­er­ence to just the Prod­uct col­umn: [Prod­uct]. Addi­tion­al­ly, we need to wrap this in the List.Distinct func­tion to remove duplicates:

The mod­i­fied for­mu­la sec­tion should look like: List.Distinct([Product])

Transforming the Product List into Text

Now that we have lists of unique prod­ucts for each mer­chant, we can trans­form these lists into com­ma-sep­a­rat­ed text val­ues for bet­ter readability:

  1. Click on the expand but­ton in the List col­umn header
  2. Select “Extract Val­ues…” rather than “Expand to New Rows”
  3. Choose a cus­tom delim­iter (com­ma fol­lowed by space: “, ”)
  4. Click OK

This trans­forms our prod­uct lists into read­able text strings show­ing all the unique prod­ucts each mer­chant has sold, sep­a­rat­ed by commas.

Loading the Results Back to Excel

Once we’ve cre­at­ed our sum­ma­ry with all the required met­rics, we can load the results back to Excel:

  1. Go to the Home tab and click “Close & Load To…”
  2. Select “Exist­ing Work­sheet” and choose where to place the results
  3. Click OK

After load­ing, we can for­mat the Sum and Aver­age columns as cur­ren­cy to improve read­abil­i­ty. The final result is a clean sum­ma­ry table show­ing the count of sales, total income, aver­age income, and unique prod­ucts for each merchant.

Benefits of Using Power Query for Grouping

While our exam­ple used a rel­a­tive­ly small dataset, the advan­tages of using Pow­er Query for this type of analy­sis become even more appar­ent with larg­er data volumes:

  • Effi­cien­cy — Per­forms cal­cu­la­tions much faster than Excel for­mu­las for large datasets
  • Repeata­bil­i­ty — The query can be refreshed when data changes with­out redo­ing formulas
  • Flex­i­bil­i­ty — Allows for com­plex aggre­ga­tions and trans­for­ma­tions that would be dif­fi­cult with Excel formulas
  • Data clean­ing — Inte­grates with oth­er Pow­er Query trans­for­ma­tions for com­pre­hen­sive data preparation

By mas­ter­ing this group­ing tech­nique in Pow­er Query, you can cre­ate sophis­ti­cat­ed sales sum­maries and oth­er data aggre­ga­tions with just a few clicks, sav­ing sig­nif­i­cant time and effort com­pared to tra­di­tion­al Excel methods.

Automate Student Exam Grading with Excel Power Query

Automate Student Exam Grading with Excel Power Query

Using append­ing in Pow­er Query offers a stream­lined approach to auto­mat­i­cal­ly grade stu­dent exams based on pre­de­fined thresh­olds. This method elim­i­nates man­u­al grad­ing errors and saves sig­nif­i­cant time when pro­cess­ing large sets of stu­dent scores, pro­vid­ing an effi­cient way to man­age aca­d­e­m­ic assessments.

The tech­nique com­bines stu­dent scores with grade thresh­olds to gen­er­ate accu­rate results that can be eas­i­ly updat­ed when grad­ing cri­te­ria change.

Watch my step-by-step tuto­r­i­al on how to imple­ment this grad­ing sys­tem using Pow­er Query:

Understanding the Grading Process with Power Query

Pow­er Query’s append­ing fea­ture allows us to com­bine mul­ti­ple tables in a way that cre­ates rela­tion­ships between stu­dent scores and grade thresh­olds. The append­ing process essen­tial­ly stacks tables on top of each oth­er, cre­at­ing a sin­gle dataset that we can manip­u­late to auto­mat­i­cal­ly assign grades. This approach is par­tic­u­lar­ly use­ful in edu­ca­tion­al set­tings where instruc­tors need to process mul­ti­ple exam results efficiently.

The pri­ma­ry advan­tage of using this method is that when grad­ing cri­te­ria change, you don’t need to man­u­al­ly reas­sign grades to each stu­dent. Sim­ply update the thresh­old val­ues in your source table, refresh the query, and all grades will auto­mat­i­cal­ly update to reflect the new standards.

Required Resources

Before start­ing the grad­ing process, you’ll need two essen­tial tables:

  • A table con­tain­ing stu­dent names and their exam points
  • A table defin­ing grade thresh­olds (what point ranges cor­re­spond to which let­ter grades)

These tables will serve as the foun­da­tion for our auto­mat­ed grad­ing sys­tem. The struc­ture of these tables does­n’t need to be complex—just clear­ly orga­nized with appro­pri­ate head­ers to iden­ti­fy the data.

Setting Up the Tables in Power Query

To begin imple­ment­ing the grad­ing sys­tem, we need to import both tables into Pow­er Query. This process allows us to manip­u­late and com­bine the data efficiently.

First, with the stu­dent scores table select­ed, nav­i­gate to the Data tab in Excel. From there, select “From Table/Range” to import the first table into Pow­er Query. When prompt­ed to load the data, choose “Cre­ate Con­nec­tion Only” as we’ll be manip­u­lat­ing this data fur­ther before bring­ing it back to Excel.

Repeat the same process for your grade thresh­olds table. Hav­ing both tables acces­si­ble in Pow­er Query is essen­tial for the next step where we’ll com­bine them through appending.

Appending the Tables

Once both tables are in Pow­er Query, we can com­bine them using the append function:

On the Home tab with­in Pow­er Query Edi­tor, click the Append Queries but­ton (not “Append Queries as New”). In the dia­log box that appears, con­firm that you want to append just the two tables you’ve import­ed. Select the stu­dent table as your pri­ma­ry table and the grades thresh­old table as the table to append.

After append­ing, you’ll notice that the result­ing table con­tains all rows from both orig­i­nal tables. The stu­dent data appears first, fol­lowed by the thresh­old data. This com­bined dataset gives us every­thing we need to assign grades automatically.

Organizing and Sorting the Data

For our grad­ing sys­tem to work prop­er­ly, we need to sort the append­ed data by the num­ber of points. This sort­ing is a cru­cial step because it allows us to arrange scores in ascend­ing order, mak­ing it eas­i­er to assign appro­pri­ate grades.

Click on the col­umn con­tain­ing the points/scores and sort from A to Z (ascend­ing). This action arranges all scores from low­est to high­est, with the grade thresh­olds inte­grat­ed among the stu­dent scores based on their point values.

Because of how the data is now arranged, the grade thresh­olds appear at posi­tions where they nat­u­ral­ly divide the stu­dent scores into dif­fer­ent grade categories.

Assigning Grades Automatically

With our data prop­er­ly sort­ed, we can now use a sim­ple but pow­er­ful tech­nique to assign grades to each stu­dent. The key to this process is the Fill Down com­mand in Pow­er Query.

The grade col­umn ini­tial­ly has val­ues only in the rows that came from our thresh­old table. To assign these grades to stu­dents, select the grade col­umn and nav­i­gate to the Trans­form tab. Click on “Fill” and then “Down.” This action copies each grade down to all rows below it until anoth­er grade is encountered.

As a result, each stu­dent row is auto­mat­i­cal­ly assigned the grade cor­re­spond­ing to their score based on the defined thresh­olds. For exam­ple, if the thresh­old for grade C is 40 points, stu­dents with 40 or more points (but below the next thresh­old) will be assigned a C.

Filtering and Finalizing the Results

After fill­ing down the grades, we need to fil­ter out the thresh­old rows to see only the stu­dent results:

Click the fil­ter but­ton on the col­umn con­tain­ing stu­dent names and select “Fil­ter out emp­ty val­ues” or specif­i­cal­ly select only the rows con­tain­ing actu­al stu­dent data. This leaves us with just the stu­dent infor­ma­tion and their cor­re­spond­ing grades.

With the data fil­tered and grades assigned, we can now load this infor­ma­tion back to Excel. Nav­i­gate to the Home tab and select “Close & Load To.” Choose where you want to place the results in your workbook.

Updating Grades When Thresholds Change

One of the most pow­er­ful aspects of this approach is how eas­i­ly you can adjust grad­ing criteria:

If you need to change the grad­ing thresh­olds, sim­ply update the val­ues in your orig­i­nal thresh­old table. For exam­ple, you might change the min­i­mum score for an A from 90 to 88 points. After mak­ing these changes, save your file and then refresh the query (right-click on your results table and select “Refresh”).

The entire grad­ing process will auto­mat­i­cal­ly run again with the new thresh­olds, updat­ing all stu­dent grades accord­ing­ly. This makes the sys­tem extreme­ly flex­i­ble and saves sig­nif­i­cant time when adjust­ments are needed.

Real-World Application Example

In my video demon­stra­tion, I showed how this sys­tem works with a sam­ple dataset. For instance, Jen­nifer earned 13 points which result­ed in an F grade, Robert received 34 points also result­ing in an F, while Olivia scored 40 points which met the thresh­old for an E grade.

After chang­ing the thresh­olds and refresh­ing the query, the grades updat­ed auto­mat­i­cal­ly to reflect the new stan­dards. This dynam­ic updat­ing capa­bil­i­ty is what makes this approach so valu­able for edu­ca­tors who may need to adjust grad­ing scales based on class per­for­mance or oth­er factors.

How to Clean one column based on the other column

How to Clean one column based on the other column

Pow­er Query offers pow­er­ful solu­tions for clean­ing data across columns by lever­ag­ing infor­ma­tion con­tained in relat­ed columns. In this arti­cle, I’ll demon­strate how to effi­cient­ly remove cryp­tocur­ren­cy sym­bols from a name col­umn when those sym­bols already exist in a sep­a­rate col­umn, even when the sym­bols have vary­ing lengths.

This tech­nique is par­tic­u­lar­ly valu­able when deal­ing with incon­sis­tent data for­mats that require stan­dard­iza­tion for analysis.

Watch my step-by-step tuto­r­i­al video to see these tech­niques in action:

Understanding the Data Cleaning Challenge

When work­ing with finan­cial data, espe­cial­ly cryp­tocur­ren­cy infor­ma­tion, you’ll often encounter datasets where the same infor­ma­tion appears redun­dant­ly across mul­ti­ple columns. In my exam­ple, I have a table con­tain­ing cryp­tocur­ren­cy data where the name col­umn includes both the cryp­tocur­ren­cy name and its sym­bol (like “Bit­coin BTC”), while a sep­a­rate sym­bol col­umn already con­tains just the sym­bol (“BTC”).

The chal­lenge lies in clean­ing the name col­umn by remov­ing these redun­dant sym­bols. What makes this par­tic­u­lar­ly tricky is that cryp­tocur­ren­cy sym­bols vary in length — some might be three char­ac­ters (BTC), oth­ers four (DOGE) or more. This incon­sis­ten­cy means we can’t sim­ply remove a fixed num­ber of char­ac­ters from the end of each name.

Preparing the Data in Power Query

To begin the clean­ing process, we need to load our data into Pow­er Query where we can apply the nec­es­sary transformations:

  1. Nav­i­gate to the Data tab in Excel
  2. Select your data range and choose to load it to Pow­er Query
  3. Remove any unnec­es­sary columns to focus on the task at hand

In my case, I ini­tial­ly removed sev­er­al columns to focus on just the key columns for this demon­stra­tion — the name col­umn con­tain­ing both cryp­tocur­ren­cy names and sym­bols, and the sym­bol col­umn con­tain­ing just the sym­bols. This sim­pli­fied approach allows us to con­cen­trate specif­i­cal­ly on the clean­ing tech­nique with­out distraction.

Solution #1: Using Text After Delimiter

The first solu­tion uses Pow­er Query’s built-in func­tions to extract the rel­e­vant part of the text. This approach is more straight­for­ward and uses the UI-based options:

To imple­ment this solution:

  1. Select the name col­umn in Pow­er Query
  2. Go to the Add Col­umn tab
  3. Choose Extract and then Text After Delimiter

Ini­tial­ly, when using the UI, you might try enter­ing a spe­cif­ic sym­bol like “BTC” as the delim­iter. How­ev­er, this would only work for one cryp­tocur­ren­cy. The real pow­er comes from mod­i­fy­ing the for­mu­la to use the sym­bol col­umn as a reference.

After cre­at­ing the ini­tial for­mu­la, you’ll need to edit it in the for­mu­la bar to make it dynam­ic. Replace the hard­cod­ed delim­iter text with a ref­er­ence to the sym­bol col­umn. This way, Pow­er Query will use each row’s cor­re­spond­ing sym­bol to extract just the cryp­tocur­ren­cy name, regard­less of the sym­bol­’s length.

Solution #2: Using Text.RemoveRange Function

The sec­ond solu­tion is more com­plex but demon­strates the flex­i­bil­i­ty of Pow­er Query’s M lan­guage. For this approach:

  1. Go to the Add Col­umn tab
  2. Select Cus­tom Column
  3. Use the Text.RemoveRange func­tion to pre­cise­ly remove the sym­bol part

The for­mu­la will look some­thing like:

Text.RemoveRange([name], 0, Text.Length([symbol]))

This func­tion works by remov­ing a spe­cif­ic range of char­ac­ters from the text. It requires three parameters:

  • The source text (our name column)
  • The start­ing posi­tion (0‑based in Pow­er Query)
  • The num­ber of char­ac­ters to remove (which we deter­mine using the length of the symbol)

Remem­ber that Pow­er Query uses zero-based index­ing, mean­ing the first char­ac­ter posi­tion is 0, not 1. This is cru­cial when work­ing with text manip­u­la­tion functions.

Finalizing the Transformation

After apply­ing either solu­tion, we need to clean up our query:

  1. Rename the new­ly cre­at­ed col­umn to some­thing mean­ing­ful (like “name”)
  2. Delete the orig­i­nal name col­umn that con­tained the symbols
  3. If need­ed, restore any pre­vi­ous­ly removed columns by delet­ing the “Removed Columns” step
  4. Arrange columns in the desired order by drag­ging them

In my exam­ple, I also demon­strat­ed how to clean anoth­er col­umn (cir­cu­lat­ing sup­ply) that had a sim­i­lar issue but with a sim­pler pat­tern. Since all the unwant­ed sym­bols in this col­umn were pre­ced­ed by a space, I could use the Extract Text Before Delim­iter func­tion with a space as the delimiter.

Loading the Cleaned Data Back to Excel

Once all trans­for­ma­tions are com­plete, we can load the cleaned data back to Excel:

  1. Go to the Home tab in Pow­er Query
  2. Select Close & Load To
  3. Choose whether to load to a new work­sheet or an exist­ing location

In my demon­stra­tion, I loaded the data to cell A13 on the exist­ing work­sheet, which allows for easy com­par­i­son between the orig­i­nal and cleaned data.

Why This Approach Matters

The beau­ty of using Pow­er Query for this type of data clean­ing is that it cre­ates a repeat­able process. If your data source updates, you can sim­ply refresh the query, and all the clean­ing steps will be reap­plied auto­mat­i­cal­ly. This saves sig­nif­i­cant time com­pared to man­u­al clean­ing or using for­mu­las that would need to be copied down for each new set of data.

Addi­tion­al­ly, by lever­ag­ing infor­ma­tion from one col­umn to clean anoth­er, we cre­ate a more robust solu­tion that can han­dle vari­a­tions in the data for­mat with­out break­ing. This approach is far supe­ri­or to hard-cod­ing spe­cif­ic text val­ues or char­ac­ter counts that might change across dif­fer­ent datasets.

How to Remove Empty Columns in Excel with Power Query (Automated Method)

How to Remove Empty Columns in Excel with Power Query (Automated Method)

In this guide, I’ll show you how to auto­mat­i­cal­ly remove com­plete­ly emp­ty columns in your Excel data using Pow­er Query. This pow­er­ful tech­nique saves you from tedious man­u­al dele­tion and works even when columns con­tain null val­ues, emp­ty cells, and emp­ty text strings (rep­re­sent­ed by “”).

The approach I’ll demon­strate uses a spe­cif­ic Pow­er Query for­mu­la that iden­ti­fies and removes columns with no mean­ing­ful data, great­ly improv­ing your data clean­ing workflow.

Watch my detailed video walk­through of this process:

Understanding Empty Columns in Power Query

When work­ing with data in Excel, we often encounter columns that appear emp­ty but can actu­al­ly con­tain dif­fer­ent types of “empti­ness” that Pow­er Query rec­og­nizes dif­fer­ent­ly. It’s impor­tant to under­stand that Pow­er Query dis­tin­guish­es between null val­ues (tru­ly emp­ty cells), and cells con­tain­ing emp­ty text strings (“”).

An emp­ty text string occurs when a cell con­tains a for­mu­la that returns “” or when you paste such val­ues. These look emp­ty visu­al­ly in Excel, but Pow­er Query treats them dif­fer­ent­ly than cells with null val­ues. Our goal is to cre­ate a solu­tion that iden­ti­fies and removes columns regard­less of which type of “empti­ness” they con­tain, as long as the entire col­umn con­tains no mean­ing­ful data.

Importing Data into Power Query

To begin the process of remov­ing emp­ty columns, we first need to import our data into Pow­er Query. Here’s exam­ple how to do this:

  1. Go to the Data tab in Excel
  2. Click on “Get Data” → “From File” → “From Excel”
  3. Locate and select your Excel file
  4. Choose the work­sheet con­tain­ing your data (in my exam­ple, Sheet1)

After import­ing, you’ll notice that some columns might appear emp­ty but con­tain dif­fer­ent types of empti­ness — some cells might have null val­ues (shown as “null” in Pow­er Query), while oth­ers might be tru­ly emp­ty or con­tain emp­ty text strings. Our code will han­dle all these sce­nar­ios automatically.

The Power Query Solution

Rather than man­u­al­ly delet­ing emp­ty columns, we’ll use a cus­tom for­mu­la in Pow­er Query that auto­mat­i­cal­ly iden­ti­fies and removes all emp­ty columns. This approach is much more effi­cient, espe­cial­ly when work­ing with large datasets.

The Complete Code

Here’s the com­plete Pow­er Query for­mu­la that we’ll use to remove emp­ty columns:

= Table.RemoveColumns(#“Changed Type”, List.Select( Table.ColumnNames(#“Changed Type”), each List.Count( List.RemoveMatchingItems( Table.Column(#“Changed Type”, _), {null, “”} ) ) = 0 ) )

To use this code in your own projects, you’ll need to replace #“Change Type” with the name of your pre­vi­ous step in Pow­er Query. This ref­er­ence appears in three places in the for­mu­la, so be sure to update all instances.

Breaking Down the Formula

Let’s break down this for­mu­la to under­stand how it works:

The for­mu­la starts with Table.RemoveColumns(), which is the func­tion that will even­tu­al­ly remove our emp­ty columns from the table. This func­tion takes two para­me­ters: the table we’re work­ing with, and a list of col­umn names to remove.

To gen­er­ate the list of col­umn names to remove, we use List.Select(), which fil­ters a list based on a con­di­tion. We’re fil­ter­ing the list of all col­umn names (obtained using Table.ColumnNames()) to find those that meet our “emp­ty col­umn” criteria.

The core log­ic hap­pens in this part:

List.Count(List.RemoveMatchingItems(Table.Column(#“Change Type”, _), {“”, null})) = 0

For each col­umn, we:

  1. Extract all val­ues in the col­umn using Table.Column()
  2. Remove any val­ues that are emp­ty strings or null using List.RemoveMatchingItems()
  3. Count how many items remain using List.Count()
  4. Check if this count equals zero (mean­ing the col­umn is com­plete­ly empty)

If the count equals zero, the col­umn is con­sid­ered emp­ty and will be includ­ed in our list of columns to remove.

Implementing the Solution

To imple­ment this solu­tion in your Pow­er Query Editor:

  1. Click “Add Step” to cre­ate a new trans­for­ma­tion step
  2. In the for­mu­la bar, paste the com­plete code shown above
  3. Replace “#“Change Type”” with the name of your pre­vi­ous step (typ­i­cal­ly the step after pro­mot­ing headers)
  4. Press Enter to apply the transformation

You should now see your data with all emp­ty columns removed. The trans­for­ma­tion hap­pens imme­di­ate­ly, leav­ing only columns that con­tain at least one non-emp­ty value.

Testing the Solution

To ver­i­fy that our solu­tion works cor­rect­ly, we can test it with dif­fer­ent types of data. In my video, I demon­strat­ed this by:

1. Start­ing with a dataset that con­tained sev­er­al emp­ty columns (with var­i­ous types of emptiness)

2. Apply­ing our Pow­er Query solu­tion to remove these columns

3. Mod­i­fy­ing the source data to add a new col­umn with a sin­gle value

4. Refresh­ing the query to ver­i­fy that the col­umn with a val­ue is preserved

The solu­tion cor­rect­ly iden­ti­fied and retained columns that had at least one mean­ing­ful val­ue, while remov­ing columns that were com­plete­ly empty.

Advanced Usage and Adaptations

This solu­tion can be eas­i­ly adapt­ed to work with dif­fer­ent datasets. The key is to ensure you’re ref­er­enc­ing the cor­rect pre­vi­ous step in your Pow­er Query trans­for­ma­tion sequence. If your pre­vi­ous step is named dif­fer­ent­ly, sim­ply replace all instances of “#“Change Type”” with the appro­pri­ate step name.

When work­ing with very large datasets, this auto­mat­ed approach can save sig­nif­i­cant time com­pared to man­u­al­ly iden­ti­fy­ing and remov­ing emp­ty columns. Addi­tion­al­ly, the solu­tion is dynam­ic — if your source data changes and a pre­vi­ous­ly emp­ty col­umn now con­tains val­ues, refresh­ing the query will auto­mat­i­cal­ly retain that column.

Loading the Results

After apply­ing our trans­for­ma­tion, we can load the results back to Excel:

  1. Click “Close & Load” (or “Close & Load To…” if you want to spec­i­fy a destination)
  2. Choose your pre­ferred des­ti­na­tion (work­sheet, data mod­el, etc.)
  3. The cleaned data, with emp­ty columns removed, will be loaded to your spec­i­fied location

The result­ing table will con­tain only mean­ing­ful columns that have at least one val­ue. This makes your data more con­cise and eas­i­er to work with.

When your source data changes, sim­ply refresh the query to apply the same trans­for­ma­tions to the updat­ed data. The emp­ty col­umn removal log­ic will be reap­plied auto­mat­i­cal­ly, adapt­ing to any changes in your source data’s structure.