Dynam­i­cal­ly remov­ing top and bot­tom rows in Pow­er Query can sig­nif­i­cant­ly stream­line your data clean­ing process when deal­ing with incon­sis­tent data imports. This tech­nique allows you to auto­mat­i­cal­ly elim­i­nate unnec­es­sary head­er rows or foot­er infor­ma­tion based on spe­cif­ic con­di­tions rather than fixed row counts, mak­ing your data trans­for­ma­tion process more robust and adapt­able to chang­ing source files.

The abil­i­ty to use con­di­tions rather than sta­t­ic num­bers is par­tic­u­lar­ly valu­able when work­ing with reg­u­lar­ly updat­ed reports that may con­tain vary­ing amounts of meta­da­ta or sum­ma­ry information.

In my video, I demon­strate how to imple­ment this dynam­ic row removal tech­nique that I learned at an Excel Lon­don Meetup:

Understanding the Problem with Static Row Removal

When import­ing data from text files, CSV files, or oth­er sources, you often encounter extra­ne­ous infor­ma­tion at the top and bot­tom of your dataset. These might include title rows, explana­to­ry notes, sum­ma­ry sta­tis­tics, or foot­er infor­ma­tion that aren’t part of the actu­al data you need to ana­lyze. Using Pow­er Query’s stan­dard “Remove Top Rows” or “Remove Bot­tom Rows” func­tions with a fixed num­ber works fine when your source data struc­ture nev­er changes, but becomes prob­lem­at­ic when the num­ber of these rows varies.

In the exam­ple shown in my video, we have mul­ti­ple rows of meta­da­ta before the actu­al head­er row (which con­tains “Date” as the first col­umn val­ue), and sev­er­al rows of addi­tion­al infor­ma­tion at the bot­tom of the data that need to be removed. There are also some miss­ing val­ues through­out the data that make sim­ple fil­ter­ing ineffective.

Dynamically Removing Top Rows Based on Conditions

The con­ven­tion­al approach to remov­ing top rows in Pow­er Query involves spec­i­fy­ing a fixed num­ber. How­ev­er, this can be prob­lem­at­ic when the num­ber of head­er rows changes. The dynam­ic solu­tion involves using a con­di­tion rather than a fixed count.

Here’s how to imple­ment this technique:

  1. Go to the Home tab in Pow­er Query Editor
  2. Select “Remove Rows” and then “Remove Top Rows”
  3. Instead of enter­ing a num­ber, mod­i­fy the for­mu­la to use the “each” key­word fol­lowed by a condition

The for­mu­la will look some­thing like this: each [Column1] <> “Date” (assum­ing “Date” is the head­er text in your first col­umn). This tells Pow­er Query to keep remov­ing rows until it finds a row where the first col­umn con­tains the text “Date”.

The Technical Details of the Table.Skip Function

Behind the scenes, Pow­er Query uses the Table.Skip func­tion when you remove top rows. This func­tion has a hid­den capa­bil­i­ty that isn’t obvi­ous from the user inter­face — it can accept either a count or a con­di­tion parameter.

When using a con­di­tion, the syn­tax changes from sim­ply pro­vid­ing a num­ber to using the format:

each [condition]

The con­di­tion is eval­u­at­ed for each row, start­ing from the top, and rows are removed until the con­di­tion is no longer true. This allows for dynam­ic adap­ta­tion to vary­ing source data structures.

Dynamically Removing Bottom Rows

Sim­i­lar­ly, we can apply the same con­cept to remove rows from the bot­tom of our dataset. This is par­tic­u­lar­ly use­ful when deal­ing with files that con­tain sum­ma­ry infor­ma­tion, notes, or oth­er foot­er data that should be exclud­ed from analysis.

The process for remov­ing bot­tom rows dynam­i­cal­ly is:

  1. Go to the Home tab in Pow­er Query Editor
  2. Select “Remove Rows” and then “Remove Bot­tom Rows”
  3. Replace the sta­t­ic num­ber with a con­di­tion using the “each” keyword

For exam­ple, you might use a for­mu­la like: each [Mer­chant] = “” to remove rows from the bot­tom where the Mer­chant col­umn con­tains an emp­ty text string. Or you might use each [Rev­enue] = null to remove rows where the Rev­enue col­umn con­tains null values.

Handling Different Types of Empty Values

When work­ing with bot­tom rows, it’s impor­tant to under­stand the dif­fer­ent types of emp­ty val­ues that might appear in your data:

  • Emp­ty text strings — rep­re­sent­ed by “” in formulas
  • Null val­ues — rep­re­sent­ed by null in formulas
  • Miss­ing val­ues — which might be null or emp­ty depend­ing on the data source

In the video demon­stra­tion, I show how to han­dle both emp­ty text strings and null val­ues as con­di­tions for remov­ing bot­tom rows. The key is to iden­ti­fy which col­umn and which type of emp­ty val­ue reli­ably indi­cates the foot­er sec­tion of your data.

Practical Applications and Benefits

This dynam­ic row removal tech­nique is par­tic­u­lar­ly valu­able in sev­er­al scenarios:

When work­ing with reg­u­lar­ly updat­ed reports where the struc­ture might change slight­ly between ver­sions, this approach ensures your Pow­er Query solu­tion remains robust. It’s also help­ful when deal­ing with data exports from sys­tems that include vary­ing amounts of meta­da­ta or when con­sol­i­dat­ing mul­ti­ple files that might have dif­fer­ent head­er structures.

The major ben­e­fits include:

  • More resilient data trans­for­ma­tion process­es that don’t break when source for­mats change slightly
  • Reduced need for man­u­al inter­ven­tion when pro­cess­ing new data
  • Abil­i­ty to han­dle files with incon­sis­tent struc­ture automatically
  • Greater flex­i­bil­i­ty com­pared to sta­t­ic row removal or sim­ple filtering

This tech­nique demon­strates the pow­er of M lan­guage in Pow­er Query, allow­ing for solu­tions that go beyond what’s imme­di­ate­ly avail­able in the user inter­face. By under­stand­ing and lever­ag­ing these more advanced capa­bil­i­ties, you can cre­ate more robust data trans­for­ma­tion processes.

Important Considerations

When imple­ment­ing this tech­nique, keep in mind a few impor­tant points:

The con­di­tion you use must reli­ably iden­ti­fy the bound­ary between the rows you want to keep and those you want to remove. Choose col­umn val­ues that are con­sis­tent­ly present (or con­sis­tent­ly absent) at these bound­aries. Also be aware that if your con­di­tion nev­er eval­u­ates to false, you could poten­tial­ly remove all rows from your dataset, so test­ing with rep­re­sen­ta­tive sam­ple data is essential.

Addi­tion­al­ly, remem­ber that this tech­nique works even when you have miss­ing val­ues in your actu­al data. As shown in the video, the rows are only removed when they match the spe­cif­ic con­di­tion you’ve defined, allow­ing rows with some miss­ing val­ues to be retained as long as they don’t match your removal condition.