Dynamically removing top and bottom rows in Power Query can significantly streamline your data cleaning process when dealing with inconsistent data imports. This technique allows you to automatically eliminate unnecessary header rows or footer information based on specific conditions rather than fixed row counts, making your data transformation process more robust and adaptable to changing source files.
The ability to use conditions rather than static numbers is particularly valuable when working with regularly updated reports that may contain varying amounts of metadata or summary information.
In my video, I demonstrate how to implement this dynamic row removal technique that I learned at an Excel London Meetup:
Understanding the Problem with Static Row Removal
When importing data from text files, CSV files, or other sources, you often encounter extraneous information at the top and bottom of your dataset. These might include title rows, explanatory notes, summary statistics, or footer information that aren’t part of the actual data you need to analyze. Using Power Query’s standard “Remove Top Rows” or “Remove Bottom Rows” functions with a fixed number works fine when your source data structure never changes, but becomes problematic when the number of these rows varies.
In the example shown in my video, we have multiple rows of metadata before the actual header row (which contains “Date” as the first column value), and several rows of additional information at the bottom of the data that need to be removed. There are also some missing values throughout the data that make simple filtering ineffective.
Dynamically Removing Top Rows Based on Conditions
The conventional approach to removing top rows in Power Query involves specifying a fixed number. However, this can be problematic when the number of header rows changes. The dynamic solution involves using a condition rather than a fixed count.
Here’s how to implement this technique:
Go to the Home tab in Power Query Editor
Select “Remove Rows” and then “Remove Top Rows”
Instead of entering a number, modify the formula to use the “each” keyword followed by a condition
The formula will look something like this: each [Column1] <> “Date” (assuming “Date” is the header text in your first column). This tells Power Query to keep removing rows until it finds a row where the first column contains the text “Date”.
The Technical Details of the Table.Skip Function
Behind the scenes, Power Query uses the Table.Skip function when you remove top rows. This function has a hidden capability that isn’t obvious from the user interface — it can accept either a count or a condition parameter.
When using a condition, the syntax changes from simply providing a number to using the format:
each [condition]
The condition is evaluated for each row, starting from the top, and rows are removed until the condition is no longer true. This allows for dynamic adaptation to varying source data structures.
Dynamically Removing Bottom Rows
Similarly, we can apply the same concept to remove rows from the bottom of our dataset. This is particularly useful when dealing with files that contain summary information, notes, or other footer data that should be excluded from analysis.
The process for removing bottom rows dynamically is:
Go to the Home tab in Power Query Editor
Select “Remove Rows” and then “Remove Bottom Rows”
Replace the static number with a condition using the “each” keyword
For example, you might use a formula like: each [Merchant] = “” to remove rows from the bottom where the Merchant column contains an empty text string. Or you might use each [Revenue] = null to remove rows where the Revenue column contains null values.
Handling Different Types of Empty Values
When working with bottom rows, it’s important to understand the different types of empty values that might appear in your data:
Empty text strings — represented by “” in formulas
Null values — represented by null in formulas
Missing values — which might be null or empty depending on the data source
In the video demonstration, I show how to handle both empty text strings and null values as conditions for removing bottom rows. The key is to identify which column and which type of empty value reliably indicates the footer section of your data.
Practical Applications and Benefits
This dynamic row removal technique is particularly valuable in several scenarios:
When working with regularly updated reports where the structure might change slightly between versions, this approach ensures your Power Query solution remains robust. It’s also helpful when dealing with data exports from systems that include varying amounts of metadata or when consolidating multiple files that might have different header structures.
The major benefits include:
More resilient data transformation processes that don’t break when source formats change slightly
Reduced need for manual intervention when processing new data
Ability to handle files with inconsistent structure automatically
Greater flexibility compared to static row removal or simple filtering
This technique demonstrates the power of M language in Power Query, allowing for solutions that go beyond what’s immediately available in the user interface. By understanding and leveraging these more advanced capabilities, you can create more robust data transformation processes.
Important Considerations
When implementing this technique, keep in mind a few important points:
The condition you use must reliably identify the boundary between the rows you want to keep and those you want to remove. Choose column values that are consistently present (or consistently absent) at these boundaries. Also be aware that if your condition never evaluates to false, you could potentially remove all rows from your dataset, so testing with representative sample data is essential.
Additionally, remember that this technique works even when you have missing values in your actual data. As shown in the video, the rows are only removed when they match the specific condition you’ve defined, allowing rows with some missing values to be retained as long as they don’t match your removal condition.
Importing fixed-width column data to Excel can be challenging when each column has a set number of characters, often resulting in unnecessary spaces. In this article, I’ll show you how to properly import fixed-width CSV files using Power Query, ensuring clean data without those problematic extra spaces.
This technique is essential for handling structured data exports from legacy systems or standardized reports.
Understanding Fixed-Width Column Data
Fixed-width column data is a specific format where each column occupies exactly the same number of characters in every row. Unlike delimited formats (like comma-separated values), fixed-width files allocate a predetermined number of spaces for each field. This means shorter values get padded with spaces to maintain the fixed width. For example, the date column might always be 10 characters wide, the merchant column 20 characters, and so on.
This format is common in older systems, government data, and standardized reports, but it presents unique challenges when importing to Excel. The main issue is that these extra spaces can cause problems with data analysis and visualization if not properly handled during the import process.
Importing Fixed-Width Data Using Power Query
Power Query provides excellent tools for handling fixed-width column data. Here’s the process to properly import this type of data:
Initial Import Steps
To begin importing fixed-width data, follow these steps:
Go to the Data tab in Excel
Click on “From Text/CSV” option
Locate and select your fixed-width file
When the Power Query Editor opens, you might need to adjust the window size to see all available options, as sometimes the third column of settings may be hidden if the window is too narrow. Make sure to expand the window to see all import settings.
Configuring Import Settings
During the import process, you’ll need to configure several important settings:
First, check the file origin (encoding) — UTF‑8 works well for most modern files. Then, verify that the delimiter is set correctly. For fixed-width files, Power Query typically automatically detects this format. You should see “Fixed Width” selected rather than comma, tab, or other delimiters.
The data type detection setting is another crucial choice. You have three options:
Based on first 200 rows
Based on entire dataset
Do not detect data types
In most cases with fixed-width data, it’s better to select “Do not detect data types” initially and handle data type conversion after cleaning the spaces. This prevents Power Query from making assumptions about your data types that might be affected by the extra spaces.
Handling Extra Spaces in Power Query
The key challenge with fixed-width data is managing the extra spaces that pad each column. When examining the data in Power Query, you’ll notice these extra spaces in both the column headers and text fields. These spaces are necessary in the original file to maintain the fixed-width format, but they’re problematic for analysis in Excel.
For example, when clicking on a column header, you might see something like “Date ” with several spaces after the actual name. Similarly, text fields like product names or merchant names will have trailing spaces to fill their allocated width.
Removing Extra Spaces
To clean up the data properly:
Select all columns in the Power Query Editor by clicking the first column header and then Shift-clicking the last column
Go to the Transform tab
Click on “Format” and select “Trim”
This important step removes all leading and trailing white spaces from every cell in your dataset while preserving spaces that are actually part of your data (like spaces between words in product names). After trimming the data, you’ll notice a significant improvement in how your data looks.
Setting Headers and Data Types
After removing the extra spaces, you can proceed with configuring headers and data types:
Use the “Use First Row as Headers” option (found in the Transform tab or by right-clicking on the first row) to set your headers properly. Now that the spaces have been removed, your column names will appear clean and professional.
With clean data, you can now set appropriate data types for each column. For example, date columns should be formatted as dates, and revenue or cost columns as currency or decimal numbers. Power Query will help identify appropriate types, but you can manually change them by clicking the data type icon in each column header.
Loading Data to Excel
The final step is to load your cleaned data into Excel:
Click “Close & Load” (or “Close & Load To” for more options)
Select whether to load as a table or to a specific location
Choose an existing worksheet or create a new one
When loading the data, you have the option to add it to the data model if you plan to use it with Power Pivot, but this isn’t necessary for basic Excel analysis. One important thing to note is that while Power Query may detect currency data types, this doesn’t automatically apply Excel’s currency formatting to your table. You’ll need to apply appropriate number formatting in Excel after importing.
Fortunately, when you refresh your data in the future, Excel will maintain any number formatting you’ve applied to the table, making future updates seamless.
Benefits of Using Power Query for Fixed-Width Data
Using Power Query for importing fixed-width data offers several advantages over other methods:
The ability to transform data before loading it to Excel saves significant time compared to manual cleanup after import. The process is also repeatable — if you receive updated versions of the same file, you can simply refresh your query rather than repeating the entire import and cleanup process.
Additionally, Power Query maintains a record of all transformations applied, making your data preparation process transparent and easily adjustable if needed. This approach ensures consistency in how your fixed-width data is processed each time.
Power Query offers powerful solutions for cleaning data across columns by leveraging information contained in related columns. In this article, I’ll demonstrate how to efficiently remove cryptocurrency symbols from a name column when those symbols already exist in a separate column, even when the symbols have varying lengths.
This technique is particularly valuable when dealing with inconsistent data formats that require standardization for analysis.
Watch my step-by-step tutorial video to see these techniques in action:
Understanding the Data Cleaning Challenge
When working with financial data, especially cryptocurrency information, you’ll often encounter datasets where the same information appears redundantly across multiple columns. In my example, I have a table containing cryptocurrency data where the name column includes both the cryptocurrency name and its symbol (like “Bitcoin BTC”), while a separate symbol column already contains just the symbol (“BTC”).
The challenge lies in cleaning the name column by removing these redundant symbols. What makes this particularly tricky is that cryptocurrency symbols vary in length — some might be three characters (BTC), others four (DOGE) or more. This inconsistency means we can’t simply remove a fixed number of characters from the end of each name.
Preparing the Data in Power Query
To begin the cleaning process, we need to load our data into Power Query where we can apply the necessary transformations:
Navigate to the Data tab in Excel
Select your data range and choose to load it to Power Query
Remove any unnecessary columns to focus on the task at hand
In my case, I initially removed several columns to focus on just the key columns for this demonstration — the name column containing both cryptocurrency names and symbols, and the symbol column containing just the symbols. This simplified approach allows us to concentrate specifically on the cleaning technique without distraction.
Solution #1: Using Text After Delimiter
The first solution uses Power Query’s built-in functions to extract the relevant part of the text. This approach is more straightforward and uses the UI-based options:
To implement this solution:
Select the name column in Power Query
Go to the Add Column tab
Choose Extract and then Text After Delimiter
Initially, when using the UI, you might try entering a specific symbol like “BTC” as the delimiter. However, this would only work for one cryptocurrency. The real power comes from modifying the formula to use the symbol column as a reference.
After creating the initial formula, you’ll need to edit it in the formula bar to make it dynamic. Replace the hardcoded delimiter text with a reference to the symbol column. This way, Power Query will use each row’s corresponding symbol to extract just the cryptocurrency name, regardless of the symbol’s length.
Solution #2: Using Text.RemoveRange Function
The second solution is more complex but demonstrates the flexibility of Power Query’s M language. For this approach:
Go to the Add Column tab
Select Custom Column
Use the Text.RemoveRange function to precisely remove the symbol part
This function works by removing a specific range of characters from the text. It requires three parameters:
The source text (our name column)
The starting position (0‑based in Power Query)
The number of characters to remove (which we determine using the length of the symbol)
Remember that Power Query uses zero-based indexing, meaning the first character position is 0, not 1. This is crucial when working with text manipulation functions.
Finalizing the Transformation
After applying either solution, we need to clean up our query:
Rename the newly created column to something meaningful (like “name”)
Delete the original name column that contained the symbols
If needed, restore any previously removed columns by deleting the “Removed Columns” step
Arrange columns in the desired order by dragging them
In my example, I also demonstrated how to clean another column (circulating supply) that had a similar issue but with a simpler pattern. Since all the unwanted symbols in this column were preceded by a space, I could use the Extract Text Before Delimiter function with a space as the delimiter.
Loading the Cleaned Data Back to Excel
Once all transformations are complete, we can load the cleaned data back to Excel:
Go to the Home tab in Power Query
Select Close & Load To
Choose whether to load to a new worksheet or an existing location
In my demonstration, I loaded the data to cell A13 on the existing worksheet, which allows for easy comparison between the original and cleaned data.
Why This Approach Matters
The beauty of using Power Query for this type of data cleaning is that it creates a repeatable process. If your data source updates, you can simply refresh the query, and all the cleaning steps will be reapplied automatically. This saves significant time compared to manual cleaning or using formulas that would need to be copied down for each new set of data.
Additionally, by leveraging information from one column to clean another, we create a more robust solution that can handle variations in the data format without breaking. This approach is far superior to hard-coding specific text values or character counts that might change across different datasets.
In this guide, I’ll show you how to automatically remove completely empty columns in your Excel data using Power Query. This powerful technique saves you from tedious manual deletion and works even when columns contain null values, empty cells, and empty text strings (represented by “”).
The approach I’ll demonstrate uses a specific Power Query formula that identifies and removes columns with no meaningful data, greatly improving your data cleaning workflow.
Watch my detailed video walkthrough of this process:
Understanding Empty Columns in Power Query
When working with data in Excel, we often encounter columns that appear empty but can actually contain different types of “emptiness” that Power Query recognizes differently. It’s important to understand that Power Query distinguishes between null values (truly empty cells), and cells containing empty text strings (“”).
An empty text string occurs when a cell contains a formula that returns “” or when you paste such values. These look empty visually in Excel, but Power Query treats them differently than cells with null values. Our goal is to create a solution that identifies and removes columns regardless of which type of “emptiness” they contain, as long as the entire column contains no meaningful data.
Importing Data into Power Query
To begin the process of removing empty columns, we first need to import our data into Power Query. Here’s example how to do this:
Go to the Data tab in Excel
Click on “Get Data” → “From File” → “From Excel”
Locate and select your Excel file
Choose the worksheet containing your data (in my example, Sheet1)
After importing, you’ll notice that some columns might appear empty but contain different types of emptiness — some cells might have null values (shown as “null” in Power Query), while others might be truly empty or contain empty text strings. Our code will handle all these scenarios automatically.
The Power Query Solution
Rather than manually deleting empty columns, we’ll use a custom formula in Power Query that automatically identifies and removes all empty columns. This approach is much more efficient, especially when working with large datasets.
The Complete Code
Here’s the complete Power Query formula that we’ll use to remove empty columns:
To use this code in your own projects, you’ll need to replace #“Change Type” with the name of your previous step in Power Query. This reference appears in three places in the formula, so be sure to update all instances.
Breaking Down the Formula
Let’s break down this formula to understand how it works:
The formula starts with Table.RemoveColumns(), which is the function that will eventually remove our empty columns from the table. This function takes two parameters: the table we’re working with, and a list of column names to remove.
To generate the list of column names to remove, we use List.Select(), which filters a list based on a condition. We’re filtering the list of all column names (obtained using Table.ColumnNames()) to find those that meet our “empty column” criteria.
Extract all values in the column using Table.Column()
Remove any values that are empty strings or null using List.RemoveMatchingItems()
Count how many items remain using List.Count()
Check if this count equals zero (meaning the column is completely empty)
If the count equals zero, the column is considered empty and will be included in our list of columns to remove.
Implementing the Solution
To implement this solution in your Power Query Editor:
Click “Add Step” to create a new transformation step
In the formula bar, paste the complete code shown above
Replace “#“Change Type”” with the name of your previous step (typically the step after promoting headers)
Press Enter to apply the transformation
You should now see your data with all empty columns removed. The transformation happens immediately, leaving only columns that contain at least one non-empty value.
Testing the Solution
To verify that our solution works correctly, we can test it with different types of data. In my video, I demonstrated this by:
1. Starting with a dataset that contained several empty columns (with various types of emptiness)
2. Applying our Power Query solution to remove these columns
3. Modifying the source data to add a new column with a single value
4. Refreshing the query to verify that the column with a value is preserved
The solution correctly identified and retained columns that had at least one meaningful value, while removing columns that were completely empty.
Advanced Usage and Adaptations
This solution can be easily adapted to work with different datasets. The key is to ensure you’re referencing the correct previous step in your Power Query transformation sequence. If your previous step is named differently, simply replace all instances of “#“Change Type”” with the appropriate step name.
When working with very large datasets, this automated approach can save significant time compared to manually identifying and removing empty columns. Additionally, the solution is dynamic — if your source data changes and a previously empty column now contains values, refreshing the query will automatically retain that column.
Loading the Results
After applying our transformation, we can load the results back to Excel:
Click “Close & Load” (or “Close & Load To…” if you want to specify a destination)
Choose your preferred destination (worksheet, data model, etc.)
The cleaned data, with empty columns removed, will be loaded to your specified location
The resulting table will contain only meaningful columns that have at least one value. This makes your data more concise and easier to work with.
When your source data changes, simply refresh the query to apply the same transformations to the updated data. The empty column removal logic will be reapplied automatically, adapting to any changes in your source data’s structure.