How to Import Fixed-Width Files to Excel

How to Import Fixed-Width Files to Excel

Import­ing fixed-width col­umn data to Excel can be chal­leng­ing when each col­umn has a set num­ber of char­ac­ters, often result­ing in unnec­es­sary spaces. In this arti­cle, I’ll show you how to prop­er­ly import fixed-width CSV files using Pow­er Query, ensur­ing clean data with­out those prob­lem­at­ic extra spaces.

This tech­nique is essen­tial for han­dling struc­tured data exports from lega­cy sys­tems or stan­dard­ized reports.

Understanding Fixed-Width Column Data

Fixed-width col­umn data is a spe­cif­ic for­mat where each col­umn occu­pies exact­ly the same num­ber of char­ac­ters in every row. Unlike delim­it­ed for­mats (like com­ma-sep­a­rat­ed val­ues), fixed-width files allo­cate a pre­de­ter­mined num­ber of spaces for each field. This means short­er val­ues get padded with spaces to main­tain the fixed width. For exam­ple, the date col­umn might always be 10 char­ac­ters wide, the mer­chant col­umn 20 char­ac­ters, and so on.

This for­mat is com­mon in old­er sys­tems, gov­ern­ment data, and stan­dard­ized reports, but it presents unique chal­lenges when import­ing to Excel. The main issue is that these extra spaces can cause prob­lems with data analy­sis and visu­al­iza­tion if not prop­er­ly han­dled dur­ing the import process.

Importing Fixed-Width Data Using Power Query

Pow­er Query pro­vides excel­lent tools for han­dling fixed-width col­umn data. Here’s the process to prop­er­ly import this type of data:

Initial Import Steps

To begin import­ing fixed-width data, fol­low these steps:

  1. Go to the Data tab in Excel
  2. Click on “From Text/CSV” option
  3. Locate and select your fixed-width file

When the Pow­er Query Edi­tor opens, you might need to adjust the win­dow size to see all avail­able options, as some­times the third col­umn of set­tings may be hid­den if the win­dow is too nar­row. Make sure to expand the win­dow to see all import settings.

Configuring Import Settings

Dur­ing the import process, you’ll need to con­fig­ure sev­er­al impor­tant settings:

First, check the file ori­gin (encod­ing) — UTF‑8 works well for most mod­ern files. Then, ver­i­fy that the delim­iter is set cor­rect­ly. For fixed-width files, Pow­er Query typ­i­cal­ly auto­mat­i­cal­ly detects this for­mat. You should see “Fixed Width” select­ed rather than com­ma, tab, or oth­er delimiters.

The data type detec­tion set­ting is anoth­er cru­cial choice. You have three options:

  • Based on first 200 rows
  • Based on entire dataset
  • Do not detect data types

In most cas­es with fixed-width data, it’s bet­ter to select “Do not detect data types” ini­tial­ly and han­dle data type con­ver­sion after clean­ing the spaces. This pre­vents Pow­er Query from mak­ing assump­tions about your data types that might be affect­ed by the extra spaces.

Handling Extra Spaces in Power Query

The key chal­lenge with fixed-width data is man­ag­ing the extra spaces that pad each col­umn. When exam­in­ing the data in Pow­er Query, you’ll notice these extra spaces in both the col­umn head­ers and text fields. These spaces are nec­es­sary in the orig­i­nal file to main­tain the fixed-width for­mat, but they’re prob­lem­at­ic for analy­sis in Excel.

For exam­ple, when click­ing on a col­umn head­er, you might see some­thing like “Date ” with sev­er­al spaces after the actu­al name. Sim­i­lar­ly, text fields like prod­uct names or mer­chant names will have trail­ing spaces to fill their allo­cat­ed width.

Removing Extra Spaces

To clean up the data properly:

  1. Select all columns in the Pow­er Query Edi­tor by click­ing the first col­umn head­er and then Shift-click­ing the last column
  2. Go to the Trans­form tab
  3. Click on “For­mat” and select “Trim”

This impor­tant step removes all lead­ing and trail­ing white spaces from every cell in your dataset while pre­serv­ing spaces that are actu­al­ly part of your data (like spaces between words in prod­uct names). After trim­ming the data, you’ll notice a sig­nif­i­cant improve­ment in how your data looks.

Setting Headers and Data Types

After remov­ing the extra spaces, you can pro­ceed with con­fig­ur­ing head­ers and data types:

Use the “Use First Row as Head­ers” option (found in the Trans­form tab or by right-click­ing on the first row) to set your head­ers prop­er­ly. Now that the spaces have been removed, your col­umn names will appear clean and professional.

With clean data, you can now set appro­pri­ate data types for each col­umn. For exam­ple, date columns should be for­mat­ted as dates, and rev­enue or cost columns as cur­ren­cy or dec­i­mal num­bers. Pow­er Query will help iden­ti­fy appro­pri­ate types, but you can man­u­al­ly change them by click­ing the data type icon in each col­umn header.

Loading Data to Excel

The final step is to load your cleaned data into Excel:

  1. Click “Close & Load” (or “Close & Load To” for more options)
  2. Select whether to load as a table or to a spe­cif­ic location
  3. Choose an exist­ing work­sheet or cre­ate a new one

When load­ing the data, you have the option to add it to the data mod­el if you plan to use it with Pow­er Piv­ot, but this isn’t nec­es­sary for basic Excel analy­sis. One impor­tant thing to note is that while Pow­er Query may detect cur­ren­cy data types, this does­n’t auto­mat­i­cal­ly apply Excel’s cur­ren­cy for­mat­ting to your table. You’ll need to apply appro­pri­ate num­ber for­mat­ting in Excel after importing.

For­tu­nate­ly, when you refresh your data in the future, Excel will main­tain any num­ber for­mat­ting you’ve applied to the table, mak­ing future updates seamless.

Benefits of Using Power Query for Fixed-Width Data

Using Pow­er Query for import­ing fixed-width data offers sev­er­al advan­tages over oth­er methods:

The abil­i­ty to trans­form data before load­ing it to Excel saves sig­nif­i­cant time com­pared to man­u­al cleanup after import. The process is also repeat­able — if you receive updat­ed ver­sions of the same file, you can sim­ply refresh your query rather than repeat­ing the entire import and cleanup process.

Addi­tion­al­ly, Pow­er Query main­tains a record of all trans­for­ma­tions applied, mak­ing your data prepa­ra­tion process trans­par­ent and eas­i­ly adjustable if need­ed. This approach ensures con­sis­ten­cy in how your fixed-width data is processed each time.

Merge Tables with Summarized Data for Cleaner Reports

Merge Tables with Summarized Data for Cleaner Reports

Merg­ing tables in Pow­er Query with sum­ma­rized data pro­vides a pow­er­ful way to con­sol­i­date infor­ma­tion with­out over­whelm­ing detail. By aggre­gat­ing invoice details such as counts, sums, aver­ages, and medi­ans, you can cre­ate con­cise reports that high­light key insights while main­tain­ing the orig­i­nal table structure.

This tech­nique is par­tic­u­lar­ly valu­able when deal­ing with sales data where each invoice con­tains mul­ti­ple line items that need to be condensed.

In my video tuto­r­i­al, I demon­strate how to effec­tive­ly merge tables with sum­ma­riza­tion in Pow­er Query:

Understanding the Problem: Merging Tables with Summarization

When work­ing with rela­tion­al data like invoic­es and their detailed items, we often need to com­bine infor­ma­tion from mul­ti­ple tables. How­ev­er, sim­ply merg­ing tables can result in redun­dant data and unnec­es­sary com­plex­i­ty. The chal­lenge is to merge data while simul­ta­ne­ous­ly sum­ma­riz­ing the details to pro­vide mean­ing­ful insights.

In this demon­stra­tion, I work with two tables: one con­tain­ing basic invoice infor­ma­tion and anoth­er with detailed line items for each invoice. The goal is to enhance the invoice table with sum­ma­rized infor­ma­tion from the details table, such as the count of items, aver­age quan­ti­ties, sum of prices, and even sta­tis­ti­cal mea­sures like medians.

Loading Tables into Power Query

The first step in our process is to load both tables into Pow­er Query for manip­u­la­tion. This gives us access to the pow­er­ful data trans­for­ma­tion capa­bil­i­ties that allow for sophis­ti­cat­ed merg­ing and summarization.

To begin, I nav­i­gate to the Data tab in Excel and select “From Table/Range” to import our first table into Pow­er Query. For this ini­tial details table, I choose to load it only as a con­nec­tion rather than import­ing it direct­ly into the work­sheet. This is done by select­ing “Close and Load To” and then choos­ing the “Con­nec­tion only” option.

After load­ing the details table, I fol­low the same process to load the invoice table. This time, how­ev­er, we’ll per­form our merge oper­a­tion direct­ly with­in this query rather than cre­at­ing a new one.

Merging the Tables

With both tables loaded into Pow­er Query, we can now merge them using the com­mon invoice ID field. This is where the pow­er of our approach becomes evident.

To per­form the merge:

  1. Select the “Merge Queries” com­mand from the ribbon
  2. Choose the sec­ond table (details) to merge with
  3. Select match­ing columns for merg­ing (invoice ID in both tables)
  4. Choose “Left Out­er Join” as the join kind to ensure all invoic­es are included

After press­ing OK, Pow­er Query cre­ates a new col­umn in our invoice table that con­tains the match­ing rows from the details table. You’ll notice that for each invoice, we now have a nest­ed table show­ing all the cor­re­spond­ing detail rows. For exam­ple, some invoic­es might have two detail rows, oth­ers three, and some even five or more.

The Key Difference: Aggregating Instead of Expanding

The crit­i­cal step that dif­fer­en­ti­ates this approach from a stan­dard merge is what we do next. Instead of expand­ing these nest­ed tables (which would cre­ate dupli­cate invoice rows), we want to aggre­gate the infor­ma­tion they contain.

When we click on the expand but­ton (the small tri­an­gle) in the col­umn head­er, we get options to expand or aggre­gate. By choos­ing to aggre­gate, we can per­form cal­cu­la­tions across all the detail rows for each invoice.

Creating Summarized Metrics

The pow­er of this tech­nique lies in the vari­ety of aggre­ga­tions we can per­form on the detailed data. For our invoice exam­ple, I demon­strate sev­er­al use­ful metrics:

  • Count of items: Sim­ple count of how many line items exist for each invoice
  • Aver­age quan­ti­ty: Instead of sum­ming quan­ti­ties, we cal­cu­late the aver­age num­ber of items per line
  • Sum of unit price: Total val­ue of the invoice
  • Medi­an of unit price: Sta­tis­ti­cal mea­sure show­ing the mid­dle val­ue of all unit prices on the invoice

To cre­ate these aggre­ga­tions, I select the appro­pri­ate options from the drop­down menu when expand­ing the nest­ed table col­umn. By uncheck­ing “Use orig­i­nal col­umn name as pre­fix,” we get clean­er col­umn names in our results.

Refining the Results

After cre­at­ing our aggre­gat­ed columns, we may want to refine the col­umn names for clar­i­ty. This can be done direct­ly in the for­mu­la bar in Pow­er Query. For exam­ple, I rename one of the columns to “Count” to bet­ter reflect its purpose.

It’s also impor­tant to check the data types of our new columns. Pow­er Query auto­mat­i­cal­ly assigns data types, but they may need adjust­ment. In our case, the “Count” col­umn is cor­rect­ly set as a dec­i­mal num­ber, which works well since counts will always be whole numbers.

When work­ing with mon­e­tary val­ues like the sum of unit prices, Pow­er Query may pre­serve the cur­ren­cy data type. How­ev­er, it’s worth not­ing that this data type infor­ma­tion does­n’t auto­mat­i­cal­ly trans­late to for­mat­ting in Excel. We’ll need to apply cur­ren­cy for­mat­ting in Excel after load­ing the data.

Loading the Final Results

Once we’re sat­is­fied with our merged and sum­ma­rized table, we can load it back to Excel. From the Home tab in Pow­er Query, I select “Close and Load To” and choose to cre­ate a new table in the exist­ing worksheet.

The final result is an enhanced invoice table that includes aggre­gat­ed infor­ma­tion from the details table with­out dupli­cat­ing invoice records. This gives us a con­cise view of our data with valu­able insights such as the num­ber of items per invoice, aver­age quan­ti­ties, total val­ues, and sta­tis­ti­cal measures.

This tech­nique is par­tic­u­lar­ly valu­able for finan­cial analy­sis, sales report­ing, and any sit­u­a­tion where you need to com­bine mas­ter records with sum­ma­rized infor­ma­tion from relat­ed detail records. By lever­ag­ing Pow­er Query’s merg­ing and aggre­ga­tion capa­bil­i­ties, we can cre­ate more infor­ma­tive reports while main­tain­ing a clean, non-redun­dant data structure.

How to Create Sales Summaries with Power Query

How to Create Sales Summaries with Power Query

Sales data analy­sis often requires sum­ma­riz­ing results by spe­cif­ic cat­e­gories to gain valu­able busi­ness insights. In this arti­cle, I demon­strate how to effi­cient­ly cal­cu­late sales sum­maries using Pow­er Query’s group­ing func­tion­al­i­ty, pro­vid­ing a pow­er­ful alter­na­tive to tra­di­tion­al Excel for­mu­las for ana­lyz­ing mer­chant per­for­mance, income totals, and prod­uct distribution.

The group­ing tech­nique shown allows for mul­ti­ple aggre­ga­tions in a sin­gle oper­a­tion, dra­mat­i­cal­ly improv­ing effi­cien­cy when work­ing with larg­er datasets.

Watch my detailed tuto­r­i­al video below to fol­low along with each step:

Understanding the Sales Summary Challenge

When ana­lyz­ing sales data, we often need to cre­ate sum­maries that show per­for­mance met­rics grouped by spe­cif­ic cat­e­gories. In this demon­stra­tion, I’m work­ing with a dataset con­tain­ing sales trans­ac­tions that include mer­chants, prod­ucts, dates, and income val­ues. My goal is to sum­ma­rize this data to show sev­er­al key met­rics for each merchant:

  • Count of sales transactions
  • Sum of income
  • Aver­age income per transaction
  • List of unique prod­ucts sold

While these cal­cu­la­tions can be done using Excel for­mu­las (which I’ve pre­pared in the work­sheet for com­par­i­son), using Pow­er Query offers sig­nif­i­cant advan­tages, espe­cial­ly for larg­er datasets or repet­i­tive analy­sis tasks.

Importing Data into Power Query

The first step in our process is to bring the data into Pow­er Query where we can per­form the group­ing operations:

  1. Select the data table in Excel
  2. Nav­i­gate to the Data tab in the Excel ribbon
  3. Click “From Table/Range” to import the data into Pow­er Query

Once the data is loaded into the Pow­er Query Edi­tor, we can see all our sales records and begin the group­ing process to cre­ate our summary.

Creating Groups in Power Query

To sum­ma­rize our data by mer­chant, we need to use the Group By func­tion in Pow­er Query:

  1. Select the Mer­chant col­umn in the Pow­er Query Editor
  2. Go to the Home tab and click “Group By”

The default Group By dia­log offers basic func­tion­al­i­ty with just one col­umn for group­ing and a sin­gle aggre­ga­tion. For our needs, we require the Advanced option since we want to cre­ate mul­ti­ple aggre­ga­tions based on the mer­chant column.

Setting Up Advanced Grouping

In the advanced Group By dia­log, we need to con­fig­ure mul­ti­ple aggregations:

  1. Set the group­ing col­umn to Mer­chant
  2. Add the fol­low­ing aggregations: 
    • “Count Rows” for the num­ber of sales (named “Count”)
    • “Sum” of the Income col­umn (named “Sum”)
    • “Aver­age” of the Income col­umn (named “Aver­age”)
    • “All Rows” to cap­ture all relat­ed data (named “List”)

After click­ing OK, Pow­er Query cre­ates a new table with our mer­chant groups and the spec­i­fied aggre­ga­tions. The Count, Sum, and Aver­age val­ues match the Excel for­mu­las we pre­pared ear­li­er, con­firm­ing that our group­ing is work­ing correctly.

Customizing the Product List

The “All Rows” aggre­ga­tion cre­ates a nest­ed table for each mer­chant con­tain­ing all their sales records. How­ev­er, we only want a list of unique prod­ucts sold by each mer­chant, not all the row data. Since the Group By dia­log does­n’t offer this spe­cif­ic option, we need to mod­i­fy the M for­mu­la directly.

Modifying the M Formula

To cus­tomize our prod­uct list, we need to edit the M code that Pow­er Query generated:

  1. The default “All Rows” aggre­ga­tion pro­duces tables with all columns for each merchant
  2. We need to change this to extract only the Prod­uct col­umn values
  3. Fur­ther­more, we need to elim­i­nate dupli­cates from these lists

The orig­i­nal for­mu­la includes {_} which ref­er­ences the entire row. We need to replace this with a ref­er­ence to just the Prod­uct col­umn: [Prod­uct]. Addi­tion­al­ly, we need to wrap this in the List.Distinct func­tion to remove duplicates:

The mod­i­fied for­mu­la sec­tion should look like: List.Distinct([Product])

Transforming the Product List into Text

Now that we have lists of unique prod­ucts for each mer­chant, we can trans­form these lists into com­ma-sep­a­rat­ed text val­ues for bet­ter readability:

  1. Click on the expand but­ton in the List col­umn header
  2. Select “Extract Val­ues…” rather than “Expand to New Rows”
  3. Choose a cus­tom delim­iter (com­ma fol­lowed by space: “, ”)
  4. Click OK

This trans­forms our prod­uct lists into read­able text strings show­ing all the unique prod­ucts each mer­chant has sold, sep­a­rat­ed by commas.

Loading the Results Back to Excel

Once we’ve cre­at­ed our sum­ma­ry with all the required met­rics, we can load the results back to Excel:

  1. Go to the Home tab and click “Close & Load To…”
  2. Select “Exist­ing Work­sheet” and choose where to place the results
  3. Click OK

After load­ing, we can for­mat the Sum and Aver­age columns as cur­ren­cy to improve read­abil­i­ty. The final result is a clean sum­ma­ry table show­ing the count of sales, total income, aver­age income, and unique prod­ucts for each merchant.

Benefits of Using Power Query for Grouping

While our exam­ple used a rel­a­tive­ly small dataset, the advan­tages of using Pow­er Query for this type of analy­sis become even more appar­ent with larg­er data volumes:

  • Effi­cien­cy — Per­forms cal­cu­la­tions much faster than Excel for­mu­las for large datasets
  • Repeata­bil­i­ty — The query can be refreshed when data changes with­out redo­ing formulas
  • Flex­i­bil­i­ty — Allows for com­plex aggre­ga­tions and trans­for­ma­tions that would be dif­fi­cult with Excel formulas
  • Data clean­ing — Inte­grates with oth­er Pow­er Query trans­for­ma­tions for com­pre­hen­sive data preparation

By mas­ter­ing this group­ing tech­nique in Pow­er Query, you can cre­ate sophis­ti­cat­ed sales sum­maries and oth­er data aggre­ga­tions with just a few clicks, sav­ing sig­nif­i­cant time and effort com­pared to tra­di­tion­al Excel methods.

Automate Student Exam Grading with Excel Power Query

Automate Student Exam Grading with Excel Power Query

Using append­ing in Pow­er Query offers a stream­lined approach to auto­mat­i­cal­ly grade stu­dent exams based on pre­de­fined thresh­olds. This method elim­i­nates man­u­al grad­ing errors and saves sig­nif­i­cant time when pro­cess­ing large sets of stu­dent scores, pro­vid­ing an effi­cient way to man­age aca­d­e­m­ic assessments.

The tech­nique com­bines stu­dent scores with grade thresh­olds to gen­er­ate accu­rate results that can be eas­i­ly updat­ed when grad­ing cri­te­ria change.

Watch my step-by-step tuto­r­i­al on how to imple­ment this grad­ing sys­tem using Pow­er Query:

Understanding the Grading Process with Power Query

Pow­er Query’s append­ing fea­ture allows us to com­bine mul­ti­ple tables in a way that cre­ates rela­tion­ships between stu­dent scores and grade thresh­olds. The append­ing process essen­tial­ly stacks tables on top of each oth­er, cre­at­ing a sin­gle dataset that we can manip­u­late to auto­mat­i­cal­ly assign grades. This approach is par­tic­u­lar­ly use­ful in edu­ca­tion­al set­tings where instruc­tors need to process mul­ti­ple exam results efficiently.

The pri­ma­ry advan­tage of using this method is that when grad­ing cri­te­ria change, you don’t need to man­u­al­ly reas­sign grades to each stu­dent. Sim­ply update the thresh­old val­ues in your source table, refresh the query, and all grades will auto­mat­i­cal­ly update to reflect the new standards.

Required Resources

Before start­ing the grad­ing process, you’ll need two essen­tial tables:

  • A table con­tain­ing stu­dent names and their exam points
  • A table defin­ing grade thresh­olds (what point ranges cor­re­spond to which let­ter grades)

These tables will serve as the foun­da­tion for our auto­mat­ed grad­ing sys­tem. The struc­ture of these tables does­n’t need to be complex—just clear­ly orga­nized with appro­pri­ate head­ers to iden­ti­fy the data.

Setting Up the Tables in Power Query

To begin imple­ment­ing the grad­ing sys­tem, we need to import both tables into Pow­er Query. This process allows us to manip­u­late and com­bine the data efficiently.

First, with the stu­dent scores table select­ed, nav­i­gate to the Data tab in Excel. From there, select “From Table/Range” to import the first table into Pow­er Query. When prompt­ed to load the data, choose “Cre­ate Con­nec­tion Only” as we’ll be manip­u­lat­ing this data fur­ther before bring­ing it back to Excel.

Repeat the same process for your grade thresh­olds table. Hav­ing both tables acces­si­ble in Pow­er Query is essen­tial for the next step where we’ll com­bine them through appending.

Appending the Tables

Once both tables are in Pow­er Query, we can com­bine them using the append function:

On the Home tab with­in Pow­er Query Edi­tor, click the Append Queries but­ton (not “Append Queries as New”). In the dia­log box that appears, con­firm that you want to append just the two tables you’ve import­ed. Select the stu­dent table as your pri­ma­ry table and the grades thresh­old table as the table to append.

After append­ing, you’ll notice that the result­ing table con­tains all rows from both orig­i­nal tables. The stu­dent data appears first, fol­lowed by the thresh­old data. This com­bined dataset gives us every­thing we need to assign grades automatically.

Organizing and Sorting the Data

For our grad­ing sys­tem to work prop­er­ly, we need to sort the append­ed data by the num­ber of points. This sort­ing is a cru­cial step because it allows us to arrange scores in ascend­ing order, mak­ing it eas­i­er to assign appro­pri­ate grades.

Click on the col­umn con­tain­ing the points/scores and sort from A to Z (ascend­ing). This action arranges all scores from low­est to high­est, with the grade thresh­olds inte­grat­ed among the stu­dent scores based on their point values.

Because of how the data is now arranged, the grade thresh­olds appear at posi­tions where they nat­u­ral­ly divide the stu­dent scores into dif­fer­ent grade categories.

Assigning Grades Automatically

With our data prop­er­ly sort­ed, we can now use a sim­ple but pow­er­ful tech­nique to assign grades to each stu­dent. The key to this process is the Fill Down com­mand in Pow­er Query.

The grade col­umn ini­tial­ly has val­ues only in the rows that came from our thresh­old table. To assign these grades to stu­dents, select the grade col­umn and nav­i­gate to the Trans­form tab. Click on “Fill” and then “Down.” This action copies each grade down to all rows below it until anoth­er grade is encountered.

As a result, each stu­dent row is auto­mat­i­cal­ly assigned the grade cor­re­spond­ing to their score based on the defined thresh­olds. For exam­ple, if the thresh­old for grade C is 40 points, stu­dents with 40 or more points (but below the next thresh­old) will be assigned a C.

Filtering and Finalizing the Results

After fill­ing down the grades, we need to fil­ter out the thresh­old rows to see only the stu­dent results:

Click the fil­ter but­ton on the col­umn con­tain­ing stu­dent names and select “Fil­ter out emp­ty val­ues” or specif­i­cal­ly select only the rows con­tain­ing actu­al stu­dent data. This leaves us with just the stu­dent infor­ma­tion and their cor­re­spond­ing grades.

With the data fil­tered and grades assigned, we can now load this infor­ma­tion back to Excel. Nav­i­gate to the Home tab and select “Close & Load To.” Choose where you want to place the results in your workbook.

Updating Grades When Thresholds Change

One of the most pow­er­ful aspects of this approach is how eas­i­ly you can adjust grad­ing criteria:

If you need to change the grad­ing thresh­olds, sim­ply update the val­ues in your orig­i­nal thresh­old table. For exam­ple, you might change the min­i­mum score for an A from 90 to 88 points. After mak­ing these changes, save your file and then refresh the query (right-click on your results table and select “Refresh”).

The entire grad­ing process will auto­mat­i­cal­ly run again with the new thresh­olds, updat­ing all stu­dent grades accord­ing­ly. This makes the sys­tem extreme­ly flex­i­ble and saves sig­nif­i­cant time when adjust­ments are needed.

Real-World Application Example

In my video demon­stra­tion, I showed how this sys­tem works with a sam­ple dataset. For instance, Jen­nifer earned 13 points which result­ed in an F grade, Robert received 34 points also result­ing in an F, while Olivia scored 40 points which met the thresh­old for an E grade.

After chang­ing the thresh­olds and refresh­ing the query, the grades updat­ed auto­mat­i­cal­ly to reflect the new stan­dards. This dynam­ic updat­ing capa­bil­i­ty is what makes this approach so valu­able for edu­ca­tors who may need to adjust grad­ing scales based on class per­for­mance or oth­er factors.

How to Clean one column based on the other column

How to Clean one column based on the other column

Pow­er Query offers pow­er­ful solu­tions for clean­ing data across columns by lever­ag­ing infor­ma­tion con­tained in relat­ed columns. In this arti­cle, I’ll demon­strate how to effi­cient­ly remove cryp­tocur­ren­cy sym­bols from a name col­umn when those sym­bols already exist in a sep­a­rate col­umn, even when the sym­bols have vary­ing lengths.

This tech­nique is par­tic­u­lar­ly valu­able when deal­ing with incon­sis­tent data for­mats that require stan­dard­iza­tion for analysis.

Watch my step-by-step tuto­r­i­al video to see these tech­niques in action:

Understanding the Data Cleaning Challenge

When work­ing with finan­cial data, espe­cial­ly cryp­tocur­ren­cy infor­ma­tion, you’ll often encounter datasets where the same infor­ma­tion appears redun­dant­ly across mul­ti­ple columns. In my exam­ple, I have a table con­tain­ing cryp­tocur­ren­cy data where the name col­umn includes both the cryp­tocur­ren­cy name and its sym­bol (like “Bit­coin BTC”), while a sep­a­rate sym­bol col­umn already con­tains just the sym­bol (“BTC”).

The chal­lenge lies in clean­ing the name col­umn by remov­ing these redun­dant sym­bols. What makes this par­tic­u­lar­ly tricky is that cryp­tocur­ren­cy sym­bols vary in length — some might be three char­ac­ters (BTC), oth­ers four (DOGE) or more. This incon­sis­ten­cy means we can’t sim­ply remove a fixed num­ber of char­ac­ters from the end of each name.

Preparing the Data in Power Query

To begin the clean­ing process, we need to load our data into Pow­er Query where we can apply the nec­es­sary transformations:

  1. Nav­i­gate to the Data tab in Excel
  2. Select your data range and choose to load it to Pow­er Query
  3. Remove any unnec­es­sary columns to focus on the task at hand

In my case, I ini­tial­ly removed sev­er­al columns to focus on just the key columns for this demon­stra­tion — the name col­umn con­tain­ing both cryp­tocur­ren­cy names and sym­bols, and the sym­bol col­umn con­tain­ing just the sym­bols. This sim­pli­fied approach allows us to con­cen­trate specif­i­cal­ly on the clean­ing tech­nique with­out distraction.

Solution #1: Using Text After Delimiter

The first solu­tion uses Pow­er Query’s built-in func­tions to extract the rel­e­vant part of the text. This approach is more straight­for­ward and uses the UI-based options:

To imple­ment this solution:

  1. Select the name col­umn in Pow­er Query
  2. Go to the Add Col­umn tab
  3. Choose Extract and then Text After Delimiter

Ini­tial­ly, when using the UI, you might try enter­ing a spe­cif­ic sym­bol like “BTC” as the delim­iter. How­ev­er, this would only work for one cryp­tocur­ren­cy. The real pow­er comes from mod­i­fy­ing the for­mu­la to use the sym­bol col­umn as a reference.

After cre­at­ing the ini­tial for­mu­la, you’ll need to edit it in the for­mu­la bar to make it dynam­ic. Replace the hard­cod­ed delim­iter text with a ref­er­ence to the sym­bol col­umn. This way, Pow­er Query will use each row’s cor­re­spond­ing sym­bol to extract just the cryp­tocur­ren­cy name, regard­less of the sym­bol­’s length.

Solution #2: Using Text.RemoveRange Function

The sec­ond solu­tion is more com­plex but demon­strates the flex­i­bil­i­ty of Pow­er Query’s M lan­guage. For this approach:

  1. Go to the Add Col­umn tab
  2. Select Cus­tom Column
  3. Use the Text.RemoveRange func­tion to pre­cise­ly remove the sym­bol part

The for­mu­la will look some­thing like:

Text.RemoveRange([name], 0, Text.Length([symbol]))

This func­tion works by remov­ing a spe­cif­ic range of char­ac­ters from the text. It requires three parameters:

  • The source text (our name column)
  • The start­ing posi­tion (0‑based in Pow­er Query)
  • The num­ber of char­ac­ters to remove (which we deter­mine using the length of the symbol)

Remem­ber that Pow­er Query uses zero-based index­ing, mean­ing the first char­ac­ter posi­tion is 0, not 1. This is cru­cial when work­ing with text manip­u­la­tion functions.

Finalizing the Transformation

After apply­ing either solu­tion, we need to clean up our query:

  1. Rename the new­ly cre­at­ed col­umn to some­thing mean­ing­ful (like “name”)
  2. Delete the orig­i­nal name col­umn that con­tained the symbols
  3. If need­ed, restore any pre­vi­ous­ly removed columns by delet­ing the “Removed Columns” step
  4. Arrange columns in the desired order by drag­ging them

In my exam­ple, I also demon­strat­ed how to clean anoth­er col­umn (cir­cu­lat­ing sup­ply) that had a sim­i­lar issue but with a sim­pler pat­tern. Since all the unwant­ed sym­bols in this col­umn were pre­ced­ed by a space, I could use the Extract Text Before Delim­iter func­tion with a space as the delimiter.

Loading the Cleaned Data Back to Excel

Once all trans­for­ma­tions are com­plete, we can load the cleaned data back to Excel:

  1. Go to the Home tab in Pow­er Query
  2. Select Close & Load To
  3. Choose whether to load to a new work­sheet or an exist­ing location

In my demon­stra­tion, I loaded the data to cell A13 on the exist­ing work­sheet, which allows for easy com­par­i­son between the orig­i­nal and cleaned data.

Why This Approach Matters

The beau­ty of using Pow­er Query for this type of data clean­ing is that it cre­ates a repeat­able process. If your data source updates, you can sim­ply refresh the query, and all the clean­ing steps will be reap­plied auto­mat­i­cal­ly. This saves sig­nif­i­cant time com­pared to man­u­al clean­ing or using for­mu­las that would need to be copied down for each new set of data.

Addi­tion­al­ly, by lever­ag­ing infor­ma­tion from one col­umn to clean anoth­er, we cre­ate a more robust solu­tion that can han­dle vari­a­tions in the data for­mat with­out break­ing. This approach is far supe­ri­or to hard-cod­ing spe­cif­ic text val­ues or char­ac­ter counts that might change across dif­fer­ent datasets.