Pow­er Query offers pow­er­ful solu­tions for clean­ing data across columns by lever­ag­ing infor­ma­tion con­tained in relat­ed columns. In this arti­cle, I’ll demon­strate how to effi­cient­ly remove cryp­tocur­ren­cy sym­bols from a name col­umn when those sym­bols already exist in a sep­a­rate col­umn, even when the sym­bols have vary­ing lengths.

This tech­nique is par­tic­u­lar­ly valu­able when deal­ing with incon­sis­tent data for­mats that require stan­dard­iza­tion for analysis.

Watch my step-by-step tuto­r­i­al video to see these tech­niques in action:

Understanding the Data Cleaning Challenge

When work­ing with finan­cial data, espe­cial­ly cryp­tocur­ren­cy infor­ma­tion, you’ll often encounter datasets where the same infor­ma­tion appears redun­dant­ly across mul­ti­ple columns. In my exam­ple, I have a table con­tain­ing cryp­tocur­ren­cy data where the name col­umn includes both the cryp­tocur­ren­cy name and its sym­bol (like “Bit­coin BTC”), while a sep­a­rate sym­bol col­umn already con­tains just the sym­bol (“BTC”).

The chal­lenge lies in clean­ing the name col­umn by remov­ing these redun­dant sym­bols. What makes this par­tic­u­lar­ly tricky is that cryp­tocur­ren­cy sym­bols vary in length — some might be three char­ac­ters (BTC), oth­ers four (DOGE) or more. This incon­sis­ten­cy means we can’t sim­ply remove a fixed num­ber of char­ac­ters from the end of each name.

Preparing the Data in Power Query

To begin the clean­ing process, we need to load our data into Pow­er Query where we can apply the nec­es­sary transformations:

  1. Nav­i­gate to the Data tab in Excel
  2. Select your data range and choose to load it to Pow­er Query
  3. Remove any unnec­es­sary columns to focus on the task at hand

In my case, I ini­tial­ly removed sev­er­al columns to focus on just the key columns for this demon­stra­tion — the name col­umn con­tain­ing both cryp­tocur­ren­cy names and sym­bols, and the sym­bol col­umn con­tain­ing just the sym­bols. This sim­pli­fied approach allows us to con­cen­trate specif­i­cal­ly on the clean­ing tech­nique with­out distraction.

Solution #1: Using Text After Delimiter

The first solu­tion uses Pow­er Query’s built-in func­tions to extract the rel­e­vant part of the text. This approach is more straight­for­ward and uses the UI-based options:

To imple­ment this solution:

  1. Select the name col­umn in Pow­er Query
  2. Go to the Add Col­umn tab
  3. Choose Extract and then Text After Delimiter

Ini­tial­ly, when using the UI, you might try enter­ing a spe­cif­ic sym­bol like “BTC” as the delim­iter. How­ev­er, this would only work for one cryp­tocur­ren­cy. The real pow­er comes from mod­i­fy­ing the for­mu­la to use the sym­bol col­umn as a reference.

After cre­at­ing the ini­tial for­mu­la, you’ll need to edit it in the for­mu­la bar to make it dynam­ic. Replace the hard­cod­ed delim­iter text with a ref­er­ence to the sym­bol col­umn. This way, Pow­er Query will use each row’s cor­re­spond­ing sym­bol to extract just the cryp­tocur­ren­cy name, regard­less of the sym­bol­’s length.

Solution #2: Using Text.RemoveRange Function

The sec­ond solu­tion is more com­plex but demon­strates the flex­i­bil­i­ty of Pow­er Query’s M lan­guage. For this approach:

  1. Go to the Add Col­umn tab
  2. Select Cus­tom Column
  3. Use the Text.RemoveRange func­tion to pre­cise­ly remove the sym­bol part

The for­mu­la will look some­thing like:

Text.RemoveRange([name], 0, Text.Length([symbol]))

This func­tion works by remov­ing a spe­cif­ic range of char­ac­ters from the text. It requires three parameters:

  • The source text (our name column)
  • The start­ing posi­tion (0‑based in Pow­er Query)
  • The num­ber of char­ac­ters to remove (which we deter­mine using the length of the symbol)

Remem­ber that Pow­er Query uses zero-based index­ing, mean­ing the first char­ac­ter posi­tion is 0, not 1. This is cru­cial when work­ing with text manip­u­la­tion functions.

Finalizing the Transformation

After apply­ing either solu­tion, we need to clean up our query:

  1. Rename the new­ly cre­at­ed col­umn to some­thing mean­ing­ful (like “name”)
  2. Delete the orig­i­nal name col­umn that con­tained the symbols
  3. If need­ed, restore any pre­vi­ous­ly removed columns by delet­ing the “Removed Columns” step
  4. Arrange columns in the desired order by drag­ging them

In my exam­ple, I also demon­strat­ed how to clean anoth­er col­umn (cir­cu­lat­ing sup­ply) that had a sim­i­lar issue but with a sim­pler pat­tern. Since all the unwant­ed sym­bols in this col­umn were pre­ced­ed by a space, I could use the Extract Text Before Delim­iter func­tion with a space as the delimiter.

Loading the Cleaned Data Back to Excel

Once all trans­for­ma­tions are com­plete, we can load the cleaned data back to Excel:

  1. Go to the Home tab in Pow­er Query
  2. Select Close & Load To
  3. Choose whether to load to a new work­sheet or an exist­ing location

In my demon­stra­tion, I loaded the data to cell A13 on the exist­ing work­sheet, which allows for easy com­par­i­son between the orig­i­nal and cleaned data.

Why This Approach Matters

The beau­ty of using Pow­er Query for this type of data clean­ing is that it cre­ates a repeat­able process. If your data source updates, you can sim­ply refresh the query, and all the clean­ing steps will be reap­plied auto­mat­i­cal­ly. This saves sig­nif­i­cant time com­pared to man­u­al clean­ing or using for­mu­las that would need to be copied down for each new set of data.

Addi­tion­al­ly, by lever­ag­ing infor­ma­tion from one col­umn to clean anoth­er, we cre­ate a more robust solu­tion that can han­dle vari­a­tions in the data for­mat with­out break­ing. This approach is far supe­ri­or to hard-cod­ing spe­cif­ic text val­ues or char­ac­ter counts that might change across dif­fer­ent datasets.