image 8
image 8

press to zoom
cta-section1-original-image
cta-section1-original-image

press to zoom
hero-image
hero-image

press to zoom
image 8
image 8

press to zoom
1/3
  • Oscar Martinez (Valerock)

Power Query Text Cleaning Function

Updated: Mar 15

I've developed a Power Query function for cleaning up text columns through time. The operations the function perform are:

  1. Replace characters

  2. Replaces matching words

  3. Remove characters

  4. Trims text

  5. Replace full cell contents

Instead of a boring explanation, see the video below that shows how the text "University fo Zürich" goes through different transformations; all the rows of a column will go through the same transformation.

Instructions for using the function

1. Copy and paste the following code into a new query, creating a new function.

2. In the table containing the column you want to clean, paste the following code at the end; make sure you replace the "Previous Step" and "Column to be cleaned" parameters to match your table and previous code.

3. Depending on your needs, your will need to modify the code.

  • If you need to replace different/more characters, add more rows to this part of the code.

//This step replaces characters
    #"ReplaceMatchingChar" = Text.Combine(
        List.ReplaceMatchingItems(
            Text.ToList (Source), { 
                {"ü", "u"},
                {"ö", "o"}
            }
        )
    ),
  • If you need to replace more complete words, modify this section.

//This step replaces words
    #"ReplaceMatchingWord" = Text.Combine(
        List.ReplaceMatchingItems(
            Text.Split(#"ReplaceMatchingChar"," "),
            {
                {"fo", "of"},
                {"teh", "the"}              
            }
        )," "
    ),
  • If you need to remove more characters, add more between the double-quotes.

//This step removes characters
    #"Cleaned" = 
        Table.TransformColumns(
            #table(1, {{#"ReplaceMatchingWord"}}),
            {{
                    "Column1", 
                    each 
                        List.Accumulate(Text.ToList(",."),_,(String,Remove) => Text.Replace(String,Remove,""))
            }}
        ),
  • If you need to replace more complete cell contents, add more rows to this part of the code

//This step replaces full cell contents
    #"ReplaceFullContent" = List.ReplaceMatchingItems(
        {#"Trimmed Text"{0}[Column1]},
        {
            {"University of Zurich","UZH"},
            {"University of the Abroad","University abroad"}
        }
    ),



120 views

Recent Posts

See All