Oscar Martinez
Power Query Text Cleaning Function
Updated: Nov 11, 2022
I've developed a Power Query function for cleaning up text columns through time. The operations the function perform are:
Replace characters
Replaces matching words
Remove characters
Trims text
Replace full cell contents
Instead of a boring explanation, see the video below that shows how the text "University fo Zürich" goes through different transformations; all the rows of a column will go through the same transformation.
Instructions for using the function
1. Copy and paste the following code into a new query, creating a new function.
2. In the table containing the column you want to clean, paste the following code at the end; make sure you replace the "Previous Step" and "Column to be cleaned" parameters to match your table and previous code.
3. Depending on your needs, your will need to modify the code.
If you need to replace different/more characters, add more rows to this part of the code.
//This step replaces characters
#"ReplaceMatchingChar" = Text.Combine(
List.ReplaceMatchingItems(
Text.ToList (Source), {
{"ü", "u"},
{"ö", "o"}
}
)
),
If you need to replace more complete words, modify this section.
//This step replaces words
#"ReplaceMatchingWord" = Text.Combine(
List.ReplaceMatchingItems(
Text.Split(#"ReplaceMatchingChar"," "),
{
{"fo", "of"},
{"teh", "the"}
}
)," "
),
If you need to remove more characters, add more between the double-quotes.
//This step removes characters
#"Cleaned" =
Table.TransformColumns(
#table(1, {{#"ReplaceMatchingWord"}}),
{{
"Column1",
each
List.Accumulate(Text.ToList(",."),_,(String,Remove) => Text.Replace(String,Remove,""))
}}
),
If you need to replace more complete cell contents, add more rows to this part of the code
//This step replaces full cell contents
#"ReplaceFullContent" = List.ReplaceMatchingItems(
{#"Trimmed Text"{0}[Column1]},
{
{"University of Zurich","UZH"},
{"University of the Abroad","University abroad"}
}
),