C# Code: How to transform Åäö to Aao

I have extended the Friendly Url Rewriter in one project to rewrite all URL:s following a specific pattern to a search page.

Search Engine Optimization (SEO) with Friendly URL Rewriter

Instead of having to use URL that looks like this:

http://www.example.com/Search.aspx?country=åland?type=hotel

We accept URL:s in a more friendly format and rewrite internally to the format above with parameters:

http://www.example.com/åland-hotel.html

There exists no file with that name on the server instead we recognize the pattern  with a regular exception like this one:

^(?<country>.+)-(?<type>.+)\.html$

Google AdWords does not allow diacritic marks in URLS

image Using letters with accents, rings and umlaut a url is not allowed with Google AdWords so we needed a generic matching algorithm that would both recognize “aland” and “åland” when comparing search parameters.

There is a nice Unicode function, String.Normalize(), that makes it very easy to transform unwanted characters into something allowed.

Fabrice wrote a snippet of code to transform Åäö to Aao that I used:

public static String RemoveDiacritics(string s)
{
    string normalizedString = s.Normalize(NormalizationForm.FormD);
    StringBuilder stringBuilder = new StringBuilder();
    for (int i = 0; i < normalizedString.Length; i++)
    {
        char c = normalizedString[i];
        if (CharUnicodeInfo.GetUnicodeCategory(c) != UnicodeCategory.NonSpacingMark)
            stringBuilder.Append(c);
    }
    return stringBuilder.ToString();
}