C# Code: How to transform Åäö to Aao

I have extended the Friendly Url Rewriter in one project to rewrite all URL:s following a specific pattern to a search page.

Search Engine Optimization (SEO) with Friendly URL Rewriter

Instead of having to use URL that looks like this:

http://www.example.com/Search.aspx?country=åland?type=hotel

We accept URL:s in a more friendly format and rewrite internally to the format above with parameters:

http://www.example.com/åland-hotel.html

There exists no file with that name on the server instead we recognize the pattern  with a regular exception like this one:

^(?<country>.+)-(?<type>.+)\.html$

Google AdWords does not allow diacritic marks in URLS

image Using letters with accents, rings and umlaut a url is not allowed with Google AdWords so we needed a generic matching algorithm that would both recognize “aland” and “åland” when comparing search parameters.

There is a nice Unicode function, String.Normalize(), that makes it very easy to transform unwanted characters into something allowed.

Fabrice wrote a snippet of code to transform Åäö to Aao that I used:

public static String RemoveDiacritics(string s)
{
    string normalizedString = s.Normalize(NormalizationForm.FormD);
    StringBuilder stringBuilder = new StringBuilder();
    for (int i = 0; i < normalizedString.Length; i++)
    {
        char c = normalizedString[i];
        if (CharUnicodeInfo.GetUnicodeCategory(c) != UnicodeCategory.NonSpacingMark)
            stringBuilder.Append(c);
    }
    return stringBuilder.ToString();
}
Bookmark and Share

Tags: ,

  1. edenstrom.wordpress.com’s avatar

    Hi Fredrik, noticed that you had bought AdWords. That’s serious blogging! Any monetary contributions from EPiServer, a gift certificate or is it plain traffic gathering? ;)

    Reply

  2. Fredrik Haglund’s avatar

    Hej Martin!
    No, it is my own wallet but I needed to learn how it works in detail and why not experiment with your own site? And yes, it is always satisfying when someone read what write about but I do not spend much on it…
    /Fredrik

    Reply

  3. nick’s avatar

    Hi,

    Could you please provide a code sample for the same
    as I am trying urlrewriting by extending episerver urlrewriting but getting some issues.

    Any help will be really appreciated.

    Many thanks,
    Nick

    Reply

  4. nick’s avatar

    Hi,

    I am overriding HttpUrlRewriteToInternal function and passing it the url that I need to be passed internally (not to be shown on the address bar)
    but for some reason the url on my address bar is getting changed.

    Do i need to override some other function as well to avoid this happening.

    Could you please help on this.

    regards,
    Nick

    Reply

  5. nick’s avatar

    in continuation : if I pass relative path to .aspx in HTTPUrlRewriteToInternal function, then urlrewriting works
    but the only issue is that is that I am unable to then refer to episerver custom properties on that page.

    Could you please let me know if there is a solution for this.

    thanks alot,
    Nick

    Reply

  6. Ted Nyberg’s avatar

    Nick, if you’re interested in custom URL rewriting you could have a look at the following post on implementing a custom URL rewrite provider in EPiServer: http://labs.episerver.com/en/Blogs/Ted-Nyberg/Dates/112276/7/Implementing-a-custom-URL-rewrite-provider-for-EPiServer/

    Reply

  7. ojejej’s avatar

    Try wReplace tool which removes diacritic:

    http://wwidgets.com/us_wReplace.html

    There is available replacement table for removing diacritics (accents), so you can test your solutions.

    Reply