C# code to linkify urls in a string

It's a pretty simple task you can acheive it with Regex and a ready-to-go regular expression from:

  • http://regexlib.com/

Something like:

var html = Regex.Replace(html, @"^(http|https|ftp)\://[a-zA-Z0-9\-\.]+" +
                         "\.[a-zA-Z]{2,3}(:[a-zA-Z0-9]*)?/?" +
                         "([a-zA-Z0-9\-\._\?\,\'/\\\+&%\$#\=~])*$",
                         "<a href=\"$1\">$1</a>");

You may also be interested not only in creating links but in shortening URLs. Here is a good article on this subject:

  • Resolve and shorten URLs in C#

See also:

  • Regular Expression Workbench at MSDN
  • Converting a URL into a Link in C# Using Regular Expressions
  • Regex to find URL within text and make them as link
  • Regex.Replace Method at MSDN
  • The Problem With URLs by Jeff Atwood
  • Parsing URLs with Regular Expressions and the Regex Object
  • Format URLs in string to HTML Links in C#
  • Automatically hyperlink URL and Email in ASP.NET Pages with C#

well, after a lot of research on this, and several attempts to fix times when

  1. people enter in http://www.sitename.com and www.sitename.com in the same post
  2. fixes to parenthisis like (http://www.sitename.com) and http://msdn.microsoft.com/en-us/library/aa752574(vs.85).aspx
  3. long urls like: http://www.amazon.com/gp/product/b000ads62g/ref=s9_simz_gw_s3_p74_t1?pf_rd_m=atvpdkikx0der&pf_rd_s=center-2&pf_rd_r=04eezfszazqzs8xfm9yd&pf_rd_t=101&pf_rd_p=470938631&pf_rd_i=507846

we are now using this HtmlHelper extension... thought I would share and get any comments:

    private static Regex regExHttpLinks = new Regex(@"(?<=\()\b(https?://|www\.)[-A-Za-z0-9+&@#/%?=~_()|!:,.;]*[-A-Za-z0-9+&@#/%=~_()|](?=\))|(?<=(?<wrap>[=~|_#]))\b(https?://|www\.)[-A-Za-z0-9+&@#/%?=~_()|!:,.;]*[-A-Za-z0-9+&@#/%=~_()|](?=\k<wrap>)|\b(https?://|www\.)[-A-Za-z0-9+&@#/%?=~_()|!:,.;]*[-A-Za-z0-9+&@#/%=~_()|]", RegexOptions.Compiled | RegexOptions.IgnoreCase);

    public static string Format(this HtmlHelper htmlHelper, string html)
    {
        if (string.IsNullOrEmpty(html))
        {
            return html;
        }

        html = htmlHelper.Encode(html);
        html = html.Replace(Environment.NewLine, "<br />");

        // replace periods on numeric values that appear to be valid domain names
        var periodReplacement = "[[[replace:period]]]";
        html = Regex.Replace(html, @"(?<=\d)\.(?=\d)", periodReplacement);

        // create links for matches
        var linkMatches = regExHttpLinks.Matches(html);
        for (int i = 0; i < linkMatches.Count; i++)
        {
            var temp = linkMatches[i].ToString();

            if (!temp.Contains("://"))
            {
                temp = "http://" + temp;
            }

            html = html.Replace(linkMatches[i].ToString(), String.Format("<a href=\"{0}\" title=\"{0}\">{1}</a>", temp.Replace(".", periodReplacement).ToLower(), linkMatches[i].ToString().Replace(".", periodReplacement)));
        }

        // Clear out period replacement
        html = html.Replace(periodReplacement, ".");

        return html;
    }

protected string Linkify( string SearchText ) {
    // this will find links like:
    // http://www.mysite.com
    // as well as any links with other characters directly in front of it like:
    // href="http://www.mysite.com"
    // you can then use your own logic to determine which links to linkify
    Regex regx = new Regex( @"\b(((\S+)?)(@|mailto\:|(news|(ht|f)tp(s?))\://)\S+)\b", RegexOptions.IgnoreCase );
    SearchText = SearchText.Replace( "&nbsp;", " " );
    MatchCollection matches = regx.Matches( SearchText );

    foreach ( Match match in matches ) {
        if ( match.Value.StartsWith( "http" ) ) { // if it starts with anything else then dont linkify -- may already be linked!
            SearchText = SearchText.Replace( match.Value, "<a href='" + match.Value + "'>" + match.Value + "</a>" );
        }
    }

    return SearchText;
}