YouTube-like GUID

9 chars is not a GUID. Given that, you could use the hexadecimal representation of an int, which gives you a 8 char string.

You can use an id you might already have. Also you can use .GetHashCode against different simple types and there you have a different int. You can also xor different fields. And if you are into it, you might even use a Random number - hey, you have well above 2.000.000.000+ possible values if you stick to the positives ;)


URL Friendly Solution

As mentioned in the accepted answer, base64 is a good solution but it can cause issues if you want to use the GUID in a URL. This is because + and / are valid base64 characters, but have special meaning in URLs.

Luckily, there are unused characters in base64 that are URL friendly. Here is a more complete answer:

public string ToShortString(Guid guid)
{
    var base64Guid = Convert.ToBase64String(guid.ToByteArray());

    // Replace URL unfriendly characters
    base64Guid = base64Guid.Replace('+', '-').Replace('/', '_');

    // Remove the trailing ==
    return base64Guid.Substring(0, base64Guid.Length - 2);
}

public Guid FromShortString(string str)
{
    str = str.Replace('_', '/').Replace('-', '+');
    var byteArray = Convert.FromBase64String(str + "==");
    return new Guid(byteArray);
}

Usage:

Guid guid = Guid.NewGuid();
string shortStr = ToShortString(guid);
// shortStr will look something like 2LP8GcHr-EC4D__QTizUWw
Guid guid2 = FromShortString(shortStr);
Assert.AreEqual(guid, guid2);

EDIT:

Can we do better? (Theoretical limit)

The above yields a 22 character, URL friendly GUID. This is because a GUID uses 128 bits, so representing it in base64 requires log_{64}2^128 characters, which is 21.33, which rounds up to 22.

There are actually 66 URL friendly characters (we aren't using . and ~). So theoretically, we could use base66 to get log_{66}2^128 characters, which is 21.17, which also rounds up to 22.

So this is optimal for a full, valid GUID.

However, GUID uses 6 bits to indicate the version and variant, which in our case are constant. So we technically only need 122 bits, which in both bases rounds to 21 (log_{64}2^122 = 20.33). So with more manipulation, we could remove another character. This requires wrangling the bits out however, so I leave this as an exercise to the reader.

How does youtube do it?

YouTube IDs use 11 characters. How do they do it?

A GUID uses 122 bits, which guarantees collisions are virtually impossible. This means you can generate a random GUID and be certain it is unique without checking. However, we don't need so many bits for just a regular ID.

We could use a smaller ID. If we use 66 bits or less, we have a higher risk of collision, but can represent this ID with 11 characters (even in base64). One could either accept the risk of collision, or test for a collision and regenerate.

With 122 bits (regular GUID), you would have to generate ~10^17 GUIDs to have a 1% chance of collision.

With 66 bits, you would have to generate ~10^9 or 1 billion IDs to have a 1% chance of collision. That is not that many IDs.

My guess is youtube uses 64 bits (which is more memory friendly than 66 bits), and checks for collisions to regenerate the ID if necessary.

If you want to abandon GUIDs in favor of smaller IDs, here is code for that:

class IdFactory
{
    private Random random = new Random();
    public int CharacterCount { get; }
    public IdFactory(int characterCount)
    {
        CharacterCount = characterCount;
    }

    public string Generate()
    {
        // bitCount = characterCount * log (targetBase) / log(2)
        var bitCount = 6 * CharacterCount;
        var byteCount = (int)Math.Ceiling(bitCount / 8f);
        byte[] buffer = new byte[byteCount];
        random.NextBytes(buffer);

        string guid = Convert.ToBase64String(buffer);
        // Replace URL unfriendly characters
        guid = guid.Replace('+', '-').Replace('/', '_');
        // Trim characters to fit the count
        return guid.Substring(0, CharacterCount);
    }
}

Usage:

var factory = new IdFactory(characterCount: 11);
string guid = factory.Generate();
// guid will look like Mh3darwiZhp

This uses 64 characters which is not optimal, but requires much less code (since we can reuse Convert.ToBase64String). You should be a lot more careful of collisions if you use this.


You could use Base64:

string base64Guid = Convert.ToBase64String(Guid.NewGuid().ToByteArray());

That generates a string like E1HKfn68Pkms5zsZsvKONw==. Since a GUID is always 128 bits, you can omit the == that you know will always be present at the end and that will give you a 22 character string. This isn't as short as YouTube though.