How to get a single Unicode character from string

First, you may want to read https://blog.golang.org/strings It will answer part of your questions.

A string in Go can contains arbitrary bytes. When you write str[i], the result is a byte, and the index is always a number of bytes.

Most of the time, strings are encoded in UTF-8 though. You have multiple ways to deal with UTF-8 encoding in a string.

For instance, you can use the for...range statement to iterate on a string rune by rune.

var first rune
for _,c := range str {
    first = c
    break
}
// first now contains the first rune of the string

You can also leverage the unicode/utf8 package. For instance:

r, size := utf8.DecodeRuneInString(str)
// r contains the first rune of the string
// size is the size of the rune in bytes

If the string is encoded in UTF-8, there is no direct way to access the nth rune of the string, because the size of the runes (in bytes) is not constant. If you need this feature, you can easily write your own helper function to do it (with for...range, or with the unicode/utf8 package).

Tags:

String

Unicode

Go