This page looks best with JavaScript enabled

Go strings, chars and runes

 ·  🎃 kr0m

In this article, we will explore how to manipulate strings in Go, the internal representation of characters, the challenges of working with UTF-8 characters, and how Go resolves these challenges with the concept of runes.

The article is divided into the following sections:


If you’re new to the world of Go, I recommend the following previous articles:


Basic String:

A string is nothing more than a slice of bytes in Go, and they can be created by writing a set of characters between double quotes: " ". Let’s see a simple example.

vi strings00.go
package main

import (
    "fmt"
)

func main() {
    name := "Hello World"
    fmt.Println(name)
}

If we run the code, we will see the following output:

go run strings00.go

Hello World

Internal Representation of Characters:

The character data of strings is stored byte by byte, so we can iterate over the string to obtain each of these. In the following example, we show the representation as chars and the hexadecimal representation of each of them.

vi strings01.go
package main

import (
    "fmt"
)

func printChars(s string) {
    fmt.Printf("Characters: ")
    for i := 0; i < len(s); i++ {
        fmt.Printf("%c ", s[i])
    }
    fmt.Printf("\n")
}

func printBytes(s string) {
    fmt.Printf("Bytes: ")
    for i := 0; i < len(s); i++ {
        fmt.Printf("%x ", s[i])
    }
    fmt.Printf("\n")
}

func main() {
    name := "Hello World"
    fmt.Printf("String: %s\n", name)
    printChars(name)
    printBytes(name)
}

If we run the code, we will see the following output:

go run strings01.go

String: Hello World
Characters: H e l l o   W o r l d
Bytes: 48 65 6c 6c 6f 20 57 6f 72 6c 64

UTF-8 and Runes:

Characters occupy 1 byte, but if they are UTF-8 (any non-English character), they can occupy 1, 2, 3, or 4 bytes. If we try to display these strings character by character, it will not work correctly.

vi strings02.go
package main

import (
    "fmt"
)

func printChars(s string) {
    fmt.Printf("Characters: ")
    for i := 0; i < len(s); i++ {
        fmt.Printf("%c ", s[i])
    }
    fmt.Printf("\n")
}

func printBytes(s string) {
    fmt.Printf("Bytes: ")
    for i := 0; i < len(s); i++ {
        fmt.Printf("%x ", s[i])
    }
    fmt.Printf("\n")
}

func main() {
    name := "Hello señor"
    fmt.Printf("String: %s\n", name)
    printChars(name)
    printBytes(name)
}

If we run the code, we will see the following output:

go run strings02.go

String: Hello señor
Characters: H e l l o   s e à ± o r
Bytes: 48 65 6c 6c 6f 20 73 65 c3 b1 6f 72

We can solve this problem using runes. In the following code, we generate a slice of runes from a string, and when we iterate over each of the characters, we are actually iterating over each of the runes.

vi strings03.go
package main

import (
    "fmt"
)

func printChars(s string) {
    fmt.Printf("Characters: ")
    runes := []rune(s)
    for i := 0; i < len(runes); i++ {
        fmt.Printf("%c ", runes[i])
    }
    fmt.Printf("\n")
}

func printBytes(s string) {
    fmt.Printf("Bytes: ")
    for i := 0; i < len(s); i++ {
        fmt.Printf("%x ", s[i])
    }
    fmt.Printf("\n")
}

func main() {
    name := "Hello World"
    fmt.Printf("String: %s\n", name)
    printChars(name)
    printBytes(name)

    fmt.Printf("\n\n")

    name = "Señor"
    fmt.Printf("String: %s\n", name)
    printChars(name)
    printBytes(name)
}

If we run the code, we will see the following output:

go run strings03.go

String: Hello World
Characters: H e l l o   W o r l d
Bytes: 48 65 6c 6c 6f 20 57 6f 72 6c 64


String: Señor
Characters: S e ñ o r
Bytes: 53 65 c3 b1 6f 72

If we use a Go for range, it is not necessary to use a rune slice since the for loop itself will handle the conversion for us.

vi strings04.go
package main

import (
    "fmt"
)

func printChars(s string) {
    fmt.Printf("Characters: ")
    for _, char := range s {
        fmt.Printf("%c", char)
    }
    fmt.Printf("\n")
}

func printBytes(s string) {
    fmt.Printf("Bytes: ")
    for i := 0; i < len(s); i++ {
        fmt.Printf("%x ", s[i])
    }
    fmt.Printf("\n")
}

func main() {
    name := "Hello señor"
    fmt.Printf("String: %s\n", name)
    printChars(name)
    printBytes(name)
}

If we run the code, we will see the following output:

go run strings04.go

String: Hello señor
Characters: Hello señor
Bytes: 48 65 6c 6c 6f 20 73 65 c3 b1 6f 72

As a general rule, it is always advisable to use runes to avoid unpleasant surprises.


Immutability:

Strings in Go are immutable, meaning their content cannot be altered. For example, this function would produce an error:

func mutate(s string) string {
    s[0] = 'a'
    return s
}
./strings05.go:8:5: cannot assign to s[0] (neither addressable nor a map index expression)

If we want to modify the content, we need to work with runes.

vi strings05.go
package main

import (
    "fmt"
)

func mutate(s string) string {
    runes := []rune(s)
    runes[0] = 'a'
    return string(runes)
}

func main() {
    s := "hello"
    fmt.Println(mutate(s))
}

If we run the code, we will see the following output:

go run strings05.go

aello

Length:

Another issue we might encounter is counting the length of strings. By viewing the hexadecimal representation of the characters, we can see that “señor” occupies 6 chars when it should occupy 5.

name := "Hello World"
Bytes: 48 65 6c 6c 6f 20 57 6f 72 6c 64
"Hello " -> 48 65 6c 6c 6f 20
"World"  -> 57 6f 72 6c 64

name := "Hello señor"
Bytes: 48 65 6c 6c 6f 20 73 65 c3 b1 6f 72
"Hello " -> 48 65 6c 6c 6f 20
"señor"  -> 73 65 c3 b1 6f 72

The len() function will always show the number of bytes in a string, in this case, len(48 65 6c 6c 6f 20 73 65 c3 b1 6f 72). If we want to obtain the number of runes, we must use utf8.RuneCountInString.

vi strings06.go

package main

import (
    "fmt"
    "unicode/utf8"
)

func main() {
    name := "Hello señor"
    fmt.Printf("String len: %v\n", len(name))
    fmt.Printf("String utf8.RuneCountInString: %v\n", utf8.RuneCountInString(name))
}

If we run the code, we will see the following output:

go run strings06.go

String len: 12
String utf8.RuneCountInString: 11

Comparison:

String comparison is straightforward; we only need to use the == operator.

vi strings07.go
package main

import (
    "fmt"
)

func compareStrings(str1 string, str2 string) {
    if str1 == str2 {
        fmt.Printf("%s and %s are equal\n", str1, str2)
        return
    }
    fmt.Printf("%s and %s are not equal\n", str1, str2)
}

func main() {
    string1 := "Go"
    string2 := "Go"
    compareStrings(string1, string2)

    string3 := "hello"
    string4 := "world"
    compareStrings(string3, string4)
}

If we run the code, we will see the following output:

go run strings07.go

Go and Go are equal
hello and world are not equal

Concatenation:

Concatenation is also very simple; we just need to join the two strings using the + operator.

vi strings08.go
package main

import (
    "fmt"
)

func main() {
    string1 := "Go"
    string2 := "is awesome"
    result := string1 + " " + string2
    fmt.Println(result)
}

If we run the code, we will see the following output:

go run strings08.go

Go is awesome
If you liked the article, you can treat me to a RedBull here