Something about Go strings that you should know

Something about Go strings that you should know

·

0 min read

What will be the output of the following code snippet?

package main

import (
    "fmt"
)

func main() {
    var s = "abcdè"
    fmt.Println(len(s))
}

The string s has 5 characters and therefore the output should be 5 right? or Is it 🤔

If we try running the code here: https://play.golang.com/p/4F4ZkyWJAiQ, we can find that the output is 6. That's interesting, why is it so? Let's try to understand.

In Go a string is nothing but an immutable byte slice. Let's examine our string and what it is contains.

package main

import (
    "fmt"
)

func main() {
    var s = "abcdè"
    fmt.Printf("% x",s)
}

https://play.golang.com/p/Q0m5l6fndhy

On executing the code, we are greeted with the following output.

61 62 63 64 c3 a8

Now things are starting to unfold, we see that we actually have 6 bytes in the string slice and therefore we got the length as 6 for the string.

But, what are these random values? Let's find out.

package main

import (
    "fmt"
)

func main() {
    var s = "abcdè"
    fmt.Printf("%+q",s)
}

%+q escapes any non-ASCII bytes

https://play.golang.com/p/IlQUDUX9DHB

"abcd\u00e8"

Since %+q escapes non-ASCII characters, we can see that the last character has 00E8 Unicode value. Go uses UTF-8 Encoding so the hex value c3 a8 that we got earlier was indeed UTF-8 encoded hex value of 00E8. So our string variable had hex values corresponding to the unicode values in it's byte slice.

$ printf '\x61\n'
a
$ printf '\x62\n'
b
$ printf '\x63\n'
c
$ printf '\x64\n'
d
$ printf '\xC3\xA8\n'
è

So, that was the reason behind that behaviour. Feel free to post any questions and feedback below in the comments.

Find More: