geektimes

Algorithms in Go: Bit Manipulation

  • понедельник, 12 апреля 2021 г. в 00:13:16
https://habr.com/en/post/551732/
  • Programming
  • Algorithms
  • Go
  • Interview


This article is a part of Algorithms in Go series where we discuss common algorithmic problems and their solution patterns.


In this edition, we take a closer look at bit manipulations. Bit operations can be extremely powerful and useful in an entire class of algorithmic problems, including problems that at first glance does not have to do anything with bits.


Let's consider the following problem: six friends meet in the bar and decide who pays for the next round. They would like to select a random person among them for that. How can they do a random selection using only a single coin?



The solution to this problem is not particularly obvious (for me:), so let's simplify a problem for a moment to develop our understanding. How would we do the selection if there were only three friends? In other words, how would we "mimic" a three-sided coin with a two-sided coin?


Well, we can throw the coin twice and enlist all possible options:


  • tails and tails: Joey
  • tails and heads: Phoebe
  • heads and tails: Rachel
  • heads and heads: no-op, do the selection again

Now, this looks suspiciously close to a binary encoding.


  • 0 and 0: Joey
  • 0 and 1: Phoebe
  • 1 and 0: Rachel
  • 1 and 1: no-op, do the selection again

So in general, we need to conduct enough trials to encode all the options. If we get an invalid outcome, we repeat the process.


The signature for the function will look as follows:


type Choice string

// 6 possible choices for six friends.
var choices = []Choice{"Joey", "Phoebe", "Rachel", "Chandler", "Ross", "Monica"}

// select a random friend from the list of choices.
func Select() Choice {
    ...
}

Let's start with the tests before implementing the actual solution. How can we test a random generator? First, we need to ensure that we have a uniform distribution, i.e. that each possible choice can be selected with the same probability. It is not possible to test this claim on a single trial. However, we can check this statistically. We do a series of trials and count how many times each friend was chosen. After all trials, each friend must have been chosen approximately the same number of times. In our test, we conduct 10,000 trials and allow a delta of ten percent.


func TestUniformity(t *testing.T) {
    n := 10_000 // 10,000 trials
    buckets := make([]int, len(choices))
    for i := 0; i < n; i++ {
        trial := Select()
        buckets[trial-1]++
    }
    delta := 0.1 // 10 percent
    diff := delta * float64(buckets[0])
    // expected range of the results
    min, max := float64(buckets[0])-diff, float64(buckets[0])+diff

    for _, value := range buckets[1:] {
        assert.Greater(t, float64(value), min)
        assert.Less(t, float64(value), max)
    }
}

However, this test alone is not sufficient. Let's write a mock generator that passes this test but actually does not produce random numbers.


func mockGenerator() func() Choice {
    state := -1
    f := func() Choice {
        state = (state + 1) % 6
        return choices[state]
    }
    return f
}

This generator returns a selection function that produces the same sequence ["Joey", "Phoebe", "Rachel", "Chandler", "Ros", "Monica"] in a circle. The distribution is uniform, however, it is not random.


Let's add one more test that detects the circles in the output. We define the size of the window and then split the output. If each window contains exactly the same sequence, then we find a circle. If not, we adjust the size of the window and repeat the process. If we reach the size of the window of 1 and didn't find the circle, then the test passes.


func TestCircle(t *testing.T) {
    trials := make([]Choice, 0, 10000)
    for i := 0; i < 10000; i++ {
        trials = append(trials, Select())
    }

    for size := 100; size >= 1; size-- {
        first := trials[:size]
        circle := true
        for start := size; start+size <= len(trials); start = start + size {
            next := trials[start : start+size]
            if !cmp.Equal(first, next) {
                circle = false
                break
            } else {
                circle = true
            }
        }
        assert.False(t, circle, size)
    }
}

Note, that in our case each window will be represented as a slice of Choice. In Golang, the equality operation is not defined on slices, therefore we use a helper library github.com/google/go-cmp/cmp.


Our mock generator won't pass the second test, as it produces the circled sequence.


Now we can proceed to the actual implementation of the function. How many trials would we need to conduct? In other words, how many bits do we need to encode all possible choices. Each bite can have two possible values 0 or 1, therefore, a sequence of bytes can represent 2n values, where n is the number of bytes.


numberOfChoices = 2n
n = log2(numberOfChoices)

Therefore, we require at least n bytes (or trials). We need to round up this number to the closest integer. Some outcomes won't be meaningful: i.e. if have three friends, then we need at least two bytes. However, we can encode four values with two bytes, therefore, we will need to repeat the selection if we get the invalid result.


So we can outline our algorithm:


  1. Calculate the number of bytes n needed to represent all possible choices.
  2. Generate the sequence of bytes of size n at random.
  3. Convert the sequence of byte to a decimal number.
  4. Check whether the number has a meaning, i.e. whether this number has an associated choice. If the check was successful, return the number. Otherwise, repeat the process.

func Select() Choice {
    numberOfChoices := len(choices)
    raw := math.Log2(float64(numberOfChoices))
    n := int(math.Ceil(raw))                   

    var choice []int
    for i := 0; i < n; i++ {
        trial := rand.Intn(2) // select at random `0` or `1`
        choice = append(choice, trial)
    }

    var str string
    for _, c := range choice {
        str += strconv.Itoa(c)
    }

        // convert to a decimal
    dec, err := strconv.ParseInt(str, 2, 0)
    if err != nil {
        panic(err)
    }

    if int(dec) >= numberOfChoices {
        return Select()
    }

    return choices[dec]
}

Here we used a standard library function ParseInt to convert a string representation of a binary number to a decimal. We can avoid the double conversation from a binary integer to a string and then back a decimal integer, and do the direct conversion:


var choice int
for i := 0; i < n; i++ {
    trial := rand.Intn(2)
    choice = choice*2 + trial
}

The rest of the function remains the same. Let's write a simple benchmark to see whether the change made any difference.


func Benchmark(b *testing.B) {
    for i := 0; i < b.N; i++ {
        a := Select()
        _ = a
    }
}

func Benchmark1(b *testing.B) {
    for i := 0; i < b.N; i++ {
    a := Select1()
    _ = a
    }
}

The second version is two times faster than the first one.


Can we do better than that?


Let's take a closer look at our task. We need to make n choices (choosing between 0 and 1), where n is the number of bits. To do so we use a for loop and at every iteration, we add one bit of information to our final result. When we have selected all the required bits, we got a sequence of bits. This sequence of bits can be represented as a decimal number. This number, in turn, represents a particular friend that has been chosen to pay the bill.


We can employ left shift operation to build up the resulting decimal number. We start with out equal to zero, and at every iteration we shift out to the left by one bit, making space for the next bit. Then we select the next bit at random and assign it to variable next. After that we apply union operation to out and next, so updated out now also takes into account the currently selected bit. We repeat this process until we have selected enough bits to represent all friends, i.e. 2 ** bits must become equal or larger than len(choice). As function math.Pow in Golang defined only for floats, the above condition can be more succinctly represented using another left shift operation (1 << bits) < len(choices)


var bits, out int
for (1 << bits) < len(choices) {
    next := rand.Intn(2)
    out = (out << 1) | next
    bits++
}

In case if resulting out does not have a meaning, i.e. it does not represent any friend, we need to repeat the process again.


func Select2() Choice {
    for {
        var bits, out int
        for (1 << bits) < len(choices) {
            next := rand.Intn(2)
            out = (out << 1) | next
            bits++
        }
        if out < len(choices) {
            return choices[out]
        }
    }
}

According to the benchmark, this version is two times faster than our second version.


In this post, we implemented a random generator employing bit operations. More algorithmic patterns such as Matrix Spiral or Dutch National Flag can be found in the series Algorithms in Go.