本周因为公司需要搞个技术小组分享形式的内部会议，所以很匆忙地赶了一些粗制滥造的算法内容出来。主要有最简单的bfs、dfs、union-find、popcount等算法。以下为内容：

Graph

intro and definition

subgraph
connectivity
trees and forest
1 simple unbalanced tree sort

BFS

Definition: A BFS traversal of a graph returns the nodes of the graph level by level.
Application form: by queue

A queue is a line. If you’re the first to get in a bus line, you’re the first to get on the bus. First In, First Out.

leetcode

515. Find Largest Value in Each Tree Row
You need to find the largest value in each row of a binary tree.

Example:
Input: 

          1
         / \
        3   2
       / \   \  
      5   3   9 

Output: [1, 3, 9]

solution


type TreeNode struct {
	Val   int
	Left  *TreeNode
	Right *TreeNode
}

func bfs(root *TreeNode) []int {
	if root == nil {
		return nil
	}
	var queue []*TreeNode
	var res []int
	queue = append(queue, root)
	pos := 0
	for pos < len(queue) {
		max := ^int(^uint(0) >> 1)
		length := len(queue)
		for i := pos; i < length; i++ {
			t := queue[i]
			if t.Val > max {
				max = t.Val
			}
			if queue[i].Left != nil {
				queue = append(queue, queue[i].Left)
			}
			if queue[i].Right != nil {
				queue = append(queue, queue[i].Right)
			}
			pos++
		}
		res = append(res, max)
	}
	return res
}

func main() {
	root := &TreeNode{Val: 1}
	root.Left = &TreeNode{Val: 3}
	root.Right = &TreeNode{Val: 2}
	root.Left.Left = &TreeNode{Val: 5}
	root.Left.Right = &TreeNode{Val: 3}
	root.Right.Right = &TreeNode{Val: 9}
	fmt.Println(bfs(root))
	fmt.Println(bfs(nil))
}

Attention

empty data set may cause exception
list length should not change during process of fetching one element, should use offset to start with

DFS

definition
usage
complexity
further usage (1. path between two vertices 2. find a cycle in the graph)

leetcode

547. Friend Circles
There are N students in a class. Some of them are friends, while some are not. Their friendship is transitive in nature. For example, if A is a direct friend of B, and B is a direct friend of C, then A is an indirect friend of C. And we defined a friend circle is a group of students who are direct or indirect friends.

Given a N*N matrix M representing the friend relationship between students in the class. If M[i][j] = 1, then the ith and jth students are direct friends with each other, otherwise not. And you have to output the total number of friend circles among all the students.

Example 1:
Input: 
[[1,1,0],
 [1,1,0],
 [0,0,1]]
Output: 2
Explanation:The 0th and 1st students are direct friends, so they are in a friend circle. 
The 2nd student himself is in a friend circle. So return 2.
Example 2:
Input: 
[[1,1,0],
 [1,1,1],
 [0,1,1]]
Output: 1
Explanation:The 0th and 1st students are direct friends, the 1st and 2nd students are direct friends, 
so the 0th and 2nd students are indirect friends. All of them are in the same friend circle, so return 1.
Note:
N is in range [1,200].
M[i][i] = 1 for all students.
If M[i][j] = 1, then M[j][i] = 1.

DFS solution

func dfs(M [][]int, i int, visit []bool) {
	for j := range M {
		if M[i][j] == 1 && !visit[j] {
			visit[j] = true
			dfs(M, j, visit)
		}
	}
}

func findCircleNum(M [][]int) int {
	visit := make([]bool, len(M))
	ans := 0
	for i := range M {
		if !visit[i] {
			ans++
			dfs(M, i, visit)
		}
	}
	return ans
}

Union find solution


var size int

var pre []int
var rank []int
var count int

func findPre(x int) int {
	if pre[x] == x {
		return x
	}
	return findPre(x)
}

func compFindPre(x int) int {
	if pre[x] == x {
		return x
	}
	pre[x] = compFindPre(pre[x])
	return pre[x]
}

func union(x, y int) {
	rootX := compFindPre(x)
	rootY := compFindPre(y)
	if rootX == rootY {
		return
	}

	if rank[rootX] < rank[rootY] {
		pre[rootX] = rootY
	} else {
		pre[rootY] = rootX
		if rank[rootX] == rank[rootY] {
			rank[rootX]++
		}
	}
	count--
}

func FindCircleNum(M [][]int) int {
	size = len(M)
	count = size
	pre = make([]int, size)
	rank = make([]int, size)
	for i := 0; i < size; i++ {
		pre[i] = i
		rank[i] = 1
	}
	for i := 0; i < size; i++ {
		for j := i + 1; j < size; j++ {
			if M[i][j] == 1 {
				union(i, j)
			}
		}
	}
	return count
}

Union find

In computer science, a disjoint-set data structure (also called a union–find data structure or merge–find set) is a data structure that tracks a set of elements partitioned into a number of disjoint (non-overlapping) subsets. It provides near-constant-time operations (bounded by the inverse Ackermann function) to add new sets, to merge existing sets, and to determine whether elements are in the same set. In addition to many other uses (see the Applications section), disjoint-sets play a key role in Kruskal’s algorithm for finding the minimum spanning tree of a graph.

make connections

usage seriano
find
union
find and compression

Greatest Common Divisor Algorithm

In mathematics, the Euclidean algorithm[a], or Euclid’s algorithm, is an efficient method for computing the greatest common divisor (GCD) of two numbers, the largest number that divides both of them without leaving a remainder. It is named after the ancient Greek mathematician Euclid, who first described it in his Elements (c. 300 BC). It is an example of an algorithm, a step-by-step procedure for performing a calculation according to well-defined rules, and is one of the oldest algorithms in common use. It can be used to reduce fractions to their simplest form, and is a part of many other number-theoretic and cryptographic calculations.

The Euclidean algorithm is based on the principle that the greatest common divisor of two numbers does not change if the larger number is replaced by its difference with the smaller number. For example, 21 is the GCD of 252 and 105 (as 252 = 21 × 12 and 105 = 21 × 5), and the same number 21 is also the GCD of 105 and 252 − 105 = 147. Since this replacement reduces the larger of the two numbers, repeating this process gives successively smaller pairs of numbers until the two numbers become equal. When that occurs, they are the GCD of the original two numbers. By reversing the steps, the GCD can be expressed as a sum of the two original numbers each multiplied by a positive or negative integer, e.g., 21 = 5 × 105 + (−2) × 252. The fact that the GCD can always be expressed in this way is known as Bézout’s identity.

Binary Greatest Common Divisor Algoroithm

The binary GCD algorithm, also known as Stein’s algorithm, is an algorithm that computes the greatest common divisor of two nonnegative integers. Stein’s algorithm uses simpler arithmetic operations than the conventional Euclidean algorithm; it replaces division with arithmetic shifts, comparisons, and subtraction. Although the algorithm was first published by the Israeli physicist and programmer Josef Stein in 1967,[1] it may have been known in 1st-century China.


// Stein's Algorithm
func BinaryGCD(u, v int64) int64 {
	if u == v {
		return u
	}
	if v == 0 {
		return u
	}
	if u == 0 {
		return v
	}
	if ^u&1 == 1 { // u is even
		if v&1 == 1 { // v is odd
			return BinaryGCD(u>>1, v)
		} else {
			return BinaryGCD(u>>1, v>>1) << 1
		}
	}
	// u is odd
	if ^v&1 == 1 { // v is even
		return BinaryGCD(u, v>>1)
	}
	// v is odd
	if u > v {
		return BinaryGCD((u-v)>>1, v)
	} else {
		return BinaryGCD((v-u)>>1, u)
	}
}

func BGCDRecurrence(u, v int) int {
	if u == v {
		return u
	}
	if u == 0 {
		return v
	}
	if v == 0 {
		return u
	}
	var shift uint
	for shift = 0; (u|v)&1 == 0; shift++ {
		u >>= 1
		v >>= 1
	}
	for u&1 == 0 {
		u >>= 1
	}
	for {
		for v&1 == 0 {
			v >>= 1
		}
		if u > v {
			u, v = v, u
		}
		v = (v - u) >> 1
		if v == 0 {
			break
		}
	}
	return u << shift
}

Common GCD Algorithm

func CommonGCD(x, y int64) int64 {
	if x < y {
		x, y = y, x
	}
	for x%y != 0 {
		x, y = y, x%y
	}
	return y
}

Benchmark

// small numbers
// Euclidean algorithm
func BenchmarkCommonGCD(b *testing.B) {
	for i := 0; i < b.N; i++ {
		CommonGCD(18, 12)
	}
}
// Stein's Algorithm
func BenchmarkBinaryGCD(b *testing.B) {
	for i := 0; i < b.N; i++ {
		BinaryGCD(18, 12)
	}
}

Euclidean

goos: windows
goarch: amd64
pkg: github.com/d0zingcat/labs/gcd/common
BenchmarkCommonGCD-4   	100000000	        20.0 ns/op
PASS
ok  	github.com/d0zingcat/labs/gcd/common	2.104s

Binary

goos: windows
goarch: amd64
pkg: github.com/d0zingcat/labs/gcd/binary
BenchmarkBinaryGCD-4   	100000000	        10.5 ns/op
PASS
ok  	github.com/d0zingcat/labs/gcd/binary	1.165s

ALMOST double its effience!!!

// int64 big numbers
// Euclidean algorithm
func BenchmarkCommonGCD(b *testing.B) {
	for i := 0; i < b.N; i++ {
		r := rand.New(rand.NewSource(time.Now().UnixNano()))
		CommonGCD(r.Int63(), r.Int63())
	}
}

// Stein's Algorithm
func BenchmarkBinaryGCD(b *testing.B) {
	for i := 0; i < b.N; i++ {
		r := rand.New(rand.NewSource(time.Now().UnixNano()))
		BinaryGCD(r.Int63(), r.Int63())
	}
}

Euclidean

goos: windows
goarch: amd64
pkg: github.com/d0zingcat/labs/gcd/common
BenchmarkCommonGCD-4   	  200000	     10610 ns/op
PASS
ok  	github.com/d0zingcat/labs/gcd/common	2.388s

Binary

goos: windows
goarch: amd64
pkg: github.com/d0zingcat/labs/gcd/binary
BenchmarkBinaryGCD-4   	  200000	      9885 ns/op
PASS
ok  	github.com/d0zingcat/labs/gcd/binary	2.192s

Each op differs by almost 0.5 ms! still quite fast in practice

Popcount Algorithm(Hamming Weight)

Question: how to count all the 1s in a 0-1 binary string of a number

arithmatical op: %2 == 1; n /= 2
bitwise op

2.1 iterated popcount
2.2 sparse popcount
2.3 dense popcount
2.4 lookup popcount
2.5 parallel popcount
2.6 to be continued…. cannot understand any more

code

var pc [256]byte = [...]byte{
	0, 1, 1, 2, 1, 2, 2, 3, 1, 2, 2, 3, 2, 3, 3, 4, 1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5,
	1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5, 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6,
	1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5, 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6,
	2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7,
	1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5, 2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6,
	2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7,
	2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6, 3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7,
	3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7, 4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8,
}

//func init() {
//	for i := range pc {
//		pc[i] = pc[i/2] + byte(i&1)
//	}
//}

func LookupCount(x uint64) int {
	// x != 0 && x & (x-1) == 0 is power to 2
	// equavilent to x & 0xff
	return int(pc[byte(x>>(0*8))] +
		pc[byte(x>>(1*8))] +
		pc[byte(x>>(2*8))] +
		pc[byte(x>>(3*8))] +
		pc[byte(x>>(4*8))] +
		pc[byte(x>>(5*8))] +
		pc[byte(x>>(6*8))] +
		pc[byte(x>>(7*8))])
}

func StupidCount(n int) int {
	count := 0
	for n != 0 {
		if n%2 == 1 {
			count++
		}
		n /= 2
	}
	return count
}

// log2N
func NaiveCount(n int) int {
	count := 0
	for n != 0 {
		count += (n & 0x1)
		n >>= 1
	}
	return count
}

func SparseCount(n int) int {
	count := 0
	for n != 0 {
		count++
		// Zero the lowest-order one-bit
		n &= n - 1
	}
	return count
}

func DenseCount(n int64) int {
	count := 64
	n = ^n
	for n != 0 {
		count--
		n &= (n - 1)
	}
	return count
}

func ParallelCount(n int64) int64 {
	const (
		//uint64_t is an unsigned 64-bit integer variable type (defined in C99 version of C language)
		m1  = 0x5555555555555555 //binary: 0101...
		m2  = 0x3333333333333333 //binary: 00110011..
		m4  = 0x0f0f0f0f0f0f0f0f //binary:  4 zeros,  4 ones ...
		m8  = 0x00ff00ff00ff00ff //binary:  8 zeros,  8 ones ...
		m16 = 0x0000ffff0000ffff //binary: 16 zeros, 16 ones ...
		m32 = 0x00000000ffffffff //binary: 32 zeros, 32 ones
		h01 = 0x0101010101010101 //the sum of 256 to the power of 0,1,2,3...
	)

	// or to be optimized
	const (
		mm1  = 0xffffffffffffffff / (0x2 + 1)
		mm2  = 0xffffffffffffffff / (0x4 + 1)
		mm4  = 0xffffffffffffffff / (0x10 + 1)
		mm8  = 0xffffffffffffffff / (0x100 + 1)
		mm16 = 0xffffffffffffffff / (0x10000 + 1)
		mm32 = 0xffffffffffffffff / (0x100000000 + 1)
	)
	n = (n & m1) + ((n >> 1) & m1)    //put count of each  2 bits into those  2 bits
	n = (n & m2) + ((n >> 2) & m2)    //put count of each  4 bits into those  4 bits
	n = (n & m4) + ((n >> 4) & m4)    //put count of each  8 bits into those  8 bits
	n = (n & m8) + ((n >> 8) & m8)    //put count of each 16 bits into those 16 bits
	n = (n & m16) + ((n >> 16) & m16) //put count of each 32 bits into those 32 bits
	n = (n & m32) + ((n >> 32) & m32) //put count of each 64 bits into those 64 bits
	return n
}

test


const N = 1<<62 + 5543445554

func TestStupidCount(t *testing.T) {
	fmt.Println(StupidCount(N))
}

func TestNaiveCount(t *testing.T) {
	fmt.Println(NaiveCount(N))
}

func TestSparseCount(t *testing.T) {
	fmt.Println(SparseCount(N))
}

func TestDenseCount(t *testing.T) {
	fmt.Println(DenseCount(N))
}

func TestLookupCount(t *testing.T) {
	fmt.Println(LookupCount(N))
}

func TestParallelCount(t *testing.T) {
	fmt.Println(ParallelCount(N))
}

func BenchmarkLookupCount(b *testing.B) {
	for i := 0; i < b.N; i++ {
		LookupCount(N)
	}
}

func BenchmarkStupidCount(b *testing.B) {
	for i := 0; i < b.N; i++ {
		StupidCount(N)
	}
}

func BenchmarkNaiveCount(b *testing.B) {
	for i := 0; i < b.N; i++ {
		NaiveCount(N)
	}
}

func BenchmarkSparseCount(b *testing.B) {
	for i := 0; i < b.N; i++ {
		SparseCount(N)
	}
}

func BenchmarkDenseCount(b *testing.B) {
	for i := 0; i < b.N; i++ {
		DenseCount(N)
	}
}

func BenchmarkParallelCount(b *testing.B) {
	for i := 0; i < b.N; i++ {
		ParallelCount(N)
	}
}

benchmark

goos: windows
goarch: amd64
pkg: github.com/d0zingcat/labs/popcount
BenchmarkLookupCount-4     	2000000000	         0.31 ns/op
BenchmarkStupidCount-4     	20000000	       107 ns/op
BenchmarkNaiveCount-4      	30000000	        38.9 ns/op
BenchmarkSparseCount-4     	200000000	         6.93 ns/op
BenchmarkDenseCount-4      	30000000	        41.2 ns/op
BenchmarkParallelCount-4   	2000000000	         0.31 ns/op
PASS
ok  	github.com/d0zingcat/labs/popcount	8.252s

as to lookup popcount, the approach is use space to exchange time
thus, parallel pop count uses devide-and-conquer strategy to count.

leetcode

191. Number of 1 Bits

Write a function that takes an unsigned integer and returns the number of '1' bits it has (also known as the Hamming weight).

Example 1:

Input: 11
Output: 3
Explanation: Integer 11 has binary representation 00000000000000000000000000001011 
Example 2:

Input: 128
Output: 1
Explanation: Integer 128 has binary representation 00000000000000000000000010000000

Solution

class Solution(object):
    def hammingWeight(self, n):
        """
        :type n: int
        :rtype: int
        """
        count = 0
        while n:
            count += 1
            n = n & (n-1)
        return count

Refer

Hamming weight

Bit-counting algorithms

Fast Bit Counting

Calculating Hamming Weight in O(1)

Hamming Weight的算法分析

popcount 算法分析

Hamming Weight的算法分析（转载）

Fermat number

Sieve of Eratosthenes derived from popcount

In mathematics, the sieve of Eratosthenes is a simple, ancient algorithm for finding all prime numbers up to any given limit.

It does so by iteratively marking as composite (i.e., not prime) the multiples of each prime, starting with the first prime number, 2. The multiples of a given prime are generated as a sequence of numbers starting from that prime, with constant difference between them that is equal to that prime. This is the sieve’s key distinction from using trial division to sequentially test each candidate number for divisibility by each prime.

Two Approaches to get prime table

normal

code

func PrimeNumbers(n int) (ans []int) {
	for i := 2; i < n; i ++ {
		if isPrime(i) {
			ans = append(ans, i)
		}
	}
	return ans
}

func isPrime(n int) bool {
	for i := 2; i < int(math.Sqrt(float64(n))) + 1; i ++ {
		if n % i == 0 {
			return false
		}
	}
	return true
}

test

func BenchmarkPrimeNumbers(b *testing.B) {
	for i := 0; i < b.N; i ++ {
		PrimeNumbers(1000)
	}
}

benchmark

goos: darwin
goarch: amd64
pkg: github.com/d0zingcat/learning/leetcode/primenumber/normal
BenchmarkPrimeNumbers-8   	   20000	     63843 ns/op
PASS

eratosthenes

code

func PrimeTable(n int) (ans []int) {
	var a []int
	for i := 0; i < n; i ++ {
		a = append(a, i)
	}
	for i := 2; i < n; i++ {
		if a[i] != 0 {
			ans = append(ans, a[i])
		}
		for j := i*2; j < n; j += i {
			a[j] = 0
		}
	}
	return
}

test

func BenchmarkPrimeTable(b *testing.B) {
	for i := 0; i < b.N; i ++ {
		PrimeTable(1000)
	}
}

benchmark

goos: darwin
goarch: amd64
pkg: github.com/d0zingcat/learning/leetcode/primenumber/eratosthenes
BenchmarkPrimeTable-8   	  100000	     10839 ns/op
PASS

The bigger n is, the more time the normal algorithm consumes each op!

Appearantly, use the prime table and sieve is the best way to get a bunch of primes that is smaller than n.

Refer

Euclidean algorithm

Sieve of Eratosthenes

2018-10-18

Contents

Graph

BFS

leetcode

DFS

leetcode

Union find

Greatest Common Divisor Algorithm

Popcount Algorithm(Hamming Weight)

Sieve of Eratosthenes derived from popcount