2018-10-18
Contents
本周因为公司需要搞个技术小组分享形式的内部会议,所以很匆忙地赶了一些粗制滥造的算法内容出来。主要有最简单的bfs、dfs、union-find、popcount等算法。以下为内容:
Graph
intro and definition
- subgraph
- connectivity
- trees and forest
- 1 simple unbalanced tree sort
BFS
- Definition: A BFS traversal of a graph returns the nodes of the graph level by level.
- Application form: by queue
A queue is a line. If you’re the first to get in a bus line, you’re the first to get on the bus. First In, First Out.
leetcode
1 | 515. Find Largest Value in Each Tree Row |
solution
1 |
|
Attention
- empty data set may cause exception
- list length should not change during process of fetching one element, should use offset to start with
DFS
- definition
- usage
- complexity
- further usage (1. path between two vertices 2. find a cycle in the graph)
leetcode
1 | 547. Friend Circles |
DFS solution
1 | func dfs(M [][]int, i int, visit []bool) { |
Union find solution
1 |
|
Union find
In computer science, a disjoint-set data structure (also called a union–find data structure or merge–find set) is a data structure that tracks a set of elements partitioned into a number of disjoint (non-overlapping) subsets. It provides near-constant-time operations (bounded by the inverse Ackermann function) to add new sets, to merge existing sets, and to determine whether elements are in the same set. In addition to many other uses (see the Applications section), disjoint-sets play a key role in Kruskal’s algorithm for finding the minimum spanning tree of a graph.
make connections
- usage seriano
- find
- union
- find and compression
Greatest Common Divisor Algorithm
In mathematics, the Euclidean algorithm[a], or Euclid’s algorithm, is an efficient method for computing the greatest common divisor (GCD) of two numbers, the largest number that divides both of them without leaving a remainder. It is named after the ancient Greek mathematician Euclid, who first described it in his Elements (c. 300 BC). It is an example of an algorithm, a step-by-step procedure for performing a calculation according to well-defined rules, and is one of the oldest algorithms in common use. It can be used to reduce fractions to their simplest form, and is a part of many other number-theoretic and cryptographic calculations.
The Euclidean algorithm is based on the principle that the greatest common divisor of two numbers does not change if the larger number is replaced by its difference with the smaller number. For example, 21 is the GCD of 252 and 105 (as 252 = 21 × 12 and 105 = 21 × 5), and the same number 21 is also the GCD of 105 and 252 − 105 = 147. Since this replacement reduces the larger of the two numbers, repeating this process gives successively smaller pairs of numbers until the two numbers become equal. When that occurs, they are the GCD of the original two numbers. By reversing the steps, the GCD can be expressed as a sum of the two original numbers each multiplied by a positive or negative integer, e.g., 21 = 5 × 105 + (−2) × 252. The fact that the GCD can always be expressed in this way is known as Bézout’s identity.
Binary Greatest Common Divisor Algoroithm
The binary GCD algorithm, also known as Stein’s algorithm, is an algorithm that computes the greatest common divisor of two nonnegative integers. Stein’s algorithm uses simpler arithmetic operations than the conventional Euclidean algorithm; it replaces division with arithmetic shifts, comparisons, and subtraction. Although the algorithm was first published by the Israeli physicist and programmer Josef Stein in 1967,[1] it may have been known in 1st-century China.
1 |
|
Common GCD Algorithm
1 | func CommonGCD(x, y int64) int64 { |
Benchmark
1 | // small numbers |
- Euclidean
1 | goos: windows |
- Binary
1 | goos: windows |
ALMOST double its effience!!!
1 | // int64 big numbers |
- Euclidean
1 | goos: windows |
- Binary
1 | goos: windows |
Each op differs by almost 0.5 ms! still quite fast in practice
Popcount Algorithm(Hamming Weight)
Question: how to count all the 1s in a 0-1 binary string of a number
- arithmatical op: %2 == 1; n /= 2
- bitwise op
2.1 iterated popcount
2.2 sparse popcount
2.3 dense popcount
2.4 lookup popcount
2.5 parallel popcount
2.6 to be continued…. cannot understand any more
code
1 | var pc [256]byte = [...]byte{ |
test
1 |
|
benchmark
1 | goos: windows |
as to lookup popcount, the approach is use space to exchange time
thus, parallel pop count uses devide-and-conquer strategy to count.
leetcode
1 | 191. Number of 1 Bits |
Solution
1 | class Solution(object): |
Refer
Calculating Hamming Weight in O(1)
Sieve of Eratosthenes derived from popcount
In mathematics, the sieve of Eratosthenes is a simple, ancient algorithm for finding all prime numbers up to any given limit.
It does so by iteratively marking as composite (i.e., not prime) the multiples of each prime, starting with the first prime number, 2. The multiples of a given prime are generated as a sequence of numbers starting from that prime, with constant difference between them that is equal to that prime. This is the sieve’s key distinction from using trial division to sequentially test each candidate number for divisibility by each prime.
Two Approaches to get prime table
- normal
code
1 | func PrimeNumbers(n int) (ans []int) { |
test
1 | func BenchmarkPrimeNumbers(b *testing.B) { |
benchmark
1 | goos: darwin |
- eratosthenes
code
1 | func PrimeTable(n int) (ans []int) { |
test
1 | func BenchmarkPrimeTable(b *testing.B) { |
benchmark
1 | goos: darwin |
The bigger n is, the more time the normal algorithm consumes each op!
Appearantly, use the prime table and sieve is the best way to get a bunch of primes that is smaller than n.
Refer
Author: d0zingcat
Link: https://infloop.life/2018/11/03/2018-10-18/
License: 知识共享署名-非商业性使用 4.0 国际许可协议