什么是近似算法?它适用于哪些问题?这篇文章给你答案

本文介绍了近似算法及其对某些标准问题的适用性。近似算法是处理优化问题NP完全性的方式,目标是在多项式时间内接近最优值。文中通过分区问题、装箱问题等实例,探讨了不同近似算法的应用,如贪婪数字分割、Karmarkar - Karp算法等。

来自丨机器之心

罗素曾说:所有精确科学都被近似思想所主宰。本文介绍了近似算法及其对某些标准问题的适用性。

新冠大流行给世界带来了巨大的改变,全球科学家和研究人员在研制有效的疫苗。他们正在做的就是从广阔的样本空间中近似地收紧可能性范围,并尽力得到一些有效解。近似在我们的生活中发挥了重要作用。

以在线食品配送为例,我们经常从网上订购食物,享受快速送达的服务。但你想过这些 app 后端运行的什么算法让快递员在更短时间内抵达目的地吗?答案是近似算法。这类问题就是「旅行商问题」。

食品配送:旅行商问题的现实应用。

本文将介绍近似算法及其对某些标准问题的适用性,以及哪些因素会影响到特定算法的选择。

什么是近似算法?

近似算法是一种处理优化问题 NP 完全性的方式,它无法确保最优解。近似算法的目标是在多项式时间内尽可能地接近最优值。

它虽然无法给出精确最优解,但可以将问题收敛到最终解的近似值。其目标满足以下三个关键特性:

  • 能够在多项式时间内高效运行;

  • 能够给出最优解;

  • 对于每个问题实例均有效。

背景

数学表达式的评估常伴随常量、变量分析和方程的阶,可用于衡量近似的复杂度。此类评估将问题分解为 P 和 NP 难问题

P 问题和 NP 问题的策略

P 问题是指可以在多项式时间内求解的问题。

NP 表示不确定性多项式时间(nondeterministic polynomial time),NP 问题是指在多项式时间内近似验证答案的问题。但目前人们发现,很多此类问题需要指数时间才能求解。

P 和 NP 策略。

真正的争论在于 P=NP 还是 P≠NP。之前的一些研究证明这两种都是对的。如果一个问题是多项式次方,则存在多个最优算法。因此,在 NP 完全问题中,存在两种方法找到近优解,然后选择最适合的算法。

如果输入的大小比较小,则具备指数运行时间的算法可能会比较适合。

其次,通过用近似算法替代确定性算法,我们仍然能够在多项式时间内找到近优解。

近似算法的复杂度可以从输入大小和近似因子中推断出来。接下来,我们通过一些示例,深入探索这些算法如何应用到现实问题中。

分区问题(Partition Problem)

在计算机科学领域,该问题的定义是:给定多重正整数集 X,它可以被分割为两个元素之和相等的子集 X1 和 X2,即每个子集的数值之和与另一个子集相等。

例如,X={3,4,1,3,3,2,3,2,1} 可以被分割为 X1={3,3,2,3} 和 X2={4,2,3,1,1},二者的数值之和都是 11。

类似地,X={1,3,1,2,1,2} 可以被分成 X1={2,1,1,1} 和 X2={3,2},两个子集的数值之和都是 5。有趣的是,这不是唯一解。X1={1,3,1} 和 X2={2,1,2} 的数值之和也为 5,这表明存在多个可能的子集。

这就是 NP 完全问题,存在伪多项式时间动态规划解,可获得该问题的近优解。

方法和决定步骤

现在,我们开始分析这个问题,把它分解成数个单独的标准问题。这里,我们想要找出多重集的元素之和相等的子集,那么该问题就可以分解成以下两个问题:

  • 子集和问题:子集 X 的元素之和等于数字 W。

  • 多路数字分割:给定整数参数 W,确定如何将 X 分割成 W 个等额子集。

近似算法

如上所述,将分区问题分解为多路分割与子集和问题后,我们就可以考虑为这些问题而开发的算法,包括:

贪婪数字分割(Greedy number Partitioning)

该算法循环遍历所有数字,将每个数字分配给总和最小的子集。如果数字未以排序方式排列,则其运行时复杂度为 O(n),近似率约为 3/2。其 Python 伪代码如下:

def find_partition(numbers):
    """Separate the available numbers into two eqal sum series.




    Args:
        numbers: collection of numbers, for example list of integers.




    Returns:
        Two lists of numbers.
    """
    X = []
    Y = []
    sum_X = 0
    sum_Y = 0
    for n in sorted(numbers, reverse=True):
        if sum_X < sum_Y:
           X.append(n)
           sum_X = sum_X + n
        else:
           Y.append(n)
           sum_Y = sum_Y + n
    return (X, Y)

将数字排序,则运行时复杂度增加到 O(n logn),近似率增加到 7/6。如果数字在 [0,1] 范围内均匀分布,则近似率约为 1 + O(log logn/n)。

分区问题图示。

上图用二叉树的形式展示所有分区。树的根部表示集合中的最大数,每一级对应输入数字,每个独立分支对应不同的子集。遍历这些集合需要深度优先遍历(depth-first traversal),所需的空间复杂度为 O(n),时间复杂度为 O(2^n)。

适用性:

该算法可以根据情况进行修改,以便改善运行时复杂度。每一级的首要目标是构建一个分支,将当前数字分配给总和最小的子集。首先通过贪婪数字分割找出总和,然后切换到优化,得到全多项式时间近似解。

Karmarkar-Karp 算法

Karmarkar-Karp 算法指以降序方式排列数字的最大差分方法,该方法将差值替换掉原来的数字不断放进集合中。其 Java 伪代码实现如下:

int karmarkarKarpPartition(int[] baseArr) {    
    // create max heap    
    PriorityQueue<Integer> heap = new PriorityQueue<Integer>(baseArr.length, REVERSE_INT_CMP);




    for (int value : baseArr) {        
        heap.add(value);    
    }




    while (heap.size() > 1) {
        int val1 = heap.poll();    
        int val2 = heap.poll();    
        heap.add(val1 - val2);
    }




    return heap.poll();
}

该算法包含输入集 S 和参数 k。将 S 分割成 k 个子集,使这些子集中的数字总和相等,从而构建期望输出。该算法包含如下关键步骤:

  • 以降序方式排列数字;

  • 用差值替换掉原来的数字,直到只有一个数字;

  • 采用回溯算法,完成分区。

适用性:

该算法通过构建二叉树来假设分区。每一级表示一对数字,左侧的分支表示用差值替换数字,右侧的分支表示将差值放置在同一个子集中。该算法先通过最大差分求得解,然后继续寻找更好的近似解。它所需的空间复杂度为 O(n),但最糟糕的情况下所需的时间复杂度可能会达到 O(2^n)。

装箱问题

装箱问题有多种现实应用。例如,如何从根本上改善印度的垃圾管理系统。这个问题就可以通过装箱问题来解决,帮助当局决定 x 量的垃圾需要多少个垃圾箱。

集装箱船:装箱问题的现实应用。

在计算机科学领域中,该问题可用于多种内存管理技术。在该算法中,我们可以通过去除冗余和最小化空间浪费来包装不同形状和大小的对象。

例如:给定一个包含 n 个项的集合,每个项的大小分别为 s1,s2,..,sn (0<=si<=1, 1<=i<=n),如何将它们装进最少数量的箱子?

经典方法:

1. 邻近适应算法 (Next Fit):查看当前项是否适合当前箱子。如果适合,则将物品放置在箱子里,否则开启一个新的箱子。

我们来看一个示例:项是 0.5, 0.7, 0.5, 0.2, 0.4, 0.2, 0.5, 0.1, 0.6,箱子大小均为 1。

基于邻近适应算法的装箱解决方案(M = 箱子总数 = 6)。

2. 最先匹配法 (First Fit):按顺序浏览箱子,在第一个箱中放置新的项,直到放不下再启用新的箱子。

我们来看一个示例:项是 0.5, 0.7, 0.5, 0.2, 0.4, 0.2, 0.5, 0.1, 0.6,箱子的大小均为 1。

基于最先匹配法的装箱解决方案(M = 箱子总数 = 5)。

3. 最优匹配法 (Best Fit):按顺序浏览箱子,将每一个新的项放在最适合的箱子里。如果不适合,则创建一个新的箱子。

我们来看一个示例:项是 0.5, 0.7, 0.5, 0.2, 0.4, 0.2, 0.5, 0.1, 0.6,箱子的大小均为 1。

基于最优匹配法的装箱解决方案(M = 箱子总数 = 5)。

该方法的输出与最先匹配法相同,但该方法的优点是实现速度比 FFD 快,即时间复杂度为 O(nlogn)。

自然方法:

如果我们提前知道所有项的大小,那么自然的解决方案就是首先按照从大到小排序,然后应用以下启发式方法:

  • 最先匹配递减法

  • 最优匹配递减法

假设有相同的示例 0.7, 0.6, 0.5, 0.5, 0.5, 0.4, 0.2, 0.2, 0.1,则排序为 0.7, 0.6, 0.5, 0.5, 0.5, 0.4, 0.2, 0.2, 0.1。

优化方法(M = 箱子总数 = 4)。

参考文献:

1. https://cutt.ly/4hSDx2Y

2. https://cutt.ly/xhSDhEM

3. https://shorturl.at/hxCO5 

4.https://en.wikipedia.org/wiki/Bin_packing_problem#Approximation_algorithms_for_bin_packing 

5. https://en.wikipedia.org/wiki/Partition_problem 

6.https://www.javatpoint.com/daa-approximate-algorithms#:~:text=An%20Approximate%20Algorithm%20is%20a,at%20the%20most%20polynomial%20time

原文链接:https://medium.com/aryan-gupta18/how-to-decide-suitability-of-approximation-algorithms-d8e45b90e530

这本书在国内已经绝版。目录如下 Introduction Dorit S. Hochbaum 0.1 What can approximation algorithms do for you: an illustrative example 0.2 Fundamentals and concepts 0.3 Objectives and organization of this book 0.4 Acknowledgments I Approximation Algorithms for Scheduling Leslie A. Hall 1.1 Introduction 1.2 Sequencing with Release Dates to Minimize Lateness 1.2.1 Jacksons rule 1.2.2 A simple 3/2-approximation algorithm 1.2.3 A polynomial approximation scheme 1.2.4 Precedence constraints and preprocessing 1.3 Identical parallel machines: beyond list scheduling 1.3.1 P|rj,prec|Lmax:: list scheduling revisited 1.3.2 The LPT rule for P‖Cmax 1.3.3 The LPT rule for P|rj|Cmax 1.3.4 Other results for identical parallel machines 1.4 Unrelated parallel machines 1.4.1 A 2-approximation algorithm based on linear programming 1.4.2 An approximation algorithm for minimizing cost and makespan 1.4.3 A related result from network scheduling 1.5 Shop scheduling 1.5.1 A greedy 2-approximation algorithm for open shops 1.5.2 An algorithm with an absolute error bound 1.5.3 A 2 E -approximation algorithm for fixed job and flow shops 1.5.4 The general job shop: unit-time operations 1.6 Lower bounds on approximation for makespan scheduling 1.6.1 Identical parallel machines and precedence constraints 1.6.2 Unrelated parallel machines 1.6.3 Shop scheduling 1.7 Min-sum Objectives 1.7.1 Sequencing with release dates to minimize sum of completion times 1.7.2 Sequencing with precedence constraints 1.7.3 Unrelated parallel machines 1.8 Final remarks 2 Approximation Algorithms for Bin Packing: A Survey E. G. Coffman, Jr., M. R. Garey, and D. S. Johnson 2.1 Introduction 2.2 Worst-case analysis 2.2.1 Next fit 2.2.2 First fit 2.2.3 Best fit, worst fit, and almost any fit algorithms 2.2.4 Bounded-space online algorithms 2.2.5 Arbitrary online algorithms 2.2.6 Semi-online algorithms 2.2.7 First fit decreasing and best fit decreasing 2.2.8 Other simple offline algorithms 2.2.9 Special-case optimality, approximation schemes, and asymptotically optimal algorithms 2.2.10 Other worst-case questions 2.3 Average-case analysis 2.3.1 Bounded-space online algorithms 2.3.2 Arbitrary online algorithms 2.3.3 Offiine algorithms 2.3.4 Other average-case questions 2.4 Conclusion Approximating Covering and Packing Problems: Set Cover, Vertex Cover, Independent Set, and Related Problems Dorit S. Hachbaum 3.1 Introduction 3.1.1 Definitions, formulations and applications 3.1.2 Lower bounds on approximations 3.1.3 Overview of chapter 3.2 The greedy algorithm for the set cover problem 3.3 The LP-algorithm for set cover 3.4 The feasible dual approach 3.5 Using other relaxations to derive dual feasible solutions 3.6 Approximating the multicoverproblem 3.7 The optimal dual approach for the vertex cover and independent set problems: preprocessing 3.7.1 The complexity of the LP-relaxation of vertex cover and independent set 3.7.2 Easily colorable graphs 3.7.3 A greedy algorithm for independent set in unweighted graphs 3.7.4 A local-ratio theorem and subgraph removal 3.7.5 Additional algorithms without preprocessing 3.7.6 Summary of approximations for vertex cover and independent set 3.8 Integer programming with two variables per inequality 3.8.1 The half integrality and the linear programming relaxation 3.8.2 Computing all approximate solution 3.8.3 The equivalence of IP2 to 2-SAT and 2-SAT to vertex cover 3.8.4 Properties of binary integer programs 3.8.5 Dual feasible solutions for IP2 3.9 The maximum coverage problem and the greedy 3.9.1 Tile greedy approach 3.9.2 Applications of the maxinmum coverage problem 4 The Primal-Dual Methud for Approximation Algorithms and Its Applicatiun to Network Design Problems Michel X. Goemans and David P. Williamson 4.1 Introduction 4.2 The classical primal-dual method 4.3 Thc primal-dual method Im approximation algorithms 4.4 A model of network design problems 4.4.1 0-I functions 4.5 Downwards monotone functions 4.5.1 The edge-covering problem 4.5.2 Lower capacitated partitioning problems 4.5.3 Location-design and location-routing problems 4.5.4 Proof of Theorems 4.5 and 4.6 4.6 0-1 proper functions 4.6.1 The generalized Sterner tree problem 4.6.2 The T-join problem 4.6.3 The minimum-weight perfect matching problem 4.6.4 Point-to-point connection problems 4.6.5 Exact partitioning problems 4.7 General proper functions 4.8 Extensions 4.8.1 Mininmm multicut in trees 4.8.2 The prize-collecting problems 4.8.3 Vertex connectivity problems 4.9 Conclusions 5 Cut Problems and Their Application to Divide-and-Conquer David B. Shmoys 5.1 Introduction 5.2 Minimum multicuts and maximum multicommodity flow 5.2.1 Multicuts, maximum multicommodity flow, and a weak duality theorem 5.2.2 Fractional multicuts, pipe systems, and a strong duality theorem 5.2.3 Solving the linear programs 5.2.4 Finding a good multicut 5.3 Sparsest cuts and maximum concurrent flow 5.3.1 The sparsest cut problem 5.3.2 Reducing the sparsest cut problem to the minimum multicut problem 5.3.3 Embeddings and the sparsest cut problem 5.3.4 Finding a good embedding 5.3.5 The maximum concurrent flow problem 5.4 Minimum feedback arc sets and related problems 5.4.1 An LP-based approximation algorithm 5.4.2 Analyzing the algorithm Feedback 5.4.3 Finding a good partition 5.5 Finding balanced cuts and other applications 5.5.1 Finding balanced cuts 5.5.2 Applications of balanced cut theorems 5.6 Conclusions Approximation Algorithms for Finding Highly Connected Suhgraphs Samir KhulJer 6.1 Introduction 6.1.1 Outline of chapter and techniques 6.2 Edge-connectivity problems 6.2.1 Weighted edge-connectivity 6.2.2 Unweighted edge-connectivity 6.3 Vertex-connectivity problems 6.3.1 Weighted vertex-connectivity 6.3.2 Unweighted vertex-connectivity 6.4 Strong-connectivity problems 6.4.1 Polynomial time approximation algorithms 6.4.2 Nearly linear-time implementation 6.5 Connectivity augmentation 6.5.1 increasing edge connectivity from I to 2 6.5.2 Increasing vertex connectivity from I to 2 6.5.3 Increasing edge-connectivity to 3. Algorithms for Finding Low Degree Structures Balaji Raghavachari 7.1 Introduction 7.2 Toughness and degree 7.3 Matchings and MDST 7.4 MDST within one of optimal 7.4.1 Witness sets 7.4.2 The △* 1 algorithm 7.4.3 Performance analysis 7.5 Local search techniques 7.5.1 MDST problem 7.5.2 Constrained forest problems 7.5.3 Two-connected subgraphs 7.6 Problems with edge weights - points in Euclidean spaces 7.7 Open problems 8 Approximation Algorithms for Geometric Problems Marshall Bern and David Eppstein 8.1 Introduction 8.1.1 Overview of topics 8.1.2 Special nature of geometric problems 8.2 Traveling salesman problem 8.2.1 Christofides algorithm 8.2.2 Heuristics 8.2.3 TSP with neighborhoods 8.3 Steiner tree problem 8.3.1 Steiner ratios 8.3.2 Better approximations 8.4 Minimum weight triangulation 8.4.1 Triangulation without Steiner points 8.4.2 Steiner triangulation 8.5 Clustering 8.5.1 Minmax k-clustering 8.5.2 k-minimum spanning tree 8.6 Separation problems 8.6.1 Polygon separation 8.6.2 Polyhedron separation 8.6.3 Point set separation 8.7 Odds and ends 8.7.1 Covering orthogonal polygons by rectangles 8.7.2 Packing squares with fixed comers 8.7.3 Largest congruent subsets 8.7.4 Polygon bisection 8.7.5 Graph embedding 8.7.6 Low-degree spanning trees 8.7.7 Shortest paths in space 8.7.8 Longest subgraph problems 8.8 Conclusions 9 Various Notions of Approximations: Good, Better, Best, and More Dorit S. Hochbaum 9.1 Introduction 9.1.1 Overview of chapter 9.2 Good: fixed constant approximations 9.2.1 The weighted undirected vertex feedback set problem 9.2.2 The shortest superstring problem 9.2.3 How maximization versus minimization affects approximations 9.3 Better: approximation schemes 9.3.1 A fully polynomial approximation scheme for the knapsack problem 9.3.2 The minimum makespan and the technique of dual approximations 9.3.3 Geometric packing and covering--the shifting technique 9.4 Best: unless NP = P 9.4.1 The k-center problem 9.4.2 A powerful approximation technique for bottleneck problems 9.4.3 Best possible parallel approximation algorithms 9.5 Better than best 9.5.1 A FPAS for bin packing 9.5.2 A 9/8-approximation algorithm for ~dge coloring of multigraphs and beyond 9.6 Wonderful: within one unit of optimum 10 Hardness of Approximations San jeer Arora and Carsten Lund 10.1 Introduction 10.2 How to prove inapproximability results 10.2.1 The canonical problems 10.2.2 Inapproximability results for the canonical problems 10.2.3 Gap preserving reductions 10.3 Inapproximability results for problems in class I 10.3.1 Max-SNP 10.4 Inapproximability results for problems in class II 10.4.1 SETCOVER 10.5 Inapproximability results lor problems in class 111 10.5.1 LABELCOVER maximization version ,. 10.5.2 LABELCOVER mtn version 10.5.3 Nearest lattice vector problem 10.6 Inapproximability results for problems in class IV 10.6.1 CLIQUE 10.6.2 COLORING 10.7 Inapproximability results at a glance 10.7.1 How to prove other hardness results: a case study 10.8 prohabilistically checkable proofs and inapproximability 10.8.1 The PCP theorem 10.8.2 Connection to inapproximability of MAX-3SAT 10.8.3 Where the gap comes from 10.9 Open problems 10.10 Chapter notes 11 Randomized Approximation Algorithms in Combinatorial Optimization Rajeev Motwani, Joseph Seffi Naor, and Prabhakar Raghavan 11.1 Introduction 11.2 Rounding linear programs 11.2.1 The integer multicommodity flow problem 11.2.2 Covering and packing problems 11.2.3 The maximum satisfiability problem 11.2.4 Related work 11.3 Semidefinite programming 11.3.1 The maximum cut problem 11.3.2 The graph coloring problem 11.4 Concluding remarks 11.4.1 Derandomizafion and parallelization 11.4.2 Computational experience 11.4.3 Open problems 12 The Markov Chain Monte Carlo Method: An Approach to Approximate Counting and Integration Mark Jerrum and Alistair Sinclair 12.1 Introduction 12.2 An illustrative example 12.3 Two techniques for bounding the mixing time 12.3.1 Canonical paths 12.3.2 Conductance 12.4 A more complex example: monomer-dimer systems 12.5 More applications 12.5.1 The permanent 12.5.2 Volume of convex bodies 12.5.3 Statistical physics 12.5.4 Matroid bases: an open problem 12.6 The Metropolis algorithm and simulated annealing Appendix 13 Online Computation Sandy Irani and Anna R. Karlin 13.1 Introduction 13.2 Three examples of competitive analysis 13.2.1 Paging 13.2.2 The k-server problem 13.2.3 Metrical task systems 13.3 Theoretical underpinnings: deterministic algorithms 13.3.1 Lower bounds 13.3.2 Design principles 13.3.3 Bounding competitiveness 13.4 Theoretical underpinnings: randomized algorithms 13.4.1 Example: paging 13.4.2 Lower bounds 13.4.3 The relationships between the adversaries 13.5 The k-server problem revisited 13.5.1 History. 13.5.2 Notation and properties of work functions. 13.5.3 The work function algorithm WFA 13.5.4 Proof of 2k - 1 -competitiveness 13.5.5 The duality lemma 13.5.6 The potential function 13.5.7 Quasi-convexity and the duality lemma 13.6 Online load balancing and virtual circuit routing 13.6.1 Load balancing on unrelated machines 13.6.2 Online virtual circuit routing 13.6.3 Recent results 13.7 Variants of competitive analysis 13.8 Conclusions and directions for future research Glossary of Problems Index
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值