用户名: 密码: 验证码:
面向软件脆弱性分析的并行符号执行技术研究
详细信息    本馆镜像全文|  推荐本文 |  |   获取CNKI官网全文
摘要
符号执行技术在软件脆弱性分析领域的应用已经取得了较大的发展,相比于传统的模糊测试,在测试数据生成和程序执行路径分析方面具有较大优势,提高了脆弱性分析的能力。但是由于软件执行路径数巨大,在有限的资源条件下逐条路径分析效率较低,测试覆盖率难以提高,成为符号执行发展的瓶颈。
     随着高性能硬件平台和新型计算模型的发展,并行符号执行技术逐渐成为国内外研究的热点。在提高软件安全并行测试效率方面,算法的并行度及漏洞挖掘方法成为研究的关键问题。本文从符号执行、约束求解及其并行化技术等方面,对软件脆弱性分析进行了研究,主要工作可概括如下:
     1、在静态符号执行方面,为了提高符号执行的并行度,提出了以路径簇为负载单位的并行执行方法。通过程序依赖分析,将与程序目标点具有共享控制依赖条件多路径按规则归约成路径簇,以路径簇链为单位进行并行测试的负载调度,在有效减少实际执行路径数目的同时提高执行分析的并行度。
     2、为了处理环境交互问题,提出了基于混合输入的环境交互方法。通过定义程序执行的一致性模型,系统化地分析了符号执行与实际执行的逼近问题;提出了符号执行与具体执行的转换方法及维护一致性的启发式策略。通过区分可求解约束和复杂约束条件,提出了具体值和符号值混合代入法化简复杂约束的路径求解算法。
     3、在动态符号执行方面,为了提高针对特定目标的测试覆盖率,提出了基于敏感点逼近的并行测试方法。该方法结合了动静态分析两方面的优势。利用静态分析技术,提出了基于调用链回溯的敏感点到程序执行路径的距离计算方法,能够为导向型测试提供依据;设计了基于动态符号执行的分布式测试模型,实现了“距离分析——路径求解——用例生成——实际测试”反复迭代的测试过程。以敏感点为导向、以符号执行技术为基础的测试方法,加强了测试的针对性。
     4、在约束求解方面,提出了基于集中式存储的全局约束并行相容模型,并将该技术与回溯搜索结合,实现了并行求解方法。该模型利用动态分配约束条件的方法解决负载均衡问题,通过对变量域的集中式管理,保证了冲突检测的及时性,利用变量域剪枝单调性的特点,实现了异步相容检查,提高多节点间相容检查的并行程度。在并行搜索方面,依赖对搜索空间划分与调度,与相容技术进行融合,进一步提高了约束求解的效率。
     5、将静态符号执行技术应用于软件补丁比对,实现基本块级的语义差异分析。在函数级利用基于图形同构的匹配方法,找到最大同构子图;在此基础上,利用静态符号执行技术对基本块的输入输出行为进行分析,判断其功能的相似性,借以进行语义的差异分析。通过对比实验验证了匹配结果准确率的提升。
     本文提出的相关模型和算法已在课题组研制的大规模分布式并行漏洞挖掘系统中得到应用。实验结果表明了相关技术的可行性和有效性,对于提高软件脆弱性分析的效率有着很好的帮助。
Symbolic execution in the field of software vulnerability analysis has made greatimprovement. Compared with traditional fuzzy testing, it has the advantage of test inputgeneration and execution path analysis, and improves vulnerability detection greatly. Becausethere is a large quantity of execution paths in software, it is low efficiency to analyze path one byone, and test coverage is difficult to improve. It becomes the bottleneck of symbolic execution.
     With the development of high performance hardware platform and new computing model,parallel symbolic execution technology becomes domestic and foreign research focus gradually.In order to improve parallel efficiency of software security test, the degree of parallelism of thealgorithms and the method of vulnerability detection has become the critical issues. The study onsoftware vulnerability analysis includes symbolic execution, constraint solution and parallelmethod of them in the thesis. The main task can be summarized as follows.
     1. In respect of static symbolic execution, the parallel method in which load unit consists ofpath families is proposed to improve the degree of parallelism. Through Program DependenceAnalysis, multi paths with control dependency which have the same symbolic value of targetstatement and are reducted into path family. Path family is unit of load scheduling in parallel test.Both the number of actual execution paths is reduced and the degree of parallelism of analysis ispromoted.
     2. In order to address the problem of interaction with environment, symbolic execution withmixed concrete symbolic input is proposed. The consistency model of program execution isdefined to analyze systematically approximation between symbolic execution and concreteexecution. The method for switching back and forth between symbolic execution and concreteexecution and the heuristic strategy of maintaining consistency are put forward. bydistinguishing solvable constraint from complex constraint and using concrete and symbolicvalues to simplify complex constraint, mixed path constraint solution algorithm is designed andrealized.
     3. In respect of dynamic symbolic execution, the sensitive point oriented test method forparallel approach is put forward to improve coverage of target statements. The method combinesthe advantages of both static and dynamic. Static analysis is used for realize the method forcomputing the distance from sensitive point to program execution trace based on call chainbacktrack. It provides the basis of oriented test. Distributed test model is designed based ondynamic symbolic execution, which achieves the iterative testing process, including distanceanalysis, path condition solution, cases generation and actual test. The test method combiningsensitive point oriented and symbolic execution strengthens the pertinence.
     4. In respect of constant solution, Parallel Consistency Model for Global Constraints basedon Centralized Storage. As well as the model is integrated with backtrack search to parallelsolution method. Dynamic constraints distribution is used for addressing the issue ofload balancing. Collision is detected quickly due to centralized management of variable domain.Since pruning variable domain is monotonic, asynchronous consistency was realized to improvethe degree of parallelism. Parallel search is merged with parallel consistency by dividing andscheduling search space, to improve constraint solving efficiency.
     5. Static symbolic execution used in software security patches Comparison for the purposeof semantic differences analysis in basic block level. Through traditional structural comparison,syntax differences in function level are analyzed to find the maximum common subgraph. On thebasis, input and output behavior of basic block is analyzed by static symbolic execution. Thenthe I/O behavior is used for determining functional similarity so as to find semantic differences.Ultimately, the experimental results show the accuracy of match result is improved.
     Models and algorithms included in the thesis have been applied to Large scale DistributedParallel Vulnerability Detection System developed by our team. The experimental results showthat related technologies are feasible and effective and provide support for improve vulnerabilityanalysis efficiency.
引文
[1] Fuzzing testing[EB/OL].: http://en.wikipedia.org/wiki/Fuzz_testing,2013.
    [2] W. Halfond, A. Orso, and P. Manolios. Using Positive Tainting and Syntax aware Evaluation to CounterSQL Injection Attacks[C]. In Proceedings of the14th ACM SIGSOFT international symposium onFoundations of software engineering, New York, USA,2006.
    [3] A. Tsitovich. Detection of Security Vulnerabilities Using Guided Model Checking[C]. In ICLP’08:Proceedings of the24th International Conference on Logic Programming., Udine, Italy,2008.
    [4] J.C. King. Symbolic execution and program testing[J]. Communnications of ACM,1976,19(7):385394.
    [5] T. Ball, R. Majumdar, T. Millstein, S. Rajamani. Automatic predicate abstraction of C programs[C]. In:Proceedings of PLDI, Utah, USA,2001.
    [6] S. Khurshid, C.S. Pasareanu, and W. Visser. Generalized symbolic execution for model checking andtesting[C]. In: Proceedings of TACAS, Warsaw, Poland,2003.
    [7] C.S. Pasareanu and W. Visser. A survey of new trends in symbolic execution for software testing andanalysis[J]. International Journal on Software Tools for Technology Transfer,2009,11(4):339353.
    [8] S. Anand, C.S. Pasareanu, and W. Visser. JPF SE: A symbolic execution extension to Java PathFinder[C].In: Proceedings of TACAS, Braga, Portugal,2007.
    [9] The Choco Constraint[EB/OL]: Solver: http://choco.sourceforge.net/.
    [10] IASolver (The Brandeis Interval Arithmetic Constraint Solver)[EB/OL]: http://www.cs.brandeis.edu/~tim/Applets/IAsolver.html/.
    [11] Yices: An SMT Solver[EB/OL]: http://yices.csl.sri.com/.
    [12]杜子德.程序控制流图:一种可视化的程序设计工具[J].计算机研究与发展,1995,12:1520.
    [13] Ryder, B.G. Constructing the Call Graph of a program[J]. Software Engineering,1979, SE5(3):216226.
    [14] U. Montanari. Networks of constraints: Fundamental properties and applications to picture processing[J].Inform. Sci.,1974(7):95132.
    [15] Graph Coloring[EB/OL]: http://en.wikipedia.org/wiki/Graph_coloring.
    [16] The N by N Queens Problem[EB/OL]: http://www.math.utah.edu/~alfeld/queens/queens. html.
    [17] R. Dechter. Constraint Processing [M].San Francisco: Morgan Kaufmann Publishers,2003.
    [18] M. Bruynooghe. Solving combinatorial search problems by intelligent backtracking[J]. InformationProcessing Letters,12:3639,1981.
    [19] F. Rossi, P. Beek, T. Walsh. Handbook of Constraint Programming [M]. ELSEVIER Press,2006:1329.
    [20] R. Dechter. Bucket elimination: A unifying framework for reasoning[J].Artificial Intelligence,1999,113:4185.
    [21] R. Dechter, J. Pearl. Tree clustering for constraint networks[J]. Artificial Intelligence,1989,38(3):353366.
    [22] K. Apt. Principles of Constraint Programming [M]. Cambridge: Cambridge University Press,2003:910
    [23] R. Dechter, D. Frost. Backtracking Algorithms for Constraint Satisfaction Problems A TutorialSurvey[OL], available: http://citeseer.ist.psu.edu/correct/298588, June1,2006.
    [24] Fahiem Bacchus, Adam Grove. On The Forward Checking Algorithm[C]. In Proceedings of the FirstInternational Conference on Principles and Practice of Constraint Programming, Montanari,1995.
    [25] Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, Clifford Stein. Introduction toAlgorithm[M]. Massachusetts: The MIT Press,2001.
    [26] Jussien N, Lhomme O. Local search with constraint propagation and conflict based heuristics[J].Artificial Intelligence.2002,139(1):2145.
    [27] H.W.QiJ.R.Chao.The quasi physical personification algorithm for solving SAT Problem Solar[J].Seience in China(Series E),1997,27(2):179186
    [28] J. S. Russell, N. Peter. Artificial Intelligence: A Modern Approach[M]. NewJersey: Prentice,2003:111114.
    [29] W. M. Spears. Simulated Annealing for Hard Satisfiability Problems[C]. In Proceedings of the SecondDIMACS Implementation Challenge: Cliques, Coloring and Satisfiability, Providence, Rhode Island:AMS,1996.
    [30] F. Rossi, P. Beek, T. Walsh. Handbook of Constraint Programming [M]. ELSEVIER Press,2006:1329.
    [31] U. Montanari. Networks of constraints: Fundamental properties and applicationsto picture processing[J].Inform. Sci.,1974(7):95–132.
    [32] R. Debruyne and C. Bessiere. Some practicable filtering techniques for the constraint satisfactionproblem[C]. In Proceedings IJCAI’97, Nagoya, Japan,1997.
    [33] C. Lecoutre, S. Cardon, J. Vion. Conservative dual consistency[C]. In procedings of the22nd nationalconference on Artificial intelligence, California,2007.
    [34] A. Almossawi, K. Lim, T. Sinha. Analysis Tool Evaluation: Coverity Prevent[R]. USA: Carnegie MellonUniversity,2006.
    [35] J. Wagner. A new Approach to Fortify Your Software[EB/OL]. http://www.internetnews.com/dev news/article.php/3335651.2004.
    [36] C. Cadar, et al. EXE: Automatically Generating Inputs of Death[C]. In ACM CCS, Alexandria, USA,2006.
    [37] P. Saxena, P. Poosankam, S. MeCamant, D. Song. Loop extended symbolic execution on binaryprograms[C]. In Intemational Symposium on Software Testing and Analysis, Chicago, IL,2009.
    [38] P. Godefroid, A. Kiezun, Y. Miehael. Levin Grammar based Whitebox Fuzzing[C]. In proceedings ofPLDI’2008, Tueson,2008.
    [39] P. Godefroid, A. Nori, S. Rajamani, S. Deep. Tetali ComPositional May Must Program Analysis:Unleashing the Power of Altemation[C]. In proceedings of POPL’2010, Madrid,2010.
    [40] K. Sen. Concolic testing[C]. In Intl. Conf. on Automated Software Engineering, Atlanta, Georgia,2007.
    [41] S. Anand, P. Godefroid, N. Tillmann. Demand driven compositional symbolic execution[C]. InProceedings of Tools and Algorithms for the Construction and Analysis of Systems,Budapest, Hungary2008.
    [42] P. Godefroid. Compositional Dynamic Test Generation[C]. In POPL’2007, Nice,2007.
    [43] P. Godefroid, S.K. Lahiri, and C. Rubio Gonzalez, Incremental ComPositional Dynamic TestGeneration[R]. Technical report in Microsoft researcher, February2010.
    [44] P. Godefroid, N. Klarlund, and K. Sen. DART: Directed automated random testing[C]. In Proc. of theACM SIGPLAN2005Conference on Programming Language design and Implementation, Chicago,lllinois, USA,2005.
    [45] K. Sen, D. Marinov, and G. Agha. CUTE: A concolic unit testing engine for C[C]. In5th joint meeting ofthe European Software Engineering Conference and ACM SIG SOFT Symposium on the Foundationsof Software Engineering. Lisbon, Portugal,2005.
    [46] K. Sen and G. Agha. CUTE, and Jcute: Concolic unit testing and explicit path model checking tools[C]In of proceedings of the18th International Conference on Computer Aided Verification, Lecture Notesin Computer Science4144, Berlin, Heidelberg,2006.
    [47] C. Cadar, D. Dunbar, and D. R. Engler. KLEE: Unassisted and automatic generation of high coveragetests for complex systems programs[C]. In OSDI, San Diego,2008.
    [48] T. L. Wang, T. Wei, Z. Q. Lin, W. Zou. IntScope: Automatically Detecting Integer OverflowVulnerability in X86Binary Using Symbolic Execution[C]. In Proceedings of NDSS, San Diego,2009.
    [49] N. Nethercote, J. Seward. Valgrind: a framework for heavyweight dynamic binary instrumentation[J].SIGPLAN Not,2007,42:89100.
    [50] D. A. Molnar, D. Wagner. Catchconv: Symbolic execution and run time type inference for integerconversion errors[R]. Technical Report No. UCB/EECS200713,2007.
    [51] Y. Xie, A. Chou, D. Engler. Archer: Using symbolic, path sensitive analysis to detect memory accesserrors[C].In Proceedings of the ACM SIGSOFT Symposium on the Foundations of SoftwareEngineering, Lund, Sweden,2003.
    [52] G. Balakrisshnan, R. Gruian, T. Reps, T. Teitelbaum. CodeSurfer/X86A platform for Analyzing X86Executables[J]. Computer Science,20053443:250254.
    [53] P. Godefroid, M. Y. Levin, D. Molnar. Automated Whitebox Fuzz Testing[C]. In Proceedings ofNDSS’2008(Network and Distributed Systems Security), San Diego,2008.
    [54] M. Staats and C. P as areanu, Parallel Symbolic Execution for Structural Test Generation[C].International Symposium on Software Testing and Analysis(ISSTA10), Trento, Italy,2010.
    [55] P. Boonstoppel, C. Cadar, D. R. Engler. RWset: Attacking Path Explosion in Constraint Based TestGeneration[C]. In Proceedings of Tools and Algorithms for the Construction and Analysis of Systems,Budapest, Hungary,2008.
    [56] K. Sen. Concolic testing[C]. In Intl. Conf. on Automated Software Engineering, Atlanta, Georgia,2007.
    [57] S. Anand, P. Godefroid, N. Tillmann. Demand driven compositional symbolic execution[C]. InProceedings of Tools and Algorithms for the Construction and Analysis of Systems, Budapest, Hungary2008.
    [58] M. Kim, Y. Kim and G. Rothermel, A Scalable Distributed Concolic Testing Approach[C]. The7thInternational colloquium conference on Theoretical aspects of computing(ICTAC10), Natal, Brazil,2010.
    [59] J. H. Siddiqui, S. Khurshid. ParSym: Parallel Symbolic Execution[C]. ICSTE’10, Puerto Rico,2010.
    [60] R. Sasnauskas, O. S. Dustmann, B. L. Kaminski, K. Wehrle. Scalable Symbolic Execution of DistributedSystems[C]. ICDCS’11, Minneapoils, Minnesota, USA,2011.
    [61] W. Clocksin. Principles of the DelPhi parallel inference machine[J]. Computer Journal,1987,30(5):386392.
    [62] L. Perron. Search Procedures and parallelism in constraint programming[C]. In Principles and Practicesof Constraint Programming, Alexandria, USA,1999.
    [63] L. Michel, A. See, Van H P. Distributed Constraint based local search[C]. In Principles and Practices ofConstraint Programming, Nantes, France,2006.
    [64] M. Laurent, S. Andrew, V. H. Pascal. Parallelizing Constraint Programs Transparently[C]. In Principlesand Practices of Constraint Programming, Providence, USA,2007.
    [65] G. Chu and P.J. Stuckey. PMiniSAT: A Parallelization of MiniSAT2.0[C]. In SAT Race, Guangzhou,China,2008.
    [66] T. Schubert, M. Lewis, and B. Becker. PaMiraXT: Parallel SAT solving with threads and messagepassing[J]. JSAT,2009(6):203222.
    [67] G. Chu, C. Schulte, and P. J. Stuckey. Condence based Work Stealing in Parallel ConstraintProgramming[J]. InPrinciples and Practices of Constraint Programming,2009:226241.
    [68] L. Kottho and N. Moore. Distributed Solving Through Model Splitting[J]. CoRR.,2010, abs/1008.4328.
    [69] D. L. Waltz. Understanding line drawings of scenes with shadows[S]. The Psychology of ComputerVision, McGraw Hill,1975.
    [70] A. K. Mackworth. Consistency in networks of relations [J]. Artificial Intelligence,1977,8(1):99118.
    [71] R. Mohr, C. Henderson. Arc andpathconsistency revisited[J]. Artificial Intelligence,1986,28(2):225233.
    [72] Y. Deville, P. Hentenryck. An efficient arc consistency algorithm for a class of CSP Problem[C]. InProceedings of12thIJCAI,325330, Sydney, Australia,1991.
    [73] R. Mohr, T. C. Henderson. Arc and path consistency revisied[J]. Artificial Intelligence,1986,28:225233.
    [74] C. Bessiere. Arc consistency and arc consistency again[J]. Artificial Intelligence,1994,65:179190.
    [75] C. Bessiere, E. C. Freuder, J. C. Regin. Using inference to reduce arc consistency computation[C]. InProceedings of IJCAI’95, Montreal, Canada,1995:592598.
    [76] Y. Zhang, R. H. C. Yap. Making AC3an optimal algorithm[C]. In Proceedings of IJCAI’01, Seattle WA,2001.
    [77] L. Christophe, B. Frederic, H. Fred. Exploiting multidirectionality in coarse grained arc consistencyalgorithms[C]. In CP03:9th International Conference on Principles and Practice of ConstraintProgramming, Kinsale, Ireland,2003.
    [78]张晓亮.基于保留支持的约束相容性及求解技术的研究[D].长春:吉林大学,2012.
    [79] Z. S. Li, H. B. Li, Y.G. Zhang, Z. W. Wang. An approach of solving constraint satisfaction problembased on cycle cut[J]. Chinese Journal of Computers,2011,34(8):1528.
    [80]李宏博,李占山,王涛.改进求解约束满足问题粗粒度弧相容算法[J].软件学报,2012,23(7):18161823.
    [81]李占山,王孜文,艾阳.基于AC4的动态值启发式约束满足问题求解算法[J].吉林大学学报(工学版),2011,41(5):13781382.
    [82] T. Nguyen, Y. Deville. A distributed arc consistency algorithm[J]. Science of Computer Programming,1998,30(12):227250.
    [83] H. Youssef. Optimal Distributed Arc Consistency[J]. Constraints,2002,7(34):367385.
    [84] C. C. Rolf, K. Kuchcinski. Combining parallel search and parallel consistencyin constraintprogramming[C]. In TRICS workshop at CP2010, St Andrews, Scotland,2010.
    [85] K. Jeanne, J. Ottenstein, D. Joe. Warren. The Program Dependence Graph and its use in Optimization[J].ACM Transactions on Programming Languages and Systems,1987,9(3):319349.
    [86] Software artifact Infrastructure Repository[EB/OL]. http://sir.unl.edu/portal/index.php,2012.
    [87] Cloud9[EB/OL]. https://cloud9.epfl.ch,2012.
    [88]范文庆,分段符号执行及其环境交互问题研究[D].北京:北京邮电大学,2010.
    [89] N. Tillmann, J. D. Halleux. Pex White Box Test Generation for.Net[C]. TAP’08Proceedings of the2ndinternational conference on Tests and proofs, Prato, Italy,2008.
    [90] Fuzzgrind[EB/OL]. http://esec lab.sogeti.com/category/Tools.
    [91] The KLEE Symbolic Virtual Machine. http://klee.llvm.org/.
    [92] C. Cifuentes, C. Hoermann, N. Keynes. BegBunch Benchmarking for C Bug Detection Tools[C],DEFECTS’09, USA,2009.
    [93]刘杰,王嘉捷,魏强,王清贤.基于危险函数逼近的漏洞检测技术研究[C].第三届信息安全漏洞分析与风险评估大会,黄山,安徽,2010.
    [94]崔展齐,王林章,李宣东.一种目标制导的混合执行测试方法[J].计算机学报,2011,34(6):953954.
    [95] J. Burnim, K. Sen. Heuristics for scalable dynamic test generation. In ASE, pages443446,2008.
    [96] Sudoku[EB/OL]. http://en.wikipedia.org/wiki/Sudoku.
    [97] C. C. Rolf, K. Kuchcinski. Parallel Consistency in Constraint Programming[C]. The2009InternationalConference on Parallel and Distributed Processing Techniques and Applications, Las Vegas, USA,2009.
    [98] J. Gaschnig. Experimental case studies of backtrack vs. Waltz type vs. new algorithms for satisficingassignment problems[M]. In Proceedings of the Canadian Artificial Intelligence Conference,1978:268–277.
    [99] Satlib Benchmark Problems[OL]. http://www.cs.ubc.ca/~hoos/SATLIB/benchm.html.2012.7
    [100] M. CHRISTINE. Plant physiology: plant biology in the Genome Era[J/OL].http://cpai.ucc.ie/05/Benchmarks.html,2012.
    [101] B. David, P. Pongsin, S. Dawn. Automatic patch2based exploit generation is possible: techniques andimplications[C]. IEEE Symposiumon Security and Privacy, Oakland, California, USA,2008.
    [102] Z. Wang, K. Pierce and S. McFarling, BMAT A Binary Matching Tool for Stale Profile Propagation[J].The Journal of Instruction Level Parallelism (JILP), Vol.2, May2000.
    [103] T. Sabin. Comparing binaries with graph isomorphisms[EB/OL]. http://razor.bindview.com/publish/papers/comparingbinaries.htm.
    [104] H. Flake. Structural Comparison of Executable Objects. Detection of Intrusions and Malware&Vulnerability Assessment[C]. GI SIG SIDAR Workshop, Dortmund Germany,2004.
    [105] H. Flake. BinDiff[EB/OL]: http://www.zynamics.com/bindiff.html
    [106] O. J. Fight against1day exploits: Diffing binaries vs anti diffing binaries[C]. In BlackHat, USA,2009.
    [107]徐良华,孙玉龙,高丰.基于图形同构理论的安全补丁比较技术[J].计算机应用,2006,26(7):162321625.
    [108] PatchDiff2[CP/OL]: http://code.google.com/p/patchdiff2/,2013.
    [109] TurboDiff[CP/OL]: http://corelabs.coresecurity.com/index.php?module=Wiki&action=view&type=tool&name=turbodiff,2009..

© 2004-2018 中国地质图书馆版权所有 京ICP备05064691号 京公网安备11010802017129号

地址:北京市海淀区学院路29号 邮编:100083

电话:办公室:(+86 10)66554848;文献借阅、咨询服务、科技查新:66554700