本文介绍了Clang优化级别的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在gcc上,说明了 -O3 -Os 等翻译成具体的优化参数( -funswitch-loops -fcompare-elim 等)



相同的信息。



我看过和 man clang ,其中仅提供一般信息( -O2 优化超过 -O1 -Os 优化速度...),并在这里查看Stack Overflow,发现,但我还没有找到任何相关的引用源文件。 p>

修改:我发现了一个答案,但我仍然感兴趣如果任何人有用户手册文档的链接所有优化传递和由 -O x 选择的传递。目前,我只是找到通过列表,但在优化级别上没有。

解决方案

我发现相关问题。



总结一下,了解编译器优化过程:

  llvm-as< / dev / null | opt -O3 -disable-output -debug-pass = Arguments 

Nixon 的答案(+1), clang 另外运行一些更高级别的优化,我们可以检索:

  echo'int;'| clang -xc -O3  -  -o / dev / null -\#\#\#

可以查看个人通行证的文档。









使用版本3.8 ,通行证如下:




  • 基线( -O0 ):




    • opt :-targetlibinfo -tti -verify

    • clang 添加:-mdisable-fp-elim -mrelax-all


  • p> -O1 基于 -O0




    • opt 添加:-globalopt -demanded-bits -branch-prob -inferattrs - ipsccp -dse -loop-simplify -scoped-noalias -barrier -adce -deadargelim -memdep -licm -globals -aa -rpo-functionattrs -basiccg -loop-idiom -forceattrs -mem2reg -simplifycfg -early-cse -instcombine -sccp - loop-unswitch -loop-vectorize -tailcallelim -functionattrs -loop-accesses -memcpyopt -loop-deletion -reassociate -strip-dead-prototypes -loops -basicaa -correlated-propagation -lcssa -domtree -always-inline -aa -block- freq -float2int -lower-expect -sroa -loop-unroll -alignment-from-assumptions -lazy-value-info -prune-eh -jump-threading -loop-rotate -indvars -bdce -scalar-evolution -tbaa -assumption- cache-tracker

    • clang 添加:-momit-leaf-frame-pointer

    • clang 丢弃:-mdisable-fp-elim -mrelax-all

  • code>




    • opt 添加 -elim-avail-extern -mldst-motion -slp-vectorizer -gvn -inline -globaldce -constmerge

    • opt drop :-always-inline

    • clang -slp


  • -O3 是基于 -O2




    • opt 添加:-argpromotion


  • -Ofast 基于 -O3 ,在 clang 中有效,但不在 opt




    • clang 新增:-fno-signed-zeros -freciprocal-math -ffp-contract = fast -menable-unsafe-fp-math -menable-no-nans -menable-no-infs


  • -Os c $ c> -O2


  • / strong>基于 -Os




    • opt drop :-slp-vectorizer

    • / em>:-vectorize-loops







使用版本3.7 ,通行证如下(上述命令的解析输出):




  • 默认值(-O0):-targetlibinfo -verify -tti


  • -O1基于-O0




    • 添加:-sccp -loop-simplify -float2int -lazy-value-info -correlated-propagation -bdce - lcssa -deadargelim -loop-unroll -loop-vectorize -barrier -memcpyopt -loop -asas -assumption-cache-tracker -reassociate -loop-deletion -branch-prob -jump-threading -domtree -dse -loop-rotate -ipsccp - instcombine -scoped-noalias -licm -prune-eh -loop-unswitch -alignment-from-assumptions -early-cse -inline-cost -simplifycfg -strip-dead-prototypes -tbaa -sroa -no-aa -adce -functionattrs - lower-expect -basiccg -loops -loop-idiom -tailcallelim -basicaa -indvars -globalopt -block-freq -scalar-evolution -memdep -always-inline


  • -O2基于-01




    • 添加:-elim- avail-extern -globaldce -inline -constmerge -mldst-motion -gvn -slp-vectorizer

    • 删除:-always-inline


  • -O3基于-O2




    • >添加:-argpromotion -verif


  • -Os与-O2相同

    li>
  • -Oz基于-Os




    • 删除: -slp-vectorizer





b

对于版本3.6 ,通票在GYUNGMIN KIM的帖子中有记录。






使用版本3.5 ,通行证如下(上述命令的解析输出):




  • 默认值(-O0):-targetlibinfo -verify -verify-di


  • :-correlated-propagation -basiccg -simplifycfg -no-aa -jump-threading -sroa -

    loop-unswitch -ipsccp -instcombine -memdep -memcpyopt -barrier -block-freq -loop-simplify -loop-vectorize -inline-cost -branch-prob -early-cse -lazy-value-info -loop-rotate -strip- dead-prototypes -loop-deletion -tbaa -prune-eh -indvars -loop-unroll -reassociate -loops -sccp -always-inline -basicaa -dse -globalopt -tailcallelim -functionattrs -deadargelim -notti -scalar-evolution -lower- expect -licm -loop-idiom -adce -domtree -lcssa


  • -O2基于-01




    • 添加:-gvn -constmerge -globaldce -slp-vectorizer -mldst-motion -inline

    • 移除:-always-inline





    • 添加:-argpromotion


  • -Os与-O2相同


  • -Oz基于-Os




    • 移除:-slp-vectorizer







  • 使用版本3.4 ,传递如下(解析输出命令):




    • -O0:-targetlibinfo -preverify -domtree -verify


    • -O1是基于-O0




      • 添加:-adce -always -inline -basicaa -basiccg -correlated-propagation -deadargelim -dse -early-cse -functionattrs -globalopt -indvars -inline-cost -instcombine -ipsccp -jump-threading -lazy-value-info -lcssa -licm -loop-deletion -loop-idiom -loop-rotate -loop-simplify -loop-unroll -loop-unswitch -loops -lower-expect -memcpyopt -memdep -no-aa -notti -prune-eh -reassociate -scalar-evolution -sccp -simplifycfg -sroa -strip-dead-prototypes -tailcallelim -tbaa


    • -O2基于-01




      • 添加:-barrier -constmerge -domtree -globaldce -gvn -inline -loop-vectorize -preverify -slp-vectorizer -targetlibinfo -verify

      • 删除:-always-inline


    • p> -O3 is based on -O2




      • 添加:-argpromotion


    • -Os与-O2相同





      • 删除:-barrier -loop-vectorize -slp-vectorizer







    3.2 传递如下(上面命令的解析输出):




    • -O0:-targetlibinfo - preverify -domtree -verify


    • -O1基于-O0




      • :-sroa -early-cse -lower-expect -no-aa -tbaa -basicaa -globalopt -ipsccp -deadargelim -instcombine -simplifycfg -basiccg -prune-eh -always-inline - functionattrs -simplify-libcalls -lazy-value-info -jump-threading -correlated-propagation -tailcallelim -reassociate -loops -loop-simplify -lcssa -loop-rotate -licm -loop-unswitch -scalar-evolution -indvars -loop- idiom -loop-deletion -loop-unroll -memdep -memcpyopt -sccp -dse -adce -strip-dead-prototypes


    • -O2 is based on -01




      • 添加:-inline -globaldce -constmerge

      • 移除:-always-inline


    • -O2




      • 添加:-argpromotion


    • -Os与-O2相同


    • -Oz与-Os相同

      >





    [april 2014] 添加了文档链接。

    +选项3.4



    编辑 [九月2014] 添加了3.5的选项



    编辑 [2015年12月] 添加3.7的选项并提及现有答案3.6



    编辑 [may 2016] 添加了3.8的选项,用于opt和clang,并提供现有的clang回答(vs opt)


    On gcc, the manual explains what -O3, -Os, etc. translate to in terms of specific optimisation arguments (-funswitch-loops, -fcompare-elim, etc.)

    I'm looking for the same info for clang.

    I've looked online and in man clang which only gives general information (-O2 optimises more than -O1, -Os optimises for speed, …) and also looked here on Stack Overflow and found this, but I haven't found anything relevant in the cited source files.

    Edit: I found an answer but I'm still interested if anyone has a link to a user-manual documenting all optimisation passes and the passes selected by -Ox. Currently I just found this list of passes, but nothing on optimisation levels.

    解决方案

    I found this related question.

    To sum it up, to find out about compiler optimization passes:

    llvm-as < /dev/null | opt -O3 -disable-output -debug-pass=Arguments
    

    As pointed out in Geoff Nixon's answer (+1), clang additionally runs some higher level optimizations, which we can retrieve with:

    echo 'int;' | clang -xc -O3 - -o /dev/null -\#\#\#
    

    Documentation of individual passes is available here.



    With version 3.8 the passes are as follow:

    • baseline (-O0):

      • opt sets : -targetlibinfo -tti -verify
      • clang adds : -mdisable-fp-elim -mrelax-all
    • -O1 is based on -O0

      • opt adds: -globalopt -demanded-bits -branch-prob -inferattrs -ipsccp -dse -loop-simplify -scoped-noalias -barrier -adce -deadargelim -memdep -licm -globals-aa -rpo-functionattrs -basiccg -loop-idiom -forceattrs -mem2reg -simplifycfg -early-cse -instcombine -sccp -loop-unswitch -loop-vectorize -tailcallelim -functionattrs -loop-accesses -memcpyopt -loop-deletion -reassociate -strip-dead-prototypes -loops -basicaa -correlated-propagation -lcssa -domtree -always-inline -aa -block-freq -float2int -lower-expect -sroa -loop-unroll -alignment-from-assumptions -lazy-value-info -prune-eh -jump-threading -loop-rotate -indvars -bdce -scalar-evolution -tbaa -assumption-cache-tracker
      • clang adds : -momit-leaf-frame-pointer
      • clang drops : -mdisable-fp-elim -mrelax-all
    • -O2 is based on -O1

      • opt adds: -elim-avail-extern -mldst-motion -slp-vectorizer -gvn -inline -globaldce -constmerge
      • opt drops: -always-inline
      • clang adds: -vectorize-loops -vectorize-slp
    • -O3 is based on -O2

      • opt adds: -argpromotion
    • -Ofast is based on -O3, valid in clang but not in opt

      • clang adds: -fno-signed-zeros -freciprocal-math -ffp-contract=fast -menable-unsafe-fp-math -menable-no-nans -menable-no-infs
    • -Os is the same as -O2

    • -Oz is based on -Os

      • opt drops: -slp-vectorizer
      • clang drops: -vectorize-loops


    With version 3.7 the passes are as follow (parsed output of the command above):

    • default (-O0): -targetlibinfo -verify -tti

    • -O1 is based on -O0

      • adds: -sccp -loop-simplify -float2int -lazy-value-info -correlated-propagation -bdce -lcssa -deadargelim -loop-unroll -loop-vectorize -barrier -memcpyopt -loop-accesses -assumption-cache-tracker -reassociate -loop-deletion -branch-prob -jump-threading -domtree -dse -loop-rotate -ipsccp -instcombine -scoped-noalias -licm -prune-eh -loop-unswitch -alignment-from-assumptions -early-cse -inline-cost -simplifycfg -strip-dead-prototypes -tbaa -sroa -no-aa -adce -functionattrs -lower-expect -basiccg -loops -loop-idiom -tailcallelim -basicaa -indvars -globalopt -block-freq -scalar-evolution -memdep -always-inline
    • -O2 is based on -01

      • adds: -elim-avail-extern -globaldce -inline -constmerge -mldst-motion -gvn -slp-vectorizer
      • removes: -always-inline
    • -O3 is based on -O2

      • adds: -argpromotion -verif
    • -Os is identical to -O2

    • -Oz is based on -Os

      • removes: -slp-vectorizer


    For version 3.6 the passes are as documented in GYUNGMIN KIM's post.


    With version 3.5 the passes are as follow (parsed output of the command above):

    • default (-O0): -targetlibinfo -verify -verify-di

    • -O1 is based on -O0

      • adds: -correlated-propagation -basiccg -simplifycfg -no-aa -jump-threading -sroa -loop-unswitch -ipsccp -instcombine -memdep -memcpyopt -barrier -block-freq -loop-simplify -loop-vectorize -inline-cost -branch-prob -early-cse -lazy-value-info -loop-rotate -strip-dead-prototypes -loop-deletion -tbaa -prune-eh -indvars -loop-unroll -reassociate -loops -sccp -always-inline -basicaa -dse -globalopt -tailcallelim -functionattrs -deadargelim -notti -scalar-evolution -lower-expect -licm -loop-idiom -adce -domtree -lcssa
    • -O2 is based on -01

      • adds: -gvn -constmerge -globaldce -slp-vectorizer -mldst-motion -inline
      • removes: -always-inline
    • -O3 is based on -O2

      • adds: -argpromotion
    • -Os is identical to -O2

    • -Oz is based on -Os

      • removes: -slp-vectorizer


    With version 3.4 the passes are as follow (parsed output of the command above):

    • -O0: -targetlibinfo -preverify -domtree -verify

    • -O1 is based on -O0

      • adds: -adce -always-inline -basicaa -basiccg -correlated-propagation -deadargelim -dse -early-cse -functionattrs -globalopt -indvars -inline-cost -instcombine -ipsccp -jump-threading -lazy-value-info -lcssa -licm -loop-deletion -loop-idiom -loop-rotate -loop-simplify -loop-unroll -loop-unswitch -loops -lower-expect -memcpyopt -memdep -no-aa -notti -prune-eh -reassociate -scalar-evolution -sccp -simplifycfg -sroa -strip-dead-prototypes -tailcallelim -tbaa
    • -O2 is based on -01

      • adds: -barrier -constmerge -domtree -globaldce -gvn -inline -loop-vectorize -preverify -slp-vectorizer -targetlibinfo -verify
      • removes: -always-inline
    • -O3 is based on -O2

      • adds: -argpromotion
    • -Os is identical to -O2

    • -Oz is based on -O2

      • removes: -barrier -loop-vectorize -slp-vectorizer


    With version 3.2 the passes are as follow (parsed output of the command above):

    • -O0: -targetlibinfo -preverify -domtree -verify

    • -O1 is based on -O0

      • adds: -sroa -early-cse -lower-expect -no-aa -tbaa -basicaa -globalopt -ipsccp -deadargelim -instcombine -simplifycfg -basiccg -prune-eh -always-inline -functionattrs -simplify-libcalls -lazy-value-info -jump-threading -correlated-propagation -tailcallelim -reassociate -loops -loop-simplify -lcssa -loop-rotate -licm -loop-unswitch -scalar-evolution -indvars -loop-idiom -loop-deletion -loop-unroll -memdep -memcpyopt -sccp -dse -adce -strip-dead-prototypes
    • -O2 is based on -01

      • adds: -inline -globaldce -constmerge
      • removes: -always-inline
    • -O3 is based on -O2

      • adds: -argpromotion
    • -Os is identical to -O2

    • -Oz is identical to -Os


    Edit [march 2014] removed duplicates from lists.

    Edit [april 2014] added documentation link + options for 3.4

    Edit [september 2014] added options for 3.5

    Edit [december 2015] added options for 3.7 and mention existing answer for 3.6

    Edit [may 2016] added options for 3.8, for both opt and clang and mention existing answer for clang (versus opt)

    这篇关于Clang优化级别的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持!

    08-18 16:36