the door way

在internal中没有找到gcc的大门,在manual却碰到了door way,因此,一个软件,manual才是最温柔的一面,internal里只有满面辛酸。


















MacJournal at hand

        以后这里的文章都是macjournal的产物咯。 蛮好用的,希望贴照片也更方面,还有模板什么的。 回头要试试排版功能,哈哈。 居然不能换行。。。。damn         what about english ? new works ? asdfasdf asdf <p> asf

gcc phases

入门,入门。 转贴,转贴。


This post is part of a series about GCC internals and specifically about howto create a new language front-end for GCC. For a list of related posts, please check this page.

If I would ask many people what the executable gcc is doing, most of the people would answer, “Well, it’s a compiler, though … it’s compiling the source file into a target file”. But that is NOT correct. The executable gcc is not a compiler although the abbreviation means GNU C compiler. gcc represents what is generally called a compiler driver or more generic a compilation driver, in the following just called driver. If you now think, this guy is completely insane, please add the -v option to one of your gcc commands and and check what gcc is really doing.

While a compiler really is only responsible for transforming the source file into (possibly optimized) target machine code, the driver is the high-level organizer in the overall compilation process creating a object/shared/executable file from one or more source files. Thereby, the driver divides the compilation process into several phases which greatly depend on the capabilities of the used programs. gcc in newer versions (for example 4.4.0 and newer) uses the following phases assuming that an executable is created from a single source file (gcc source.c -o exec):

1. compiler (cc1)
2. assembler (as; from GNU binutils)
3. collect2 (collect2; part of GCC)
   3.1. linker (ld; from GNU binutils)

Just a note: in earlier versions of gcc, the driver added a separate pre-processing phase before the compiler phase, but in the recent versions the pre-processor is omitted, because the compiler (cc1) incorporates the pre-processor, at least when using the default options. By using the -no-integrated-cpp option, you instruct the driver to split the compiler phase into 2 distinct phases: pre-processing and compiler.

Each of the phases above produces intermediate or temporary files (assuming that the -pipe option is not specified) as output files which then serve as input files for the subsequent phase, in detail:

  • pre-processor (if separate phase)
    • input: *.c
    • output: *.i
  • compiler
    • input: *.c & *.i files
    • output: *.s
  • assembler
    • input: *.s
    • output: *.o
  • collect2/linker
    • input: *.o
    • output: self-defined pattern, e.g. *.exe

Based on the file extension, the driver knows which phase to start with when processing the file. Though, if a *.s file is given as input file along with a *.o and *.c file to produce an executable in a single gcc command, gcc would run the *.s file through the assembler phase, the *.c file through the compiler and assembler phase and the resulting three *.o files finally through the collect2/linker phase. By the way, if an input file extension matches none of the defined extensions, the file is taken is collect2/linker input!

The default behavior of gcc is to try to produce an executable file from the input files. But, gcc could be instructed to stop after any of the above phases by using specific driver options:

-E : stop after pre-processing, produce a *.i file
-S : stop after compiler, produce a *.s file
-c : stop after assembler, produce a *.o file
none : stop after collect2/linker

Hint: If you want to run through all the phases with a single gcc command but neverthess keep the intermediate files, use the -save-temps option.

You may ask, why all this is important for a new language front-end in gcc? Right! Because your new language might require different or additional phases than the described ones and then you should know where to start with to bring your new language front-end driver to execute your specific phases. For this purpose, GCC (capital letters are used to distinguish the complete compiler project from the C-specific driver) is designed in a modular way to allow the front-ends to “register” new phases, but the addition/modification of the phases will be discussed in a later post. However, for the already interested reader, please take a look at the file gcc-x.y.z/gcc/gcc.c and one of the the files of an already existing front-end, for example of C++ ingcc-x.y.z/gcc/cp/lang-specs.h.

gcc scalar evolution

   hi all,下面这段文字会让你很快乐,至少我是这样。


   This pass analyzes the evolution of scalar variables in loop

   structures.  The algorithm is based on the SSA representation,

   and on the loop hierarchy tree.  This algorithm is not based on

   the notion of versions of a variable, as it was the case for the

   previous implementations of the scalar evolution algorithm, but

   it assumes that each defined name is unique.


   The notation used in this file is called "chains of recurrences",

   and has been proposed by Eugene Zima, Robert Van Engelen, and

   others for describing induction variables in programs.  For example

   "b -> {0, +, 2}_1" means that the scalar variable "b" is equal to 0

   when entering in the loop_1 and has a step 2 in this loop, in other

   words "for (b = 0; b < N; b+=2);".  Note that the coefficients of

   this chain of recurrence (or chrec [shrek]) can contain the name of

   other variables, in which case they are called parametric chrecs.

   For example, "b -> {a, +, 2}_1" means that the initial value of "b"

   is the value of "a".  In most of the cases these parametric chrecs

   are fully instantiated before their use because symbolic names can

   hide some difficult cases such as self-references described later

   (see the Fibonacci example).


   A short sketch of the algorithm is:


   Given a scalar variable to be analyzed, follow the SSA edge to

   its definition:


   - When the definition is a GIMPLE_ASSIGN: if the right hand side

   (RHS) of the definition cannot be statically analyzed, the answer

   of the analyzer is: "don't know".  

   Otherwise, for all the variables that are not yet analyzed in the

   RHS, try to determine their evolution, and finally try to

   evaluate the operation of the RHS that gives the evolution

   function of the analyzed variable.


   - When the definition is a condition-phi-node: determine the

   evolution function for all the branches of the phi node, and

   finally merge these evolutions (see chrec_merge).


   - When the definition is a loop-phi-node: determine its initial

   condition, that is the SSA edge defined in an outer loop, and

   keep it symbolic.  Then determine the SSA edges that are defined

   in the body of the loop.  Follow the inner edges until ending on

   another loop-phi-node of the same analyzed loop.  If the reached

   loop-phi-node is not the starting loop-phi-node, then we keep

   this definition under a symbolic form.  If the reached

   loop-phi-node is the same as the starting one, then we compute a

   symbolic stride on the return path.  The result is then the

   symbolic chrec {initial_condition, +, symbolic_stride}_loop.




   Example 1: Illustration of the basic algorithm.


   | a = 3

   | loop_1

   |   b = phi (a, c)

   |   c = b + 1

   |   if (c > 10) exit_loop

   | endloop


   Suppose that we want to know the number of iterations of the

   loop_1.  The exit_loop is controlled by a COND_EXPR (c > 10).  We

   ask the scalar evolution analyzer two questions: what's the

   scalar evolution (scev) of "c", and what's the scev of "10".  For

   "10" the answer is "10" since it is a scalar constant.  For the

   scalar variable "c", it follows the SSA edge to its definition,

   "c = b + 1", and then asks again what's the scev of "b".

   Following the SSA edge, we end on a loop-phi-node "b = phi (a,

   c)", where the initial condition is "a", and the inner loop edge

   is "c".  The initial condition is kept under a symbolic form (it

   may be the case that the copy constant propagation has done its

   work and we end with the constant "3" as one of the edges of the

   loop-phi-node).  The update edge is followed to the end of the

   loop, and until reaching again the starting loop-phi-node: b -> c

   -> b.  At this point we have drawn a path from "b" to "b" from

   which we compute the stride in the loop: in this example it is

   "+1".  The resulting scev for "b" is "b -> {a, +, 1}_1".  Now

   that the scev for "b" is known, it is possible to compute the

   scev for "c", that is "c -> {a + 1, +, 1}_1".  In order to

   determine the number of iterations in the loop_1, we have to

   instantiate_parameters (loop_1, {a + 1, +, 1}_1), that gives after some

   more analysis the scev {4, +, 1}_1, or in other words, this is

   the function "f (x) = x + 4", where x is the iteration count of

   the loop_1.  Now we have to solve the inequality "x + 4 > 10",

   and take the smallest iteration number for which the loop is

   exited: x = 7.  This loop runs from x = 0 to x = 7, and in total

   there are 8 iterations.  In terms of loop normalization, we have

   created a variable that is implicitly defined, "x" or just "_1",

   and all the other analyzed scalars of the loop are defined in

   function of this variable:


   a -> 3

   b -> {3, +, 1}_1

   c -> {4, +, 1}_1


   or in terms of a C program: 


   | a = 3

   | for (x = 0; x <= 7; x++)

   |   {

   |     b = x + 3

   |     c = x + 4

   |   }


   Example 2a: Illustration of the algorithm on nested loops.


   | loop_1

   |   a = phi (1, b)

   |   c = a + 2

   |   loop_2  10 times

   |     b = phi (c, d)

   |     d = b + 3

   |   endloop

   | endloop


   For analyzing the scalar evolution of "a", the algorithm follows

   the SSA edge into the loop's body: "a -> b".  "b" is an inner

   loop-phi-node, and its analysis as in Example 1, gives: 


   b -> {c, +, 3}_2

   d -> {c + 3, +, 3}_2


   Following the SSA edge for the initial condition, we end on "c = a

   + 2", and then on the starting loop-phi-node "a".  From this point,

   the loop stride is computed: back on "c = a + 2" we get a "+2" in

   the loop_1, then on the loop-phi-node "b" we compute the overall

   effect of the inner loop that is "b = c + 30", and we get a "+30"

   in the loop_1.  That means that the overall stride in loop_1 is

   equal to "+32", and the result is: 


   a -> {1, +, 32}_1

   c -> {3, +, 32}_1


   Example 2b: Multivariate chains of recurrences.


   | loop_1

   |   k = phi (0, k + 1)

   |   loop_2  4 times

   |     j = phi (0, j + 1)

   |     loop_3 4 times

   |       i = phi (0, i + 1)

   |       A[j + k] = ...

   |     endloop

   |   endloop

   | endloop


   Analyzing the access function of array A with

   instantiate_parameters (loop_1, "j + k"), we obtain the

   instantiation and the analysis of the scalar variables "j" and "k"

   in loop_1.  This leads to the scalar evolution {4, +, 1}_1: the end

   value of loop_2 for "j" is 4, and the evolution of "k" in loop_1 is

   {0, +, 1}_1.  To obtain the evolution function in loop_3 and

   instantiate the scalar variables up to loop_1, one has to use:

   instantiate_scev (block_before_loop (loop_1), loop_3, "j + k").

   The result of this call is {{0, +, 1}_1, +, 1}_2.


   Example 3: Higher degree polynomials.


   | loop_1

   |   a = phi (2, b)

   |   c = phi (5, d)

   |   b = a + 1

   |   d = c + a

   | endloop


   a -> {2, +, 1}_1

   b -> {3, +, 1}_1

   c -> {5, +, a}_1

   d -> {5 + a, +, a}_1


   instantiate_parameters (loop_1, {5, +, a}_1) -> {5, +, 2, +, 1}_1

   instantiate_parameters (loop_1, {5 + a, +, a}_1) -> {7, +, 3, +, 1}_1


   Example 4: Lucas, Fibonacci, or mixers in general.


   | loop_1

   |   a = phi (1, b)

   |   c = phi (3, d)

   |   b = c

   |   d = c + a

   | endloop


   a -> (1, c)_1

   c -> {3, +, a}_1


   The syntax "(1, c)_1" stands for a PEELED_CHREC that has the

   following semantics: during the first iteration of the loop_1, the

   variable contains the value 1, and then it contains the value "c".

   Note that this syntax is close to the syntax of the loop-phi-node:

   "a -> (1, c)_1" vs. "a = phi (1, c)".


   The symbolic chrec representation contains all the semantics of the

   original code.  What is more difficult is to use this information.


   Example 5: Flip-flops, or exchangers.


   | loop_1

   |   a = phi (1, b)

   |   c = phi (3, d)

   |   b = c

   |   d = a

   | endloop


   a -> (1, c)_1

   c -> (3, a)_1


   Based on these symbolic chrecs, it is possible to refine this

   information into the more precise PERIODIC_CHRECs: 


   a -> |1, 3|_1

   c -> |3, 1|_1


   This transformation is not yet implemented.


   Further readings:


   You can find a more detailed description of the algorithm in:  But note that

   this is a preliminary report and some of the details of the

   algorithm have changed.  I'm working on a research report that

   updates the description of the algorithms to reflect the design

   choices used in this implementation.


   A set of slides show a high level overview of the algorithm and run

   an example through the scalar evolution analyzer:


   The slides that I have presented at the GCC Summit'04 are available



























这个想法听上去很想profile driven optimizing,只不过是programmer driven。





这不就是profile driven optimizing嘛。






设计模式 随感

发现很多模式都在os里有用,看来用cpp快速开发个os也蛮诱人的 .

design patterns in large scale software design : os , browser, 3d designer , other big hierarchies