source: trunk/third/gcc/gcc.info-16 @ 11288

Revision 11288, 47.6 KB checked in by ghudson, 26 years ago (diff)
This commit was generated by cvs2svn to compensate for changes in r11287, which included commits to RCS files with non-trunk default branches.
Line 
1This is Info file gcc.info, produced by Makeinfo version 1.67 from the
2input file gcc.texi.
3
4   This file documents the use and the internals of the GNU compiler.
5
6   Published by the Free Software Foundation 59 Temple Place - Suite 330
7Boston, MA 02111-1307 USA
8
9   Copyright (C) 1988, 1989, 1992, 1993, 1994, 1995, 1996, 1997, 1998
10Free Software Foundation, Inc.
11
12   Permission is granted to make and distribute verbatim copies of this
13manual provided the copyright notice and this permission notice are
14preserved on all copies.
15
16   Permission is granted to copy and distribute modified versions of
17this manual under the conditions for verbatim copying, provided also
18that the sections entitled "GNU General Public License," "Funding for
19Free Software," and "Protect Your Freedom--Fight `Look And Feel'" are
20included exactly as in the original, and provided that the entire
21resulting derived work is distributed under the terms of a permission
22notice identical to this one.
23
24   Permission is granted to copy and distribute translations of this
25manual into another language, under the above conditions for modified
26versions, except that the sections entitled "GNU General Public
27License," "Funding for Free Software," and "Protect Your Freedom--Fight
28`Look And Feel'", and this permission notice, may be included in
29translations approved by the Free Software Foundation instead of in the
30original English.
31
32
33File: gcc.info,  Node: Insns,  Next: Calls,  Prev: Assembler,  Up: RTL
34
35Insns
36=====
37
38   The RTL representation of the code for a function is a doubly-linked
39chain of objects called "insns".  Insns are expressions with special
40codes that are used for no other purpose.  Some insns are actual
41instructions; others represent dispatch tables for `switch' statements;
42others represent labels to jump to or various sorts of declarative
43information.
44
45   In addition to its own specific data, each insn must have a unique
46id-number that distinguishes it from all other insns in the current
47function (after delayed branch scheduling, copies of an insn with the
48same id-number may be present in multiple places in a function, but
49these copies will always be identical and will only appear inside a
50`sequence'), and chain pointers to the preceding and following insns.
51These three fields occupy the same position in every insn, independent
52of the expression code of the insn.  They could be accessed with `XEXP'
53and `XINT', but instead three special macros are always used:
54
55`INSN_UID (I)'
56     Accesses the unique id of insn I.
57
58`PREV_INSN (I)'
59     Accesses the chain pointer to the insn preceding I.  If I is the
60     first insn, this is a null pointer.
61
62`NEXT_INSN (I)'
63     Accesses the chain pointer to the insn following I.  If I is the
64     last insn, this is a null pointer.
65
66   The first insn in the chain is obtained by calling `get_insns'; the
67last insn is the result of calling `get_last_insn'.  Within the chain
68delimited by these insns, the `NEXT_INSN' and `PREV_INSN' pointers must
69always correspond: if INSN is not the first insn,
70
71     NEXT_INSN (PREV_INSN (INSN)) == INSN
72
73is always true and if INSN is not the last insn,
74
75     PREV_INSN (NEXT_INSN (INSN)) == INSN
76
77is always true.
78
79   After delay slot scheduling, some of the insns in the chain might be
80`sequence' expressions, which contain a vector of insns.  The value of
81`NEXT_INSN' in all but the last of these insns is the next insn in the
82vector; the value of `NEXT_INSN' of the last insn in the vector is the
83same as the value of `NEXT_INSN' for the `sequence' in which it is
84contained.  Similar rules apply for `PREV_INSN'.
85
86   This means that the above invariants are not necessarily true for
87insns inside `sequence' expressions.  Specifically, if INSN is the
88first insn in a `sequence', `NEXT_INSN (PREV_INSN (INSN))' is the insn
89containing the `sequence' expression, as is the value of `PREV_INSN
90(NEXT_INSN (INSN))' is INSN is the last insn in the `sequence'
91expression.  You can use these expressions to find the containing
92`sequence' expression.
93
94   Every insn has one of the following six expression codes:
95
96`insn'
97     The expression code `insn' is used for instructions that do not
98     jump and do not do function calls.  `sequence' expressions are
99     always contained in insns with code `insn' even if one of those
100     insns should jump or do function calls.
101
102     Insns with code `insn' have four additional fields beyond the three
103     mandatory ones listed above.  These four are described in a table
104     below.
105
106`jump_insn'
107     The expression code `jump_insn' is used for instructions that may
108     jump (or, more generally, may contain `label_ref' expressions).  If
109     there is an instruction to return from the current function, it is
110     recorded as a `jump_insn'.
111
112     `jump_insn' insns have the same extra fields as `insn' insns,
113     accessed in the same way and in addition contain a field
114     `JUMP_LABEL' which is defined once jump optimization has completed.
115
116     For simple conditional and unconditional jumps, this field
117     contains the `code_label' to which this insn will (possibly
118     conditionally) branch.  In a more complex jump, `JUMP_LABEL'
119     records one of the labels that the insn refers to; the only way to
120     find the others is to scan the entire body of the insn.
121
122     Return insns count as jumps, but since they do not refer to any
123     labels, they have zero in the `JUMP_LABEL' field.
124
125`call_insn'
126     The expression code `call_insn' is used for instructions that may
127     do function calls.  It is important to distinguish these
128     instructions because they imply that certain registers and memory
129     locations may be altered unpredictably.
130
131     `call_insn' insns have the same extra fields as `insn' insns,
132     accessed in the same way and in addition contain a field
133     `CALL_INSN_FUNCTION_USAGE', which contains a list (chain of
134     `expr_list' expressions) containing `use' and `clobber'
135     expressions that denote hard registers used or clobbered by the
136     called function.  A register specified in a `clobber' in this list
137     is modified *after* the execution of the `call_insn', while a
138     register in a `clobber' in the body of the `call_insn' is
139     clobbered before the insn completes execution.  `clobber'
140     expressions in this list augment registers specified in
141     `CALL_USED_REGISTERS' (*note Register Basics::.).
142
143`code_label'
144     A `code_label' insn represents a label that a jump insn can jump
145     to.  It contains two special fields of data in addition to the
146     three standard ones.  `CODE_LABEL_NUMBER' is used to hold the
147     "label number", a number that identifies this label uniquely among
148     all the labels in the compilation (not just in the current
149     function).  Ultimately, the label is represented in the assembler
150     output as an assembler label, usually of the form `LN' where N is
151     the label number.
152
153     When a `code_label' appears in an RTL expression, it normally
154     appears within a `label_ref' which represents the address of the
155     label, as a number.
156
157     The field `LABEL_NUSES' is only defined once the jump optimization
158     phase is completed and contains the number of times this label is
159     referenced in the current function.
160
161`barrier'
162     Barriers are placed in the instruction stream when control cannot
163     flow past them.  They are placed after unconditional jump
164     instructions to indicate that the jumps are unconditional and
165     after calls to `volatile' functions, which do not return (e.g.,
166     `exit').  They contain no information beyond the three standard
167     fields.
168
169`note'
170     `note' insns are used to represent additional debugging and
171     declarative information.  They contain two nonstandard fields, an
172     integer which is accessed with the macro `NOTE_LINE_NUMBER' and a
173     string accessed with `NOTE_SOURCE_FILE'.
174
175     If `NOTE_LINE_NUMBER' is positive, the note represents the
176     position of a source line and `NOTE_SOURCE_FILE' is the source
177     file name that the line came from.  These notes control generation
178     of line number data in the assembler output.
179
180     Otherwise, `NOTE_LINE_NUMBER' is not really a line number but a
181     code with one of the following values (and `NOTE_SOURCE_FILE' must
182     contain a null pointer):
183
184    `NOTE_INSN_DELETED'
185          Such a note is completely ignorable.  Some passes of the
186          compiler delete insns by altering them into notes of this
187          kind.
188
189    `NOTE_INSN_BLOCK_BEG'
190    `NOTE_INSN_BLOCK_END'
191          These types of notes indicate the position of the beginning
192          and end of a level of scoping of variable names.  They
193          control the output of debugging information.
194
195    `NOTE_INSN_EH_REGION_BEG'
196    `NOTE_INSN_EH_REGION_END'
197          These types of notes indicate the position of the beginning
198          and end of a level of scoping for exception handling.
199          `NOTE_BLOCK_NUMBER' identifies which `CODE_LABEL' is
200          associated with the given region.
201
202    `NOTE_INSN_LOOP_BEG'
203    `NOTE_INSN_LOOP_END'
204          These types of notes indicate the position of the beginning
205          and end of a `while' or `for' loop.  They enable the loop
206          optimizer to find loops quickly.
207
208    `NOTE_INSN_LOOP_CONT'
209          Appears at the place in a loop that `continue' statements
210          jump to.
211
212    `NOTE_INSN_LOOP_VTOP'
213          This note indicates the place in a loop where the exit test
214          begins for those loops in which the exit test has been
215          duplicated.  This position becomes another virtual start of
216          the loop when considering loop invariants.
217
218    `NOTE_INSN_FUNCTION_END'
219          Appears near the end of the function body, just before the
220          label that `return' statements jump to (on machine where a
221          single instruction does not suffice for returning).  This
222          note may be deleted by jump optimization.
223
224    `NOTE_INSN_SETJMP'
225          Appears following each call to `setjmp' or a related function.
226
227     These codes are printed symbolically when they appear in debugging
228     dumps.
229
230   The machine mode of an insn is normally `VOIDmode', but some phases
231use the mode for various purposes; for example, the reload pass sets it
232to `HImode' if the insn needs reloading but not register elimination
233and `QImode' if both are required.  The common subexpression
234elimination pass sets the mode of an insn to `QImode' when it is the
235first insn in a block that has already been processed.
236
237   Here is a table of the extra fields of `insn', `jump_insn' and
238`call_insn' insns:
239
240`PATTERN (I)'
241     An expression for the side effect performed by this insn.  This
242     must be one of the following codes: `set', `call', `use',
243     `clobber', `return', `asm_input', `asm_output', `addr_vec',
244     `addr_diff_vec', `trap_if', `unspec', `unspec_volatile',
245     `parallel', or `sequence'.  If it is a `parallel', each element of
246     the `parallel' must be one these codes, except that `parallel'
247     expressions cannot be nested and `addr_vec' and `addr_diff_vec'
248     are not permitted inside a `parallel' expression.
249
250`INSN_CODE (I)'
251     An integer that says which pattern in the machine description
252     matches this insn, or -1 if the matching has not yet been
253     attempted.
254
255     Such matching is never attempted and this field remains -1 on an
256     insn whose pattern consists of a single `use', `clobber',
257     `asm_input', `addr_vec' or `addr_diff_vec' expression.
258
259     Matching is also never attempted on insns that result from an `asm'
260     statement.  These contain at least one `asm_operands' expression.
261     The function `asm_noperands' returns a non-negative value for such
262     insns.
263
264     In the debugging output, this field is printed as a number
265     followed by a symbolic representation that locates the pattern in
266     the `md' file as some small positive or negative offset from a
267     named pattern.
268
269`LOG_LINKS (I)'
270     A list (chain of `insn_list' expressions) giving information about
271     dependencies between instructions within a basic block.  Neither a
272     jump nor a label may come between the related insns.
273
274`REG_NOTES (I)'
275     A list (chain of `expr_list' and `insn_list' expressions) giving
276     miscellaneous information about the insn.  It is often information
277     pertaining to the registers used in this insn.
278
279   The `LOG_LINKS' field of an insn is a chain of `insn_list'
280expressions.  Each of these has two operands: the first is an insn, and
281the second is another `insn_list' expression (the next one in the
282chain).  The last `insn_list' in the chain has a null pointer as second
283operand.  The significant thing about the chain is which insns appear
284in it (as first operands of `insn_list' expressions).  Their order is
285not significant.
286
287   This list is originally set up by the flow analysis pass; it is a
288null pointer until then.  Flow only adds links for those data
289dependencies which can be used for instruction combination.  For each
290insn, the flow analysis pass adds a link to insns which store into
291registers values that are used for the first time in this insn.  The
292instruction scheduling pass adds extra links so that every dependence
293will be represented.  Links represent data dependencies,
294antidependencies and output dependencies; the machine mode of the link
295distinguishes these three types: antidependencies have mode
296`REG_DEP_ANTI', output dependencies have mode `REG_DEP_OUTPUT', and
297data dependencies have mode `VOIDmode'.
298
299   The `REG_NOTES' field of an insn is a chain similar to the
300`LOG_LINKS' field but it includes `expr_list' expressions in addition
301to `insn_list' expressions.  There are several kinds of register notes,
302which are distinguished by the machine mode, which in a register note
303is really understood as being an `enum reg_note'.  The first operand OP
304of the note is data whose meaning depends on the kind of note.
305
306   The macro `REG_NOTE_KIND (X)' returns the kind of register note.
307Its counterpart, the macro `PUT_REG_NOTE_KIND (X, NEWKIND)' sets the
308register note type of X to be NEWKIND.
309
310   Register notes are of three classes: They may say something about an
311input to an insn, they may say something about an output of an insn, or
312they may create a linkage between two insns.  There are also a set of
313values that are only used in `LOG_LINKS'.
314
315   These register notes annotate inputs to an insn:
316
317`REG_DEAD'
318     The value in OP dies in this insn; that is to say, altering the
319     value immediately after this insn would not affect the future
320     behavior of the program.
321
322     This does not necessarily mean that the register OP has no useful
323     value after this insn since it may also be an output of the insn.
324     In such a case, however, a `REG_DEAD' note would be redundant and
325     is usually not present until after the reload pass, but no code
326     relies on this fact.
327
328`REG_INC'
329     The register OP is incremented (or decremented; at this level
330     there is no distinction) by an embedded side effect inside this
331     insn.  This means it appears in a `post_inc', `pre_inc',
332     `post_dec' or `pre_dec' expression.
333
334`REG_NONNEG'
335     The register OP is known to have a nonnegative value when this
336     insn is reached.  This is used so that decrement and branch until
337     zero instructions, such as the m68k dbra, can be matched.
338
339     The `REG_NONNEG' note is added to insns only if the machine
340     description has a `decrement_and_branch_until_zero' pattern.
341
342`REG_NO_CONFLICT'
343     This insn does not cause a conflict between OP and the item being
344     set by this insn even though it might appear that it does.  In
345     other words, if the destination register and OP could otherwise be
346     assigned the same register, this insn does not prevent that
347     assignment.
348
349     Insns with this note are usually part of a block that begins with a
350     `clobber' insn specifying a multi-word pseudo register (which will
351     be the output of the block), a group of insns that each set one
352     word of the value and have the `REG_NO_CONFLICT' note attached,
353     and a final insn that copies the output to itself with an attached
354     `REG_EQUAL' note giving the expression being computed.  This block
355     is encapsulated with `REG_LIBCALL' and `REG_RETVAL' notes on the
356     first and last insns, respectively.
357
358`REG_LABEL'
359     This insn uses OP, a `code_label', but is not a `jump_insn'.  The
360     presence of this note allows jump optimization to be aware that OP
361     is, in fact, being used.
362
363   The following notes describe attributes of outputs of an insn:
364
365`REG_EQUIV'
366`REG_EQUAL'
367     This note is only valid on an insn that sets only one register and
368     indicates that that register will be equal to OP at run time; the
369     scope of this equivalence differs between the two types of notes.
370     The value which the insn explicitly copies into the register may
371     look different from OP, but they will be equal at run time.  If the
372     output of the single `set' is a `strict_low_part' expression, the
373     note refers to the register that is contained in `SUBREG_REG' of
374     the `subreg' expression.
375
376     For `REG_EQUIV', the register is equivalent to OP throughout the
377     entire function, and could validly be replaced in all its
378     occurrences by OP.  ("Validly" here refers to the data flow of the
379     program; simple replacement may make some insns invalid.)  For
380     example, when a constant is loaded into a register that is never
381     assigned any other value, this kind of note is used.
382
383     When a parameter is copied into a pseudo-register at entry to a
384     function, a note of this kind records that the register is
385     equivalent to the stack slot where the parameter was passed.
386     Although in this case the register may be set by other insns, it
387     is still valid to replace the register by the stack slot
388     throughout the function.
389
390     A `REG_EQUIV' note is also used on an instruction which copies a
391     register parameter into a pseudo-register at entry to a function,
392     if there is a stack slot where that parameter could be stored.
393     Although other insns may set the pseudo-register, it is valid for
394     the compiler to replace the pseudo-register by stack slot
395     throughout the function, provided the compiler ensures that the
396     stack slot is properly initialized by making the replacement in
397     the initial copy instruction as well.  This is used on machines
398     for which the calling convention allocates stack space for
399     register parameters.  See `REG_PARM_STACK_SPACE' in *Note Stack
400     Arguments::.
401
402     In the case of `REG_EQUAL', the register that is set by this insn
403     will be equal to OP at run time at the end of this insn but not
404     necessarily elsewhere in the function.  In this case, OP is
405     typically an arithmetic expression.  For example, when a sequence
406     of insns such as a library call is used to perform an arithmetic
407     operation, this kind of note is attached to the insn that produces
408     or copies the final value.
409
410     These two notes are used in different ways by the compiler passes.
411     `REG_EQUAL' is used by passes prior to register allocation (such as
412     common subexpression elimination and loop optimization) to tell
413     them how to think of that value.  `REG_EQUIV' notes are used by
414     register allocation to indicate that there is an available
415     substitute expression (either a constant or a `mem' expression for
416     the location of a parameter on the stack) that may be used in
417     place of a register if insufficient registers are available.
418
419     Except for stack homes for parameters, which are indicated by a
420     `REG_EQUIV' note and are not useful to the early optimization
421     passes and pseudo registers that are equivalent to a memory
422     location throughout there entire life, which is not detected until
423     later in the compilation, all equivalences are initially indicated
424     by an attached `REG_EQUAL' note.  In the early stages of register
425     allocation, a `REG_EQUAL' note is changed into a `REG_EQUIV' note
426     if OP is a constant and the insn represents the only set of its
427     destination register.
428
429     Thus, compiler passes prior to register allocation need only check
430     for `REG_EQUAL' notes and passes subsequent to register allocation
431     need only check for `REG_EQUIV' notes.
432
433`REG_UNUSED'
434     The register OP being set by this insn will not be used in a
435     subsequent insn.  This differs from a `REG_DEAD' note, which
436     indicates that the value in an input will not be used subsequently.
437     These two notes are independent; both may be present for the same
438     register.
439
440`REG_WAS_0'
441     The single output of this insn contained zero before this insn.
442     OP is the insn that set it to zero.  You can rely on this note if
443     it is present and OP has not been deleted or turned into a `note';
444     its absence implies nothing.
445
446   These notes describe linkages between insns.  They occur in pairs:
447one insn has one of a pair of notes that points to a second insn, which
448has the inverse note pointing back to the first insn.
449
450`REG_RETVAL'
451     This insn copies the value of a multi-insn sequence (for example, a
452     library call), and OP is the first insn of the sequence (for a
453     library call, the first insn that was generated to set up the
454     arguments for the library call).
455
456     Loop optimization uses this note to treat such a sequence as a
457     single operation for code motion purposes and flow analysis uses
458     this note to delete such sequences whose results are dead.
459
460     A `REG_EQUAL' note will also usually be attached to this insn to
461     provide the expression being computed by the sequence.
462
463`REG_LIBCALL'
464     This is the inverse of `REG_RETVAL': it is placed on the first
465     insn of a multi-insn sequence, and it points to the last one.
466
467`REG_CC_SETTER'
468`REG_CC_USER'
469     On machines that use `cc0', the insns which set and use `cc0' set
470     and use `cc0' are adjacent.  However, when branch delay slot
471     filling is done, this may no longer be true.  In this case a
472     `REG_CC_USER' note will be placed on the insn setting `cc0' to
473     point to the insn using `cc0' and a `REG_CC_SETTER' note will be
474     placed on the insn using `cc0' to point to the insn setting `cc0'.
475
476   These values are only used in the `LOG_LINKS' field, and indicate
477the type of dependency that each link represents.  Links which indicate
478a data dependence (a read after write dependence) do not use any code,
479they simply have mode `VOIDmode', and are printed without any
480descriptive text.
481
482`REG_DEP_ANTI'
483     This indicates an anti dependence (a write after read dependence).
484
485`REG_DEP_OUTPUT'
486     This indicates an output dependence (a write after write
487     dependence).
488
489   These notes describe information gathered from gcov profile data.
490They are stored in the `REG_NOTES' field of an insn as an `expr_list'.
491
492`REG_EXEC_COUNT'
493     This is used to indicate the number of times a basic block was
494     executed according to the profile data.  The note is attached to
495     the first insn in the basic block.
496
497`REG_BR_PROB'
498     This is used to specify the ratio of branches to non-branches of a
499     branch insn according to the profile data.  The value is stored as
500     a value between 0 and REG_BR_PROB_BASE; larger values indicate a
501     higher probability that the branch will be taken.
502
503   For convenience, the machine mode in an `insn_list' or `expr_list'
504is printed using these symbolic codes in debugging dumps.
505
506   The only difference between the expression codes `insn_list' and
507`expr_list' is that the first operand of an `insn_list' is assumed to
508be an insn and is printed in debugging dumps as the insn's unique id;
509the first operand of an `expr_list' is printed in the ordinary way as
510an expression.
511
512
513File: gcc.info,  Node: Calls,  Next: Sharing,  Prev: Insns,  Up: RTL
514
515RTL Representation of Function-Call Insns
516=========================================
517
518   Insns that call subroutines have the RTL expression code `call_insn'.
519These insns must satisfy special rules, and their bodies must use a
520special RTL expression code, `call'.
521
522   A `call' expression has two operands, as follows:
523
524     (call (mem:FM ADDR) NBYTES)
525
526Here NBYTES is an operand that represents the number of bytes of
527argument data being passed to the subroutine, FM is a machine mode
528(which must equal as the definition of the `FUNCTION_MODE' macro in the
529machine description) and ADDR represents the address of the subroutine.
530
531   For a subroutine that returns no value, the `call' expression as
532shown above is the entire body of the insn, except that the insn might
533also contain `use' or `clobber' expressions.
534
535   For a subroutine that returns a value whose mode is not `BLKmode',
536the value is returned in a hard register.  If this register's number is
537R, then the body of the call insn looks like this:
538
539     (set (reg:M R)
540          (call (mem:FM ADDR) NBYTES))
541
542This RTL expression makes it clear (to the optimizer passes) that the
543appropriate register receives a useful value in this insn.
544
545   When a subroutine returns a `BLKmode' value, it is handled by
546passing to the subroutine the address of a place to store the value.
547So the call insn itself does not "return" any value, and it has the
548same RTL form as a call that returns nothing.
549
550   On some machines, the call instruction itself clobbers some register,
551for example to contain the return address.  `call_insn' insns on these
552machines should have a body which is a `parallel' that contains both
553the `call' expression and `clobber' expressions that indicate which
554registers are destroyed.  Similarly, if the call instruction requires
555some register other than the stack pointer that is not explicitly
556mentioned it its RTL, a `use' subexpression should mention that
557register.
558
559   Functions that are called are assumed to modify all registers listed
560in the configuration macro `CALL_USED_REGISTERS' (*note Register
561Basics::.) and, with the exception of `const' functions and library
562calls, to modify all of memory.
563
564   Insns containing just `use' expressions directly precede the
565`call_insn' insn to indicate which registers contain inputs to the
566function.  Similarly, if registers other than those in
567`CALL_USED_REGISTERS' are clobbered by the called function, insns
568containing a single `clobber' follow immediately after the call to
569indicate which registers.
570
571
572File: gcc.info,  Node: Sharing,  Next: Reading RTL,  Prev: Calls,  Up: RTL
573
574Structure Sharing Assumptions
575=============================
576
577   The compiler assumes that certain kinds of RTL expressions are
578unique; there do not exist two distinct objects representing the same
579value.  In other cases, it makes an opposite assumption: that no RTL
580expression object of a certain kind appears in more than one place in
581the containing structure.
582
583   These assumptions refer to a single function; except for the RTL
584objects that describe global variables and external functions, and a
585few standard objects such as small integer constants, no RTL objects
586are common to two functions.
587
588   * Each pseudo-register has only a single `reg' object to represent
589     it, and therefore only a single machine mode.
590
591   * For any symbolic label, there is only one `symbol_ref' object
592     referring to it.
593
594   * There is only one `const_int' expression with value 0, only one
595     with value 1, and only one with value -1.  Some other integer
596     values are also stored uniquely.
597
598   * There is only one `pc' expression.
599
600   * There is only one `cc0' expression.
601
602   * There is only one `const_double' expression with value 0 for each
603     floating point mode.  Likewise for values 1 and 2.
604
605   * No `label_ref' or `scratch' appears in more than one place in the
606     RTL structure; in other words, it is safe to do a tree-walk of all
607     the insns in the function and assume that each time a `label_ref'
608     or `scratch' is seen it is distinct from all others that are seen.
609
610   * Only one `mem' object is normally created for each static variable
611     or stack slot, so these objects are frequently shared in all the
612     places they appear.  However, separate but equal objects for these
613     variables are occasionally made.
614
615   * When a single `asm' statement has multiple output operands, a
616     distinct `asm_operands' expression is made for each output operand.
617     However, these all share the vector which contains the sequence of
618     input operands.  This sharing is used later on to test whether two
619     `asm_operands' expressions come from the same statement, so all
620     optimizations must carefully preserve the sharing if they copy the
621     vector at all.
622
623   * No RTL object appears in more than one place in the RTL structure
624     except as described above.  Many passes of the compiler rely on
625     this by assuming that they can modify RTL objects in place without
626     unwanted side-effects on other insns.
627
628   * During initial RTL generation, shared structure is freely
629     introduced.  After all the RTL for a function has been generated,
630     all shared structure is copied by `unshare_all_rtl' in
631     `emit-rtl.c', after which the above rules are guaranteed to be
632     followed.
633
634   * During the combiner pass, shared structure within an insn can exist
635     temporarily.  However, the shared structure is copied before the
636     combiner is finished with the insn.  This is done by calling
637     `copy_rtx_if_shared', which is a subroutine of `unshare_all_rtl'.
638
639
640File: gcc.info,  Node: Reading RTL,  Prev: Sharing,  Up: RTL
641
642Reading RTL
643===========
644
645   To read an RTL object from a file, call `read_rtx'.  It takes one
646argument, a stdio stream, and returns a single RTL object.
647
648   Reading RTL from a file is very slow.  This is not currently a
649problem since reading RTL occurs only as part of building the compiler.
650
651   People frequently have the idea of using RTL stored as text in a
652file as an interface between a language front end and the bulk of GNU
653CC.  This idea is not feasible.
654
655   GNU CC was designed to use RTL internally only.  Correct RTL for a
656given program is very dependent on the particular target machine.  And
657the RTL does not contain all the information about the program.
658
659   The proper way to interface GNU CC to a new language front end is
660with the "tree" data structure.  There is no manual for this data
661structure, but it is described in the files `tree.h' and `tree.def'.
662
663
664File: gcc.info,  Node: Machine Desc,  Next: Target Macros,  Prev: RTL,  Up: Top
665
666Machine Descriptions
667********************
668
669   A machine description has two parts: a file of instruction patterns
670(`.md' file) and a C header file of macro definitions.
671
672   The `.md' file for a target machine contains a pattern for each
673instruction that the target machine supports (or at least each
674instruction that is worth telling the compiler about).  It may also
675contain comments.  A semicolon causes the rest of the line to be a
676comment, unless the semicolon is inside a quoted string.
677
678   See the next chapter for information on the C header file.
679
680* Menu:
681
682* Patterns::            How to write instruction patterns.
683* Example::             An explained example of a `define_insn' pattern.
684* RTL Template::        The RTL template defines what insns match a pattern.
685* Output Template::     The output template says how to make assembler code
686                          from such an insn.
687* Output Statement::    For more generality, write C code to output
688                          the assembler code.
689* Constraints::         When not all operands are general operands.
690* Standard Names::      Names mark patterns to use for code generation.
691* Pattern Ordering::    When the order of patterns makes a difference.
692* Dependent Patterns::  Having one pattern may make you need another.
693* Jump Patterns::       Special considerations for patterns for jump insns.
694* Insn Canonicalizations::Canonicalization of Instructions
695* Peephole Definitions::Defining machine-specific peephole optimizations.
696* Expander Definitions::Generating a sequence of several RTL insns
697                         for a standard operation.
698* Insn Splitting::    Splitting Instructions into Multiple Instructions
699* Insn Attributes::     Specifying the value of attributes for generated insns.
700
701
702File: gcc.info,  Node: Patterns,  Next: Example,  Up: Machine Desc
703
704Everything about Instruction Patterns
705=====================================
706
707   Each instruction pattern contains an incomplete RTL expression, with
708pieces to be filled in later, operand constraints that restrict how the
709pieces can be filled in, and an output pattern or C code to generate
710the assembler output, all wrapped up in a `define_insn' expression.
711
712   A `define_insn' is an RTL expression containing four or five
713operands:
714
715  1. An optional name.  The presence of a name indicate that this
716     instruction pattern can perform a certain standard job for the
717     RTL-generation pass of the compiler.  This pass knows certain
718     names and will use the instruction patterns with those names, if
719     the names are defined in the machine description.
720
721     The absence of a name is indicated by writing an empty string
722     where the name should go.  Nameless instruction patterns are never
723     used for generating RTL code, but they may permit several simpler
724     insns to be combined later on.
725
726     Names that are not thus known and used in RTL-generation have no
727     effect; they are equivalent to no name at all.
728
729  2. The "RTL template" (*note RTL Template::.) is a vector of
730     incomplete RTL expressions which show what the instruction should
731     look like.  It is incomplete because it may contain
732     `match_operand', `match_operator', and `match_dup' expressions
733     that stand for operands of the instruction.
734
735     If the vector has only one element, that element is the template
736     for the instruction pattern.  If the vector has multiple elements,
737     then the instruction pattern is a `parallel' expression containing
738     the elements described.
739
740  3. A condition.  This is a string which contains a C expression that
741     is the final test to decide whether an insn body matches this
742     pattern.
743
744     For a named pattern, the condition (if present) may not depend on
745     the data in the insn being matched, but only the
746     target-machine-type flags.  The compiler needs to test these
747     conditions during initialization in order to learn exactly which
748     named instructions are available in a particular run.
749
750     For nameless patterns, the condition is applied only when matching
751     an individual insn, and only after the insn has matched the
752     pattern's recognition template.  The insn's operands may be found
753     in the vector `operands'.
754
755  4. The "output template": a string that says how to output matching
756     insns as assembler code.  `%' in this string specifies where to
757     substitute the value of an operand.  *Note Output Template::.
758
759     When simple substitution isn't general enough, you can specify a
760     piece of C code to compute the output.  *Note Output Statement::.
761
762  5. Optionally, a vector containing the values of attributes for insns
763     matching this pattern.  *Note Insn Attributes::.
764
765
766File: gcc.info,  Node: Example,  Next: RTL Template,  Prev: Patterns,  Up: Machine Desc
767
768Example of `define_insn'
769========================
770
771   Here is an actual example of an instruction pattern, for the
77268000/68020.
773
774     (define_insn "tstsi"
775       [(set (cc0)
776             (match_operand:SI 0 "general_operand" "rm"))]
777       ""
778       "*
779     { if (TARGET_68020 || ! ADDRESS_REG_P (operands[0]))
780         return \"tstl %0\";
781       return \"cmpl #0,%0\"; }")
782
783   This is an instruction that sets the condition codes based on the
784value of a general operand.  It has no condition, so any insn whose RTL
785description has the form shown may be handled according to this
786pattern.  The name `tstsi' means "test a `SImode' value" and tells the
787RTL generation pass that, when it is necessary to test such a value, an
788insn to do so can be constructed using this pattern.
789
790   The output control string is a piece of C code which chooses which
791output template to return based on the kind of operand and the specific
792type of CPU for which code is being generated.
793
794   `"rm"' is an operand constraint.  Its meaning is explained below.
795
796
797File: gcc.info,  Node: RTL Template,  Next: Output Template,  Prev: Example,  Up: Machine Desc
798
799RTL Template
800============
801
802   The RTL template is used to define which insns match the particular
803pattern and how to find their operands.  For named patterns, the RTL
804template also says how to construct an insn from specified operands.
805
806   Construction involves substituting specified operands into a copy of
807the template.  Matching involves determining the values that serve as
808the operands in the insn being matched.  Both of these activities are
809controlled by special expression types that direct matching and
810substitution of the operands.
811
812`(match_operand:M N PREDICATE CONSTRAINT)'
813     This expression is a placeholder for operand number N of the insn.
814     When constructing an insn, operand number N will be substituted
815     at this point.  When matching an insn, whatever appears at this
816     position in the insn will be taken as operand number N; but it
817     must satisfy PREDICATE or this instruction pattern will not match
818     at all.
819
820     Operand numbers must be chosen consecutively counting from zero in
821     each instruction pattern.  There may be only one `match_operand'
822     expression in the pattern for each operand number.  Usually
823     operands are numbered in the order of appearance in `match_operand'
824     expressions.  In the case of a `define_expand', any operand numbers
825     used only in `match_dup' expressions have higher values than all
826     other operand numbers.
827
828     PREDICATE is a string that is the name of a C function that
829     accepts two arguments, an expression and a machine mode.  During
830     matching, the function will be called with the putative operand as
831     the expression and M as the mode argument (if M is not specified,
832     `VOIDmode' will be used, which normally causes PREDICATE to accept
833     any mode).  If it returns zero, this instruction pattern fails to
834     match.  PREDICATE may be an empty string; then it means no test is
835     to be done on the operand, so anything which occurs in this
836     position is valid.
837
838     Most of the time, PREDICATE will reject modes other than M--but
839     not always.  For example, the predicate `address_operand' uses M
840     as the mode of memory ref that the address should be valid for.
841     Many predicates accept `const_int' nodes even though their mode is
842     `VOIDmode'.
843
844     CONSTRAINT controls reloading and the choice of the best register
845     class to use for a value, as explained later (*note
846     Constraints::.).
847
848     People are often unclear on the difference between the constraint
849     and the predicate.  The predicate helps decide whether a given
850     insn matches the pattern.  The constraint plays no role in this
851     decision; instead, it controls various decisions in the case of an
852     insn which does match.
853
854     On CISC machines, the most common PREDICATE is
855     `"general_operand"'.  This function checks that the putative
856     operand is either a constant, a register or a memory reference,
857     and that it is valid for mode M.
858
859     For an operand that must be a register, PREDICATE should be
860     `"register_operand"'.  Using `"general_operand"' would be valid,
861     since the reload pass would copy any non-register operands through
862     registers, but this would make GNU CC do extra work, it would
863     prevent invariant operands (such as constant) from being removed
864     from loops, and it would prevent the register allocator from doing
865     the best possible job.  On RISC machines, it is usually most
866     efficient to allow PREDICATE to accept only objects that the
867     constraints allow.
868
869     For an operand that must be a constant, you must be sure to either
870     use `"immediate_operand"' for PREDICATE, or make the instruction
871     pattern's extra condition require a constant, or both.  You cannot
872     expect the constraints to do this work!  If the constraints allow
873     only constants, but the predicate allows something else, the
874     compiler will crash when that case arises.
875
876`(match_scratch:M N CONSTRAINT)'
877     This expression is also a placeholder for operand number N and
878     indicates that operand must be a `scratch' or `reg' expression.
879
880     When matching patterns, this is equivalent to
881
882          (match_operand:M N "scratch_operand" PRED)
883
884     but, when generating RTL, it produces a (`scratch':M) expression.
885
886     If the last few expressions in a `parallel' are `clobber'
887     expressions whose operands are either a hard register or
888     `match_scratch', the combiner can add or delete them when
889     necessary.  *Note Side Effects::.
890
891`(match_dup N)'
892     This expression is also a placeholder for operand number N.  It is
893     used when the operand needs to appear more than once in the insn.
894
895     In construction, `match_dup' acts just like `match_operand': the
896     operand is substituted into the insn being constructed.  But in
897     matching, `match_dup' behaves differently.  It assumes that operand
898     number N has already been determined by a `match_operand'
899     appearing earlier in the recognition template, and it matches only
900     an identical-looking expression.
901
902`(match_operator:M N PREDICATE [OPERANDS...])'
903     This pattern is a kind of placeholder for a variable RTL expression
904     code.
905
906     When constructing an insn, it stands for an RTL expression whose
907     expression code is taken from that of operand N, and whose
908     operands are constructed from the patterns OPERANDS.
909
910     When matching an expression, it matches an expression if the
911     function PREDICATE returns nonzero on that expression *and* the
912     patterns OPERANDS match the operands of the expression.
913
914     Suppose that the function `commutative_operator' is defined as
915     follows, to match any expression whose operator is one of the
916     commutative arithmetic operators of RTL and whose mode is MODE:
917
918          int
919          commutative_operator (x, mode)
920               rtx x;
921               enum machine_mode mode;
922          {
923            enum rtx_code code = GET_CODE (x);
924            if (GET_MODE (x) != mode)
925              return 0;
926            return (GET_RTX_CLASS (code) == 'c'
927                    || code == EQ || code == NE);
928          }
929
930     Then the following pattern will match any RTL expression consisting
931     of a commutative operator applied to two general operands:
932
933          (match_operator:SI 3 "commutative_operator"
934            [(match_operand:SI 1 "general_operand" "g")
935             (match_operand:SI 2 "general_operand" "g")])
936
937     Here the vector `[OPERANDS...]' contains two patterns because the
938     expressions to be matched all contain two operands.
939
940     When this pattern does match, the two operands of the commutative
941     operator are recorded as operands 1 and 2 of the insn.  (This is
942     done by the two instances of `match_operand'.)  Operand 3 of the
943     insn will be the entire commutative expression: use `GET_CODE
944     (operands[3])' to see which commutative operator was used.
945
946     The machine mode M of `match_operator' works like that of
947     `match_operand': it is passed as the second argument to the
948     predicate function, and that function is solely responsible for
949     deciding whether the expression to be matched "has" that mode.
950
951     When constructing an insn, argument 3 of the gen-function will
952     specify the operation (i.e. the expression code) for the
953     expression to be made.  It should be an RTL expression, whose
954     expression code is copied into a new expression whose operands are
955     arguments 1 and 2 of the gen-function.  The subexpressions of
956     argument 3 are not used; only its expression code matters.
957
958     When `match_operator' is used in a pattern for matching an insn,
959     it usually best if the operand number of the `match_operator' is
960     higher than that of the actual operands of the insn.  This improves
961     register allocation because the register allocator often looks at
962     operands 1 and 2 of insns to see if it can do register tying.
963
964     There is no way to specify constraints in `match_operator'.  The
965     operand of the insn which corresponds to the `match_operator'
966     never has any constraints because it is never reloaded as a whole.
967     However, if parts of its OPERANDS are matched by `match_operand'
968     patterns, those parts may have constraints of their own.
969
970`(match_op_dup:M N[OPERANDS...])'
971     Like `match_dup', except that it applies to operators instead of
972     operands.  When constructing an insn, operand number N will be
973     substituted at this point.  But in matching, `match_op_dup' behaves
974     differently.  It assumes that operand number N has already been
975     determined by a `match_operator' appearing earlier in the
976     recognition template, and it matches only an identical-looking
977     expression.
978
979`(match_parallel N PREDICATE [SUBPAT...])'
980     This pattern is a placeholder for an insn that consists of a
981     `parallel' expression with a variable number of elements.  This
982     expression should only appear at the top level of an insn pattern.
983
984     When constructing an insn, operand number N will be substituted at
985     this point.  When matching an insn, it matches if the body of the
986     insn is a `parallel' expression with at least as many elements as
987     the vector of SUBPAT expressions in the `match_parallel', if each
988     SUBPAT matches the corresponding element of the `parallel', *and*
989     the function PREDICATE returns nonzero on the `parallel' that is
990     the body of the insn.  It is the responsibility of the predicate
991     to validate elements of the `parallel' beyond those listed in the
992     `match_parallel'.
993
994     A typical use of `match_parallel' is to match load and store
995     multiple expressions, which can contain a variable number of
996     elements in a `parallel'.  For example,
997
998          (define_insn ""
999            [(match_parallel 0 "load_multiple_operation"
1000               [(set (match_operand:SI 1 "gpc_reg_operand" "=r")
1001                     (match_operand:SI 2 "memory_operand" "m"))
1002                (use (reg:SI 179))
1003                (clobber (reg:SI 179))])]
1004            ""
1005            "loadm 0,0,%1,%2")
1006
1007     This example comes from `a29k.md'.  The function
1008     `load_multiple_operations' is defined in `a29k.c' and checks that
1009     subsequent elements in the `parallel' are the same as the `set' in
1010     the pattern, except that they are referencing subsequent registers
1011     and memory locations.
1012
1013     An insn that matches this pattern might look like:
1014
1015          (parallel
1016           [(set (reg:SI 20) (mem:SI (reg:SI 100)))
1017            (use (reg:SI 179))
1018            (clobber (reg:SI 179))
1019            (set (reg:SI 21)
1020                 (mem:SI (plus:SI (reg:SI 100)
1021                                  (const_int 4))))
1022            (set (reg:SI 22)
1023                 (mem:SI (plus:SI (reg:SI 100)
1024                                  (const_int 8))))])
1025
1026`(match_par_dup N [SUBPAT...])'
1027     Like `match_op_dup', but for `match_parallel' instead of
1028     `match_operator'.
1029
1030`(address (match_operand:M N "address_operand" ""))'
1031     This complex of expressions is a placeholder for an operand number
1032     N in a "load address" instruction: an operand which specifies a
1033     memory location in the usual way, but for which the actual operand
1034     value used is the address of the location, not the contents of the
1035     location.
1036
1037     `address' expressions never appear in RTL code, only in machine
1038     descriptions.  And they are used only in machine descriptions that
1039     do not use the operand constraint feature.  When operand
1040     constraints are in use, the letter `p' in the constraint serves
1041     this purpose.
1042
1043     M is the machine mode of the *memory location being addressed*,
1044     not the machine mode of the address itself.  That mode is always
1045     the same on a given target machine (it is `Pmode', which normally
1046     is `SImode'), so there is no point in mentioning it; thus, no
1047     machine mode is written in the `address' expression.  If some day
1048     support is added for machines in which addresses of different
1049     kinds of objects appear differently or are used differently (such
1050     as the PDP-10), different formats would perhaps need different
1051     machine modes and these modes might be written in the `address'
1052     expression.
1053
Note: See TracBrowser for help on using the repository browser.