[8833] | 1 | This is Info file gcc.info, produced by Makeinfo-1.55 from the input |
---|
| 2 | file gcc.texi. |
---|
| 3 | |
---|
| 4 | This file documents the use and the internals of the GNU compiler. |
---|
| 5 | |
---|
| 6 | Published by the Free Software Foundation 59 Temple Place - Suite 330 |
---|
| 7 | Boston, MA 02111-1307 USA |
---|
| 8 | |
---|
| 9 | Copyright (C) 1988, 1989, 1992, 1993, 1994, 1995 Free Software |
---|
| 10 | Foundation, Inc. |
---|
| 11 | |
---|
| 12 | Permission is granted to make and distribute verbatim copies of this |
---|
| 13 | manual provided the copyright notice and this permission notice are |
---|
| 14 | preserved on all copies. |
---|
| 15 | |
---|
| 16 | Permission is granted to copy and distribute modified versions of |
---|
| 17 | this manual under the conditions for verbatim copying, provided also |
---|
| 18 | that the sections entitled "GNU General Public License," "Funding for |
---|
| 19 | Free Software," and "Protect Your Freedom--Fight `Look And Feel'" are |
---|
| 20 | included exactly as in the original, and provided that the entire |
---|
| 21 | resulting derived work is distributed under the terms of a permission |
---|
| 22 | notice identical to this one. |
---|
| 23 | |
---|
| 24 | Permission is granted to copy and distribute translations of this |
---|
| 25 | manual into another language, under the above conditions for modified |
---|
| 26 | versions, except that the sections entitled "GNU General Public |
---|
| 27 | License," "Funding for Free Software," and "Protect Your Freedom--Fight |
---|
| 28 | `Look And Feel'", and this permission notice, may be included in |
---|
| 29 | translations approved by the Free Software Foundation instead of in the |
---|
| 30 | original English. |
---|
| 31 | |
---|
| 32 | |
---|
| 33 | File: gcc.info, Node: Standard Names, Next: Pattern Ordering, Prev: Constraints, Up: Machine Desc |
---|
| 34 | |
---|
| 35 | Standard Pattern Names For Generation |
---|
| 36 | ===================================== |
---|
| 37 | |
---|
| 38 | Here is a table of the instruction names that are meaningful in the |
---|
| 39 | RTL generation pass of the compiler. Giving one of these names to an |
---|
| 40 | instruction pattern tells the RTL generation pass that it can use the |
---|
| 41 | pattern in to accomplish a certain task. |
---|
| 42 | |
---|
| 43 | `movM' |
---|
| 44 | Here M stands for a two-letter machine mode name, in lower case. |
---|
| 45 | This instruction pattern moves data with that machine mode from |
---|
| 46 | operand 1 to operand 0. For example, `movsi' moves full-word data. |
---|
| 47 | |
---|
| 48 | If operand 0 is a `subreg' with mode M of a register whose own |
---|
| 49 | mode is wider than M, the effect of this instruction is to store |
---|
| 50 | the specified value in the part of the register that corresponds |
---|
| 51 | to mode M. The effect on the rest of the register is undefined. |
---|
| 52 | |
---|
| 53 | This class of patterns is special in several ways. First of all, |
---|
| 54 | each of these names *must* be defined, because there is no other |
---|
| 55 | way to copy a datum from one place to another. |
---|
| 56 | |
---|
| 57 | Second, these patterns are not used solely in the RTL generation |
---|
| 58 | pass. Even the reload pass can generate move insns to copy values |
---|
| 59 | from stack slots into temporary registers. When it does so, one |
---|
| 60 | of the operands is a hard register and the other is an operand |
---|
| 61 | that can need to be reloaded into a register. |
---|
| 62 | |
---|
| 63 | Therefore, when given such a pair of operands, the pattern must |
---|
| 64 | generate RTL which needs no reloading and needs no temporary |
---|
| 65 | registers--no registers other than the operands. For example, if |
---|
| 66 | you support the pattern with a `define_expand', then in such a |
---|
| 67 | case the `define_expand' mustn't call `force_reg' or any other such |
---|
| 68 | function which might generate new pseudo registers. |
---|
| 69 | |
---|
| 70 | This requirement exists even for subword modes on a RISC machine |
---|
| 71 | where fetching those modes from memory normally requires several |
---|
| 72 | insns and some temporary registers. Look in `spur.md' to see how |
---|
| 73 | the requirement can be satisfied. |
---|
| 74 | |
---|
| 75 | During reload a memory reference with an invalid address may be |
---|
| 76 | passed as an operand. Such an address will be replaced with a |
---|
| 77 | valid address later in the reload pass. In this case, nothing may |
---|
| 78 | be done with the address except to use it as it stands. If it is |
---|
| 79 | copied, it will not be replaced with a valid address. No attempt |
---|
| 80 | should be made to make such an address into a valid address and no |
---|
| 81 | routine (such as `change_address') that will do so may be called. |
---|
| 82 | Note that `general_operand' will fail when applied to such an |
---|
| 83 | address. |
---|
| 84 | |
---|
| 85 | The global variable `reload_in_progress' (which must be explicitly |
---|
| 86 | declared if required) can be used to determine whether such special |
---|
| 87 | handling is required. |
---|
| 88 | |
---|
| 89 | The variety of operands that have reloads depends on the rest of |
---|
| 90 | the machine description, but typically on a RISC machine these can |
---|
| 91 | only be pseudo registers that did not get hard registers, while on |
---|
| 92 | other machines explicit memory references will get optional |
---|
| 93 | reloads. |
---|
| 94 | |
---|
| 95 | If a scratch register is required to move an object to or from |
---|
| 96 | memory, it can be allocated using `gen_reg_rtx' prior to reload. |
---|
| 97 | But this is impossible during and after reload. If there are |
---|
| 98 | cases needing scratch registers after reload, you must define |
---|
| 99 | `SECONDARY_INPUT_RELOAD_CLASS' and perhaps also |
---|
| 100 | `SECONDARY_OUTPUT_RELOAD_CLASS' to detect them, and provide |
---|
| 101 | patterns `reload_inM' or `reload_outM' to handle them. *Note |
---|
| 102 | Register Classes::. |
---|
| 103 | |
---|
| 104 | The constraints on a `moveM' must permit moving any hard register |
---|
| 105 | to any other hard register provided that `HARD_REGNO_MODE_OK' |
---|
| 106 | permits mode M in both registers and `REGISTER_MOVE_COST' applied |
---|
| 107 | to their classes returns a value of 2. |
---|
| 108 | |
---|
| 109 | It is obligatory to support floating point `moveM' instructions |
---|
| 110 | into and out of any registers that can hold fixed point values, |
---|
| 111 | because unions and structures (which have modes `SImode' or |
---|
| 112 | `DImode') can be in those registers and they may have floating |
---|
| 113 | point members. |
---|
| 114 | |
---|
| 115 | There may also be a need to support fixed point `moveM' |
---|
| 116 | instructions in and out of floating point registers. |
---|
| 117 | Unfortunately, I have forgotten why this was so, and I don't know |
---|
| 118 | whether it is still true. If `HARD_REGNO_MODE_OK' rejects fixed |
---|
| 119 | point values in floating point registers, then the constraints of |
---|
| 120 | the fixed point `moveM' instructions must be designed to avoid |
---|
| 121 | ever trying to reload into a floating point register. |
---|
| 122 | |
---|
| 123 | `reload_inM' |
---|
| 124 | `reload_outM' |
---|
| 125 | Like `movM', but used when a scratch register is required to move |
---|
| 126 | between operand 0 and operand 1. Operand 2 describes the scratch |
---|
| 127 | register. See the discussion of the `SECONDARY_RELOAD_CLASS' |
---|
| 128 | macro in *note Register Classes::.. |
---|
| 129 | |
---|
| 130 | `movstrictM' |
---|
| 131 | Like `movM' except that if operand 0 is a `subreg' with mode M of |
---|
| 132 | a register whose natural mode is wider, the `movstrictM' |
---|
| 133 | instruction is guaranteed not to alter any of the register except |
---|
| 134 | the part which belongs to mode M. |
---|
| 135 | |
---|
| 136 | `load_multiple' |
---|
| 137 | Load several consecutive memory locations into consecutive |
---|
| 138 | registers. Operand 0 is the first of the consecutive registers, |
---|
| 139 | operand 1 is the first memory location, and operand 2 is a |
---|
| 140 | constant: the number of consecutive registers. |
---|
| 141 | |
---|
| 142 | Define this only if the target machine really has such an |
---|
| 143 | instruction; do not define this if the most efficient way of |
---|
| 144 | loading consecutive registers from memory is to do them one at a |
---|
| 145 | time. |
---|
| 146 | |
---|
| 147 | On some machines, there are restrictions as to which consecutive |
---|
| 148 | registers can be stored into memory, such as particular starting or |
---|
| 149 | ending register numbers or only a range of valid counts. For those |
---|
| 150 | machines, use a `define_expand' (*note Expander Definitions::.) |
---|
| 151 | and make the pattern fail if the restrictions are not met. |
---|
| 152 | |
---|
| 153 | Write the generated insn as a `parallel' with elements being a |
---|
| 154 | `set' of one register from the appropriate memory location (you may |
---|
| 155 | also need `use' or `clobber' elements). Use a `match_parallel' |
---|
| 156 | (*note RTL Template::.) to recognize the insn. See `a29k.md' and |
---|
| 157 | `rs6000.md' for examples of the use of this insn pattern. |
---|
| 158 | |
---|
| 159 | `store_multiple' |
---|
| 160 | Similar to `load_multiple', but store several consecutive registers |
---|
| 161 | into consecutive memory locations. Operand 0 is the first of the |
---|
| 162 | consecutive memory locations, operand 1 is the first register, and |
---|
| 163 | operand 2 is a constant: the number of consecutive registers. |
---|
| 164 | |
---|
| 165 | `addM3' |
---|
| 166 | Add operand 2 and operand 1, storing the result in operand 0. All |
---|
| 167 | operands must have mode M. This can be used even on two-address |
---|
| 168 | machines, by means of constraints requiring operands 1 and 0 to be |
---|
| 169 | the same location. |
---|
| 170 | |
---|
| 171 | `subM3', `mulM3' |
---|
| 172 | `divM3', `udivM3', `modM3', `umodM3' |
---|
| 173 | `sminM3', `smaxM3', `uminM3', `umaxM3' |
---|
| 174 | `andM3', `iorM3', `xorM3' |
---|
| 175 | Similar, for other arithmetic operations. |
---|
| 176 | |
---|
| 177 | `mulhisi3' |
---|
| 178 | Multiply operands 1 and 2, which have mode `HImode', and store a |
---|
| 179 | `SImode' product in operand 0. |
---|
| 180 | |
---|
| 181 | `mulqihi3', `mulsidi3' |
---|
| 182 | Similar widening-multiplication instructions of other widths. |
---|
| 183 | |
---|
| 184 | `umulqihi3', `umulhisi3', `umulsidi3' |
---|
| 185 | Similar widening-multiplication instructions that do unsigned |
---|
| 186 | multiplication. |
---|
| 187 | |
---|
| 188 | `mulM3_highpart' |
---|
| 189 | Perform a signed multiplication of operands 1 and 2, which have |
---|
| 190 | mode M, and store the most significant half of the product in |
---|
| 191 | operand 0. The least significant half of the product is discarded. |
---|
| 192 | |
---|
| 193 | `umulM3_highpart' |
---|
| 194 | Similar, but the multiplication is unsigned. |
---|
| 195 | |
---|
| 196 | `divmodM4' |
---|
| 197 | Signed division that produces both a quotient and a remainder. |
---|
| 198 | Operand 1 is divided by operand 2 to produce a quotient stored in |
---|
| 199 | operand 0 and a remainder stored in operand 3. |
---|
| 200 | |
---|
| 201 | For machines with an instruction that produces both a quotient and |
---|
| 202 | a remainder, provide a pattern for `divmodM4' but do not provide |
---|
| 203 | patterns for `divM3' and `modM3'. This allows optimization in the |
---|
| 204 | relatively common case when both the quotient and remainder are |
---|
| 205 | computed. |
---|
| 206 | |
---|
| 207 | If an instruction that just produces a quotient or just a remainder |
---|
| 208 | exists and is more efficient than the instruction that produces |
---|
| 209 | both, write the output routine of `divmodM4' to call |
---|
| 210 | `find_reg_note' and look for a `REG_UNUSED' note on the quotient |
---|
| 211 | or remainder and generate the appropriate instruction. |
---|
| 212 | |
---|
| 213 | `udivmodM4' |
---|
| 214 | Similar, but does unsigned division. |
---|
| 215 | |
---|
| 216 | `ashlM3' |
---|
| 217 | Arithmetic-shift operand 1 left by a number of bits specified by |
---|
| 218 | operand 2, and store the result in operand 0. Here M is the mode |
---|
| 219 | of operand 0 and operand 1; operand 2's mode is specified by the |
---|
| 220 | instruction pattern, and the compiler will convert the operand to |
---|
| 221 | that mode before generating the instruction. |
---|
| 222 | |
---|
| 223 | `ashrM3', `lshrM3', `rotlM3', `rotrM3' |
---|
| 224 | Other shift and rotate instructions, analogous to the `ashlM3' |
---|
| 225 | instructions. |
---|
| 226 | |
---|
| 227 | `negM2' |
---|
| 228 | Negate operand 1 and store the result in operand 0. |
---|
| 229 | |
---|
| 230 | `absM2' |
---|
| 231 | Store the absolute value of operand 1 into operand 0. |
---|
| 232 | |
---|
| 233 | `sqrtM2' |
---|
| 234 | Store the square root of operand 1 into operand 0. |
---|
| 235 | |
---|
| 236 | The `sqrt' built-in function of C always uses the mode which |
---|
| 237 | corresponds to the C data type `double'. |
---|
| 238 | |
---|
| 239 | `ffsM2' |
---|
| 240 | Store into operand 0 one plus the index of the least significant |
---|
| 241 | 1-bit of operand 1. If operand 1 is zero, store zero. M is the |
---|
| 242 | mode of operand 0; operand 1's mode is specified by the instruction |
---|
| 243 | pattern, and the compiler will convert the operand to that mode |
---|
| 244 | before generating the instruction. |
---|
| 245 | |
---|
| 246 | The `ffs' built-in function of C always uses the mode which |
---|
| 247 | corresponds to the C data type `int'. |
---|
| 248 | |
---|
| 249 | `one_cmplM2' |
---|
| 250 | Store the bitwise-complement of operand 1 into operand 0. |
---|
| 251 | |
---|
| 252 | `cmpM' |
---|
| 253 | Compare operand 0 and operand 1, and set the condition codes. The |
---|
| 254 | RTL pattern should look like this: |
---|
| 255 | |
---|
| 256 | (set (cc0) (compare (match_operand:M 0 ...) |
---|
| 257 | (match_operand:M 1 ...))) |
---|
| 258 | |
---|
| 259 | `tstM' |
---|
| 260 | Compare operand 0 against zero, and set the condition codes. The |
---|
| 261 | RTL pattern should look like this: |
---|
| 262 | |
---|
| 263 | (set (cc0) (match_operand:M 0 ...)) |
---|
| 264 | |
---|
| 265 | `tstM' patterns should not be defined for machines that do not use |
---|
| 266 | `(cc0)'. Doing so would confuse the optimizer since it would no |
---|
| 267 | longer be clear which `set' operations were comparisons. The |
---|
| 268 | `cmpM' patterns should be used instead. |
---|
| 269 | |
---|
| 270 | `movstrM' |
---|
| 271 | Block move instruction. The addresses of the destination and |
---|
| 272 | source strings are the first two operands, and both are in mode |
---|
| 273 | `Pmode'. The number of bytes to move is the third operand, in |
---|
| 274 | mode M. |
---|
| 275 | |
---|
| 276 | The fourth operand is the known shared alignment of the source and |
---|
| 277 | destination, in the form of a `const_int' rtx. Thus, if the |
---|
| 278 | compiler knows that both source and destination are word-aligned, |
---|
| 279 | it may provide the value 4 for this operand. |
---|
| 280 | |
---|
| 281 | These patterns need not give special consideration to the |
---|
| 282 | possibility that the source and destination strings might overlap. |
---|
| 283 | |
---|
| 284 | `cmpstrM' |
---|
| 285 | Block compare instruction, with five operands. Operand 0 is the |
---|
| 286 | output; it has mode M. The remaining four operands are like the |
---|
| 287 | operands of `movstrM'. The two memory blocks specified are |
---|
| 288 | compared byte by byte in lexicographic order. The effect of the |
---|
| 289 | instruction is to store a value in operand 0 whose sign indicates |
---|
| 290 | the result of the comparison. |
---|
| 291 | |
---|
| 292 | Compute the length of a string, with three operands. Operand 0 is |
---|
| 293 | the result (of mode M), operand 1 is a `mem' referring to the |
---|
| 294 | first character of the string, operand 2 is the character to |
---|
| 295 | search for (normally zero), and operand 3 is a constant describing |
---|
| 296 | the known alignment of the beginning of the string. |
---|
| 297 | |
---|
| 298 | `floatMN2' |
---|
| 299 | Convert signed integer operand 1 (valid for fixed point mode M) to |
---|
| 300 | floating point mode N and store in operand 0 (which has mode N). |
---|
| 301 | |
---|
| 302 | `floatunsMN2' |
---|
| 303 | Convert unsigned integer operand 1 (valid for fixed point mode M) |
---|
| 304 | to floating point mode N and store in operand 0 (which has mode N). |
---|
| 305 | |
---|
| 306 | `fixMN2' |
---|
| 307 | Convert operand 1 (valid for floating point mode M) to fixed point |
---|
| 308 | mode N as a signed number and store in operand 0 (which has mode |
---|
| 309 | N). This instruction's result is defined only when the value of |
---|
| 310 | operand 1 is an integer. |
---|
| 311 | |
---|
| 312 | `fixunsMN2' |
---|
| 313 | Convert operand 1 (valid for floating point mode M) to fixed point |
---|
| 314 | mode N as an unsigned number and store in operand 0 (which has |
---|
| 315 | mode N). This instruction's result is defined only when the value |
---|
| 316 | of operand 1 is an integer. |
---|
| 317 | |
---|
| 318 | `ftruncM2' |
---|
| 319 | Convert operand 1 (valid for floating point mode M) to an integer |
---|
| 320 | value, still represented in floating point mode M, and store it in |
---|
| 321 | operand 0 (valid for floating point mode M). |
---|
| 322 | |
---|
| 323 | `fix_truncMN2' |
---|
| 324 | Like `fixMN2' but works for any floating point value of mode M by |
---|
| 325 | converting the value to an integer. |
---|
| 326 | |
---|
| 327 | `fixuns_truncMN2' |
---|
| 328 | Like `fixunsMN2' but works for any floating point value of mode M |
---|
| 329 | by converting the value to an integer. |
---|
| 330 | |
---|
| 331 | `truncMN' |
---|
| 332 | Truncate operand 1 (valid for mode M) to mode N and store in |
---|
| 333 | operand 0 (which has mode N). Both modes must be fixed point or |
---|
| 334 | both floating point. |
---|
| 335 | |
---|
| 336 | `extendMN' |
---|
| 337 | Sign-extend operand 1 (valid for mode M) to mode N and store in |
---|
| 338 | operand 0 (which has mode N). Both modes must be fixed point or |
---|
| 339 | both floating point. |
---|
| 340 | |
---|
| 341 | `zero_extendMN' |
---|
| 342 | Zero-extend operand 1 (valid for mode M) to mode N and store in |
---|
| 343 | operand 0 (which has mode N). Both modes must be fixed point. |
---|
| 344 | |
---|
| 345 | `extv' |
---|
| 346 | Extract a bit field from operand 1 (a register or memory operand), |
---|
| 347 | where operand 2 specifies the width in bits and operand 3 the |
---|
| 348 | starting bit, and store it in operand 0. Operand 0 must have mode |
---|
| 349 | `word_mode'. Operand 1 may have mode `byte_mode' or `word_mode'; |
---|
| 350 | often `word_mode' is allowed only for registers. Operands 2 and 3 |
---|
| 351 | must be valid for `word_mode'. |
---|
| 352 | |
---|
| 353 | The RTL generation pass generates this instruction only with |
---|
| 354 | constants for operands 2 and 3. |
---|
| 355 | |
---|
| 356 | The bit-field value is sign-extended to a full word integer before |
---|
| 357 | it is stored in operand 0. |
---|
| 358 | |
---|
| 359 | `extzv' |
---|
| 360 | Like `extv' except that the bit-field value is zero-extended. |
---|
| 361 | |
---|
| 362 | `insv' |
---|
| 363 | Store operand 3 (which must be valid for `word_mode') into a bit |
---|
| 364 | field in operand 0, where operand 1 specifies the width in bits and |
---|
| 365 | operand 2 the starting bit. Operand 0 may have mode `byte_mode' or |
---|
| 366 | `word_mode'; often `word_mode' is allowed only for registers. |
---|
| 367 | Operands 1 and 2 must be valid for `word_mode'. |
---|
| 368 | |
---|
| 369 | The RTL generation pass generates this instruction only with |
---|
| 370 | constants for operands 1 and 2. |
---|
| 371 | |
---|
| 372 | `movMODEcc' |
---|
| 373 | Conditionally move operand 2 or operand 3 into operand 0 according |
---|
| 374 | to the comparison in operand 1. If the comparison is true, |
---|
| 375 | operand 2 is moved into operand 0, otherwise operand 3 is moved. |
---|
| 376 | |
---|
| 377 | The mode of the operands being compared need not be the same as |
---|
| 378 | the operands being moved. Some machines, sparc64 for example, |
---|
| 379 | have instructions that conditionally move an integer value based |
---|
| 380 | on the floating point condition codes and vice versa. |
---|
| 381 | |
---|
| 382 | If the machine does not have conditional move instructions, do not |
---|
| 383 | define these patterns. |
---|
| 384 | |
---|
| 385 | `sCOND' |
---|
| 386 | Store zero or nonzero in the operand according to the condition |
---|
| 387 | codes. Value stored is nonzero iff the condition COND is true. |
---|
| 388 | cOND is the name of a comparison operation expression code, such |
---|
| 389 | as `eq', `lt' or `leu'. |
---|
| 390 | |
---|
| 391 | You specify the mode that the operand must have when you write the |
---|
| 392 | `match_operand' expression. The compiler automatically sees which |
---|
| 393 | mode you have used and supplies an operand of that mode. |
---|
| 394 | |
---|
| 395 | The value stored for a true condition must have 1 as its low bit, |
---|
| 396 | or else must be negative. Otherwise the instruction is not |
---|
| 397 | suitable and you should omit it from the machine description. You |
---|
| 398 | describe to the compiler exactly which value is stored by defining |
---|
| 399 | the macro `STORE_FLAG_VALUE' (*note Misc::.). If a description |
---|
| 400 | cannot be found that can be used for all the `sCOND' patterns, you |
---|
| 401 | should omit those operations from the machine description. |
---|
| 402 | |
---|
| 403 | These operations may fail, but should do so only in relatively |
---|
| 404 | uncommon cases; if they would fail for common cases involving |
---|
| 405 | integer comparisons, it is best to omit these patterns. |
---|
| 406 | |
---|
| 407 | If these operations are omitted, the compiler will usually |
---|
| 408 | generate code that copies the constant one to the target and |
---|
| 409 | branches around an assignment of zero to the target. If this code |
---|
| 410 | is more efficient than the potential instructions used for the |
---|
| 411 | `sCOND' pattern followed by those required to convert the result |
---|
| 412 | into a 1 or a zero in `SImode', you should omit the `sCOND' |
---|
| 413 | operations from the machine description. |
---|
| 414 | |
---|
| 415 | `bCOND' |
---|
| 416 | Conditional branch instruction. Operand 0 is a `label_ref' that |
---|
| 417 | refers to the label to jump to. Jump if the condition codes meet |
---|
| 418 | condition COND. |
---|
| 419 | |
---|
| 420 | Some machines do not follow the model assumed here where a |
---|
| 421 | comparison instruction is followed by a conditional branch |
---|
| 422 | instruction. In that case, the `cmpM' (and `tstM') patterns should |
---|
| 423 | simply store the operands away and generate all the required insns |
---|
| 424 | in a `define_expand' (*note Expander Definitions::.) for the |
---|
| 425 | conditional branch operations. All calls to expand `bCOND' |
---|
| 426 | patterns are immediately preceded by calls to expand either a |
---|
| 427 | `cmpM' pattern or a `tstM' pattern. |
---|
| 428 | |
---|
| 429 | Machines that use a pseudo register for the condition code value, |
---|
| 430 | or where the mode used for the comparison depends on the condition |
---|
| 431 | being tested, should also use the above mechanism. *Note Jump |
---|
| 432 | Patterns:: |
---|
| 433 | |
---|
| 434 | The above discussion also applies to the `movMODEcc' and `sCOND' |
---|
| 435 | patterns. |
---|
| 436 | |
---|
| 437 | `call' |
---|
| 438 | Subroutine call instruction returning no value. Operand 0 is the |
---|
| 439 | function to call; operand 1 is the number of bytes of arguments |
---|
| 440 | pushed (in mode `SImode', except it is normally a `const_int'); |
---|
| 441 | operand 2 is the number of registers used as operands. |
---|
| 442 | |
---|
| 443 | On most machines, operand 2 is not actually stored into the RTL |
---|
| 444 | pattern. It is supplied for the sake of some RISC machines which |
---|
| 445 | need to put this information into the assembler code; they can put |
---|
| 446 | it in the RTL instead of operand 1. |
---|
| 447 | |
---|
| 448 | Operand 0 should be a `mem' RTX whose address is the address of the |
---|
| 449 | function. Note, however, that this address can be a `symbol_ref' |
---|
| 450 | expression even if it would not be a legitimate memory address on |
---|
| 451 | the target machine. If it is also not a valid argument for a call |
---|
| 452 | instruction, the pattern for this operation should be a |
---|
| 453 | `define_expand' (*note Expander Definitions::.) that places the |
---|
| 454 | address into a register and uses that register in the call |
---|
| 455 | instruction. |
---|
| 456 | |
---|
| 457 | `call_value' |
---|
| 458 | Subroutine call instruction returning a value. Operand 0 is the |
---|
| 459 | hard register in which the value is returned. There are three more |
---|
| 460 | operands, the same as the three operands of the `call' instruction |
---|
| 461 | (but with numbers increased by one). |
---|
| 462 | |
---|
| 463 | Subroutines that return `BLKmode' objects use the `call' insn. |
---|
| 464 | |
---|
| 465 | `call_pop', `call_value_pop' |
---|
| 466 | Similar to `call' and `call_value', except used if defined and if |
---|
| 467 | `RETURN_POPS_ARGS' is non-zero. They should emit a `parallel' |
---|
| 468 | that contains both the function call and a `set' to indicate the |
---|
| 469 | adjustment made to the frame pointer. |
---|
| 470 | |
---|
| 471 | For machines where `RETURN_POPS_ARGS' can be non-zero, the use of |
---|
| 472 | these patterns increases the number of functions for which the |
---|
| 473 | frame pointer can be eliminated, if desired. |
---|
| 474 | |
---|
| 475 | `untyped_call' |
---|
| 476 | Subroutine call instruction returning a value of any type. |
---|
| 477 | Operand 0 is the function to call; operand 1 is a memory location |
---|
| 478 | where the result of calling the function is to be stored; operand |
---|
| 479 | 2 is a `parallel' expression where each element is a `set' |
---|
| 480 | expression that indicates the saving of a function return value |
---|
| 481 | into the result block. |
---|
| 482 | |
---|
| 483 | This instruction pattern should be defined to support |
---|
| 484 | `__builtin_apply' on machines where special instructions are needed |
---|
| 485 | to call a subroutine with arbitrary arguments or to save the value |
---|
| 486 | returned. This instruction pattern is required on machines that |
---|
| 487 | have multiple registers that can hold a return value (i.e. |
---|
| 488 | `FUNCTION_VALUE_REGNO_P' is true for more than one register). |
---|
| 489 | |
---|
| 490 | `return' |
---|
| 491 | Subroutine return instruction. This instruction pattern name |
---|
| 492 | should be defined only if a single instruction can do all the work |
---|
| 493 | of returning from a function. |
---|
| 494 | |
---|
| 495 | Like the `movM' patterns, this pattern is also used after the RTL |
---|
| 496 | generation phase. In this case it is to support machines where |
---|
| 497 | multiple instructions are usually needed to return from a |
---|
| 498 | function, but some class of functions only requires one |
---|
| 499 | instruction to implement a return. Normally, the applicable |
---|
| 500 | functions are those which do not need to save any registers or |
---|
| 501 | allocate stack space. |
---|
| 502 | |
---|
| 503 | For such machines, the condition specified in this pattern should |
---|
| 504 | only be true when `reload_completed' is non-zero and the function's |
---|
| 505 | epilogue would only be a single instruction. For machines with |
---|
| 506 | register windows, the routine `leaf_function_p' may be used to |
---|
| 507 | determine if a register window push is required. |
---|
| 508 | |
---|
| 509 | Machines that have conditional return instructions should define |
---|
| 510 | patterns such as |
---|
| 511 | |
---|
| 512 | (define_insn "" |
---|
| 513 | [(set (pc) |
---|
| 514 | (if_then_else (match_operator |
---|
| 515 | 0 "comparison_operator" |
---|
| 516 | [(cc0) (const_int 0)]) |
---|
| 517 | (return) |
---|
| 518 | (pc)))] |
---|
| 519 | "CONDITION" |
---|
| 520 | "...") |
---|
| 521 | |
---|
| 522 | where CONDITION would normally be the same condition specified on |
---|
| 523 | the named `return' pattern. |
---|
| 524 | |
---|
| 525 | `untyped_return' |
---|
| 526 | Untyped subroutine return instruction. This instruction pattern |
---|
| 527 | should be defined to support `__builtin_return' on machines where |
---|
| 528 | special instructions are needed to return a value of any type. |
---|
| 529 | |
---|
| 530 | Operand 0 is a memory location where the result of calling a |
---|
| 531 | function with `__builtin_apply' is stored; operand 1 is a |
---|
| 532 | `parallel' expression where each element is a `set' expression |
---|
| 533 | that indicates the restoring of a function return value from the |
---|
| 534 | result block. |
---|
| 535 | |
---|
| 536 | `nop' |
---|
| 537 | No-op instruction. This instruction pattern name should always be |
---|
| 538 | defined to output a no-op in assembler code. `(const_int 0)' will |
---|
| 539 | do as an RTL pattern. |
---|
| 540 | |
---|
| 541 | `indirect_jump' |
---|
| 542 | An instruction to jump to an address which is operand zero. This |
---|
| 543 | pattern name is mandatory on all machines. |
---|
| 544 | |
---|
| 545 | `casesi' |
---|
| 546 | Instruction to jump through a dispatch table, including bounds |
---|
| 547 | checking. This instruction takes five operands: |
---|
| 548 | |
---|
| 549 | 1. The index to dispatch on, which has mode `SImode'. |
---|
| 550 | |
---|
| 551 | 2. The lower bound for indices in the table, an integer constant. |
---|
| 552 | |
---|
| 553 | 3. The total range of indices in the table--the largest index |
---|
| 554 | minus the smallest one (both inclusive). |
---|
| 555 | |
---|
| 556 | 4. A label that precedes the table itself. |
---|
| 557 | |
---|
| 558 | 5. A label to jump to if the index has a value outside the |
---|
| 559 | bounds. (If the machine-description macro |
---|
| 560 | `CASE_DROPS_THROUGH' is defined, then an out-of-bounds index |
---|
| 561 | drops through to the code following the jump table instead of |
---|
| 562 | jumping to this label. In that case, this label is not |
---|
| 563 | actually used by the `casesi' instruction, but it is always |
---|
| 564 | provided as an operand.) |
---|
| 565 | |
---|
| 566 | The table is a `addr_vec' or `addr_diff_vec' inside of a |
---|
| 567 | `jump_insn'. The number of elements in the table is one plus the |
---|
| 568 | difference between the upper bound and the lower bound. |
---|
| 569 | |
---|
| 570 | `tablejump' |
---|
| 571 | Instruction to jump to a variable address. This is a low-level |
---|
| 572 | capability which can be used to implement a dispatch table when |
---|
| 573 | there is no `casesi' pattern. |
---|
| 574 | |
---|
| 575 | This pattern requires two operands: the address or offset, and a |
---|
| 576 | label which should immediately precede the jump table. If the |
---|
| 577 | macro `CASE_VECTOR_PC_RELATIVE' is defined then the first operand |
---|
| 578 | is an offset which counts from the address of the table; |
---|
| 579 | otherwise, it is an absolute address to jump to. In either case, |
---|
| 580 | the first operand has mode `Pmode'. |
---|
| 581 | |
---|
| 582 | The `tablejump' insn is always the last insn before the jump table |
---|
| 583 | it uses. Its assembler code normally has no need to use the |
---|
| 584 | second operand, but you should incorporate it in the RTL pattern so |
---|
| 585 | that the jump optimizer will not delete the table as unreachable |
---|
| 586 | code. |
---|
| 587 | |
---|
| 588 | `save_stack_block' |
---|
| 589 | `save_stack_function' |
---|
| 590 | `save_stack_nonlocal' |
---|
| 591 | `restore_stack_block' |
---|
| 592 | `restore_stack_function' |
---|
| 593 | `restore_stack_nonlocal' |
---|
| 594 | Most machines save and restore the stack pointer by copying it to |
---|
| 595 | or from an object of mode `Pmode'. Do not define these patterns on |
---|
| 596 | such machines. |
---|
| 597 | |
---|
| 598 | Some machines require special handling for stack pointer saves and |
---|
| 599 | restores. On those machines, define the patterns corresponding to |
---|
| 600 | the non-standard cases by using a `define_expand' (*note Expander |
---|
| 601 | Definitions::.) that produces the required insns. The three types |
---|
| 602 | of saves and restores are: |
---|
| 603 | |
---|
| 604 | 1. `save_stack_block' saves the stack pointer at the start of a |
---|
| 605 | block that allocates a variable-sized object, and |
---|
| 606 | `restore_stack_block' restores the stack pointer when the |
---|
| 607 | block is exited. |
---|
| 608 | |
---|
| 609 | 2. `save_stack_function' and `restore_stack_function' do a |
---|
| 610 | similar job for the outermost block of a function and are |
---|
| 611 | used when the function allocates variable-sized objects or |
---|
| 612 | calls `alloca'. Only the epilogue uses the restored stack |
---|
| 613 | pointer, allowing a simpler save or restore sequence on some |
---|
| 614 | machines. |
---|
| 615 | |
---|
| 616 | 3. `save_stack_nonlocal' is used in functions that contain labels |
---|
| 617 | branched to by nested functions. It saves the stack pointer |
---|
| 618 | in such a way that the inner function can use |
---|
| 619 | `restore_stack_nonlocal' to restore the stack pointer. The |
---|
| 620 | compiler generates code to restore the frame and argument |
---|
| 621 | pointer registers, but some machines require saving and |
---|
| 622 | restoring additional data such as register window information |
---|
| 623 | or stack backchains. Place insns in these patterns to save |
---|
| 624 | and restore any such required data. |
---|
| 625 | |
---|
| 626 | When saving the stack pointer, operand 0 is the save area and |
---|
| 627 | operand 1 is the stack pointer. The mode used to allocate the |
---|
| 628 | save area is the mode of operand 0. You must specify an integral |
---|
| 629 | mode, or `VOIDmode' if no save area is needed for a particular |
---|
| 630 | type of save (either because no save is needed or because a |
---|
| 631 | machine-specific save area can be used). Operand 0 is the stack |
---|
| 632 | pointer and operand 1 is the save area for restore operations. If |
---|
| 633 | `save_stack_block' is defined, operand 0 must not be `VOIDmode' |
---|
| 634 | since these saves can be arbitrarily nested. |
---|
| 635 | |
---|
| 636 | A save area is a `mem' that is at a constant offset from |
---|
| 637 | `virtual_stack_vars_rtx' when the stack pointer is saved for use by |
---|
| 638 | nonlocal gotos and a `reg' in the other two cases. |
---|
| 639 | |
---|
| 640 | `allocate_stack' |
---|
| 641 | Subtract (or add if `STACK_GROWS_DOWNWARD' is undefined) operand 0 |
---|
| 642 | from the stack pointer to create space for dynamically allocated |
---|
| 643 | data. |
---|
| 644 | |
---|
| 645 | Do not define this pattern if all that must be done is the |
---|
| 646 | subtraction. Some machines require other operations such as stack |
---|
| 647 | probes or maintaining the back chain. Define this pattern to emit |
---|
| 648 | those operations in addition to updating the stack pointer. |
---|
| 649 | |
---|
| 650 | |
---|
| 651 | File: gcc.info, Node: Pattern Ordering, Next: Dependent Patterns, Prev: Standard Names, Up: Machine Desc |
---|
| 652 | |
---|
| 653 | When the Order of Patterns Matters |
---|
| 654 | ================================== |
---|
| 655 | |
---|
| 656 | Sometimes an insn can match more than one instruction pattern. Then |
---|
| 657 | the pattern that appears first in the machine description is the one |
---|
| 658 | used. Therefore, more specific patterns (patterns that will match |
---|
| 659 | fewer things) and faster instructions (those that will produce better |
---|
| 660 | code when they do match) should usually go first in the description. |
---|
| 661 | |
---|
| 662 | In some cases the effect of ordering the patterns can be used to hide |
---|
| 663 | a pattern when it is not valid. For example, the 68000 has an |
---|
| 664 | instruction for converting a fullword to floating point and another for |
---|
| 665 | converting a byte to floating point. An instruction converting an |
---|
| 666 | integer to floating point could match either one. We put the pattern |
---|
| 667 | to convert the fullword first to make sure that one will be used rather |
---|
| 668 | than the other. (Otherwise a large integer might be generated as a |
---|
| 669 | single-byte immediate quantity, which would not work.) Instead of using |
---|
| 670 | this pattern ordering it would be possible to make the pattern for |
---|
| 671 | convert-a-byte smart enough to deal properly with any constant value. |
---|
| 672 | |
---|
| 673 | |
---|
| 674 | File: gcc.info, Node: Dependent Patterns, Next: Jump Patterns, Prev: Pattern Ordering, Up: Machine Desc |
---|
| 675 | |
---|
| 676 | Interdependence of Patterns |
---|
| 677 | =========================== |
---|
| 678 | |
---|
| 679 | Every machine description must have a named pattern for each of the |
---|
| 680 | conditional branch names `bCOND'. The recognition template must always |
---|
| 681 | have the form |
---|
| 682 | |
---|
| 683 | (set (pc) |
---|
| 684 | (if_then_else (COND (cc0) (const_int 0)) |
---|
| 685 | (label_ref (match_operand 0 "" "")) |
---|
| 686 | (pc))) |
---|
| 687 | |
---|
| 688 | In addition, every machine description must have an anonymous pattern |
---|
| 689 | for each of the possible reverse-conditional branches. Their templates |
---|
| 690 | look like |
---|
| 691 | |
---|
| 692 | (set (pc) |
---|
| 693 | (if_then_else (COND (cc0) (const_int 0)) |
---|
| 694 | (pc) |
---|
| 695 | (label_ref (match_operand 0 "" "")))) |
---|
| 696 | |
---|
| 697 | They are necessary because jump optimization can turn direct-conditional |
---|
| 698 | branches into reverse-conditional branches. |
---|
| 699 | |
---|
| 700 | It is often convenient to use the `match_operator' construct to |
---|
| 701 | reduce the number of patterns that must be specified for branches. For |
---|
| 702 | example, |
---|
| 703 | |
---|
| 704 | (define_insn "" |
---|
| 705 | [(set (pc) |
---|
| 706 | (if_then_else (match_operator 0 "comparison_operator" |
---|
| 707 | [(cc0) (const_int 0)]) |
---|
| 708 | (pc) |
---|
| 709 | (label_ref (match_operand 1 "" ""))))] |
---|
| 710 | "CONDITION" |
---|
| 711 | "...") |
---|
| 712 | |
---|
| 713 | In some cases machines support instructions identical except for the |
---|
| 714 | machine mode of one or more operands. For example, there may be |
---|
| 715 | "sign-extend halfword" and "sign-extend byte" instructions whose |
---|
| 716 | patterns are |
---|
| 717 | |
---|
| 718 | (set (match_operand:SI 0 ...) |
---|
| 719 | (extend:SI (match_operand:HI 1 ...))) |
---|
| 720 | |
---|
| 721 | (set (match_operand:SI 0 ...) |
---|
| 722 | (extend:SI (match_operand:QI 1 ...))) |
---|
| 723 | |
---|
| 724 | Constant integers do not specify a machine mode, so an instruction to |
---|
| 725 | extend a constant value could match either pattern. The pattern it |
---|
| 726 | actually will match is the one that appears first in the file. For |
---|
| 727 | correct results, this must be the one for the widest possible mode |
---|
| 728 | (`HImode', here). If the pattern matches the `QImode' instruction, the |
---|
| 729 | results will be incorrect if the constant value does not actually fit |
---|
| 730 | that mode. |
---|
| 731 | |
---|
| 732 | Such instructions to extend constants are rarely generated because |
---|
| 733 | they are optimized away, but they do occasionally happen in nonoptimized |
---|
| 734 | compilations. |
---|
| 735 | |
---|
| 736 | If a constraint in a pattern allows a constant, the reload pass may |
---|
| 737 | replace a register with a constant permitted by the constraint in some |
---|
| 738 | cases. Similarly for memory references. Because of this substitution, |
---|
| 739 | you should not provide separate patterns for increment and decrement |
---|
| 740 | instructions. Instead, they should be generated from the same pattern |
---|
| 741 | that supports register-register add insns by examining the operands and |
---|
| 742 | generating the appropriate machine instruction. |
---|
| 743 | |
---|
| 744 | |
---|
| 745 | File: gcc.info, Node: Jump Patterns, Next: Insn Canonicalizations, Prev: Dependent Patterns, Up: Machine Desc |
---|
| 746 | |
---|
| 747 | Defining Jump Instruction Patterns |
---|
| 748 | ================================== |
---|
| 749 | |
---|
| 750 | For most machines, GNU CC assumes that the machine has a condition |
---|
| 751 | code. A comparison insn sets the condition code, recording the results |
---|
| 752 | of both signed and unsigned comparison of the given operands. A |
---|
| 753 | separate branch insn tests the condition code and branches or not |
---|
| 754 | according its value. The branch insns come in distinct signed and |
---|
| 755 | unsigned flavors. Many common machines, such as the Vax, the 68000 and |
---|
| 756 | the 32000, work this way. |
---|
| 757 | |
---|
| 758 | Some machines have distinct signed and unsigned compare |
---|
| 759 | instructions, and only one set of conditional branch instructions. The |
---|
| 760 | easiest way to handle these machines is to treat them just like the |
---|
| 761 | others until the final stage where assembly code is written. At this |
---|
| 762 | time, when outputting code for the compare instruction, peek ahead at |
---|
| 763 | the following branch using `next_cc0_user (insn)'. (The variable |
---|
| 764 | `insn' refers to the insn being output, in the output-writing code in |
---|
| 765 | an instruction pattern.) If the RTL says that is an unsigned branch, |
---|
| 766 | output an unsigned compare; otherwise output a signed compare. When |
---|
| 767 | the branch itself is output, you can treat signed and unsigned branches |
---|
| 768 | identically. |
---|
| 769 | |
---|
| 770 | The reason you can do this is that GNU CC always generates a pair of |
---|
| 771 | consecutive RTL insns, possibly separated by `note' insns, one to set |
---|
| 772 | the condition code and one to test it, and keeps the pair inviolate |
---|
| 773 | until the end. |
---|
| 774 | |
---|
| 775 | To go with this technique, you must define the machine-description |
---|
| 776 | macro `NOTICE_UPDATE_CC' to do `CC_STATUS_INIT'; in other words, no |
---|
| 777 | compare instruction is superfluous. |
---|
| 778 | |
---|
| 779 | Some machines have compare-and-branch instructions and no condition |
---|
| 780 | code. A similar technique works for them. When it is time to "output" |
---|
| 781 | a compare instruction, record its operands in two static variables. |
---|
| 782 | When outputting the branch-on-condition-code instruction that follows, |
---|
| 783 | actually output a compare-and-branch instruction that uses the |
---|
| 784 | remembered operands. |
---|
| 785 | |
---|
| 786 | It also works to define patterns for compare-and-branch instructions. |
---|
| 787 | In optimizing compilation, the pair of compare and branch instructions |
---|
| 788 | will be combined according to these patterns. But this does not happen |
---|
| 789 | if optimization is not requested. So you must use one of the solutions |
---|
| 790 | above in addition to any special patterns you define. |
---|
| 791 | |
---|
| 792 | In many RISC machines, most instructions do not affect the condition |
---|
| 793 | code and there may not even be a separate condition code register. On |
---|
| 794 | these machines, the restriction that the definition and use of the |
---|
| 795 | condition code be adjacent insns is not necessary and can prevent |
---|
| 796 | important optimizations. For example, on the IBM RS/6000, there is a |
---|
| 797 | delay for taken branches unless the condition code register is set three |
---|
| 798 | instructions earlier than the conditional branch. The instruction |
---|
| 799 | scheduler cannot perform this optimization if it is not permitted to |
---|
| 800 | separate the definition and use of the condition code register. |
---|
| 801 | |
---|
| 802 | On these machines, do not use `(cc0)', but instead use a register to |
---|
| 803 | represent the condition code. If there is a specific condition code |
---|
| 804 | register in the machine, use a hard register. If the condition code or |
---|
| 805 | comparison result can be placed in any general register, or if there are |
---|
| 806 | multiple condition registers, use a pseudo register. |
---|
| 807 | |
---|
| 808 | On some machines, the type of branch instruction generated may |
---|
| 809 | depend on the way the condition code was produced; for example, on the |
---|
| 810 | 68k and Sparc, setting the condition code directly from an add or |
---|
| 811 | subtract instruction does not clear the overflow bit the way that a test |
---|
| 812 | instruction does, so a different branch instruction must be used for |
---|
| 813 | some conditional branches. For machines that use `(cc0)', the set and |
---|
| 814 | use of the condition code must be adjacent (separated only by `note' |
---|
| 815 | insns) allowing flags in `cc_status' to be used. (*Note Condition |
---|
| 816 | Code::.) Also, the comparison and branch insns can be located from |
---|
| 817 | each other by using the functions `prev_cc0_setter' and `next_cc0_user'. |
---|
| 818 | |
---|
| 819 | However, this is not true on machines that do not use `(cc0)'. On |
---|
| 820 | those machines, no assumptions can be made about the adjacency of the |
---|
| 821 | compare and branch insns and the above methods cannot be used. Instead, |
---|
| 822 | we use the machine mode of the condition code register to record |
---|
| 823 | different formats of the condition code register. |
---|
| 824 | |
---|
| 825 | Registers used to store the condition code value should have a mode |
---|
| 826 | that is in class `MODE_CC'. Normally, it will be `CCmode'. If |
---|
| 827 | additional modes are required (as for the add example mentioned above in |
---|
| 828 | the Sparc), define the macro `EXTRA_CC_MODES' to list the additional |
---|
| 829 | modes required (*note Condition Code::.). Also define `EXTRA_CC_NAMES' |
---|
| 830 | to list the names of those modes and `SELECT_CC_MODE' to choose a mode |
---|
| 831 | given an operand of a compare. |
---|
| 832 | |
---|
| 833 | If it is known during RTL generation that a different mode will be |
---|
| 834 | required (for example, if the machine has separate compare instructions |
---|
| 835 | for signed and unsigned quantities, like most IBM processors), they can |
---|
| 836 | be specified at that time. |
---|
| 837 | |
---|
| 838 | If the cases that require different modes would be made by |
---|
| 839 | instruction combination, the macro `SELECT_CC_MODE' determines which |
---|
| 840 | machine mode should be used for the comparison result. The patterns |
---|
| 841 | should be written using that mode. To support the case of the add on |
---|
| 842 | the Sparc discussed above, we have the pattern |
---|
| 843 | |
---|
| 844 | (define_insn "" |
---|
| 845 | [(set (reg:CC_NOOV 0) |
---|
| 846 | (compare:CC_NOOV |
---|
| 847 | (plus:SI (match_operand:SI 0 "register_operand" "%r") |
---|
| 848 | (match_operand:SI 1 "arith_operand" "rI")) |
---|
| 849 | (const_int 0)))] |
---|
| 850 | "" |
---|
| 851 | "...") |
---|
| 852 | |
---|
| 853 | The `SELECT_CC_MODE' macro on the Sparc returns `CC_NOOVmode' for |
---|
| 854 | comparisons whose argument is a `plus'. |
---|
| 855 | |
---|
| 856 | |
---|
| 857 | File: gcc.info, Node: Insn Canonicalizations, Next: Peephole Definitions, Prev: Jump Patterns, Up: Machine Desc |
---|
| 858 | |
---|
| 859 | Canonicalization of Instructions |
---|
| 860 | ================================ |
---|
| 861 | |
---|
| 862 | There are often cases where multiple RTL expressions could represent |
---|
| 863 | an operation performed by a single machine instruction. This situation |
---|
| 864 | is most commonly encountered with logical, branch, and |
---|
| 865 | multiply-accumulate instructions. In such cases, the compiler attempts |
---|
| 866 | to convert these multiple RTL expressions into a single canonical form |
---|
| 867 | to reduce the number of insn patterns required. |
---|
| 868 | |
---|
| 869 | In addition to algebraic simplifications, following canonicalizations |
---|
| 870 | are performed: |
---|
| 871 | |
---|
| 872 | * For commutative and comparison operators, a constant is always |
---|
| 873 | made the second operand. If a machine only supports a constant as |
---|
| 874 | the second operand, only patterns that match a constant in the |
---|
| 875 | second operand need be supplied. |
---|
| 876 | |
---|
| 877 | For these operators, if only one operand is a `neg', `not', |
---|
| 878 | `mult', `plus', or `minus' expression, it will be the first |
---|
| 879 | operand. |
---|
| 880 | |
---|
| 881 | * For the `compare' operator, a constant is always the second operand |
---|
| 882 | on machines where `cc0' is used (*note Jump Patterns::.). On other |
---|
| 883 | machines, there are rare cases where the compiler might want to |
---|
| 884 | construct a `compare' with a constant as the first operand. |
---|
| 885 | However, these cases are not common enough for it to be worthwhile |
---|
| 886 | to provide a pattern matching a constant as the first operand |
---|
| 887 | unless the machine actually has such an instruction. |
---|
| 888 | |
---|
| 889 | An operand of `neg', `not', `mult', `plus', or `minus' is made the |
---|
| 890 | first operand under the same conditions as above. |
---|
| 891 | |
---|
| 892 | * `(minus X (const_int N))' is converted to `(plus X (const_int |
---|
| 893 | -N))'. |
---|
| 894 | |
---|
| 895 | * Within address computations (i.e., inside `mem'), a left shift is |
---|
| 896 | converted into the appropriate multiplication by a power of two. |
---|
| 897 | |
---|
| 898 | De`Morgan's Law is used to move bitwise negation inside a bitwise |
---|
| 899 | logical-and or logical-or operation. If this results in only one |
---|
| 900 | operand being a `not' expression, it will be the first one. |
---|
| 901 | |
---|
| 902 | A machine that has an instruction that performs a bitwise |
---|
| 903 | logical-and of one operand with the bitwise negation of the other |
---|
| 904 | should specify the pattern for that instruction as |
---|
| 905 | |
---|
| 906 | (define_insn "" |
---|
| 907 | [(set (match_operand:M 0 ...) |
---|
| 908 | (and:M (not:M (match_operand:M 1 ...)) |
---|
| 909 | (match_operand:M 2 ...)))] |
---|
| 910 | "..." |
---|
| 911 | "...") |
---|
| 912 | |
---|
| 913 | Similarly, a pattern for a "NAND" instruction should be written |
---|
| 914 | |
---|
| 915 | (define_insn "" |
---|
| 916 | [(set (match_operand:M 0 ...) |
---|
| 917 | (ior:M (not:M (match_operand:M 1 ...)) |
---|
| 918 | (not:M (match_operand:M 2 ...))))] |
---|
| 919 | "..." |
---|
| 920 | "...") |
---|
| 921 | |
---|
| 922 | In both cases, it is not necessary to include patterns for the many |
---|
| 923 | logically equivalent RTL expressions. |
---|
| 924 | |
---|
| 925 | * The only possible RTL expressions involving both bitwise |
---|
| 926 | exclusive-or and bitwise negation are `(xor:M X Y)' and `(not:M |
---|
| 927 | (xor:M X Y))'. |
---|
| 928 | |
---|
| 929 | * The sum of three items, one of which is a constant, will only |
---|
| 930 | appear in the form |
---|
| 931 | |
---|
| 932 | (plus:M (plus:M X Y) CONSTANT) |
---|
| 933 | |
---|
| 934 | * On machines that do not use `cc0', `(compare X (const_int 0))' |
---|
| 935 | will be converted to X. |
---|
| 936 | |
---|
| 937 | * Equality comparisons of a group of bits (usually a single bit) |
---|
| 938 | with zero will be written using `zero_extract' rather than the |
---|
| 939 | equivalent `and' or `sign_extract' operations. |
---|
| 940 | |
---|
| 941 | |
---|
| 942 | File: gcc.info, Node: Peephole Definitions, Next: Expander Definitions, Prev: Insn Canonicalizations, Up: Machine Desc |
---|
| 943 | |
---|
| 944 | Machine-Specific Peephole Optimizers |
---|
| 945 | ==================================== |
---|
| 946 | |
---|
| 947 | In addition to instruction patterns the `md' file may contain |
---|
| 948 | definitions of machine-specific peephole optimizations. |
---|
| 949 | |
---|
| 950 | The combiner does not notice certain peephole optimizations when the |
---|
| 951 | data flow in the program does not suggest that it should try them. For |
---|
| 952 | example, sometimes two consecutive insns related in purpose can be |
---|
| 953 | combined even though the second one does not appear to use a register |
---|
| 954 | computed in the first one. A machine-specific peephole optimizer can |
---|
| 955 | detect such opportunities. |
---|
| 956 | |
---|
| 957 | A definition looks like this: |
---|
| 958 | |
---|
| 959 | (define_peephole |
---|
| 960 | [INSN-PATTERN-1 |
---|
| 961 | INSN-PATTERN-2 |
---|
| 962 | ...] |
---|
| 963 | "CONDITION" |
---|
| 964 | "TEMPLATE" |
---|
| 965 | "OPTIONAL INSN-ATTRIBUTES") |
---|
| 966 | |
---|
| 967 | The last string operand may be omitted if you are not using any |
---|
| 968 | machine-specific information in this machine description. If present, |
---|
| 969 | it must obey the same rules as in a `define_insn'. |
---|
| 970 | |
---|
| 971 | In this skeleton, INSN-PATTERN-1 and so on are patterns to match |
---|
| 972 | consecutive insns. The optimization applies to a sequence of insns when |
---|
| 973 | INSN-PATTERN-1 matches the first one, INSN-PATTERN-2 matches the next, |
---|
| 974 | and so on. |
---|
| 975 | |
---|
| 976 | Each of the insns matched by a peephole must also match a |
---|
| 977 | `define_insn'. Peepholes are checked only at the last stage just |
---|
| 978 | before code generation, and only optionally. Therefore, any insn which |
---|
| 979 | would match a peephole but no `define_insn' will cause a crash in code |
---|
| 980 | generation in an unoptimized compilation, or at various optimization |
---|
| 981 | stages. |
---|
| 982 | |
---|
| 983 | The operands of the insns are matched with `match_operands', |
---|
| 984 | `match_operator', and `match_dup', as usual. What is not usual is that |
---|
| 985 | the operand numbers apply to all the insn patterns in the definition. |
---|
| 986 | So, you can check for identical operands in two insns by using |
---|
| 987 | `match_operand' in one insn and `match_dup' in the other. |
---|
| 988 | |
---|
| 989 | The operand constraints used in `match_operand' patterns do not have |
---|
| 990 | any direct effect on the applicability of the peephole, but they will |
---|
| 991 | be validated afterward, so make sure your constraints are general enough |
---|
| 992 | to apply whenever the peephole matches. If the peephole matches but |
---|
| 993 | the constraints are not satisfied, the compiler will crash. |
---|
| 994 | |
---|
| 995 | It is safe to omit constraints in all the operands of the peephole; |
---|
| 996 | or you can write constraints which serve as a double-check on the |
---|
| 997 | criteria previously tested. |
---|
| 998 | |
---|
| 999 | Once a sequence of insns matches the patterns, the CONDITION is |
---|
| 1000 | checked. This is a C expression which makes the final decision whether |
---|
| 1001 | to perform the optimization (we do so if the expression is nonzero). If |
---|
| 1002 | CONDITION is omitted (in other words, the string is empty) then the |
---|
| 1003 | optimization is applied to every sequence of insns that matches the |
---|
| 1004 | patterns. |
---|
| 1005 | |
---|
| 1006 | The defined peephole optimizations are applied after register |
---|
| 1007 | allocation is complete. Therefore, the peephole definition can check |
---|
| 1008 | which operands have ended up in which kinds of registers, just by |
---|
| 1009 | looking at the operands. |
---|
| 1010 | |
---|
| 1011 | The way to refer to the operands in CONDITION is to write |
---|
| 1012 | `operands[I]' for operand number I (as matched by `(match_operand I |
---|
| 1013 | ...)'). Use the variable `insn' to refer to the last of the insns |
---|
| 1014 | being matched; use `prev_active_insn' to find the preceding insns. |
---|
| 1015 | |
---|
| 1016 | When optimizing computations with intermediate results, you can use |
---|
| 1017 | CONDITION to match only when the intermediate results are not used |
---|
| 1018 | elsewhere. Use the C expression `dead_or_set_p (INSN, OP)', where INSN |
---|
| 1019 | is the insn in which you expect the value to be used for the last time |
---|
| 1020 | (from the value of `insn', together with use of `prev_nonnote_insn'), |
---|
| 1021 | and OP is the intermediate value (from `operands[I]'). |
---|
| 1022 | |
---|
| 1023 | Applying the optimization means replacing the sequence of insns with |
---|
| 1024 | one new insn. The TEMPLATE controls ultimate output of assembler code |
---|
| 1025 | for this combined insn. It works exactly like the template of a |
---|
| 1026 | `define_insn'. Operand numbers in this template are the same ones used |
---|
| 1027 | in matching the original sequence of insns. |
---|
| 1028 | |
---|
| 1029 | The result of a defined peephole optimizer does not need to match |
---|
| 1030 | any of the insn patterns in the machine description; it does not even |
---|
| 1031 | have an opportunity to match them. The peephole optimizer definition |
---|
| 1032 | itself serves as the insn pattern to control how the insn is output. |
---|
| 1033 | |
---|
| 1034 | Defined peephole optimizers are run as assembler code is being |
---|
| 1035 | output, so the insns they produce are never combined or rearranged in |
---|
| 1036 | any way. |
---|
| 1037 | |
---|
| 1038 | Here is an example, taken from the 68000 machine description: |
---|
| 1039 | |
---|
| 1040 | (define_peephole |
---|
| 1041 | [(set (reg:SI 15) (plus:SI (reg:SI 15) (const_int 4))) |
---|
| 1042 | (set (match_operand:DF 0 "register_operand" "=f") |
---|
| 1043 | (match_operand:DF 1 "register_operand" "ad"))] |
---|
| 1044 | "FP_REG_P (operands[0]) && ! FP_REG_P (operands[1])" |
---|
| 1045 | "* |
---|
| 1046 | { |
---|
| 1047 | rtx xoperands[2]; |
---|
| 1048 | xoperands[1] = gen_rtx (REG, SImode, REGNO (operands[1]) + 1); |
---|
| 1049 | #ifdef MOTOROLA |
---|
| 1050 | output_asm_insn (\"move.l %1,(sp)\", xoperands); |
---|
| 1051 | output_asm_insn (\"move.l %1,-(sp)\", operands); |
---|
| 1052 | return \"fmove.d (sp)+,%0\"; |
---|
| 1053 | #else |
---|
| 1054 | output_asm_insn (\"movel %1,sp@\", xoperands); |
---|
| 1055 | output_asm_insn (\"movel %1,sp@-\", operands); |
---|
| 1056 | return \"fmoved sp@+,%0\"; |
---|
| 1057 | #endif |
---|
| 1058 | } |
---|
| 1059 | ") |
---|
| 1060 | |
---|
| 1061 | The effect of this optimization is to change |
---|
| 1062 | |
---|
| 1063 | jbsr _foobar |
---|
| 1064 | addql #4,sp |
---|
| 1065 | movel d1,sp@- |
---|
| 1066 | movel d0,sp@- |
---|
| 1067 | fmoved sp@+,fp0 |
---|
| 1068 | |
---|
| 1069 | into |
---|
| 1070 | |
---|
| 1071 | jbsr _foobar |
---|
| 1072 | movel d1,sp@ |
---|
| 1073 | movel d0,sp@- |
---|
| 1074 | fmoved sp@+,fp0 |
---|
| 1075 | |
---|
| 1076 | INSN-PATTERN-1 and so on look *almost* like the second operand of |
---|
| 1077 | `define_insn'. There is one important difference: the second operand |
---|
| 1078 | of `define_insn' consists of one or more RTX's enclosed in square |
---|
| 1079 | brackets. Usually, there is only one: then the same action can be |
---|
| 1080 | written as an element of a `define_peephole'. But when there are |
---|
| 1081 | multiple actions in a `define_insn', they are implicitly enclosed in a |
---|
| 1082 | `parallel'. Then you must explicitly write the `parallel', and the |
---|
| 1083 | square brackets within it, in the `define_peephole'. Thus, if an insn |
---|
| 1084 | pattern looks like this, |
---|
| 1085 | |
---|
| 1086 | (define_insn "divmodsi4" |
---|
| 1087 | [(set (match_operand:SI 0 "general_operand" "=d") |
---|
| 1088 | (div:SI (match_operand:SI 1 "general_operand" "0") |
---|
| 1089 | (match_operand:SI 2 "general_operand" "dmsK"))) |
---|
| 1090 | (set (match_operand:SI 3 "general_operand" "=d") |
---|
| 1091 | (mod:SI (match_dup 1) (match_dup 2)))] |
---|
| 1092 | "TARGET_68020" |
---|
| 1093 | "divsl%.l %2,%3:%0") |
---|
| 1094 | |
---|
| 1095 | then the way to mention this insn in a peephole is as follows: |
---|
| 1096 | |
---|
| 1097 | (define_peephole |
---|
| 1098 | [... |
---|
| 1099 | (parallel |
---|
| 1100 | [(set (match_operand:SI 0 "general_operand" "=d") |
---|
| 1101 | (div:SI (match_operand:SI 1 "general_operand" "0") |
---|
| 1102 | (match_operand:SI 2 "general_operand" "dmsK"))) |
---|
| 1103 | (set (match_operand:SI 3 "general_operand" "=d") |
---|
| 1104 | (mod:SI (match_dup 1) (match_dup 2)))]) |
---|
| 1105 | ...] |
---|
| 1106 | ...) |
---|
| 1107 | |
---|