1 | This file describes the implementation notes of the GNU C Compiler for |
---|
2 | the National Semiconductor 32032 chip (and 32000 family). |
---|
3 | |
---|
4 | The 32032 machine description and configuration file for this compiler |
---|
5 | is, for NS32000 family machine, primarily machine independent. |
---|
6 | However, since this release still depends on vendor-supplied |
---|
7 | assemblers and linkers, the compiler must obey the existing |
---|
8 | conventions of the actual machine to which this compiler is targeted. |
---|
9 | In this case, the actual machine which this compiler was targeted to |
---|
10 | is a Sequent Balance 8000, running DYNIX 2.1. |
---|
11 | |
---|
12 | The assembler for DYNIX 2.1 (and DYNIX 3.0, alas) does not cope with |
---|
13 | the full generality of the addressing mode REGISTER RELATIVE. |
---|
14 | Specifically, it generates incorrect code for operands of the |
---|
15 | following form: |
---|
16 | |
---|
17 | sym(rn) |
---|
18 | |
---|
19 | Where `rn' is one of the general registers. Correct code is generated |
---|
20 | for operands of the form |
---|
21 | |
---|
22 | sym(pn) |
---|
23 | |
---|
24 | where `pn' is one of the special processor registers (sb, fp, or sp). |
---|
25 | |
---|
26 | An equivalent operand can be generated by the form |
---|
27 | |
---|
28 | sym[rn:b] |
---|
29 | |
---|
30 | although this addressing mode is about twice as slow on the 32032. |
---|
31 | |
---|
32 | The more efficient addressing mode is controlled by defining the |
---|
33 | constant SEQUENT_ADDRESS_BUG to 0. It is currently defined to be 1. |
---|
34 | |
---|
35 | Another bug in the assembler makes it impossible to compute with |
---|
36 | explicit addresses. In order to compute with a symbolic address, it |
---|
37 | is necessary to load that address into a register using the "addr" |
---|
38 | instruction. For example, it is not possible to say |
---|
39 | |
---|
40 | cmpd _p,@_x |
---|
41 | |
---|
42 | Rather one must say |
---|
43 | |
---|
44 | addr _x,rn |
---|
45 | cmpd _p,rn |
---|
46 | |
---|
47 | |
---|
48 | The ns32032 chip has a number of known bugs. Any attempt to make the |
---|
49 | compiler unaware of these deficiencies will surely bring disaster. |
---|
50 | The current list of know bugs are as follows (list provided by Richard |
---|
51 | Stallman): |
---|
52 | |
---|
53 | 1) instructions with two overlapping operands in memory |
---|
54 | (unlikely in C code, perhaps impossible). |
---|
55 | |
---|
56 | 2) floating point conversion instructions with constant |
---|
57 | operands (these may never happen, but I'm not certain). |
---|
58 | |
---|
59 | 3) operands crossing a page boundary. These can be prevented |
---|
60 | by setting the flag in tm.h that requires strict alignment. |
---|
61 | |
---|
62 | 4) Scaled indexing in an insn following an insn that has a read-write |
---|
63 | operand in memory. This can be prevented by placing a no-op in |
---|
64 | between. I, Michael Tiemann, do not understand what exactly is meant |
---|
65 | by `read-write operand in memory'. If this is referring to the special |
---|
66 | TOS mode, for example "addd 5,tos" then one need not fear, since this |
---|
67 | will never be generated. However, is this includes "addd 5,-4(fp)" |
---|
68 | then there is room for disaster. The Sequent compiler does not insert |
---|
69 | a no-op for code involving the latter, and I have been informed that |
---|
70 | Sequent is aware of this list of bugs, so I must assume that it is not |
---|
71 | a problem. |
---|
72 | |
---|
73 | 5) The 32032 cannot shift by 32 bits. It shifts modulo the word size |
---|
74 | of the operand. Therefore, for 32-bit operations, 32-bit shifts are |
---|
75 | interpreted as zero bit shifts. 32-bit shifts have been removed from |
---|
76 | the compiler, but future hackers must be careful not to reintroduce |
---|
77 | them. |
---|
78 | |
---|
79 | 6) The ns32032 is a very slow chip; however, some instructions are |
---|
80 | still very much slower than one might expect. For example, it is |
---|
81 | almost always faster to double a quantity by adding it to itself than |
---|
82 | by shifting it by one, even if that quantity is deep in memory. The |
---|
83 | MOVM instruction has a 20-cycle setup time, after which it moves data |
---|
84 | at about the speed that normal moves would. It is also faster to use |
---|
85 | address generation instructions than shift instructions for left |
---|
86 | shifts less than 4. I do not claim that I generate optimal code for all |
---|
87 | given patterns, but where I did escape from National's "clean |
---|
88 | architecture", I did so because the timing specification from the data |
---|
89 | book says that I will win if I do. I suppose this is called the |
---|
90 | "performance gap". |
---|
91 | |
---|
92 | |
---|
93 | Signed bitfield extraction has not been implemented. It is not |
---|
94 | provided by the NS32032, and while it is most certainly possible to do |
---|
95 | better than the standard shift-left/shift-right sequence, it is also |
---|
96 | quite hairy. Also, since signed bitfields do not yet exist in C, this |
---|
97 | omission seems relatively harmless. |
---|
98 | |
---|
99 | |
---|
100 | Zero extractions could be better implemented if it were possible in |
---|
101 | GCC to provide sized zero extractions: i.e. a byte zero extraction |
---|
102 | would be allowed to yield a byte result. The current implementation |
---|
103 | of GCC manifests 68000-ist thinking, where bitfields are extracted |
---|
104 | into a register, and automatically sign/zero extended to fill the |
---|
105 | register. See comments in ns32k.md around the "extzv" insn for more |
---|
106 | details. |
---|
107 | |
---|
108 | |
---|
109 | It should be noted that while the NS32000 family was designed to |
---|
110 | provide odd-aligned addressing capability for multi-byte data (also |
---|
111 | provided by the 68020, but not by the 68000 or 68010), many machines |
---|
112 | do not opt to take advantage of this. For example, on the sequent, |
---|
113 | although there is no advantage to long-word aligning word data, shorts |
---|
114 | must be int-aligned in structs. This is an example of another |
---|
115 | machine-specific machine dependency. |
---|
116 | |
---|
117 | |
---|
118 | Because the ns32032 is has a coherent byte-order/bit-order |
---|
119 | architecture, many instructions which would be different for |
---|
120 | 68000-style machines, fold into the same instruction for the 32032. |
---|
121 | The classic case is push effective address, where it does not matter |
---|
122 | whether one is pushing a long, word, or byte address. They all will |
---|
123 | push the same address. |
---|
124 | |
---|
125 | |
---|
126 | The macro FUNCTION_VALUE_REGNO_P is probably not sufficient, what is |
---|
127 | needed is FUNCTION_VALUE_P, which also takes a MODE parameter. In |
---|
128 | this way it will be possible to determine more exactly whether a |
---|
129 | register is really a function value register, or just one that happens |
---|
130 | to look right. |
---|