source: trunk/third/gcc/cp/gxxint.texi @ 8834

Revision 8834, 54.5 KB checked in by ghudson, 28 years ago (diff)
This commit was generated by cvs2svn to compensate for changes in r8833, which included commits to RCS files with non-trunk default branches.
Line 
1\input texinfo  @c -*-texinfo-*-
2@c %**start of header
3@setfilename g++int.info
4@settitle G++ internals
5@setchapternewpage odd
6@c %**end of header
7     
8@node Top, Limitations of g++, (dir), (dir)
9@chapter Internal Architecture of the Compiler
10
11This is meant to describe the C++ front-end for gcc in detail.
12Questions and comments to mrs@@cygnus.com.
13
14@menu
15* Limitations of g++::         
16* Routines::                   
17* Implementation Specifics::   
18* Glossary::                   
19* Macros::                     
20* Typical Behavior::           
21* Coding Conventions::         
22* Templates::                   
23* Access Control::             
24* Error Reporting::             
25* Parser::                     
26* Copying Objects::             
27* Exception Handling::         
28* Free Store::                 
29* Concept Index::               
30@end menu
31
32@node Limitations of g++, Routines, Top, Top
33@section Limitations of g++
34
35@itemize @bullet
36@item
37Limitations on input source code: 240 nesting levels with the parser
38stacksize (YYSTACKSIZE) set to 500 (the default), and requires around
3916.4k swap space per nesting level.  The parser needs about 2.09 *
40number of nesting levels worth of stackspace.
41
42@cindex pushdecl_class_level
43@item
44I suspect there are other uses of pushdecl_class_level that do not call
45set_identifier_type_value in tandem with the call to
46pushdecl_class_level.  It would seem to be an omission.
47
48@cindex access checking
49@item
50Access checking is unimplemented for nested types.
51
52@cindex @code{volatile}
53@item
54@code{volatile} is not implemented in general.
55
56@cindex pointers to members
57@item
58Pointers to members are only minimally supported, and there are places
59where the grammar doesn't even properly accept them yet.
60
61@cindex multiple inheritance
62@item
63@code{this} will be wrong in virtual members functions defined in a
64virtual base class, when they are overridden in a derived class, when
65called via a non-left most object.
66
67An example would be:
68
69@example
70extern "C" int printf(const char*, ...);
71struct A @{ virtual void f() @{ @} @};
72struct B : virtual A @{ int b; B() : b(0) @{@} void f() @{ b++; @} @};
73struct C : B @{@};
74struct D : B @{@};
75struct E : C, D @{@};
76int main()
77@{
78  E e;
79  C& c = e; D& d = e;
80  c.f(); d.f();
81  printf ("C::b = %d, D::b = %d\n", e.C::b, e.D::b);
82  return 0;
83@}
84@end example
85
86This will print out 2, 0, instead of 1,1.
87
88@end itemize
89
90@node Routines, Implementation Specifics, Limitations of g++, Top
91@section Routines
92
93This section describes some of the routines used in the C++ front-end.
94
95@code{build_vtable} and @code{prepare_fresh_vtable} is used only within
96the @file{cp-class.c} file, and only in @code{finish_struct} and
97@code{modify_vtable_entries}.
98
99@code{build_vtable}, @code{prepare_fresh_vtable}, and
100@code{finish_struct} are the only routines that set @code{DECL_VPARENT}.
101
102@code{finish_struct} can steal the virtual function table from parents,
103this prohibits related_vslot from working.  When finish_struct steals,
104we know that
105
106@example
107get_binfo (DECL_FIELD_CONTEXT (CLASSTYPE_VFIELD (t)), t, 0)
108@end example
109
110@noindent
111will get the related binfo.
112
113@code{layout_basetypes} does something with the VIRTUALS.
114
115Supposedly (according to Tiemann) most of the breadth first searching
116done, like in @code{get_base_distance} and in @code{get_binfo} was not
117because of any design decision.  I have since found out the at least one
118part of the compiler needs the notion of depth first binfo searching, I
119am going to try and convert the whole thing, it should just work.  The
120term left-most refers to the depth first left-most node.  It uses
121@code{MAIN_VARIANT == type} as the condition to get left-most, because
122the things that have @code{BINFO_OFFSET}s of zero are shared and will
123have themselves as their own @code{MAIN_VARIANT}s.  The non-shared right
124ones, are copies of the left-most one, hence if it is its own
125@code{MAIN_VARIANT}, we know it IS a left-most one, if it is not, it is
126a non-left-most one.
127
128@code{get_base_distance}'s path and distance matters in its use in:
129
130@itemize @bullet
131@item
132@code{prepare_fresh_vtable} (the code is probably wrong)
133@item
134@code{init_vfields} Depends upon distance probably in a safe way,
135build_offset_ref might use partial paths to do further lookups,
136hack_identifier is probably not properly checking access.
137
138@item
139@code{get_first_matching_virtual} probably should check for
140@code{get_base_distance} returning -2.
141
142@item
143@code{resolve_offset_ref} should be called in a more deterministic
144manner.  Right now, it is called in some random contexts, like for
145arguments at @code{build_method_call} time, @code{default_conversion}
146time, @code{convert_arguments} time, @code{build_unary_op} time,
147@code{build_c_cast} time, @code{build_modify_expr} time,
148@code{convert_for_assignment} time, and
149@code{convert_for_initialization} time.
150
151But, there are still more contexts it needs to be called in, one was the
152ever simple:
153
154@example
155if (obj.*pmi != 7)
156   @dots{}
157@end example
158
159Seems that the problems were due to the fact that @code{TREE_TYPE} of
160the @code{OFFSET_REF} was not a @code{OFFSET_TYPE}, but rather the type
161of the referent (like @code{INTEGER_TYPE}).  This problem was fixed by
162changing @code{default_conversion} to check @code{TREE_CODE (x)},
163instead of only checking @code{TREE_CODE (TREE_TYPE (x))} to see if it
164was @code{OFFSET_TYPE}.
165
166@end itemize
167
168@node Implementation Specifics, Glossary, Routines, Top
169@section Implementation Specifics
170
171@itemize @bullet
172@item Explicit Initialization
173
174The global list @code{current_member_init_list} contains the list of
175mem-initializers specified in a constructor declaration.  For example:
176
177@example
178foo::foo() : a(1), b(2) @{@}
179@end example
180
181@noindent
182will initialize @samp{a} with 1 and @samp{b} with 2.
183@code{expand_member_init} places each initialization (a with 1) on the
184global list.  Then, when the fndecl is being processed,
185@code{emit_base_init} runs down the list, initializing them.  It used to
186be the case that g++ first ran down @code{current_member_init_list},
187then ran down the list of members initializing the ones that weren't
188explicitly initialized.  Things were rewritten to perform the
189initializations in order of declaration in the class.  So, for the above
190example, @samp{a} and @samp{b} will be initialized in the order that
191they were declared:
192
193@example
194class foo @{ public: int b; int a; foo (); @};
195@end example
196
197@noindent
198Thus, @samp{b} will be initialized with 2 first, then @samp{a} will be
199initialized with 1, regardless of how they're listed in the mem-initializer.
200
201@item Argument Matching
202
203In early 1993, the argument matching scheme in @sc{gnu} C++ changed
204significantly.  The original code was completely replaced with a new
205method that will, hopefully, be easier to understand and make fixing
206specific cases much easier.
207
208The @samp{-fansi-overloading} option is used to enable the new code; at
209some point in the future, it will become the default behavior of the
210compiler.
211
212The file @file{cp-call.c} contains all of the new work, in the functions
213@code{rank_for_overload}, @code{compute_harshness},
214@code{compute_conversion_costs}, and @code{ideal_candidate}.
215
216Instead of using obscure numerical values, the quality of an argument
217match is now represented by clear, individual codes.  The new data
218structure @code{struct harshness} (it used to be an @code{unsigned}
219number) contains:
220
221@enumerate a
222@item the @samp{code} field, to signify what was involved in matching two
223arguments;
224@item the @samp{distance} field, used in situations where inheritance
225decides which function should be called (one is ``closer'' than
226another);
227@item and the @samp{int_penalty} field, used by some codes as a tie-breaker.
228@end enumerate
229
230The @samp{code} field is a number with a given bit set for each type of
231code, OR'd together.  The new codes are:
232
233@itemize @bullet
234@item @code{EVIL_CODE}
235The argument was not a permissible match.
236
237@item @code{CONST_CODE}
238Currently, this is only used by @code{compute_conversion_costs}, to
239distinguish when a non-@code{const} member function is called from a
240@code{const} member function.
241
242@item @code{ELLIPSIS_CODE}
243A match against an ellipsis @samp{...} is considered worse than all others.
244
245@item @code{USER_CODE}
246Used for a match involving a user-defined conversion.
247
248@item @code{STD_CODE}
249A match involving a standard conversion.
250
251@item @code{PROMO_CODE}
252A match involving an integral promotion.  For these, the
253@code{int_penalty} field is used to handle the ARM's rule (XXX cite)
254that a smaller @code{unsigned} type should promote to a @code{int}, not
255to an @code{unsigned int}.
256
257@item @code{QUAL_CODE}
258Used to mark use of qualifiers like @code{const} and @code{volatile}.
259
260@item @code{TRIVIAL_CODE}
261Used for trivial conversions.  The @samp{int_penalty} field is used by
262@code{convert_harshness} to communicate further penalty information back
263to @code{build_overload_call_real} when deciding which function should
264be call.
265@end itemize
266
267The functions @code{convert_to_aggr} and @code{build_method_call} use
268@code{compute_conversion_costs} to rate each argument's suitability for
269a given candidate function (that's how we get the list of candidates for
270@code{ideal_candidate}).
271
272@end itemize
273
274@node Glossary, Macros, Implementation Specifics, Top
275@section Glossary
276
277@table @r
278@item binfo
279The main data structure in the compiler used to represent the
280inheritance relationships between classes.  The data in the binfo can be
281accessed by the BINFO_ accessor macros.
282
283@item vtable
284@itemx virtual function table
285
286The virtual function table holds information used in virtual function
287dispatching.  In the compiler, they are usually referred to as vtables,
288or vtbls.  The first index is not used in the normal way, I believe it
289is probably used for the virtual destructor.
290
291@item vfield
292
293vfields can be thought of as the base information needed to build
294vtables.  For every vtable that exists for a class, there is a vfield.
295See also vtable and virtual function table pointer.  When a type is used
296as a base class to another type, the virtual function table for the
297derived class can be based upon the vtable for the base class, just
298extended to include the additional virtual methods declared in the
299derived class.  The virtual function table from a virtual base class is
300never reused in a derived class.  @code{is_normal} depends upon this.
301
302@item virtual function table pointer
303
304These are @code{FIELD_DECL}s that are pointer types that point to
305vtables.  See also vtable and vfield.
306@end table
307
308@node Macros, Typical Behavior, Glossary, Top
309@section Macros
310
311This section describes some of the macros used on trees.  The list
312should be alphabetical.  Eventually all macros should be documented
313here.  There are some postscript drawings that can be used to better
314understand from of the more complex data structures, contact Mike Stump
315(@code{mrs@@cygnus.com}) for information about them.
316
317@table @code
318@item BINFO_BASETYPES
319A vector of additional binfos for the types inherited by this basetype.
320The binfos are fully unshared (except for virtual bases, in which
321case the binfo structure is shared).
322
323   If this basetype describes type D as inherited in C,
324   and if the basetypes of D are E anf F,
325   then this vector contains binfos for inheritance of E and F by C.
326
327Has values of:
328
329        TREE_VECs
330
331
332@item BINFO_INHERITANCE_CHAIN
333Temporarily used to represent specific inheritances.  It usually points
334to the binfo associated with the lesser derived type, but it can be
335reversed by reverse_path.  For example:
336
337@example
338        Z ZbY   least derived
339        |
340        Y YbX
341        |
342        X Xb    most derived
343
344TYPE_BINFO (X) == Xb
345BINFO_INHERITANCE_CHAIN (Xb) == YbX
346BINFO_INHERITANCE_CHAIN (Yb) == ZbY
347BINFO_INHERITANCE_CHAIN (Zb) == 0
348@end example
349
350Not sure is the above is really true, get_base_distance has is point
351towards the most derived type, opposite from above.
352
353Set by build_vbase_path, recursive_bounded_basetype_p,
354get_base_distance, lookup_field, lookup_fnfields, and reverse_path.
355
356What things can this be used on:
357
358        TREE_VECs that are binfos
359
360
361@item BINFO_OFFSET
362The offset where this basetype appears in its containing type.
363BINFO_OFFSET slot holds the offset (in bytes) from the base of the
364complete object to the base of the part of the object that is allocated
365on behalf of this `type'.  This is always 0 except when there is
366multiple inheritance.
367
368Used on TREE_VEC_ELTs of the binfos BINFO_BASETYPES (...) for example.
369
370
371@item BINFO_VIRTUALS
372A unique list of functions for the virtual function table.  See also
373TYPE_BINFO_VIRTUALS.
374
375What things can this be used on:
376
377        TREE_VECs that are binfos
378
379
380@item BINFO_VTABLE
381Used to find the VAR_DECL that is the virtual function table associated
382with this binfo.  See also TYPE_BINFO_VTABLE.  To get the virtual
383function table pointer, see CLASSTYPE_VFIELD.
384
385What things can this be used on:
386
387        TREE_VECs that are binfos
388
389Has values of:
390
391        VAR_DECLs that are virtual function tables
392
393
394@item BLOCK_SUPERCONTEXT
395In the outermost scope of each function, it points to the FUNCTION_DECL
396node.  It aids in better DWARF support of inline functions.
397
398
399@item CLASSTYPE_TAGS
400CLASSTYPE_TAGS is a linked (via TREE_CHAIN) list of member classes of a
401class. TREE_PURPOSE is the name, TREE_VALUE is the type (pushclass scans
402these and calls pushtag on them.)
403
404finish_struct scans these to produce TYPE_DECLs to add to the
405TYPE_FIELDS of the type.
406
407It is expected that name found in the TREE_PURPOSE slot is unique,
408resolve_scope_to_name is one such place that depends upon this
409uniqueness.
410
411
412@item CLASSTYPE_METHOD_VEC
413The following is true after finish_struct has been called (on the
414class?) but not before.  Before finish_struct is called, things are
415different to some extent.  Contains a TREE_VEC of methods of the class.
416The TREE_VEC_LENGTH is the number of differently named methods plus one
417for the 0th entry.  The 0th entry is always allocated, and reserved for
418ctors and dtors.  If there are none, TREE_VEC_ELT(N,0) == NULL_TREE.
419Each entry of the TREE_VEC is a FUNCTION_DECL.  For each FUNCTION_DECL,
420there is a DECL_CHAIN slot.  If the FUNCTION_DECL is the last one with a
421given name, the DECL_CHAIN slot is NULL_TREE.  Otherwise it is the next
422method that has the same name (but a different signature).  It would
423seem that it is not true that because the DECL_CHAIN slot is used in
424this way, we cannot call pushdecl to put the method in the global scope
425(cause that would overwrite the TREE_CHAIN slot), because they use
426different _CHAINs.  finish_struct_methods setups up one version of the
427TREE_CHAIN slots on the FUNCTION_DECLs.
428
429friends are kept in TREE_LISTs, so that there's no need to use their
430TREE_CHAIN slot for anything.
431
432Has values of:
433
434        TREE_VECs
435       
436
437@item CLASSTYPE_VFIELD
438Seems to be in the process of being renamed TYPE_VFIELD.  Use on types
439to get the main virtual function table pointer.  To get the virtual
440function table use BINFO_VTABLE (TYPE_BINFO ()).
441
442Has values of:
443
444        FIELD_DECLs that are virtual function table pointers
445
446What things can this be used on:
447
448        RECORD_TYPEs
449
450
451@item DECL_CLASS_CONTEXT
452Identifies the context that the _DECL was found in.  For virtual function
453tables, it points to the type associated with the virtual function
454table.  See also DECL_CONTEXT, DECL_FIELD_CONTEXT and DECL_FCONTEXT.
455
456The difference between this and DECL_CONTEXT, is that for virtuals
457functions like:
458
459@example
460struct A
461@{
462  virtual int f ();
463@};
464
465struct B : A
466@{
467  int f ();
468@};
469
470DECL_CONTEXT (A::f) == A
471DECL_CLASS_CONTEXT (A::f) == A
472
473DECL_CONTEXT (B::f) == A
474DECL_CLASS_CONTEXT (B::f) == B
475@end example
476
477Has values of:
478
479        RECORD_TYPEs, or UNION_TYPEs
480
481What things can this be used on:
482
483        TYPE_DECLs, _DECLs
484
485
486@item DECL_CONTEXT
487Identifies the context that the _DECL was found in.  Can be used on
488virtual function tables to find the type associated with the virtual
489function table, but since they are FIELD_DECLs, DECL_FIELD_CONTEXT is a
490better access method.  Internally the same as DECL_FIELD_CONTEXT, so
491don't us both.  See also DECL_FIELD_CONTEXT, DECL_FCONTEXT and
492DECL_CLASS_CONTEXT.
493
494Has values of:
495
496        RECORD_TYPEs
497
498
499What things can this be used on:
500
501@display
502VAR_DECLs that are virtual function tables
503_DECLs
504@end display
505
506
507@item DECL_FIELD_CONTEXT
508Identifies the context that the FIELD_DECL was found in.  Internally the
509same as DECL_CONTEXT, so don't us both.  See also DECL_CONTEXT,
510DECL_FCONTEXT and DECL_CLASS_CONTEXT.
511
512Has values of:
513
514        RECORD_TYPEs
515
516What things can this be used on:
517
518@display
519FIELD_DECLs that are virtual function pointers
520FIELD_DECLs
521@end display
522
523
524@item DECL_NESTED_TYPENAME
525Holds the fully qualified type name.  Example, Base::Derived.
526
527Has values of:
528
529        IDENTIFIER_NODEs
530
531What things can this be used on:
532
533        TYPE_DECLs
534
535
536@item DECL_NAME
537
538Has values of:
539
540@display
5410 for things that don't have names
542IDENTIFIER_NODEs for TYPE_DECLs
543@end display
544
545@item DECL_IGNORED_P
546A bit that can be set to inform the debug information output routines in
547the back-end that a certain _DECL node should be totally ignored.
548
549Used in cases where it is known that the debugging information will be
550output in another file, or where a sub-type is known not to be needed
551because the enclosing type is not needed.
552
553A compiler constructed virtual destructor in derived classes that do not
554define an explicit destructor that was defined explicit in a base class
555has this bit set as well.  Also used on __FUNCTION__ and
556__PRETTY_FUNCTION__ to mark they are ``compiler generated.''  c-decl and
557c-lex.c both want DECL_IGNORED_P set for ``internally generated vars,''
558and ``user-invisible variable.''
559
560Functions built by the C++ front-end such as default destructors,
561virtual destructors and default constructors want to be marked that
562they are compiler generated, but unsure why.
563
564Currently, it is used in an absolute way in the C++ front-end, as an
565optimization, to tell the debug information output routines to not
566generate debugging information that will be output by another separately
567compiled file.
568
569
570@item DECL_VIRTUAL_P
571A flag used on FIELD_DECLs and VAR_DECLs.  (Documentation in tree.h is
572wrong.)  Used in VAR_DECLs to indicate that the variable is a vtable.
573It is also used in FIELD_DECLs for vtable pointers.
574
575What things can this be used on:
576
577        FIELD_DECLs and VAR_DECLs
578
579
580@item DECL_VPARENT
581Used to point to the parent type of the vtable if there is one, else it
582is just the type associated with the vtable.  Because of the sharing of
583virtual function tables that goes on, this slot is not very useful, and
584is in fact, not used in the compiler at all.  It can be removed.
585
586What things can this be used on:
587
588        VAR_DECLs that are virtual function tables
589
590Has values of:
591
592        RECORD_TYPEs maybe UNION_TYPEs
593
594
595@item DECL_FCONTEXT
596Used to find the first baseclass in which this FIELD_DECL is defined.
597See also DECL_CONTEXT, DECL_FIELD_CONTEXT and DECL_CLASS_CONTEXT.
598
599How it is used:
600
601        Used when writing out debugging information about vfield and
602        vbase decls.
603
604What things can this be used on:
605
606        FIELD_DECLs that are virtual function pointers
607        FIELD_DECLs
608
609
610@item DECL_REFERENCE_SLOT
611Used to hold the initialize for the reference.
612
613What things can this be used on:
614
615        PARM_DECLs and VAR_DECLs that have a reference type
616
617
618@item DECL_VINDEX
619Used for FUNCTION_DECLs in two different ways.  Before the structure
620containing the FUNCTION_DECL is laid out, DECL_VINDEX may point to a
621FUNCTION_DECL in a base class which is the FUNCTION_DECL which this
622FUNCTION_DECL will replace as a virtual function.  When the class is
623laid out, this pointer is changed to an INTEGER_CST node which is
624suitable to find an index into the virtual function table.  See
625get_vtable_entry as to how one can find the right index into the virtual
626function table.  The first index 0, of a virtual function table it not
627used in the normal way, so the first real index is 1.
628
629DECL_VINDEX may be a TREE_LIST, that would seem to be a list of
630overridden FUNCTION_DECLs.  add_virtual_function has code to deal with
631this when it uses the variable base_fndecl_list, but it would seem that
632somehow, it is possible for the TREE_LIST to pursist until method_call,
633and it should not.
634
635
636What things can this be used on:
637
638        FUNCTION_DECLs
639
640
641@item DECL_SOURCE_FILE
642Identifies what source file a particular declaration was found in.
643
644Has values of:
645
646        "<built-in>" on TYPE_DECLs to mean the typedef is built in
647
648
649@item DECL_SOURCE_LINE
650Identifies what source line number in the source file the declaration
651was found at.
652
653Has values of:
654
655@display
6560 for an undefined label
657
6580 for TYPE_DECLs that are internally generated
659
6600 for FUNCTION_DECLs for functions generated by the compiler
661        (not yet, but should be)
662
6630 for ``magic'' arguments to functions, that the user has no
664        control over
665@end display
666
667
668@item TREE_USED
669
670Has values of:
671
672        0 for unused labels
673
674
675@item TREE_ADDRESSABLE
676A flag that is set for any type that has a constructor.
677
678
679@item TREE_COMPLEXITY
680They seem a kludge way to track recursion, poping, and pushing.  They only
681appear in cp-decl.c and cp-decl2.c, so the are a good candidate for
682proper fixing, and removal.
683
684
685@item TREE_PRIVATE
686Set for FIELD_DECLs by finish_struct.  But not uniformly set.
687
688The following routines do something with PRIVATE access:
689build_method_call, alter_access, finish_struct_methods,
690finish_struct, convert_to_aggr, CWriteLanguageDecl, CWriteLanguageType,
691CWriteUseObject, compute_access, lookup_field, dfs_pushdecl,
692GNU_xref_member, dbxout_type_fields, dbxout_type_method_1
693
694
695@item TREE_PROTECTED
696The following routines do something with PROTECTED access:
697build_method_call, alter_access, finish_struct, convert_to_aggr,
698CWriteLanguageDecl, CWriteLanguageType, CWriteUseObject,
699compute_access, lookup_field, GNU_xref_member, dbxout_type_fields,
700dbxout_type_method_1
701
702
703@item TYPE_BINFO
704Used to get the binfo for the type.
705
706Has values of:
707
708        TREE_VECs that are binfos
709
710What things can this be used on:
711
712        RECORD_TYPEs
713
714
715@item TYPE_BINFO_BASETYPES
716See also BINFO_BASETYPES.
717
718@item TYPE_BINFO_VIRTUALS
719A unique list of functions for the virtual function table.  See also
720BINFO_VIRTUALS.
721
722What things can this be used on:
723
724        RECORD_TYPEs
725
726
727@item TYPE_BINFO_VTABLE
728Points to the virtual function table associated with the given type.
729See also BINFO_VTABLE.
730
731What things can this be used on:
732
733        RECORD_TYPEs
734
735Has values of:
736
737        VAR_DECLs that are virtual function tables
738
739
740@item TYPE_NAME
741Names the type.
742
743Has values of:
744
745@display
7460 for things that don't have names.
747should be IDENTIFIER_NODE for RECORD_TYPEs UNION_TYPEs and
748        ENUM_TYPEs.
749TYPE_DECL for RECORD_TYPEs, UNION_TYPEs and ENUM_TYPEs, but
750        shouldn't be.
751TYPE_DECL for typedefs, unsure why.
752@end display
753
754What things can one use this on:
755
756@display
757TYPE_DECLs
758RECORD_TYPEs
759UNION_TYPEs
760ENUM_TYPEs
761@end display
762
763History:
764
765        It currently points to the TYPE_DECL for RECORD_TYPEs,
766        UNION_TYPEs and ENUM_TYPEs, but it should be history soon.
767
768
769@item TYPE_METHODS
770Synonym for @code{CLASSTYPE_METHOD_VEC}.  Chained together with
771@code{TREE_CHAIN}.  @file{dbxout.c} uses this to get at the methods of a
772class.
773
774
775@item TYPE_DECL
776Used to represent typedefs, and used to represent bindings layers.
777
778Components:
779
780        DECL_NAME is the name of the typedef.  For example, foo would
781        be found in the DECL_NAME slot when @code{typedef int foo;} is
782        seen.
783
784        DECL_SOURCE_LINE identifies what source line number in the
785        source file the declaration was found at.  A value of 0
786        indicates that this TYPE_DECL is just an internal binding layer
787        marker, and does not correspond to a user supplied typedef.
788
789        DECL_SOURCE_FILE
790
791@item TYPE_FIELDS
792A linked list (via @code{TREE_CHAIN}) of member types of a class.  The
793list can contain @code{TYPE_DECL}s, but there can also be other things
794in the list apparently.  See also @code{CLASSTYPE_TAGS}.
795
796
797@item TYPE_VIRTUAL_P
798A flag used on a @code{FIELD_DECL} or a @code{VAR_DECL}, indicates it is
799a virtual function table or a pointer to one.  When used on a
800@code{FUNCTION_DECL}, indicates that it is a virtual function.  When
801used on an @code{IDENTIFIER_NODE}, indicates that a function with this
802same name exists and has been declared virtual.
803
804When used on types, it indicates that the type has virtual functions, or
805is derived from one that does.
806
807Not sure if the above about virtual function tables is still true.  See
808also info on @code{DECL_VIRTUAL_P}.
809
810What things can this be used on:
811
812        FIELD_DECLs, VAR_DECLs, FUNCTION_DECLs, IDENTIFIER_NODEs
813
814
815@item VF_BASETYPE_VALUE
816Get the associated type from the binfo that caused the given vfield to
817exist.  This is the least derived class (the most parent class) that
818needed a virtual function table.  It is probably the case that all uses
819of this field are misguided, but they need to be examined on a
820case-by-case basis.  See history for more information on why the
821previous statement was made.
822
823Set at @code{finish_base_struct} time.
824
825What things can this be used on:
826
827        TREE_LISTs that are vfields
828
829History:
830
831        This field was used to determine if a virtual function table's
832        slot should be filled in with a certain virtual function, by
833        checking to see if the type returned by VF_BASETYPE_VALUE was a
834        parent of the context in which the old virtual function existed.
835        This incorrectly assumes that a given type _could_ not appear as
836        a parent twice in a given inheritance lattice.  For single
837        inheritance, this would in fact work, because a type could not
838        possibly appear more than once in an inheritance lattice, but
839        with multiple inheritance, a type can appear more than once.
840
841
842@item VF_BINFO_VALUE
843Identifies the binfo that caused this vfield to exist.  If this vfield
844is from the first direct base class that has a virtual function table,
845then VF_BINFO_VALUE is NULL_TREE, otherwise it will be the binfo of the
846direct base where the vfield came from.  Can use @code{TREE_VIA_VIRTUAL}
847on result to find out if it is a virtual base class.  Related to the
848binfo found by
849
850@example
851get_binfo (VF_BASETYPE_VALUE (vfield), t, 0)
852@end example
853
854@noindent
855where @samp{t} is the type that has the given vfield.
856
857@example
858get_binfo (VF_BASETYPE_VALUE (vfield), t, 0)
859@end example
860
861@noindent
862will return the binfo for the the given vfield.
863
864May or may not be set at @code{modify_vtable_entries} time.  Set at
865@code{finish_base_struct} time.
866
867What things can this be used on:
868
869        TREE_LISTs that are vfields
870
871
872@item VF_DERIVED_VALUE
873Identifies the type of the most derived class of the vfield, excluding
874the the class this vfield is for.
875
876Set at @code{finish_base_struct} time.
877
878What things can this be used on:
879
880        TREE_LISTs that are vfields
881
882
883@item VF_NORMAL_VALUE
884Identifies the type of the most derived class of the vfield, including
885the class this vfield is for.
886
887Set at @code{finish_base_struct} time.
888
889What things can this be used on:
890
891        TREE_LISTs that are vfields
892
893
894@item WRITABLE_VTABLES
895This is a option that can be defined when building the compiler, that
896will cause the compiler to output vtables into the data segment so that
897the vtables maybe written.  This is undefined by default, because
898normally the vtables should be unwritable.  People that implement object
899I/O facilities may, or people that want to change the dynamic type of
900objects may want to have the vtables writable.  Another way of achieving
901this would be to make a copy of the vtable into writable memory, but the
902drawback there is that that method only changes the type for one object.
903
904@end table
905
906@node Typical Behavior, Coding Conventions, Macros, Top
907@section Typical Behavior
908
909@cindex parse errors
910
911Whenever seemingly normal code fails with errors like
912@code{syntax error at `\@{'}, it's highly likely that grokdeclarator is
913returning a NULL_TREE for whatever reason.
914
915@node Coding Conventions, Templates, Typical Behavior, Top
916@section Coding Conventions
917
918It should never be that case that trees are modified in-place by the
919back-end, @emph{unless} it is guaranteed that the semantics are the same
920no matter how shared the tree structure is.  @file{fold-const.c} still
921has some cases where this is not true, but rms hypothesizes that this
922will never be a problem.
923
924@node Templates, Access Control, Coding Conventions, Top
925@section Templates
926
927A template is represented by a @code{TEMPLATE_DECL}.  The specific
928fields used are:
929
930@table @code
931@item DECL_TEMPLATE_RESULT
932The generic decl on which instantiations are based.  This looks just
933like any other decl.
934
935@item DECL_TEMPLATE_PARMS
936The parameters to this template.
937@end table
938
939The generic decl is parsed as much like any other decl as possible,
940given the parameterization.  The template decl is not built up until the
941generic decl has been completed.  For template classes, a template decl
942is generated for each member function and static data member, as well.
943
944Template members of template classes are represented by a TEMPLATE_DECL
945for the class' parameters around another TEMPLATE_DECL for the member's
946parameters.
947
948All declarations that are instantiations or specializations of templates
949refer to their template and parameters through DECL_TEMPLATE_INFO.
950
951How should I handle parsing member functions with the proper param
952decls?  Set them up again or try to use the same ones?  Currently we do
953the former.  We can probably do this without any extra machinery in
954store_pending_inline, by deducing the parameters from the decl in
955do_pending_inlines.  PRE_PARSED_TEMPLATE_DECL?
956
957If a base is a parm, we can't check anything about it.  If a base is not
958a parm, we need to check it for name binding.  Do finish_base_struct if
959no bases are parameterized (only if none, including indirect, are
960parms).  Nah, don't bother trying to do any of this until instantiation
961-- we only need to do name binding in advance.
962
963Always set up method vec and fields, inc. synthesized methods.  Really?
964We can't know the types of the copy folks, or whether we need a
965destructor, or can have a default ctor, until we know our bases and
966fields.  Otherwise, we can assume and fix ourselves later.  Hopefully.
967
968@node Access Control, Error Reporting, Templates, Top
969@section Access Control
970The function compute_access returns one of three values:
971
972@table @code
973@item access_public
974means that the field can be accessed by the current lexical scope.
975
976@item access_protected
977means that the field cannot be accessed by the current lexical scope
978because it is protected.
979
980@item access_private
981means that the field cannot be accessed by the current lexical scope
982because it is private.
983@end table
984
985DECL_ACCESS is used for access declarations; alter_access creates a list
986of types and accesses for a given decl.
987
988Formerly, DECL_@{PUBLIC,PROTECTED,PRIVATE@} corresponded to the return
989codes of compute_access and were used as a cache for compute_access.
990Now they are not used at all.
991
992TREE_PROTECTED and TREE_PRIVATE are used to record the access levels
993granted by the containing class.  BEWARE: TREE_PUBLIC means something
994completely unrelated to access control!
995
996@node Error Reporting, Parser, Access Control, Top
997@section Error Reporting
998
999The C++ front-end uses a call-back mechanism to allow functions to print
1000out reasonable strings for types and functions without putting extra
1001logic in the functions where errors are found.  The interface is through
1002the @code{cp_error} function (or @code{cp_warning}, etc.).  The
1003syntax is exactly like that of @code{error}, except that a few more
1004conversions are supported:
1005
1006@itemize @bullet
1007@item
1008%C indicates a value of `enum tree_code'.
1009@item
1010%D indicates a *_DECL node.
1011@item
1012%E indicates a *_EXPR node.
1013@item
1014%L indicates a value of `enum languages'.
1015@item
1016%P indicates the name of a parameter (i.e. "this", "1", "2", ...)
1017@item
1018%T indicates a *_TYPE node.
1019@item
1020%O indicates the name of an operator (MODIFY_EXPR -> "operator =").
1021
1022@end itemize
1023
1024There is some overlap between these; for instance, any of the node
1025options can be used for printing an identifier (though only @code{%D}
1026tries to decipher function names).
1027
1028For a more verbose message (@code{class foo} as opposed to just @code{foo},
1029including the return type for functions), use @code{%#c}.
1030To have the line number on the error message indicate the line of the
1031DECL, use @code{cp_error_at} and its ilk; to indicate which argument you want,
1032use @code{%+D}, or it will default to the first.
1033
1034@node Parser, Copying Objects, Error Reporting, Top
1035@section Parser
1036
1037Some comments on the parser:
1038
1039The @code{after_type_declarator} / @code{notype_declarator} hack is
1040necessary in order to allow redeclarations of @code{TYPENAME}s, for
1041instance
1042
1043@example
1044typedef int foo;
1045class A @{
1046  char *foo;
1047@};
1048@end example
1049
1050In the above, the first @code{foo} is parsed as a @code{notype_declarator},
1051and the second as a @code{after_type_declarator}.
1052
1053Ambiguities:
1054
1055There are currently four reduce/reduce ambiguities in the parser.  They are:
1056
10571) Between @code{template_parm} and
1058@code{named_class_head_sans_basetype}, for the tokens @code{aggr
1059identifier}.  This situation occurs in code looking like
1060
1061@example
1062template <class T> class A @{ @};
1063@end example
1064
1065It is ambiguous whether @code{class T} should be parsed as the
1066declaration of a template type parameter named @code{T} or an unnamed
1067constant parameter of type @code{class T}.  Section 14.6, paragraph 3 of
1068the January '94 working paper states that the first interpretation is
1069the correct one.  This ambiguity results in two reduce/reduce conflicts.
1070
10712) Between @code{primary} and @code{type_id} for code like @samp{int()}
1072in places where both can be accepted, such as the argument to
1073@code{sizeof}.  Section 8.1 of the pre-San Diego working paper specifies
1074that these ambiguous constructs will be interpreted as @code{typename}s.
1075This ambiguity results in six reduce/reduce conflicts between
1076@samp{absdcl} and @samp{functional_cast}.
1077
10783) Between @code{functional_cast} and
1079@code{complex_direct_notype_declarator}, for various token strings.
1080This situation occurs in code looking like
1081
1082@example
1083int (*a);
1084@end example
1085
1086This code is ambiguous; it could be a declaration of the variable
1087@samp{a} as a pointer to @samp{int}, or it could be a functional cast of
1088@samp{*a} to @samp{int}.  Section 6.8 specifies that the former
1089interpretation is correct.  This ambiguity results in 7 reduce/reduce
1090conflicts.  Another aspect of this ambiguity is code like 'int (x[2]);',
1091which is resolved at the '[' and accounts for 6 reduce/reduce conflicts
1092between @samp{direct_notype_declarator} and
1093@samp{primary}/@samp{overqualified_id}.  Finally, there are 4 r/r
1094conflicts between @samp{expr_or_declarator} and @samp{primary} over code
1095like 'int (a);', which could probably be resolved but would also
1096probably be more trouble than it's worth.  In all, this situation
1097accounts for 17 conflicts.  Ack!
1098
1099The second case above is responsible for the failure to parse 'LinppFile
1100ppfile (String (argv[1]), &outs, argc, argv);' (from Rogue Wave
1101Math.h++) as an object declaration, and must be fixed so that it does
1102not resolve until later.
1103
11044) Indirectly between @code{after_type_declarator} and @code{parm}, for
1105type names.  This occurs in (as one example) code like
1106
1107@example
1108typedef int foo, bar;
1109class A @{
1110  foo (bar);
1111@};
1112@end example
1113
1114What is @code{bar} inside the class definition?  We currently interpret
1115it as a @code{parm}, as does Cfront, but IBM xlC interprets it as an
1116@code{after_type_declarator}.  I believe that xlC is correct, in light
1117of 7.1p2, which says "The longest sequence of @i{decl-specifiers} that
1118could possibly be a type name is taken as the @i{decl-specifier-seq} of
1119a @i{declaration}."  However, it seems clear that this rule must be
1120violated in the case of constructors.  This ambiguity accounts for 8
1121conflicts.
1122
1123Unlike the others, this ambiguity is not recognized by the Working Paper.
1124
1125@node  Copying Objects, Exception Handling, Parser, Top
1126@section Copying Objects
1127
1128The generated copy assignment operator in g++ does not currently do the
1129right thing for multiple inheritance involving virtual bases; it just
1130calls the copy assignment operators for its direct bases.  What it
1131should probably do is:
1132
11331) Split up the copy assignment operator for all classes that have
1134vbases into "copy my vbases" and "copy everything else" parts.  Or do
1135the trickiness that the constructors do to ensure that vbases don't get
1136initialized by intermediate bases.
1137
11382) Wander through the class lattice, find all vbases for which no
1139intermediate base has a user-defined copy assignment operator, and call
1140their "copy everything else" routines.  If not all of my vbases satisfy
1141this criterion, warn, because this may be surprising behavior.
1142
11433) Call the "copy everything else" routine for my direct bases.
1144
1145If we only have one direct base, we can just foist everything off onto
1146them.
1147
1148This issue is currently under discussion in the core reflector
1149(2/28/94).
1150
1151@node  Exception Handling, Free Store, Copying Objects, Top
1152@section Exception Handling
1153
1154Note, exception handling in g++ is still under development. 
1155
1156This section describes the mapping of C++ exceptions in the C++
1157front-end, into the back-end exception handling framework.
1158
1159The basic mechanism of exception handling in the back-end is
1160unwind-protect a la elisp.  This is a general, robust, and language
1161independent representation for exceptions.
1162
1163The C++ front-end exceptions are mapping into the unwind-protect
1164semantics by the C++ front-end.  The mapping is describe below.
1165
1166When -frtti is used, rtti is used to do exception object type checking,
1167when it isn't used, the encoded name for the type of the object being
1168thrown is used instead.  All code that originates exceptions, even code
1169that throws exceptions as a side effect, like dynamic casting, and all
1170code that catches exceptions must be compiled with either -frtti, or
1171-fno-rtti.  It is not possible to mix rtti base exception handling
1172objects with code that doesn't use rtti.  The exceptions to this, are
1173code that doesn't catch or throw exceptions, catch (...), and code that
1174just rethrows an exception.
1175
1176Currently we use the normal mangling used in building functions names
1177(int's are "i", const char * is PCc) to build the non-rtti base type
1178descriptors for exception handling.  These descriptors are just plain
1179NULL terminated strings, and internally they are passed around as char
1180*.
1181
1182In C++, all cleanups should be protected by exception regions.  The
1183region starts just after the reason why the cleanup is created has
1184ended.  For example, with an automatic variable, that has a constructor,
1185it would be right after the constructor is run.  The region ends just
1186before the finalization is expanded.  Since the backend may expand the
1187cleanup multiple times along different paths, once for normal end of the
1188region, once for non-local gotos, once for returns, etc, the backend
1189must take special care to protect the finalization expansion, if the
1190expansion is for any other reason than normal region end, and it is
1191`inline' (it is inside the exception region).  The backend can either
1192choose to move them out of line, or it can created an exception region
1193over the finalization to protect it, and in the handler associated with
1194it, it would not run the finalization as it otherwise would have, but
1195rather just rethrow to the outer handler, careful to skip the normal
1196handler for the original region.
1197
1198In Ada, they will use the more runtime intensive approach of having
1199fewer regions, but at the cost of additional work at run time, to keep a
1200list of things that need cleanups.  When a variable has finished
1201construction, they add the cleanup to the list, when the come to the end
1202of the lifetime of the variable, the run the list down.  If the take a
1203hit before the section finishes normally, they examine the list for
1204actions to perform.  I hope they add this logic into the back-end, as it
1205would be nice to get that alternative approach in C++.
1206
1207On an rs6000, xlC stores exception objects on that stack, under the try
1208block.  When is unwinds down into a handler, the frame pointer is
1209adjusted back to the normal value for the frame in which the handler
1210resides, and the stack pointer is left unchanged from the time at which
1211the object was thrown.  This is so that there is always someplace for
1212the exception object, and nothing can overwrite it, once we start
1213throwing.  The only bad part, is that the stack remains large.
1214
1215The below points out some things that work in g++'s exception handling.
1216
1217All completely constructed temps and local variables are cleaned up in
1218all unwinded scopes.  Completely constructed parts of partially
1219constructed objects are cleaned up.  This includes partially built
1220arrays.  Exception specifications are now handled.
1221
1222The below points out some flaws in g++'s exception handling, as it now
1223stands.
1224
1225Only exact type matching or reference matching of throw types works when
1226-fno-rtti is used.  Only works on a SPARC (like Suns), i386, arm and
1227rs6000 machines.  Partial support is in for all other machines, but a
1228stack unwinder called __unwind_function has to be written, and added to
1229libgcc2 for them.  See below for details on __unwind_function.  Don't
1230expect exception handling to work right if you optimize, in fact the
1231compiler will probably core dump.  RTL_EXPRs for EH cond variables for
1232&& and || exprs should probably be wrapped in UNSAVE_EXPRs, and
1233RTL_EXPRs tweaked so that they can be unsaved, and the UNSAVE_EXPR code
1234should be in the backend, or alternatively, UNSAVE_EXPR should be ripped
1235out and exactly one finalization allowed to be expanded by the backend.
1236I talked with kenner about this, and we have to allow multiple
1237expansions.
1238
1239We only do pointer conversions on exception matching a la 15.3 p2 case
12403: `A handler with type T, const T, T&, or const T& is a match for a
1241throw-expression with an object of type E if [3]T is a pointer type and
1242E is a pointer type that can be converted to T by a standard pointer
1243conversion (_conv.ptr_) not involving conversions to pointers to private
1244or protected base classes.' when -frtti is given.
1245
1246We don't call delete on new expressions that die because the ctor threw
1247an exception.  See except/18 for a test case.
1248
124915.2 para 13: The exception being handled should be rethrown if control
1250reaches the end of a handler of the function-try-block of a constructor
1251or destructor, right now, it is not.
1252
125315.2 para 12: If a return statement appears in a handler of
1254function-try-block of a constructor, the program is ill-formed, but this
1255isn't diagnosed.
1256
125715.2 para 11: If the handlers of a function-try-block contain a jump
1258into the body of a constructor or destructor, the program is ill-formed,
1259but this isn't diagnosed.
1260
126115.2 para 9: Check that the fully constructed base classes and members
1262of an object are destroyed before entering the handler of a
1263function-try-block of a constructor or destructor for that object.
1264
1265build_exception_variant should sort the incoming list, so that it
1266implements set compares, not exact list equality.  Type smashing should
1267smash exception specifications using set union.
1268
1269Thrown objects are usually allocated on the heap, in the usual way, but
1270they are never deleted.  They should be deleted by the catch clauses.
1271If one runs out of heap space, throwing an object will probably never
1272work.  This could be relaxed some by passing an __in_chrg parameter to
1273track who has control over the exception object.  Thrown objects are not
1274allocated on the heap when they are pointer to object types.
1275
1276When the backend returns a value, it can create new exception regions
1277that need protecting.  The new region should rethrow the object in
1278context of the last associated cleanup that ran to completion.
1279
1280The structure of the code that is generated for C++ exception handling
1281code is shown below:
1282
1283@example
1284Ln:                                     throw value;
1285        copy value onto heap
1286        jump throw (Ln, id, address of copy of value on heap)
1287
1288                                        try {
1289+Lstart:        the start of the main EH region
1290|...                                            ...
1291+Lend:          the end of the main EH region
1292                                        } catch (T o) {
1293                                                ...1
1294                                        }
1295Lresume:
1296        nop     used to make sure there is something before
1297                the next region ends, if there is one
1298...                                     ...
1299
1300        jump Ldone
1301[
1302Lmainhandler:    handler for the region Lstart-Lend
1303        cleanup
1304] zero or more, depending upon automatic vars with dtors
1305+Lpartial:
1306|        jump Lover
1307+Lhere:
1308        rethrow (Lhere, same id, same obj);
1309Lterm:          handler for the region Lpartial-Lhere
1310        call terminate
1311Lover:
1312[
1313 [
1314        call throw_type_match
1315        if (eq) {
1316 ] these lines disappear when there is no catch condition
1317+Lsregion2:
1318|       ...1
1319|       jump Lresume
1320|Lhandler:      handler for the region Lsregion2-Leregion2
1321|       rethrow (Lresume, same id, same obj);
1322+Leregion2
1323        }
1324] there are zero or more of these sections, depending upon how many
1325  catch clauses there are
1326----------------------------- expand_end_all_catch --------------------------
1327                here we have fallen off the end of all catch
1328                clauses, so we rethrow to outer
1329        rethrow (Lresume, same id, same obj);
1330----------------------------- expand_end_all_catch --------------------------
1331[
1332L1:     maybe throw routine
1333] depending upon if we have expanded it or not
1334Ldone:
1335        ret
1336
1337start_all_catch emits labels: Lresume,
1338
1339#end example
1340
1341The __unwind_function takes a pointer to the throw handler, and is
1342expected to pop the stack frame that was built to call it, as well as
1343the frame underneath and then jump to the throw handler.  It must
1344restore all registers to their proper values as well as all other
1345machine state as determined by the context in which we are unwinding
1346into.  The way I normally start is to compile:
1347
1348        void *g;
1349        foo(void* a) { g = a; }
1350
1351with -S, and change the thing that alters the PC (return, or ret
1352usually) to not alter the PC, making sure to leave all other semantics
1353(like adjusting the stack pointer, or frame pointers) in.  After that,
1354replicate the prologue once more at the end, again, changing the PC
1355altering instructions, and finally, at the very end, jump to `g'.
1356
1357It takes about a week to write this routine, if someone wants to
1358volunteer to write this routine for any architecture, exception support
1359for that architecture will be added to g++.  Please send in those code
1360donations.  One other thing that needs to be done, is to double check
1361that __builtin_return_address (0) works.
1362
1363@subsection Specific Targets
1364
1365For the alpha, the __unwind_function will be something resembling:
1366
1367@example
1368void
1369__unwind_function(void *ptr)
1370@{
1371  /* First frame */
1372  asm ("ldq $15, 8($30)"); /* get the saved frame ptr; 15 is fp, 30 is sp */
1373  asm ("bis $15, $15, $30"); /* reload sp with the fp we found */
1374
1375  /* Second frame */
1376  asm ("ldq $15, 8($30)"); /* fp */
1377  asm ("bis $15, $15, $30"); /* reload sp with the fp we found */
1378
1379  /* Return */
1380  asm ("ret $31, ($16), 1"); /* return to PTR, stored in a0 */
1381@}
1382@end example
1383
1384@noindent
1385However, there are a few problems preventing it from working.  First of
1386all, the gcc-internal function @code{__builtin_return_address} needs to
1387work given an argument of 0 for the alpha.  As it stands as of August
138830th, 1995, the code for @code{BUILT_IN_RETURN_ADDRESS} in @file{expr.c}
1389will definitely not work on the alpha.  Instead, we need to define
1390the macros @code{DYNAMIC_CHAIN_ADDRESS} (maybe),
1391@code{RETURN_ADDR_IN_PREVIOUS_FRAME}, and definitely need a new
1392definition for @code{RETURN_ADDR_RTX}.
1393
1394In addition (and more importantly), we need a way to reliably find the
1395frame pointer on the alpha.  The use of the value 8 above to restore the
1396frame pointer (register 15) is incorrect.  On many systems, the frame
1397pointer is consistently offset to a specific point on the stack.  On the
1398alpha, however, the frame pointer is pushed last.  First the return
1399address is stored, then any other registers are saved (e.g., @code{s0}),
1400and finally the frame pointer is put in place.  So @code{fp} could have
1401an offset of 8, but if the calling function saved any registers at all,
1402they add to the offset.
1403
1404The only places the frame size is noted are with the @samp{.frame}
1405directive, for use by the debugger and the OSF exception handling model
1406(useless to us), and in the initial computation of the new value for
1407@code{sp}, the stack pointer.  For example, the function may start with:
1408
1409@example
1410lda $30,-32($30)
1411.frame $15,32,$26,0
1412@end example
1413
1414@noindent
1415The 32 above is exactly the value we need.  With this, we can be sure
1416that the frame pointer is stored 8 bytes less---in this case, at 24(sp)).
1417The drawback is that there is no way that I (Brendan) have found to let
1418us discover the size of a previous frame @emph{inside} the definition
1419of @code{__unwind_function}.
1420
1421So to accomplish exception handling support on the alpha, we need two
1422things: first, a way to figure out where the frame pointer was stored,
1423and second, a functional @code{__builtin_return_address} implementation
1424for except.c to be able to use it.
1425
1426@subsection Backend Exception Support
1427
1428The backend must be extended to fully support exceptions.  Right now
1429there are a few hooks into the alpha exception handling backend that
1430resides in the C++ frontend from that backend that allows exception
1431handling to work in g++.  An exception region is a segment of generated
1432code that has a handler associated with it.  The exception regions are
1433denoted in the generated code as address ranges denoted by a starting PC
1434value and an ending PC value of the region.  Some of the limitations
1435with this scheme are:
1436
1437@itemize @bullet
1438@item
1439The backend replicates insns for such things as loop unrolling and
1440function inlining.  Right now, there are no hooks into the frontend's
1441exception handling backend to handle the replication of insns.  When
1442replication happens, a new exception region descriptor needs to be
1443generated for the new region.
1444
1445@item
1446The backend expects to be able to rearrange code, for things like jump
1447optimization.  Any rearranging of the code needs have exception region
1448descriptors updated appropriately.
1449
1450@item
1451The backend can eliminate dead code.  Any associated exception region
1452descriptor that refers to fully contained code that has been eliminated
1453should also be removed, although not doing this is harmless in terms of
1454semantics.
1455
1456#end itemize
1457
1458The above is not meant to be exhaustive, but does include all things I
1459have thought of so far.  I am sure other limitations exist.
1460
1461Below are some notes on the migration of the exception handling code
1462backend from the C++ frontend to the backend.
1463
1464NOTEs are to be used to denote the start of an exception region, and the
1465end of the region.  I presume that the interface used to generate these
1466notes in the backend would be two functions, start_exception_region and
1467end_exception_region (or something like that).  The frontends are
1468required to call them in pairs.  When marking the end of a region, an
1469argument can be passed to indicate the handler for the marked region.
1470This can be passed in many ways, currently a tree is used.  Another
1471possibility would be insns for the handler, or a label that denotes a
1472handler.  I have a feeling insns might be the the best way to pass it.
1473Semantics are, if an exception is thrown inside the region, control is
1474transfered unconditionally to the handler.  If control passes through
1475the handler, then the backend is to rethrow the exception, in the
1476context of the end of the original region.  The handler is protected by
1477the conventional mechanisms; it is the frontend's responsibility to
1478protect the handler, if special semantics are required.
1479
1480This is a very low level view, and it would be nice is the backend
1481supported a somewhat higher level view in addition to this view.  This
1482higher level could include source line number, name of the source file,
1483name of the language that threw the exception and possibly the name of
1484the exception.  Kenner may want to rope you into doing more than just
1485the basics required by C++.  You will have to resolve this.  He may want
1486you to do support for non-local gotos, first scan for exception handler,
1487if none is found, allow the debugger to be entered, without any cleanups
1488being done.  To do this, the backend would have to know the difference
1489between a cleanup-rethrower, and a real handler, if would also have to
1490have a way to know if a handler `matches' a thrown exception, and this
1491is frontend specific.
1492
1493The UNSAVE_EXPR tree code has to be migrated to the backend.  Exprs such
1494as TARGET_EXPRs, WITH_CLEANUP_EXPRs, CALL_EXPRs and RTL_EXPRs have to be
1495changed to support unsaving.  This is meant to be a complete list.
1496SAVE_EXPRs can be unsaved already.  expand_decl_cleanup should be
1497changed to unsave it's argument, if needed.  See
1498cp/tree.c:cp_expand_decl_cleanup, unsave_expr_now, unsave_expr, and
1499cp/expr.c:cplus_expand_expr(case UNSAVE_EXPR:) for the UNSAVE_EXPR code.
1500Now, as to why...  because kenner already tripped over the exact same
1501problem in Ada, we talked about it, he didn't like any of the solution,
1502but yet, didn't like no solution either.  He was willing to live with
1503the drawbacks of this solution.  The drawback is unsave_expr_now.  It
1504should have a callback into the frontend, to allow the unsaveing of
1505frontend special codes.  The callback goes in, inplace of the call to
1506my_friendly_abort.
1507
1508The stack unwinder is one of the hardest parts to do.  It is highly
1509machine dependent.  The form that kenner seems to like was a couple of
1510macros, that would do the machine dependent grunt work.  One preexisting
1511function that might be of some use is __builtin_return_address ().  One
1512macro he seemed to want was __builtin_return_address, and the other
1513would do the hard work of fixing up the registers, adjusting the stack
1514pointer, frame pointer, arg pointer and so on.
1515
1516The eh archive (~mrs/eh) might be good reading for understanding the Ada
1517perspective, and some of kenners mindset, and a detailed explanation
1518(Message-Id: <9308301130.AA10543@vlsi1.ultra.nyu.edu>) of the concepts
1519involved.
1520
1521Here is a guide to existing backend type code.  It is all in
1522cp/except.c.  Check out do_unwind, and expand_builtin_throw for current
1523code on how to figure out what handler matches an exception,
1524emit_exception_table for code on emitting the PC range table that is
1525built during compilation, expand_exception_blocks for code that emits
1526all the handlers at the end of a functions, end_protect to mark the end
1527of an exception region, start_protect to mark the start of an exception
1528region, lang_interim_eh is the master hook used by the backend into the
1529EH backend that now exists in the frontend, and expand_internal_throw to
1530raise an exception.
1531
1532
1533@node Free Store, Concept Index, Exception Handling, Top
1534@section Free Store
1535
1536operator new [] adds a magic cookie to the beginning of arrays for which
1537the number of elements will be needed by operator delete [].  These are
1538arrays of objects with destructors and arrays of objects that define
1539operator delete [] with the optional size_t argument.  This cookie can
1540be examined from a program as follows:
1541
1542@example
1543typedef unsigned long size_t;
1544extern "C" int printf (const char *, ...);
1545
1546size_t nelts (void *p)
1547@{
1548  struct cookie @{
1549    size_t nelts __attribute__ ((aligned (sizeof (double))));
1550  @};
1551
1552  cookie *cp = (cookie *)p;
1553  --cp;
1554
1555  return cp->nelts;
1556@}
1557
1558struct A @{
1559  ~A() @{ @}
1560@};
1561
1562main()
1563@{
1564  A *ap = new A[3];
1565  printf ("%ld\n", nelts (ap));
1566@}
1567@end example
1568
1569@section Linkage
1570The linkage code in g++ is horribly twisted in order to meet two design goals:
1571
15721) Avoid unnecessary emission of inlines and vtables.
1573
15742) Support pedantic assemblers like the one in AIX.
1575
1576To meet the first goal, we defer emission of inlines and vtables until
1577the end of the translation unit, where we can decide whether or not they
1578are needed, and how to emit them if they are.
1579
1580@node Concept Index,  , Free Store, Top
1581@section Concept Index
1582
1583@printindex cp
1584
1585@bye
Note: See TracBrowser for help on using the repository browser.