Context Navigation

source: trunk/third/gcc/cp/gxxint.texi @ 8834

Visit:

Revision 8834, 54.5 KB checked in by ghudson, 28 years ago (diff)
This commit was generated by cvs2svn to compensate for changes in r8833, which included commits to RCS files with non-trunk default branches.

Line
1	\input texinfo @c --texinfo--
2	@c %**start of header
3	@setfilename g++int.info
4	@settitle G++ internals
5	@setchapternewpage odd
6	@c %**end of header
7
8	@node Top, Limitations of g++, (dir), (dir)
9	@chapter Internal Architecture of the Compiler
10
11	This is meant to describe the C++ front-end for gcc in detail.
12	Questions and comments to mrs@@cygnus.com.
13
14	@menu
15	* Limitations of g++::
16	* Routines::
17	* Implementation Specifics::
18	* Glossary::
19	* Macros::
20	* Typical Behavior::
21	* Coding Conventions::
22	* Templates::
23	* Access Control::
24	* Error Reporting::
25	* Parser::
26	* Copying Objects::
27	* Exception Handling::
28	* Free Store::
29	* Concept Index::
30	@end menu
31
32	@node Limitations of g++, Routines, Top, Top
33	@section Limitations of g++
34
35	@itemize @bullet
36	@item
37	Limitations on input source code: 240 nesting levels with the parser
38	stacksize (YYSTACKSIZE) set to 500 (the default), and requires around
39	16.4k swap space per nesting level. The parser needs about 2.09 *
40	number of nesting levels worth of stackspace.
41
42	@cindex pushdecl_class_level
43	@item
44	I suspect there are other uses of pushdecl_class_level that do not call
45	set_identifier_type_value in tandem with the call to
46	pushdecl_class_level. It would seem to be an omission.
47
48	@cindex access checking
49	@item
50	Access checking is unimplemented for nested types.
51
52	@cindex @code{volatile}
53	@item
54	@code{volatile} is not implemented in general.
55
56	@cindex pointers to members
57	@item
58	Pointers to members are only minimally supported, and there are places
59	where the grammar doesn't even properly accept them yet.
60
61	@cindex multiple inheritance
62	@item
63	@code{this} will be wrong in virtual members functions defined in a
64	virtual base class, when they are overridden in a derived class, when
65	called via a non-left most object.
66
67	An example would be:
68
69	@example
70	extern "C" int printf(const char*, ...);
71	struct A @{ virtual void f() @{ @} @};
72	struct B : virtual A @{ int b; B() : b(0) @{@} void f() @{ b++; @} @};
73	struct C : B @{@};
74	struct D : B @{@};
75	struct E : C, D @{@};
76	int main()
77	@{
78	E e;
79	C& c = e; D& d = e;
80	c.f(); d.f();
81	printf ("C::b = %d, D::b = %d\n", e.C::b, e.D::b);
82	return 0;
83	@}
84	@end example
85
86	This will print out 2, 0, instead of 1,1.
87
88	@end itemize
89
90	@node Routines, Implementation Specifics, Limitations of g++, Top
91	@section Routines
92
93	This section describes some of the routines used in the C++ front-end.
94
95	@code{build_vtable} and @code{prepare_fresh_vtable} is used only within
96	the @file{cp-class.c} file, and only in @code{finish_struct} and
97	@code{modify_vtable_entries}.
98
99	@code{build_vtable}, @code{prepare_fresh_vtable}, and
100	@code{finish_struct} are the only routines that set @code{DECL_VPARENT}.
101
102	@code{finish_struct} can steal the virtual function table from parents,
103	this prohibits related_vslot from working. When finish_struct steals,
104	we know that
105
106	@example
107	get_binfo (DECL_FIELD_CONTEXT (CLASSTYPE_VFIELD (t)), t, 0)
108	@end example
109
110	@noindent
111	will get the related binfo.
112
113	@code{layout_basetypes} does something with the VIRTUALS.
114
115	Supposedly (according to Tiemann) most of the breadth first searching
116	done, like in @code{get_base_distance} and in @code{get_binfo} was not
117	because of any design decision. I have since found out the at least one
118	part of the compiler needs the notion of depth first binfo searching, I
119	am going to try and convert the whole thing, it should just work. The
120	term left-most refers to the depth first left-most node. It uses
121	@code{MAIN_VARIANT == type} as the condition to get left-most, because
122	the things that have @code{BINFO_OFFSET}s of zero are shared and will
123	have themselves as their own @code{MAIN_VARIANT}s. The non-shared right
124	ones, are copies of the left-most one, hence if it is its own
125	@code{MAIN_VARIANT}, we know it IS a left-most one, if it is not, it is
126	a non-left-most one.
127
128	@code{get_base_distance}'s path and distance matters in its use in:
129
130	@itemize @bullet
131	@item
132	@code{prepare_fresh_vtable} (the code is probably wrong)
133	@item
134	@code{init_vfields} Depends upon distance probably in a safe way,
135	build_offset_ref might use partial paths to do further lookups,
136	hack_identifier is probably not properly checking access.
137
138	@item
139	@code{get_first_matching_virtual} probably should check for
140	@code{get_base_distance} returning -2.
141
142	@item
143	@code{resolve_offset_ref} should be called in a more deterministic
144	manner. Right now, it is called in some random contexts, like for
145	arguments at @code{build_method_call} time, @code{default_conversion}
146	time, @code{convert_arguments} time, @code{build_unary_op} time,
147	@code{build_c_cast} time, @code{build_modify_expr} time,
148	@code{convert_for_assignment} time, and
149	@code{convert_for_initialization} time.
150
151	But, there are still more contexts it needs to be called in, one was the
152	ever simple:
153
154	@example
155	if (obj.*pmi != 7)
156	@dots{}
157	@end example
158
159	Seems that the problems were due to the fact that @code{TREE_TYPE} of
160	the @code{OFFSET_REF} was not a @code{OFFSET_TYPE}, but rather the type
161	of the referent (like @code{INTEGER_TYPE}). This problem was fixed by
162	changing @code{default_conversion} to check @code{TREE_CODE (x)},
163	instead of only checking @code{TREE_CODE (TREE_TYPE (x))} to see if it
164	was @code{OFFSET_TYPE}.
165
166	@end itemize
167
168	@node Implementation Specifics, Glossary, Routines, Top
169	@section Implementation Specifics
170
171	@itemize @bullet
172	@item Explicit Initialization
173
174	The global list @code{current_member_init_list} contains the list of
175	mem-initializers specified in a constructor declaration. For example:
176
177	@example
178	foo::foo() : a(1), b(2) @{@}
179	@end example
180
181	@noindent
182	will initialize @samp{a} with 1 and @samp{b} with 2.
183	@code{expand_member_init} places each initialization (a with 1) on the
184	global list. Then, when the fndecl is being processed,
185	@code{emit_base_init} runs down the list, initializing them. It used to
186	be the case that g++ first ran down @code{current_member_init_list},
187	then ran down the list of members initializing the ones that weren't
188	explicitly initialized. Things were rewritten to perform the
189	initializations in order of declaration in the class. So, for the above
190	example, @samp{a} and @samp{b} will be initialized in the order that
191	they were declared:
192
193	@example
194	class foo @{ public: int b; int a; foo (); @};
195	@end example
196
197	@noindent
198	Thus, @samp{b} will be initialized with 2 first, then @samp{a} will be
199	initialized with 1, regardless of how they're listed in the mem-initializer.
200
201	@item Argument Matching
202
203	In early 1993, the argument matching scheme in @sc{gnu} C++ changed
204	significantly. The original code was completely replaced with a new
205	method that will, hopefully, be easier to understand and make fixing
206	specific cases much easier.
207
208	The @samp{-fansi-overloading} option is used to enable the new code; at
209	some point in the future, it will become the default behavior of the
210	compiler.
211
212	The file @file{cp-call.c} contains all of the new work, in the functions
213	@code{rank_for_overload}, @code{compute_harshness},
214	@code{compute_conversion_costs}, and @code{ideal_candidate}.
215
216	Instead of using obscure numerical values, the quality of an argument
217	match is now represented by clear, individual codes. The new data
218	structure @code{struct harshness} (it used to be an @code{unsigned}
219	number) contains:
220
221	@enumerate a
222	@item the @samp{code} field, to signify what was involved in matching two
223	arguments;
224	@item the @samp{distance} field, used in situations where inheritance
225	decides which function should be called (one is ``closer'' than
226	another);
227	@item and the @samp{int_penalty} field, used by some codes as a tie-breaker.
228	@end enumerate
229
230	The @samp{code} field is a number with a given bit set for each type of
231	code, OR'd together. The new codes are:
232
233	@itemize @bullet
234	@item @code{EVIL_CODE}
235	The argument was not a permissible match.
236
237	@item @code{CONST_CODE}
238	Currently, this is only used by @code{compute_conversion_costs}, to
239	distinguish when a non-@code{const} member function is called from a
240	@code{const} member function.
241
242	@item @code{ELLIPSIS_CODE}
243	A match against an ellipsis @samp{...} is considered worse than all others.
244
245	@item @code{USER_CODE}
246	Used for a match involving a user-defined conversion.
247
248	@item @code{STD_CODE}
249	A match involving a standard conversion.
250
251	@item @code{PROMO_CODE}
252	A match involving an integral promotion. For these, the
253	@code{int_penalty} field is used to handle the ARM's rule (XXX cite)
254	that a smaller @code{unsigned} type should promote to a @code{int}, not
255	to an @code{unsigned int}.
256
257	@item @code{QUAL_CODE}
258	Used to mark use of qualifiers like @code{const} and @code{volatile}.
259
260	@item @code{TRIVIAL_CODE}
261	Used for trivial conversions. The @samp{int_penalty} field is used by
262	@code{convert_harshness} to communicate further penalty information back
263	to @code{build_overload_call_real} when deciding which function should
264	be call.
265	@end itemize
266
267	The functions @code{convert_to_aggr} and @code{build_method_call} use
268	@code{compute_conversion_costs} to rate each argument's suitability for
269	a given candidate function (that's how we get the list of candidates for
270	@code{ideal_candidate}).
271
272	@end itemize
273
274	@node Glossary, Macros, Implementation Specifics, Top
275	@section Glossary
276
277	@table @r
278	@item binfo
279	The main data structure in the compiler used to represent the
280	inheritance relationships between classes. The data in the binfo can be
281	accessed by the BINFO_ accessor macros.
282
283	@item vtable
284	@itemx virtual function table
285
286	The virtual function table holds information used in virtual function
287	dispatching. In the compiler, they are usually referred to as vtables,
288	or vtbls. The first index is not used in the normal way, I believe it
289	is probably used for the virtual destructor.
290
291	@item vfield
292
293	vfields can be thought of as the base information needed to build
294	vtables. For every vtable that exists for a class, there is a vfield.
295	See also vtable and virtual function table pointer. When a type is used
296	as a base class to another type, the virtual function table for the
297	derived class can be based upon the vtable for the base class, just
298	extended to include the additional virtual methods declared in the
299	derived class. The virtual function table from a virtual base class is
300	never reused in a derived class. @code{is_normal} depends upon this.
301
302	@item virtual function table pointer
303
304	These are @code{FIELD_DECL}s that are pointer types that point to
305	vtables. See also vtable and vfield.
306	@end table
307
308	@node Macros, Typical Behavior, Glossary, Top
309	@section Macros
310
311	This section describes some of the macros used on trees. The list
312	should be alphabetical. Eventually all macros should be documented
313	here. There are some postscript drawings that can be used to better
314	understand from of the more complex data structures, contact Mike Stump
315	(@code{mrs@@cygnus.com}) for information about them.
316
317	@table @code
318	@item BINFO_BASETYPES
319	A vector of additional binfos for the types inherited by this basetype.
320	The binfos are fully unshared (except for virtual bases, in which
321	case the binfo structure is shared).
322
323	If this basetype describes type D as inherited in C,
324	and if the basetypes of D are E anf F,
325	then this vector contains binfos for inheritance of E and F by C.
326
327	Has values of:
328
329	TREE_VECs
330
331
332	@item BINFO_INHERITANCE_CHAIN
333	Temporarily used to represent specific inheritances. It usually points
334	to the binfo associated with the lesser derived type, but it can be
335	reversed by reverse_path. For example:
336
337	@example
338	Z ZbY least derived
339	\|
340	Y YbX
341	\|
342	X Xb most derived
343
344	TYPE_BINFO (X) == Xb
345	BINFO_INHERITANCE_CHAIN (Xb) == YbX
346	BINFO_INHERITANCE_CHAIN (Yb) == ZbY
347	BINFO_INHERITANCE_CHAIN (Zb) == 0
348	@end example
349
350	Not sure is the above is really true, get_base_distance has is point
351	towards the most derived type, opposite from above.
352
353	Set by build_vbase_path, recursive_bounded_basetype_p,
354	get_base_distance, lookup_field, lookup_fnfields, and reverse_path.
355
356	What things can this be used on:
357
358	TREE_VECs that are binfos
359
360
361	@item BINFO_OFFSET
362	The offset where this basetype appears in its containing type.
363	BINFO_OFFSET slot holds the offset (in bytes) from the base of the
364	complete object to the base of the part of the object that is allocated
365	on behalf of this `type'. This is always 0 except when there is
366	multiple inheritance.
367
368	Used on TREE_VEC_ELTs of the binfos BINFO_BASETYPES (...) for example.
369
370
371	@item BINFO_VIRTUALS
372	A unique list of functions for the virtual function table. See also
373	TYPE_BINFO_VIRTUALS.
374
375	What things can this be used on:
376
377	TREE_VECs that are binfos
378
379
380	@item BINFO_VTABLE
381	Used to find the VAR_DECL that is the virtual function table associated
382	with this binfo. See also TYPE_BINFO_VTABLE. To get the virtual
383	function table pointer, see CLASSTYPE_VFIELD.
384
385	What things can this be used on:
386
387	TREE_VECs that are binfos
388
389	Has values of:
390
391	VAR_DECLs that are virtual function tables
392
393
394	@item BLOCK_SUPERCONTEXT
395	In the outermost scope of each function, it points to the FUNCTION_DECL
396	node. It aids in better DWARF support of inline functions.
397
398
399	@item CLASSTYPE_TAGS
400	CLASSTYPE_TAGS is a linked (via TREE_CHAIN) list of member classes of a
401	class. TREE_PURPOSE is the name, TREE_VALUE is the type (pushclass scans
402	these and calls pushtag on them.)
403
404	finish_struct scans these to produce TYPE_DECLs to add to the
405	TYPE_FIELDS of the type.
406
407	It is expected that name found in the TREE_PURPOSE slot is unique,
408	resolve_scope_to_name is one such place that depends upon this
409	uniqueness.
410
411
412	@item CLASSTYPE_METHOD_VEC
413	The following is true after finish_struct has been called (on the
414	class?) but not before. Before finish_struct is called, things are
415	different to some extent. Contains a TREE_VEC of methods of the class.
416	The TREE_VEC_LENGTH is the number of differently named methods plus one
417	for the 0th entry. The 0th entry is always allocated, and reserved for
418	ctors and dtors. If there are none, TREE_VEC_ELT(N,0) == NULL_TREE.
419	Each entry of the TREE_VEC is a FUNCTION_DECL. For each FUNCTION_DECL,
420	there is a DECL_CHAIN slot. If the FUNCTION_DECL is the last one with a
421	given name, the DECL_CHAIN slot is NULL_TREE. Otherwise it is the next
422	method that has the same name (but a different signature). It would
423	seem that it is not true that because the DECL_CHAIN slot is used in
424	this way, we cannot call pushdecl to put the method in the global scope
425	(cause that would overwrite the TREE_CHAIN slot), because they use
426	different _CHAINs. finish_struct_methods setups up one version of the
427	TREE_CHAIN slots on the FUNCTION_DECLs.
428
429	friends are kept in TREE_LISTs, so that there's no need to use their
430	TREE_CHAIN slot for anything.
431
432	Has values of:
433
434	TREE_VECs
435
436
437	@item CLASSTYPE_VFIELD
438	Seems to be in the process of being renamed TYPE_VFIELD. Use on types
439	to get the main virtual function table pointer. To get the virtual
440	function table use BINFO_VTABLE (TYPE_BINFO ()).
441
442	Has values of:
443
444	FIELD_DECLs that are virtual function table pointers
445
446	What things can this be used on:
447
448	RECORD_TYPEs
449
450
451	@item DECL_CLASS_CONTEXT
452	Identifies the context that the _DECL was found in. For virtual function
453	tables, it points to the type associated with the virtual function
454	table. See also DECL_CONTEXT, DECL_FIELD_CONTEXT and DECL_FCONTEXT.
455
456	The difference between this and DECL_CONTEXT, is that for virtuals
457	functions like:
458
459	@example
460	struct A
461	@{
462	virtual int f ();
463	@};
464
465	struct B : A
466	@{
467	int f ();
468	@};
469
470	DECL_CONTEXT (A::f) == A
471	DECL_CLASS_CONTEXT (A::f) == A
472
473	DECL_CONTEXT (B::f) == A
474	DECL_CLASS_CONTEXT (B::f) == B
475	@end example
476
477	Has values of:
478
479	RECORD_TYPEs, or UNION_TYPEs
480
481	What things can this be used on:
482
483	TYPE_DECLs, _DECLs
484
485
486	@item DECL_CONTEXT
487	Identifies the context that the _DECL was found in. Can be used on
488	virtual function tables to find the type associated with the virtual
489	function table, but since they are FIELD_DECLs, DECL_FIELD_CONTEXT is a
490	better access method. Internally the same as DECL_FIELD_CONTEXT, so
491	don't us both. See also DECL_FIELD_CONTEXT, DECL_FCONTEXT and
492	DECL_CLASS_CONTEXT.
493
494	Has values of:
495
496	RECORD_TYPEs
497
498
499	What things can this be used on:
500
501	@display
502	VAR_DECLs that are virtual function tables
503	_DECLs
504	@end display
505
506
507	@item DECL_FIELD_CONTEXT
508	Identifies the context that the FIELD_DECL was found in. Internally the
509	same as DECL_CONTEXT, so don't us both. See also DECL_CONTEXT,
510	DECL_FCONTEXT and DECL_CLASS_CONTEXT.
511
512	Has values of:
513
514	RECORD_TYPEs
515
516	What things can this be used on:
517
518	@display
519	FIELD_DECLs that are virtual function pointers
520	FIELD_DECLs
521	@end display
522
523
524	@item DECL_NESTED_TYPENAME
525	Holds the fully qualified type name. Example, Base::Derived.
526
527	Has values of:
528
529	IDENTIFIER_NODEs
530
531	What things can this be used on:
532
533	TYPE_DECLs
534
535
536	@item DECL_NAME
537
538	Has values of:
539
540	@display
541	0 for things that don't have names
542	IDENTIFIER_NODEs for TYPE_DECLs
543	@end display
544
545	@item DECL_IGNORED_P
546	A bit that can be set to inform the debug information output routines in
547	the back-end that a certain _DECL node should be totally ignored.
548
549	Used in cases where it is known that the debugging information will be
550	output in another file, or where a sub-type is known not to be needed
551	because the enclosing type is not needed.
552
553	A compiler constructed virtual destructor in derived classes that do not
554	define an explicit destructor that was defined explicit in a base class
555	has this bit set as well. Also used on __FUNCTION__ and
556	__PRETTY_FUNCTION__ to mark they are ``compiler generated.'' c-decl and
557	c-lex.c both want DECL_IGNORED_P set for ``internally generated vars,''
558	and ``user-invisible variable.''
559
560	Functions built by the C++ front-end such as default destructors,
561	virtual destructors and default constructors want to be marked that
562	they are compiler generated, but unsure why.
563
564	Currently, it is used in an absolute way in the C++ front-end, as an
565	optimization, to tell the debug information output routines to not
566	generate debugging information that will be output by another separately
567	compiled file.
568
569
570	@item DECL_VIRTUAL_P
571	A flag used on FIELD_DECLs and VAR_DECLs. (Documentation in tree.h is
572	wrong.) Used in VAR_DECLs to indicate that the variable is a vtable.
573	It is also used in FIELD_DECLs for vtable pointers.
574
575	What things can this be used on:
576
577	FIELD_DECLs and VAR_DECLs
578
579
580	@item DECL_VPARENT
581	Used to point to the parent type of the vtable if there is one, else it
582	is just the type associated with the vtable. Because of the sharing of
583	virtual function tables that goes on, this slot is not very useful, and
584	is in fact, not used in the compiler at all. It can be removed.
585
586	What things can this be used on:
587
588	VAR_DECLs that are virtual function tables
589
590	Has values of:
591
592	RECORD_TYPEs maybe UNION_TYPEs
593
594
595	@item DECL_FCONTEXT
596	Used to find the first baseclass in which this FIELD_DECL is defined.
597	See also DECL_CONTEXT, DECL_FIELD_CONTEXT and DECL_CLASS_CONTEXT.
598
599	How it is used:
600
601	Used when writing out debugging information about vfield and
602	vbase decls.
603
604	What things can this be used on:
605
606	FIELD_DECLs that are virtual function pointers
607	FIELD_DECLs
608
609
610	@item DECL_REFERENCE_SLOT
611	Used to hold the initialize for the reference.
612
613	What things can this be used on:
614
615	PARM_DECLs and VAR_DECLs that have a reference type
616
617
618	@item DECL_VINDEX
619	Used for FUNCTION_DECLs in two different ways. Before the structure
620	containing the FUNCTION_DECL is laid out, DECL_VINDEX may point to a
621	FUNCTION_DECL in a base class which is the FUNCTION_DECL which this
622	FUNCTION_DECL will replace as a virtual function. When the class is
623	laid out, this pointer is changed to an INTEGER_CST node which is
624	suitable to find an index into the virtual function table. See
625	get_vtable_entry as to how one can find the right index into the virtual
626	function table. The first index 0, of a virtual function table it not
627	used in the normal way, so the first real index is 1.
628
629	DECL_VINDEX may be a TREE_LIST, that would seem to be a list of
630	overridden FUNCTION_DECLs. add_virtual_function has code to deal with
631	this when it uses the variable base_fndecl_list, but it would seem that
632	somehow, it is possible for the TREE_LIST to pursist until method_call,
633	and it should not.
634
635
636	What things can this be used on:
637
638	FUNCTION_DECLs
639
640
641	@item DECL_SOURCE_FILE
642	Identifies what source file a particular declaration was found in.
643
644	Has values of:
645
646	"<built-in>" on TYPE_DECLs to mean the typedef is built in
647
648
649	@item DECL_SOURCE_LINE
650	Identifies what source line number in the source file the declaration
651	was found at.
652
653	Has values of:
654
655	@display
656	0 for an undefined label
657
658	0 for TYPE_DECLs that are internally generated
659
660	0 for FUNCTION_DECLs for functions generated by the compiler
661	(not yet, but should be)
662
663	0 for ``magic'' arguments to functions, that the user has no
664	control over
665	@end display
666
667
668	@item TREE_USED
669
670	Has values of:
671
672	0 for unused labels
673
674
675	@item TREE_ADDRESSABLE
676	A flag that is set for any type that has a constructor.
677
678
679	@item TREE_COMPLEXITY
680	They seem a kludge way to track recursion, poping, and pushing. They only
681	appear in cp-decl.c and cp-decl2.c, so the are a good candidate for
682	proper fixing, and removal.
683
684
685	@item TREE_PRIVATE
686	Set for FIELD_DECLs by finish_struct. But not uniformly set.
687
688	The following routines do something with PRIVATE access:
689	build_method_call, alter_access, finish_struct_methods,
690	finish_struct, convert_to_aggr, CWriteLanguageDecl, CWriteLanguageType,
691	CWriteUseObject, compute_access, lookup_field, dfs_pushdecl,
692	GNU_xref_member, dbxout_type_fields, dbxout_type_method_1
693
694
695	@item TREE_PROTECTED
696	The following routines do something with PROTECTED access:
697	build_method_call, alter_access, finish_struct, convert_to_aggr,
698	CWriteLanguageDecl, CWriteLanguageType, CWriteUseObject,
699	compute_access, lookup_field, GNU_xref_member, dbxout_type_fields,
700	dbxout_type_method_1
701
702
703	@item TYPE_BINFO
704	Used to get the binfo for the type.
705
706	Has values of:
707
708	TREE_VECs that are binfos
709
710	What things can this be used on:
711
712	RECORD_TYPEs
713
714
715	@item TYPE_BINFO_BASETYPES
716	See also BINFO_BASETYPES.
717
718	@item TYPE_BINFO_VIRTUALS
719	A unique list of functions for the virtual function table. See also
720	BINFO_VIRTUALS.
721
722	What things can this be used on:
723
724	RECORD_TYPEs
725
726
727	@item TYPE_BINFO_VTABLE
728	Points to the virtual function table associated with the given type.
729	See also BINFO_VTABLE.
730
731	What things can this be used on:
732
733	RECORD_TYPEs
734
735	Has values of:
736
737	VAR_DECLs that are virtual function tables
738
739
740	@item TYPE_NAME
741	Names the type.
742
743	Has values of:
744
745	@display
746	0 for things that don't have names.
747	should be IDENTIFIER_NODE for RECORD_TYPEs UNION_TYPEs and
748	ENUM_TYPEs.
749	TYPE_DECL for RECORD_TYPEs, UNION_TYPEs and ENUM_TYPEs, but
750	shouldn't be.
751	TYPE_DECL for typedefs, unsure why.
752	@end display
753
754	What things can one use this on:
755
756	@display
757	TYPE_DECLs
758	RECORD_TYPEs
759	UNION_TYPEs
760	ENUM_TYPEs
761	@end display
762
763	History:
764
765	It currently points to the TYPE_DECL for RECORD_TYPEs,
766	UNION_TYPEs and ENUM_TYPEs, but it should be history soon.
767
768
769	@item TYPE_METHODS
770	Synonym for @code{CLASSTYPE_METHOD_VEC}. Chained together with
771	@code{TREE_CHAIN}. @file{dbxout.c} uses this to get at the methods of a
772	class.
773
774
775	@item TYPE_DECL
776	Used to represent typedefs, and used to represent bindings layers.
777
778	Components:
779
780	DECL_NAME is the name of the typedef. For example, foo would
781	be found in the DECL_NAME slot when @code{typedef int foo;} is
782	seen.
783
784	DECL_SOURCE_LINE identifies what source line number in the
785	source file the declaration was found at. A value of 0
786	indicates that this TYPE_DECL is just an internal binding layer
787	marker, and does not correspond to a user supplied typedef.
788
789	DECL_SOURCE_FILE
790
791	@item TYPE_FIELDS
792	A linked list (via @code{TREE_CHAIN}) of member types of a class. The
793	list can contain @code{TYPE_DECL}s, but there can also be other things
794	in the list apparently. See also @code{CLASSTYPE_TAGS}.
795
796
797	@item TYPE_VIRTUAL_P
798	A flag used on a @code{FIELD_DECL} or a @code{VAR_DECL}, indicates it is
799	a virtual function table or a pointer to one. When used on a
800	@code{FUNCTION_DECL}, indicates that it is a virtual function. When
801	used on an @code{IDENTIFIER_NODE}, indicates that a function with this
802	same name exists and has been declared virtual.
803
804	When used on types, it indicates that the type has virtual functions, or
805	is derived from one that does.
806
807	Not sure if the above about virtual function tables is still true. See
808	also info on @code{DECL_VIRTUAL_P}.
809
810	What things can this be used on:
811
812	FIELD_DECLs, VAR_DECLs, FUNCTION_DECLs, IDENTIFIER_NODEs
813
814
815	@item VF_BASETYPE_VALUE
816	Get the associated type from the binfo that caused the given vfield to
817	exist. This is the least derived class (the most parent class) that
818	needed a virtual function table. It is probably the case that all uses
819	of this field are misguided, but they need to be examined on a
820	case-by-case basis. See history for more information on why the
821	previous statement was made.
822
823	Set at @code{finish_base_struct} time.
824
825	What things can this be used on:
826
827	TREE_LISTs that are vfields
828
829	History:
830
831	This field was used to determine if a virtual function table's
832	slot should be filled in with a certain virtual function, by
833	checking to see if the type returned by VF_BASETYPE_VALUE was a
834	parent of the context in which the old virtual function existed.
835	This incorrectly assumes that a given type _could_ not appear as
836	a parent twice in a given inheritance lattice. For single
837	inheritance, this would in fact work, because a type could not
838	possibly appear more than once in an inheritance lattice, but
839	with multiple inheritance, a type can appear more than once.
840
841
842	@item VF_BINFO_VALUE
843	Identifies the binfo that caused this vfield to exist. If this vfield
844	is from the first direct base class that has a virtual function table,
845	then VF_BINFO_VALUE is NULL_TREE, otherwise it will be the binfo of the
846	direct base where the vfield came from. Can use @code{TREE_VIA_VIRTUAL}
847	on result to find out if it is a virtual base class. Related to the
848	binfo found by
849
850	@example
851	get_binfo (VF_BASETYPE_VALUE (vfield), t, 0)
852	@end example
853
854	@noindent
855	where @samp{t} is the type that has the given vfield.
856
857	@example
858	get_binfo (VF_BASETYPE_VALUE (vfield), t, 0)
859	@end example
860
861	@noindent
862	will return the binfo for the the given vfield.
863
864	May or may not be set at @code{modify_vtable_entries} time. Set at
865	@code{finish_base_struct} time.
866
867	What things can this be used on:
868
869	TREE_LISTs that are vfields
870
871
872	@item VF_DERIVED_VALUE
873	Identifies the type of the most derived class of the vfield, excluding
874	the the class this vfield is for.
875
876	Set at @code{finish_base_struct} time.
877
878	What things can this be used on:
879
880	TREE_LISTs that are vfields
881
882
883	@item VF_NORMAL_VALUE
884	Identifies the type of the most derived class of the vfield, including
885	the class this vfield is for.
886
887	Set at @code{finish_base_struct} time.
888
889	What things can this be used on:
890
891	TREE_LISTs that are vfields
892
893
894	@item WRITABLE_VTABLES
895	This is a option that can be defined when building the compiler, that
896	will cause the compiler to output vtables into the data segment so that
897	the vtables maybe written. This is undefined by default, because
898	normally the vtables should be unwritable. People that implement object
899	I/O facilities may, or people that want to change the dynamic type of
900	objects may want to have the vtables writable. Another way of achieving
901	this would be to make a copy of the vtable into writable memory, but the
902	drawback there is that that method only changes the type for one object.
903
904	@end table
905
906	@node Typical Behavior, Coding Conventions, Macros, Top
907	@section Typical Behavior
908
909	@cindex parse errors
910
911	Whenever seemingly normal code fails with errors like
912	@code{syntax error at `\@{'}, it's highly likely that grokdeclarator is
913	returning a NULL_TREE for whatever reason.
914
915	@node Coding Conventions, Templates, Typical Behavior, Top
916	@section Coding Conventions
917
918	It should never be that case that trees are modified in-place by the
919	back-end, @emph{unless} it is guaranteed that the semantics are the same
920	no matter how shared the tree structure is. @file{fold-const.c} still
921	has some cases where this is not true, but rms hypothesizes that this
922	will never be a problem.
923
924	@node Templates, Access Control, Coding Conventions, Top
925	@section Templates
926
927	A template is represented by a @code{TEMPLATE_DECL}. The specific
928	fields used are:
929
930	@table @code
931	@item DECL_TEMPLATE_RESULT
932	The generic decl on which instantiations are based. This looks just
933	like any other decl.
934
935	@item DECL_TEMPLATE_PARMS
936	The parameters to this template.
937	@end table
938
939	The generic decl is parsed as much like any other decl as possible,
940	given the parameterization. The template decl is not built up until the
941	generic decl has been completed. For template classes, a template decl
942	is generated for each member function and static data member, as well.
943
944	Template members of template classes are represented by a TEMPLATE_DECL
945	for the class' parameters around another TEMPLATE_DECL for the member's
946	parameters.
947
948	All declarations that are instantiations or specializations of templates
949	refer to their template and parameters through DECL_TEMPLATE_INFO.
950
951	How should I handle parsing member functions with the proper param
952	decls? Set them up again or try to use the same ones? Currently we do
953	the former. We can probably do this without any extra machinery in
954	store_pending_inline, by deducing the parameters from the decl in
955	do_pending_inlines. PRE_PARSED_TEMPLATE_DECL?
956
957	If a base is a parm, we can't check anything about it. If a base is not
958	a parm, we need to check it for name binding. Do finish_base_struct if
959	no bases are parameterized (only if none, including indirect, are
960	parms). Nah, don't bother trying to do any of this until instantiation
961	-- we only need to do name binding in advance.
962
963	Always set up method vec and fields, inc. synthesized methods. Really?
964	We can't know the types of the copy folks, or whether we need a
965	destructor, or can have a default ctor, until we know our bases and
966	fields. Otherwise, we can assume and fix ourselves later. Hopefully.
967
968	@node Access Control, Error Reporting, Templates, Top
969	@section Access Control
970	The function compute_access returns one of three values:
971
972	@table @code
973	@item access_public
974	means that the field can be accessed by the current lexical scope.
975
976	@item access_protected
977	means that the field cannot be accessed by the current lexical scope
978	because it is protected.
979
980	@item access_private
981	means that the field cannot be accessed by the current lexical scope
982	because it is private.
983	@end table
984
985	DECL_ACCESS is used for access declarations; alter_access creates a list
986	of types and accesses for a given decl.
987
988	Formerly, DECL_@{PUBLIC,PROTECTED,PRIVATE@} corresponded to the return
989	codes of compute_access and were used as a cache for compute_access.
990	Now they are not used at all.
991
992	TREE_PROTECTED and TREE_PRIVATE are used to record the access levels
993	granted by the containing class. BEWARE: TREE_PUBLIC means something
994	completely unrelated to access control!
995
996	@node Error Reporting, Parser, Access Control, Top
997	@section Error Reporting
998
999	The C++ front-end uses a call-back mechanism to allow functions to print
1000	out reasonable strings for types and functions without putting extra
1001	logic in the functions where errors are found. The interface is through
1002	the @code{cp_error} function (or @code{cp_warning}, etc.). The
1003	syntax is exactly like that of @code{error}, except that a few more
1004	conversions are supported:
1005
1006	@itemize @bullet
1007	@item
1008	%C indicates a value of `enum tree_code'.
1009	@item
1010	%D indicates a *_DECL node.
1011	@item
1012	%E indicates a *_EXPR node.
1013	@item
1014	%L indicates a value of `enum languages'.
1015	@item
1016	%P indicates the name of a parameter (i.e. "this", "1", "2", ...)
1017	@item
1018	%T indicates a *_TYPE node.
1019	@item
1020	%O indicates the name of an operator (MODIFY_EXPR -> "operator =").
1021
1022	@end itemize
1023
1024	There is some overlap between these; for instance, any of the node
1025	options can be used for printing an identifier (though only @code{%D}
1026	tries to decipher function names).
1027
1028	For a more verbose message (@code{class foo} as opposed to just @code{foo},
1029	including the return type for functions), use @code{%#c}.
1030	To have the line number on the error message indicate the line of the
1031	DECL, use @code{cp_error_at} and its ilk; to indicate which argument you want,
1032	use @code{%+D}, or it will default to the first.
1033
1034	@node Parser, Copying Objects, Error Reporting, Top
1035	@section Parser
1036
1037	Some comments on the parser:
1038
1039	The @code{after_type_declarator} / @code{notype_declarator} hack is
1040	necessary in order to allow redeclarations of @code{TYPENAME}s, for
1041	instance
1042
1043	@example
1044	typedef int foo;
1045	class A @{
1046	char *foo;
1047	@};
1048	@end example
1049
1050	In the above, the first @code{foo} is parsed as a @code{notype_declarator},
1051	and the second as a @code{after_type_declarator}.
1052
1053	Ambiguities:
1054
1055	There are currently four reduce/reduce ambiguities in the parser. They are:
1056
1057	1) Between @code{template_parm} and
1058	@code{named_class_head_sans_basetype}, for the tokens @code{aggr
1059	identifier}. This situation occurs in code looking like
1060
1061	@example
1062	template <class T> class A @{ @};
1063	@end example
1064
1065	It is ambiguous whether @code{class T} should be parsed as the
1066	declaration of a template type parameter named @code{T} or an unnamed
1067	constant parameter of type @code{class T}. Section 14.6, paragraph 3 of
1068	the January '94 working paper states that the first interpretation is
1069	the correct one. This ambiguity results in two reduce/reduce conflicts.
1070
1071	2) Between @code{primary} and @code{type_id} for code like @samp{int()}
1072	in places where both can be accepted, such as the argument to
1073	@code{sizeof}. Section 8.1 of the pre-San Diego working paper specifies
1074	that these ambiguous constructs will be interpreted as @code{typename}s.
1075	This ambiguity results in six reduce/reduce conflicts between
1076	@samp{absdcl} and @samp{functional_cast}.
1077
1078	3) Between @code{functional_cast} and
1079	@code{complex_direct_notype_declarator}, for various token strings.
1080	This situation occurs in code looking like
1081
1082	@example
1083	int (*a);
1084	@end example
1085
1086	This code is ambiguous; it could be a declaration of the variable
1087	@samp{a} as a pointer to @samp{int}, or it could be a functional cast of
1088	@samp{*a} to @samp{int}. Section 6.8 specifies that the former
1089	interpretation is correct. This ambiguity results in 7 reduce/reduce
1090	conflicts. Another aspect of this ambiguity is code like 'int (x[2]);',
1091	which is resolved at the '[' and accounts for 6 reduce/reduce conflicts
1092	between @samp{direct_notype_declarator} and
1093	@samp{primary}/@samp{overqualified_id}. Finally, there are 4 r/r
1094	conflicts between @samp{expr_or_declarator} and @samp{primary} over code
1095	like 'int (a);', which could probably be resolved but would also
1096	probably be more trouble than it's worth. In all, this situation
1097	accounts for 17 conflicts. Ack!
1098
1099	The second case above is responsible for the failure to parse 'LinppFile
1100	ppfile (String (argv[1]), &outs, argc, argv);' (from Rogue Wave
1101	Math.h++) as an object declaration, and must be fixed so that it does
1102	not resolve until later.
1103
1104	4) Indirectly between @code{after_type_declarator} and @code{parm}, for
1105	type names. This occurs in (as one example) code like
1106
1107	@example
1108	typedef int foo, bar;
1109	class A @{
1110	foo (bar);
1111	@};
1112	@end example
1113
1114	What is @code{bar} inside the class definition? We currently interpret
1115	it as a @code{parm}, as does Cfront, but IBM xlC interprets it as an
1116	@code{after_type_declarator}. I believe that xlC is correct, in light
1117	of 7.1p2, which says "The longest sequence of @i{decl-specifiers} that
1118	could possibly be a type name is taken as the @i{decl-specifier-seq} of
1119	a @i{declaration}." However, it seems clear that this rule must be
1120	violated in the case of constructors. This ambiguity accounts for 8
1121	conflicts.
1122
1123	Unlike the others, this ambiguity is not recognized by the Working Paper.
1124
1125	@node Copying Objects, Exception Handling, Parser, Top
1126	@section Copying Objects
1127
1128	The generated copy assignment operator in g++ does not currently do the
1129	right thing for multiple inheritance involving virtual bases; it just
1130	calls the copy assignment operators for its direct bases. What it
1131	should probably do is:
1132
1133	1) Split up the copy assignment operator for all classes that have
1134	vbases into "copy my vbases" and "copy everything else" parts. Or do
1135	the trickiness that the constructors do to ensure that vbases don't get
1136	initialized by intermediate bases.
1137
1138	2) Wander through the class lattice, find all vbases for which no
1139	intermediate base has a user-defined copy assignment operator, and call
1140	their "copy everything else" routines. If not all of my vbases satisfy
1141	this criterion, warn, because this may be surprising behavior.
1142
1143	3) Call the "copy everything else" routine for my direct bases.
1144
1145	If we only have one direct base, we can just foist everything off onto
1146	them.
1147
1148	This issue is currently under discussion in the core reflector
1149	(2/28/94).
1150
1151	@node Exception Handling, Free Store, Copying Objects, Top
1152	@section Exception Handling
1153
1154	Note, exception handling in g++ is still under development.
1155
1156	This section describes the mapping of C++ exceptions in the C++
1157	front-end, into the back-end exception handling framework.
1158
1159	The basic mechanism of exception handling in the back-end is
1160	unwind-protect a la elisp. This is a general, robust, and language
1161	independent representation for exceptions.
1162
1163	The C++ front-end exceptions are mapping into the unwind-protect
1164	semantics by the C++ front-end. The mapping is describe below.
1165
1166	When -frtti is used, rtti is used to do exception object type checking,
1167	when it isn't used, the encoded name for the type of the object being
1168	thrown is used instead. All code that originates exceptions, even code
1169	that throws exceptions as a side effect, like dynamic casting, and all
1170	code that catches exceptions must be compiled with either -frtti, or
1171	-fno-rtti. It is not possible to mix rtti base exception handling
1172	objects with code that doesn't use rtti. The exceptions to this, are
1173	code that doesn't catch or throw exceptions, catch (...), and code that
1174	just rethrows an exception.
1175
1176	Currently we use the normal mangling used in building functions names
1177	(int's are "i", const char * is PCc) to build the non-rtti base type
1178	descriptors for exception handling. These descriptors are just plain
1179	NULL terminated strings, and internally they are passed around as char
1180	*.
1181
1182	In C++, all cleanups should be protected by exception regions. The
1183	region starts just after the reason why the cleanup is created has
1184	ended. For example, with an automatic variable, that has a constructor,
1185	it would be right after the constructor is run. The region ends just
1186	before the finalization is expanded. Since the backend may expand the
1187	cleanup multiple times along different paths, once for normal end of the
1188	region, once for non-local gotos, once for returns, etc, the backend
1189	must take special care to protect the finalization expansion, if the
1190	expansion is for any other reason than normal region end, and it is
1191	`inline' (it is inside the exception region). The backend can either
1192	choose to move them out of line, or it can created an exception region
1193	over the finalization to protect it, and in the handler associated with
1194	it, it would not run the finalization as it otherwise would have, but
1195	rather just rethrow to the outer handler, careful to skip the normal
1196	handler for the original region.
1197
1198	In Ada, they will use the more runtime intensive approach of having
1199	fewer regions, but at the cost of additional work at run time, to keep a
1200	list of things that need cleanups. When a variable has finished
1201	construction, they add the cleanup to the list, when the come to the end
1202	of the lifetime of the variable, the run the list down. If the take a
1203	hit before the section finishes normally, they examine the list for
1204	actions to perform. I hope they add this logic into the back-end, as it
1205	would be nice to get that alternative approach in C++.
1206
1207	On an rs6000, xlC stores exception objects on that stack, under the try
1208	block. When is unwinds down into a handler, the frame pointer is
1209	adjusted back to the normal value for the frame in which the handler
1210	resides, and the stack pointer is left unchanged from the time at which
1211	the object was thrown. This is so that there is always someplace for
1212	the exception object, and nothing can overwrite it, once we start
1213	throwing. The only bad part, is that the stack remains large.
1214
1215	The below points out some things that work in g++'s exception handling.
1216
1217	All completely constructed temps and local variables are cleaned up in
1218	all unwinded scopes. Completely constructed parts of partially
1219	constructed objects are cleaned up. This includes partially built
1220	arrays. Exception specifications are now handled.
1221
1222	The below points out some flaws in g++'s exception handling, as it now
1223	stands.
1224
1225	Only exact type matching or reference matching of throw types works when
1226	-fno-rtti is used. Only works on a SPARC (like Suns), i386, arm and
1227	rs6000 machines. Partial support is in for all other machines, but a
1228	stack unwinder called __unwind_function has to be written, and added to
1229	libgcc2 for them. See below for details on __unwind_function. Don't
1230	expect exception handling to work right if you optimize, in fact the
1231	compiler will probably core dump. RTL_EXPRs for EH cond variables for
1232	&& and \|\| exprs should probably be wrapped in UNSAVE_EXPRs, and
1233	RTL_EXPRs tweaked so that they can be unsaved, and the UNSAVE_EXPR code
1234	should be in the backend, or alternatively, UNSAVE_EXPR should be ripped
1235	out and exactly one finalization allowed to be expanded by the backend.
1236	I talked with kenner about this, and we have to allow multiple
1237	expansions.
1238
1239	We only do pointer conversions on exception matching a la 15.3 p2 case
1240	3: `A handler with type T, const T, T&, or const T& is a match for a
1241	throw-expression with an object of type E if [3]T is a pointer type and
1242	E is a pointer type that can be converted to T by a standard pointer
1243	conversion (_conv.ptr_) not involving conversions to pointers to private
1244	or protected base classes.' when -frtti is given.
1245
1246	We don't call delete on new expressions that die because the ctor threw
1247	an exception. See except/18 for a test case.
1248
1249	15.2 para 13: The exception being handled should be rethrown if control
1250	reaches the end of a handler of the function-try-block of a constructor
1251	or destructor, right now, it is not.
1252
1253	15.2 para 12: If a return statement appears in a handler of
1254	function-try-block of a constructor, the program is ill-formed, but this
1255	isn't diagnosed.
1256
1257	15.2 para 11: If the handlers of a function-try-block contain a jump
1258	into the body of a constructor or destructor, the program is ill-formed,
1259	but this isn't diagnosed.
1260
1261	15.2 para 9: Check that the fully constructed base classes and members
1262	of an object are destroyed before entering the handler of a
1263	function-try-block of a constructor or destructor for that object.
1264
1265	build_exception_variant should sort the incoming list, so that it
1266	implements set compares, not exact list equality. Type smashing should
1267	smash exception specifications using set union.
1268
1269	Thrown objects are usually allocated on the heap, in the usual way, but
1270	they are never deleted. They should be deleted by the catch clauses.
1271	If one runs out of heap space, throwing an object will probably never
1272	work. This could be relaxed some by passing an __in_chrg parameter to
1273	track who has control over the exception object. Thrown objects are not
1274	allocated on the heap when they are pointer to object types.
1275
1276	When the backend returns a value, it can create new exception regions
1277	that need protecting. The new region should rethrow the object in
1278	context of the last associated cleanup that ran to completion.
1279
1280	The structure of the code that is generated for C++ exception handling
1281	code is shown below:
1282
1283	@example
1284	Ln: throw value;
1285	copy value onto heap
1286	jump throw (Ln, id, address of copy of value on heap)
1287
1288	try {
1289	+Lstart: the start of the main EH region
1290	\|... ...
1291	+Lend: the end of the main EH region
1292	} catch (T o) {
1293	...1
1294	}
1295	Lresume:
1296	nop used to make sure there is something before
1297	the next region ends, if there is one
1298	... ...
1299
1300	jump Ldone
1301	[
1302	Lmainhandler: handler for the region Lstart-Lend
1303	cleanup
1304	] zero or more, depending upon automatic vars with dtors
1305	+Lpartial:
1306	\| jump Lover
1307	+Lhere:
1308	rethrow (Lhere, same id, same obj);
1309	Lterm: handler for the region Lpartial-Lhere
1310	call terminate
1311	Lover:
1312	[
1313	[
1314	call throw_type_match
1315	if (eq) {
1316	] these lines disappear when there is no catch condition
1317	+Lsregion2:
1318	\| ...1
1319	\| jump Lresume
1320	\|Lhandler: handler for the region Lsregion2-Leregion2
1321	\| rethrow (Lresume, same id, same obj);
1322	+Leregion2
1323	}
1324	] there are zero or more of these sections, depending upon how many
1325	catch clauses there are
1326	----------------------------- expand_end_all_catch --------------------------
1327	here we have fallen off the end of all catch
1328	clauses, so we rethrow to outer
1329	rethrow (Lresume, same id, same obj);
1330	----------------------------- expand_end_all_catch --------------------------
1331	[
1332	L1: maybe throw routine
1333	] depending upon if we have expanded it or not
1334	Ldone:
1335	ret
1336
1337	start_all_catch emits labels: Lresume,
1338
1339	#end example
1340
1341	The __unwind_function takes a pointer to the throw handler, and is
1342	expected to pop the stack frame that was built to call it, as well as
1343	the frame underneath and then jump to the throw handler. It must
1344	restore all registers to their proper values as well as all other
1345	machine state as determined by the context in which we are unwinding
1346	into. The way I normally start is to compile:
1347
1348	void *g;
1349	foo(void* a) { g = a; }
1350
1351	with -S, and change the thing that alters the PC (return, or ret
1352	usually) to not alter the PC, making sure to leave all other semantics
1353	(like adjusting the stack pointer, or frame pointers) in. After that,
1354	replicate the prologue once more at the end, again, changing the PC
1355	altering instructions, and finally, at the very end, jump to `g'.
1356
1357	It takes about a week to write this routine, if someone wants to
1358	volunteer to write this routine for any architecture, exception support
1359	for that architecture will be added to g++. Please send in those code
1360	donations. One other thing that needs to be done, is to double check
1361	that __builtin_return_address (0) works.
1362
1363	@subsection Specific Targets
1364
1365	For the alpha, the __unwind_function will be something resembling:
1366
1367	@example
1368	void
1369	__unwind_function(void *ptr)
1370	@{
1371	/* First frame */
1372	asm ("ldq $15, 8($30)"); /* get the saved frame ptr; 15 is fp, 30 is sp */
1373	asm ("bis $15, $15, $30"); /* reload sp with the fp we found */
1374
1375	/* Second frame */
1376	asm ("ldq $15, 8($30)"); /* fp */
1377	asm ("bis $15, $15, $30"); /* reload sp with the fp we found */
1378
1379	/* Return */
1380	asm ("ret $31, ($16), 1"); /* return to PTR, stored in a0 */
1381	@}
1382	@end example
1383
1384	@noindent
1385	However, there are a few problems preventing it from working. First of
1386	all, the gcc-internal function @code{__builtin_return_address} needs to
1387	work given an argument of 0 for the alpha. As it stands as of August
1388	30th, 1995, the code for @code{BUILT_IN_RETURN_ADDRESS} in @file{expr.c}
1389	will definitely not work on the alpha. Instead, we need to define
1390	the macros @code{DYNAMIC_CHAIN_ADDRESS} (maybe),
1391	@code{RETURN_ADDR_IN_PREVIOUS_FRAME}, and definitely need a new
1392	definition for @code{RETURN_ADDR_RTX}.
1393
1394	In addition (and more importantly), we need a way to reliably find the
1395	frame pointer on the alpha. The use of the value 8 above to restore the
1396	frame pointer (register 15) is incorrect. On many systems, the frame
1397	pointer is consistently offset to a specific point on the stack. On the
1398	alpha, however, the frame pointer is pushed last. First the return
1399	address is stored, then any other registers are saved (e.g., @code{s0}),
1400	and finally the frame pointer is put in place. So @code{fp} could have
1401	an offset of 8, but if the calling function saved any registers at all,
1402	they add to the offset.
1403
1404	The only places the frame size is noted are with the @samp{.frame}
1405	directive, for use by the debugger and the OSF exception handling model
1406	(useless to us), and in the initial computation of the new value for
1407	@code{sp}, the stack pointer. For example, the function may start with:
1408
1409	@example
1410	lda $30,-32($30)
1411	.frame $15,32,$26,0
1412	@end example
1413
1414	@noindent
1415	The 32 above is exactly the value we need. With this, we can be sure
1416	that the frame pointer is stored 8 bytes less---in this case, at 24(sp)).
1417	The drawback is that there is no way that I (Brendan) have found to let
1418	us discover the size of a previous frame @emph{inside} the definition
1419	of @code{__unwind_function}.
1420
1421	So to accomplish exception handling support on the alpha, we need two
1422	things: first, a way to figure out where the frame pointer was stored,
1423	and second, a functional @code{__builtin_return_address} implementation
1424	for except.c to be able to use it.
1425
1426	@subsection Backend Exception Support
1427
1428	The backend must be extended to fully support exceptions. Right now
1429	there are a few hooks into the alpha exception handling backend that
1430	resides in the C++ frontend from that backend that allows exception
1431	handling to work in g++. An exception region is a segment of generated
1432	code that has a handler associated with it. The exception regions are
1433	denoted in the generated code as address ranges denoted by a starting PC
1434	value and an ending PC value of the region. Some of the limitations
1435	with this scheme are:
1436
1437	@itemize @bullet
1438	@item
1439	The backend replicates insns for such things as loop unrolling and
1440	function inlining. Right now, there are no hooks into the frontend's
1441	exception handling backend to handle the replication of insns. When
1442	replication happens, a new exception region descriptor needs to be
1443	generated for the new region.
1444
1445	@item
1446	The backend expects to be able to rearrange code, for things like jump
1447	optimization. Any rearranging of the code needs have exception region
1448	descriptors updated appropriately.
1449
1450	@item
1451	The backend can eliminate dead code. Any associated exception region
1452	descriptor that refers to fully contained code that has been eliminated
1453	should also be removed, although not doing this is harmless in terms of
1454	semantics.
1455
1456	#end itemize
1457
1458	The above is not meant to be exhaustive, but does include all things I
1459	have thought of so far. I am sure other limitations exist.
1460
1461	Below are some notes on the migration of the exception handling code
1462	backend from the C++ frontend to the backend.
1463
1464	NOTEs are to be used to denote the start of an exception region, and the
1465	end of the region. I presume that the interface used to generate these
1466	notes in the backend would be two functions, start_exception_region and
1467	end_exception_region (or something like that). The frontends are
1468	required to call them in pairs. When marking the end of a region, an
1469	argument can be passed to indicate the handler for the marked region.
1470	This can be passed in many ways, currently a tree is used. Another
1471	possibility would be insns for the handler, or a label that denotes a
1472	handler. I have a feeling insns might be the the best way to pass it.
1473	Semantics are, if an exception is thrown inside the region, control is
1474	transfered unconditionally to the handler. If control passes through
1475	the handler, then the backend is to rethrow the exception, in the
1476	context of the end of the original region. The handler is protected by
1477	the conventional mechanisms; it is the frontend's responsibility to
1478	protect the handler, if special semantics are required.
1479
1480	This is a very low level view, and it would be nice is the backend
1481	supported a somewhat higher level view in addition to this view. This
1482	higher level could include source line number, name of the source file,
1483	name of the language that threw the exception and possibly the name of
1484	the exception. Kenner may want to rope you into doing more than just
1485	the basics required by C++. You will have to resolve this. He may want
1486	you to do support for non-local gotos, first scan for exception handler,
1487	if none is found, allow the debugger to be entered, without any cleanups
1488	being done. To do this, the backend would have to know the difference
1489	between a cleanup-rethrower, and a real handler, if would also have to
1490	have a way to know if a handler `matches' a thrown exception, and this
1491	is frontend specific.
1492
1493	The UNSAVE_EXPR tree code has to be migrated to the backend. Exprs such
1494	as TARGET_EXPRs, WITH_CLEANUP_EXPRs, CALL_EXPRs and RTL_EXPRs have to be
1495	changed to support unsaving. This is meant to be a complete list.
1496	SAVE_EXPRs can be unsaved already. expand_decl_cleanup should be
1497	changed to unsave it's argument, if needed. See
1498	cp/tree.c:cp_expand_decl_cleanup, unsave_expr_now, unsave_expr, and
1499	cp/expr.c:cplus_expand_expr(case UNSAVE_EXPR:) for the UNSAVE_EXPR code.
1500	Now, as to why... because kenner already tripped over the exact same
1501	problem in Ada, we talked about it, he didn't like any of the solution,
1502	but yet, didn't like no solution either. He was willing to live with
1503	the drawbacks of this solution. The drawback is unsave_expr_now. It
1504	should have a callback into the frontend, to allow the unsaveing of
1505	frontend special codes. The callback goes in, inplace of the call to
1506	my_friendly_abort.
1507
1508	The stack unwinder is one of the hardest parts to do. It is highly
1509	machine dependent. The form that kenner seems to like was a couple of
1510	macros, that would do the machine dependent grunt work. One preexisting
1511	function that might be of some use is __builtin_return_address (). One
1512	macro he seemed to want was __builtin_return_address, and the other
1513	would do the hard work of fixing up the registers, adjusting the stack
1514	pointer, frame pointer, arg pointer and so on.
1515
1516	The eh archive (~mrs/eh) might be good reading for understanding the Ada
1517	perspective, and some of kenners mindset, and a detailed explanation
1518	(Message-Id: <9308301130.AA10543@vlsi1.ultra.nyu.edu>) of the concepts
1519	involved.
1520
1521	Here is a guide to existing backend type code. It is all in
1522	cp/except.c. Check out do_unwind, and expand_builtin_throw for current
1523	code on how to figure out what handler matches an exception,
1524	emit_exception_table for code on emitting the PC range table that is
1525	built during compilation, expand_exception_blocks for code that emits
1526	all the handlers at the end of a functions, end_protect to mark the end
1527	of an exception region, start_protect to mark the start of an exception
1528	region, lang_interim_eh is the master hook used by the backend into the
1529	EH backend that now exists in the frontend, and expand_internal_throw to
1530	raise an exception.
1531
1532
1533	@node Free Store, Concept Index, Exception Handling, Top
1534	@section Free Store
1535
1536	operator new [] adds a magic cookie to the beginning of arrays for which
1537	the number of elements will be needed by operator delete []. These are
1538	arrays of objects with destructors and arrays of objects that define
1539	operator delete [] with the optional size_t argument. This cookie can
1540	be examined from a program as follows:
1541
1542	@example
1543	typedef unsigned long size_t;
1544	extern "C" int printf (const char *, ...);
1545
1546	size_t nelts (void *p)
1547	@{
1548	struct cookie @{
1549	size_t nelts __attribute__ ((aligned (sizeof (double))));
1550	@};
1551
1552	cookie cp = (cookie )p;
1553	--cp;
1554
1555	return cp->nelts;
1556	@}
1557
1558	struct A @{
1559	~A() @{ @}
1560	@};
1561
1562	main()
1563	@{
1564	A *ap = new A[3];
1565	printf ("%ld\n", nelts (ap));
1566	@}
1567	@end example
1568
1569	@section Linkage
1570	The linkage code in g++ is horribly twisted in order to meet two design goals:
1571
1572	1) Avoid unnecessary emission of inlines and vtables.
1573
1574	2) Support pedantic assemblers like the one in AIX.
1575
1576	To meet the first goal, we defer emission of inlines and vtables until
1577	the end of the translation unit, where we can decide whether or not they
1578	are needed, and how to emit them if they are.
1579
1580	@node Concept Index, , Free Store, Top
1581	@section Concept Index
1582
1583	@printindex cp
1584
1585	@bye

Note: See TracBrowser for help on using the repository browser.

Download in other formats: