1 | .\" |
---|
2 | .\" $Id: fields.3,v 1.1.1.1 1997-09-03 21:08:08 ghudson Exp $ |
---|
3 | .\" |
---|
4 | .\" $Log: not supported by cvs2svn $ |
---|
5 | .\" Revision 1.3 1994/01/05 20:13:43 geoff |
---|
6 | .\" Add the maxf parameter |
---|
7 | .\" |
---|
8 | .\" Revision 1.2 1994/01/04 02:40:16 geoff |
---|
9 | .\" Add descriptions of field_line_inc, field_field_inc, and the |
---|
10 | .\" FLD_NOSHRINK flag. |
---|
11 | .\" |
---|
12 | .\" Revision 1.1 1993/09/09 01:09:44 geoff |
---|
13 | .\" Initial revision |
---|
14 | .\" |
---|
15 | .\" |
---|
16 | .TH FIELDS 3 local |
---|
17 | .SH NAME |
---|
18 | fieldread, fieldmake, fieldwrite, fieldfree \- field access package |
---|
19 | .SH SYNTAX |
---|
20 | .nf |
---|
21 | #include "fields.h" |
---|
22 | |
---|
23 | typedef struct { |
---|
24 | int nfields; |
---|
25 | int hadnl; |
---|
26 | char *linebuf; |
---|
27 | char **fields; |
---|
28 | } field_t; |
---|
29 | |
---|
30 | #define FLD_RUNS 0x0001 |
---|
31 | #define FLD_SNGLQUOTES 0x0002 |
---|
32 | #define FLD_BACKQUOTES 0x0004 |
---|
33 | #define FLD_DBLQUOTES 0x0008 |
---|
34 | #define FLD_SHQUOTES 0x0010 |
---|
35 | #define FLD_STRIPQUOTES 0x0020 |
---|
36 | #define FLD_BACKSLASH 0x0040 |
---|
37 | |
---|
38 | extern field_t *fieldread (FILE * file, char * delims, |
---|
39 | int flags, int maxf); |
---|
40 | extern field_t *fieldmake (char * line, int allocated, |
---|
41 | char * delims, int flags, int maxf); |
---|
42 | extern int fieldwrite (FILE * file, field_t * fieldp, |
---|
43 | int delim); |
---|
44 | extern void fieldfree (field_t * fieldp); |
---|
45 | |
---|
46 | extern unsigned int field_line_inc; |
---|
47 | extern unsigned int field_field_inc; |
---|
48 | .fi |
---|
49 | .SH DESCRIPTION |
---|
50 | .PP |
---|
51 | The fields access package eases the common task of parsing and |
---|
52 | accessing information which is separated into fields by whitespace or |
---|
53 | other delimiters. Various options can be specified to handle many |
---|
54 | common cases, including selectable delimiters, runs of delimiters, and |
---|
55 | quoting. |
---|
56 | .PP |
---|
57 | .I fieldread |
---|
58 | reads one line from a file, parses it into fields as specified by the |
---|
59 | parameters, and returns a |
---|
60 | .B field_t |
---|
61 | structure describing the result. |
---|
62 | .I fieldmake |
---|
63 | performs the same process on a buffer already in memory. |
---|
64 | .I fieldwrite |
---|
65 | creates an output line from a |
---|
66 | .B field_t |
---|
67 | structure and writes it to an output file. |
---|
68 | .I fieldfree |
---|
69 | frees a |
---|
70 | .B field_t |
---|
71 | structure and any associated memory allocated by the package. |
---|
72 | .PP |
---|
73 | The |
---|
74 | .B field_t |
---|
75 | structure describes the fields in a parsed line. |
---|
76 | A well-behaved should only access the |
---|
77 | .BR nfields , |
---|
78 | .BR fields , |
---|
79 | and |
---|
80 | .B hadnl |
---|
81 | elements; |
---|
82 | all other elements are used internally by the package and are not |
---|
83 | guaranteed to remain the same even though they are documented here. |
---|
84 | .B Nfields |
---|
85 | gives the number of fields in the parsed line, just like the |
---|
86 | .B argc |
---|
87 | argument to a C program; |
---|
88 | .B fields |
---|
89 | is a pointer to an array of string pointers, just like the |
---|
90 | .B argv |
---|
91 | argument to a C program. |
---|
92 | As in C, the last field pointer is followed by a null pointer, |
---|
93 | although the field count is the preferred method of accessing fields. |
---|
94 | The user may alter |
---|
95 | .B nfields |
---|
96 | by decreasing it, and may replace any pointer in |
---|
97 | .B fields |
---|
98 | without harm. |
---|
99 | This is often useful in replacing a single field with a calculated |
---|
100 | value preparatory to output. |
---|
101 | The |
---|
102 | .B hadnl |
---|
103 | element is nonzero if the original line was terminated with a newline |
---|
104 | when it was parsed; |
---|
105 | this is used to accurately reproduce the input when |
---|
106 | .I fieldwrite |
---|
107 | is called. |
---|
108 | .PP |
---|
109 | The |
---|
110 | .B linebuf |
---|
111 | element contains a pointer to an internal buffer allocated by |
---|
112 | .I fieldread |
---|
113 | or provided to |
---|
114 | .IR fieldmake . |
---|
115 | This buffer is |
---|
116 | .I not |
---|
117 | guaranteed to contain anything sensible, although in the current |
---|
118 | implementation all of the field contents can be found therein. |
---|
119 | .PP |
---|
120 | .I fieldread |
---|
121 | reads a single line of arbitrary length from |
---|
122 | .BR file , |
---|
123 | allocating as much memory as necessary to hold it, and then parses the |
---|
124 | line according to its remaining arguments. |
---|
125 | A pointer to the parsed |
---|
126 | .B field_t |
---|
127 | structure is returned, with |
---|
128 | .B NULL |
---|
129 | returned if an error occurs or if |
---|
130 | .B EOF |
---|
131 | is reached on the input file. |
---|
132 | Fields in the input line are considered to be separated by any of the |
---|
133 | delimiters in the |
---|
134 | .B delims |
---|
135 | parameter. |
---|
136 | For example, if delimiters of ":.;" are specified, a line containing |
---|
137 | "a:b;c.d" would be considered to have four fields. |
---|
138 | .PP |
---|
139 | The default parsing of fields considers each delimiter to indicate a |
---|
140 | separate field, and does not allow any quoting. This is similar to |
---|
141 | the parsing done by |
---|
142 | .IR cut (1). |
---|
143 | This behavior can be modified by specifying flags. |
---|
144 | Multiple flags may be OR'ed together. |
---|
145 | The available flags are: |
---|
146 | .IP \fBFLD_RUNS\fP |
---|
147 | Consider runs of delimiters to be the same as a single delimiter, |
---|
148 | suppressing all null fields. |
---|
149 | This is similar to the way utilities like |
---|
150 | .IR awk (1) |
---|
151 | and |
---|
152 | .IR sort (1) |
---|
153 | treat whitespace, but it is not limited to whitespace. |
---|
154 | A run does not have to consist of a single type of delimiter; if both |
---|
155 | semicolon and colon are delimiters, ";::;" is a run. |
---|
156 | .IP \fBFLD_SNGLQUOTES\fP |
---|
157 | Allow field contents to be quoted with single quotes. |
---|
158 | Delimiters and other quotes appearing within single quotes are ignored. |
---|
159 | This may appear in combination with other quote options. |
---|
160 | .IP \fBFLD_BACKQUOTES\fP |
---|
161 | Allow field contents to be quoted with reverse single quotes. |
---|
162 | Delimiters and other quotes appearing within reverse single quotes are ignored. |
---|
163 | This may appear in combination with other quote options. |
---|
164 | .IP \fBFLD_DBLQUOTES\fP |
---|
165 | Allow field contents to be quoted with single quotes. |
---|
166 | Delimiters and other quotes appearing within double quotes are ignored. |
---|
167 | This may appear in combination with other quote options. |
---|
168 | .IP \fBFLD_SHQUOTES\fP |
---|
169 | Allow shell-style quoting. |
---|
170 | In the absence of this option, quotes are only recognized at the |
---|
171 | beginning of a field, and characters following the close quote are |
---|
172 | removed from the field (and are thus lost from the input line). |
---|
173 | If this option is specified, quotes may appear within a field, in the |
---|
174 | same way as they are handled by |
---|
175 | .IR sh (1). |
---|
176 | Multiple quoting styles may be used in the same field. |
---|
177 | If none of |
---|
178 | .BR FLD_SNGLQUOTES , |
---|
179 | .BR FLD_BACKQUOTES , |
---|
180 | or |
---|
181 | .B FLD_DBLQUOTES |
---|
182 | is specified with |
---|
183 | .BR FLD_SHQUOTES , |
---|
184 | all three options are implied. |
---|
185 | .IP \fBFLD_STRIPQUOTES\fP |
---|
186 | Remove quotes and backslash sequences from the field while parsing, |
---|
187 | converting backslash sequences to their proper ASCII equivalent. |
---|
188 | The C sequences \ea, \eb, \ef, \en, \er, \ev, \ex\fInn\fP, and \e\fInnn\fP are |
---|
189 | supported. |
---|
190 | Any other sequence is simply converted to the backslashed character, |
---|
191 | as in |
---|
192 | .IR sh (1). |
---|
193 | .IP \fBFLD_BACKSLASH\fP |
---|
194 | Accept standard C-style backslash sequences. |
---|
195 | The sequence will be converted to an ASCII equivalent if |
---|
196 | .B FLD_STRIPQUOTES |
---|
197 | is specified (q.v.). |
---|
198 | .IP \fBFLD_NOSHRINK\fP |
---|
199 | Don't shrink allocated memory using |
---|
200 | .IR realloc (3) |
---|
201 | before returning. |
---|
202 | This option can have a significant effect on performance, especially |
---|
203 | when |
---|
204 | .I fieldfree |
---|
205 | is going to be called soon after |
---|
206 | .I fieldread |
---|
207 | or |
---|
208 | .IR fieldmake . |
---|
209 | The disadvantage is that slightly more memory will be occupied until |
---|
210 | the field structure is freed. |
---|
211 | .PP |
---|
212 | The |
---|
213 | .I maxf |
---|
214 | parameter, if nonzero, specifies the maximum number of fields to be |
---|
215 | generated. |
---|
216 | This may enhance performance if only the first few fields of a long |
---|
217 | line are of interest to the caller. |
---|
218 | The actual number of fields returned is one greater than |
---|
219 | .IR maxf , |
---|
220 | because the remainder of the line will be returned as a single |
---|
221 | contiguous (and uninterpreted, |
---|
222 | .B FLD_STRIPQUOTES |
---|
223 | or |
---|
224 | .B FLD_BACKSLASH |
---|
225 | is specified) field. |
---|
226 | .PP |
---|
227 | .I fieldmake |
---|
228 | operates exactly like |
---|
229 | .IR fieldread , |
---|
230 | except that the line parsed is provided by the caller rather than |
---|
231 | being read from a file. |
---|
232 | If the |
---|
233 | .I allocated |
---|
234 | parameter is nonzero, the memory pointed to by the |
---|
235 | .I line |
---|
236 | parameter will automatically be freed when |
---|
237 | .I fieldfree |
---|
238 | is called; |
---|
239 | otherwise this memory is the caller's responsibility. |
---|
240 | The memory pointed to by |
---|
241 | .I line |
---|
242 | is destroyed by |
---|
243 | .IR fieldmake . |
---|
244 | All other parameters are the same as for |
---|
245 | .IR fieldread. |
---|
246 | .PP |
---|
247 | .I fieldwrite |
---|
248 | writes a set of fields to the specified |
---|
249 | .IR file , |
---|
250 | separating them with the delimiter character |
---|
251 | .I delim |
---|
252 | (note that this is a character, not a string), and appending a newline |
---|
253 | if specified by the |
---|
254 | .I hadnl |
---|
255 | element of the structure. |
---|
256 | The field structure is not freed. |
---|
257 | .I fieldwrite |
---|
258 | will return nonzero if an I/O error is detected. |
---|
259 | .PP |
---|
260 | .I fieldfree |
---|
261 | frees the |
---|
262 | .B field_t |
---|
263 | structure passed to it, along with any associated auxiliary memory |
---|
264 | allocated by the package (or passed to |
---|
265 | .IR fieldmake ). |
---|
266 | The structure may not be accessed after |
---|
267 | .I fieldfree |
---|
268 | is called. |
---|
269 | .PP |
---|
270 | .B field_line_inc |
---|
271 | (default 512) and |
---|
272 | .B field_field_inc |
---|
273 | (default 20) describe the increments to use when expanding lines as |
---|
274 | they are read in and parsed. |
---|
275 | .I fieldread |
---|
276 | initially allocates a buffer of |
---|
277 | .B field_line_inc |
---|
278 | bytes and, if the input line is larger than that, expands the buffer |
---|
279 | in increments of the same amount until it is large enough. |
---|
280 | If input lines are known to consistently reach a certain size, |
---|
281 | performance will be improved by setting |
---|
282 | .B field_line_inc |
---|
283 | to a value larger than that size (larger because there must be room |
---|
284 | for a null byte). |
---|
285 | .B field_field_inc |
---|
286 | serves the same purpose in both |
---|
287 | .I fieldread |
---|
288 | and |
---|
289 | .IR fieldmake , |
---|
290 | except that it is related to the number of fields in the line rather |
---|
291 | than to the line length. |
---|
292 | If the number of fields is known, performance will be improved by |
---|
293 | setting |
---|
294 | .B field_field_inc |
---|
295 | to at least one more than that number. |
---|
296 | .SH RETURN VALUES |
---|
297 | .I fieldread |
---|
298 | and |
---|
299 | .I fieldmake |
---|
300 | return |
---|
301 | .B NULL |
---|
302 | if an error occurs or if |
---|
303 | .B EOF |
---|
304 | is reached on the input file. |
---|
305 | .I fieldwrite |
---|
306 | returns nonzero if an output error occurs. |
---|
307 | .SH BUGS |
---|
308 | Thanks to the vagaries of ANSI C, the |
---|
309 | .B fields.h |
---|
310 | header file defines an auxiliary macro named |
---|
311 | .BR P . |
---|
312 | If the user needs a similarly-named macro, this macro must be |
---|
313 | undefined first, and the user's macro must be defined after |
---|
314 | .B fields.h |
---|
315 | is included. |
---|