1 | .\" |
---|
2 | .\" $Id: ispell.1X,v 1.1.1.1 1997-09-03 21:08:11 ghudson Exp $ |
---|
3 | .\" |
---|
4 | .\" Copyright 1992, 1993, Geoff Kuenning, Granada Hills, CA |
---|
5 | .\" All rights reserved. |
---|
6 | .\" |
---|
7 | .\" Redistribution and use in source and binary forms, with or without |
---|
8 | .\" modification, are permitted provided that the following conditions |
---|
9 | .\" are met: |
---|
10 | .\" |
---|
11 | .\" 1. Redistributions of source code must retain the above copyright |
---|
12 | .\" notice, this list of conditions and the following disclaimer. |
---|
13 | .\" 2. Redistributions in binary form must reproduce the above copyright |
---|
14 | .\" notice, this list of conditions and the following disclaimer in the |
---|
15 | .\" documentation and/or other materials provided with the distribution. |
---|
16 | .\" 3. All modifications to the source code must be clearly marked as |
---|
17 | .\" such. Binary redistributions based on modified source code |
---|
18 | .\" must be clearly marked as modified versions in the documentation |
---|
19 | .\" and/or other materials provided with the distribution. |
---|
20 | .\" 4. All advertising materials mentioning features or use of this software |
---|
21 | .\" must display the following acknowledgment: |
---|
22 | .\" This product includes software developed by Geoff Kuenning and |
---|
23 | .\" other unpaid contributors. |
---|
24 | .\" 5. The name of Geoff Kuenning may not be used to endorse or promote |
---|
25 | .\" products derived from this software without specific prior |
---|
26 | .\" written permission. |
---|
27 | .\" |
---|
28 | .\" THIS SOFTWARE IS PROVIDED BY GEOFF KUENNING AND CONTRIBUTORS ``AS IS'' AND |
---|
29 | .\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE |
---|
30 | .\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE |
---|
31 | .\" ARE DISCLAIMED. IN NO EVENT SHALL GEOFF KUENNING OR CONTRIBUTORS BE LIABLE |
---|
32 | .\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL |
---|
33 | .\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS |
---|
34 | .\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) |
---|
35 | .\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT |
---|
36 | .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY |
---|
37 | .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF |
---|
38 | .\" SUCH DAMAGE. |
---|
39 | .\" |
---|
40 | .\" $Log: not supported by cvs2svn $ |
---|
41 | .\" Revision 1.80 1995/01/08 23:23:31 geoff |
---|
42 | .\" Document the new personal-dictionary behavior (dictionary named after |
---|
43 | .\" the hash file is preferred). |
---|
44 | .\" |
---|
45 | .\" Revision 1.79 1994/10/25 05:46:02 geoff |
---|
46 | .\" Document the new DICTIONARY variable, and improve the documentation of |
---|
47 | .\" the -d flag. |
---|
48 | .\" |
---|
49 | .\" Revision 1.78 1994/09/16 05:06:58 geoff |
---|
50 | .\" Make it clear that the + command doesn't change the string-character |
---|
51 | .\" type. |
---|
52 | .\" |
---|
53 | .\" Revision 1.77 1994/04/27 01:50:35 geoff |
---|
54 | .\" Remove the bug about the tex parser getting confused by \endxxx. |
---|
55 | .\" |
---|
56 | .\" Revision 1.76 1994/03/21 01:54:08 geoff |
---|
57 | .\" Document the '&' command in -a mode. |
---|
58 | .\" |
---|
59 | .\" Revision 1.75 1994/03/15 06:24:26 geoff |
---|
60 | .\" Document the changes to the +/-/~ commands and the -T switch. |
---|
61 | .\" |
---|
62 | .\" Revision 1.74 1994/01/25 07:11:39 geoff |
---|
63 | .\" Get rid of all old RCS log lines in preparation for the 3.1 release. |
---|
64 | .\" |
---|
65 | .\" |
---|
66 | .TH ISPELL 1 local |
---|
67 | .SH NAME |
---|
68 | ispell, buildhash, munchlist, findaffix, tryaffix, icombine, ijoin \- Interactive |
---|
69 | spelling checking |
---|
70 | .SH SYNOPSIS |
---|
71 | .B ispell |
---|
72 | .RI [ common-flags ] |
---|
73 | .RB [ \-M | \-N ] |
---|
74 | .RB [ \-L \fIcontext\fP ] |
---|
75 | .RB [ \-V ] |
---|
76 | files |
---|
77 | .br |
---|
78 | .B ispell |
---|
79 | .RI [ common-flags ] |
---|
80 | .B \-l |
---|
81 | .br |
---|
82 | .B ispell |
---|
83 | .RI [ common-flags ] |
---|
84 | .RB [ \-f |
---|
85 | file] |
---|
86 | .RB [ \-s ] |
---|
87 | .RB { \-a | \-A } |
---|
88 | .br |
---|
89 | .B ispell |
---|
90 | .RB [ \-d |
---|
91 | .IR file ] |
---|
92 | .RB [ \-w |
---|
93 | .IR chars ] |
---|
94 | .B \-c |
---|
95 | .br |
---|
96 | .B ispell |
---|
97 | .RB [ \-d |
---|
98 | .IR file ] |
---|
99 | .RB [ \-w |
---|
100 | .IR chars ] |
---|
101 | .BR \-e [ e ] |
---|
102 | .br |
---|
103 | .B ispell |
---|
104 | .RB [ \-d |
---|
105 | .IR file ] |
---|
106 | .B \-D |
---|
107 | .br |
---|
108 | .B ispell |
---|
109 | .BR \-v [ v ] |
---|
110 | .IP \fIcommon-flags\fP: |
---|
111 | .RB [ \-t ] |
---|
112 | .RB [ \-n ] |
---|
113 | .RB [ \-b ] |
---|
114 | .RB [ \-x ] |
---|
115 | .RB [ \-B ] |
---|
116 | .RB [ \-C ] |
---|
117 | .RB [ \-P ] |
---|
118 | .RB [ \-m ] |
---|
119 | .RB [ \-S ] |
---|
120 | .RB [ \-d |
---|
121 | .IR file ] |
---|
122 | .RB [ \-p |
---|
123 | .IR file ] |
---|
124 | .RB [ \-w |
---|
125 | .IR chars ] |
---|
126 | .RB [ \-W |
---|
127 | .IR n ] |
---|
128 | .RB [ \-T |
---|
129 | .IR type ] |
---|
130 | .PP |
---|
131 | .B buildhash |
---|
132 | .RB [ \-s ] |
---|
133 | .I |
---|
134 | dict-file affix-file hash-file |
---|
135 | .br |
---|
136 | .B buildhash |
---|
137 | .B \-s |
---|
138 | .I |
---|
139 | count affix-file |
---|
140 | .if n .TP 10 |
---|
141 | .if t .PP |
---|
142 | .B munchlist |
---|
143 | .RB [ \-l |
---|
144 | .IR aff-file ] |
---|
145 | .RB [ \-c |
---|
146 | .IR conv-file ] |
---|
147 | .RB [ \-T |
---|
148 | .IR suffix ] |
---|
149 | .if n .br |
---|
150 | .RB [ \-s |
---|
151 | .IR hash-file ] |
---|
152 | .RB [ \-D ] |
---|
153 | .RB [ \-v ] |
---|
154 | .RB [ \-w |
---|
155 | .IR chars ] |
---|
156 | .RI [ files ] |
---|
157 | .if n .TP 10 |
---|
158 | .if t .PP |
---|
159 | .B findaffix |
---|
160 | .RB [ \-p | \-s ] |
---|
161 | .RB [ \-f ] |
---|
162 | .RB [ \-c ] |
---|
163 | .RB [ \-m |
---|
164 | .IR min ] |
---|
165 | .RB [ \-M |
---|
166 | .IR max ] |
---|
167 | .RB [ \-e |
---|
168 | .IR elim ] |
---|
169 | .if n .br |
---|
170 | .RB [ \-t |
---|
171 | .IR tabchar ] |
---|
172 | .RB [ \-l |
---|
173 | .IR low ] |
---|
174 | .RI [ files ] |
---|
175 | .PP |
---|
176 | .B tryaffix |
---|
177 | .RB [ \-p | \-s] |
---|
178 | .RB [ \-c ] |
---|
179 | .I expanded-file |
---|
180 | .IR affix [ +addition ] |
---|
181 | ... |
---|
182 | .PP |
---|
183 | .B icombine |
---|
184 | .RB [ \-T |
---|
185 | .IR type ] |
---|
186 | .RI [ aff-file ] |
---|
187 | .PP |
---|
188 | .B ijoin |
---|
189 | .RB [ \-s | \-u ] |
---|
190 | .I join-options |
---|
191 | .I file1 |
---|
192 | .I file2 |
---|
193 | .SH DESCRIPTION |
---|
194 | .PP |
---|
195 | .I Ispell |
---|
196 | is fashioned after the |
---|
197 | .I spell |
---|
198 | program from ITS (called |
---|
199 | .I ispell |
---|
200 | on Twenex systems.) The most common usage is "ispell filename". In this |
---|
201 | case, |
---|
202 | .I ispell |
---|
203 | will display each word which does not appear in the dictionary at the |
---|
204 | top of the screen and allow you to change it. If there are "near |
---|
205 | misses" in the dictionary (words which differ by only a single letter, a |
---|
206 | missing or extra letter, a pair of transposed letters, or a missing |
---|
207 | space or hyphen), then they are |
---|
208 | also displayed on following lines. |
---|
209 | As well as "near misses", ispell may display other guesses |
---|
210 | at ways to make the word from a known root, with each guess preceded |
---|
211 | by question marks. |
---|
212 | Finally, the line containing the |
---|
213 | word and the previous line |
---|
214 | are printed at the bottom of the screen. If your terminal can |
---|
215 | display in reverse video, the word itself is highlighted. You have the |
---|
216 | option of replacing the word completely, or choosing one of the |
---|
217 | suggested words. Commands are single characters as follows |
---|
218 | (case is ignored): |
---|
219 | .PP |
---|
220 | .RS |
---|
221 | .IP R |
---|
222 | Replace the misspelled word completely. |
---|
223 | .IP Space |
---|
224 | Accept the word this time only. |
---|
225 | .IP A |
---|
226 | Accept the word for the rest of this |
---|
227 | .I ispell |
---|
228 | session. |
---|
229 | .IP I |
---|
230 | Accept the word, capitalized as it is in the |
---|
231 | file, and update private dictionary. |
---|
232 | .IP U |
---|
233 | Accept the word, and add an uncapitalized (actually, all lower-case) |
---|
234 | version to the private dictionary. |
---|
235 | .IP 0-\fIn\fR |
---|
236 | Replace with one of the suggested words. |
---|
237 | .IP L |
---|
238 | Look up words in system dictionary (controlled by the WORDS |
---|
239 | compilation option). |
---|
240 | .IP X |
---|
241 | Write the rest of this file, ignoring misspellings, and start next file. |
---|
242 | .IP Q |
---|
243 | Exit immediately and leave the file unchanged. |
---|
244 | .IP ! |
---|
245 | Shell escape. |
---|
246 | .IP ^L |
---|
247 | Redraw screen. |
---|
248 | .IP ^Z |
---|
249 | Suspend ispell. |
---|
250 | .IP ? |
---|
251 | Give help screen. |
---|
252 | .RE |
---|
253 | .PP |
---|
254 | If the |
---|
255 | .B \-M |
---|
256 | switch is specified, |
---|
257 | a one-line mini-menu at the bottom of the screen will |
---|
258 | summarize these options. |
---|
259 | Conversely, the |
---|
260 | .B \-N |
---|
261 | switch may be used to suppress the mini-menu. |
---|
262 | (The minimenu is displayed by default if |
---|
263 | .I ispell |
---|
264 | was compiled with the MINIMENU option, |
---|
265 | but these two switches will always override the default). |
---|
266 | .PP |
---|
267 | If the |
---|
268 | .B \-L |
---|
269 | flag is given, the specified number is used as the number of |
---|
270 | lines of context to be shown at the bottom of the screen |
---|
271 | (The default is to calculate the amount of context as a certain percentage |
---|
272 | of the screen size). |
---|
273 | The amount of context is subject to a system-imposed limit. |
---|
274 | .PP |
---|
275 | If the |
---|
276 | .B \-V |
---|
277 | flag is given, characters that are not in the 7-bit ANSI printable |
---|
278 | character set will always be displayed in the style of "cat -v", even if |
---|
279 | .I ispell |
---|
280 | thinks that these characters are legal ISO Latin-1 on your system. |
---|
281 | This is useful when working with older terminals. |
---|
282 | Without this switch, |
---|
283 | .I ispell |
---|
284 | will display 8-bit characters "as is" if they have been defined as |
---|
285 | string characters for the chosen file type. |
---|
286 | .PP |
---|
287 | "Normal" mode, as well as the |
---|
288 | .BR \-l , |
---|
289 | .BR \-a , |
---|
290 | and |
---|
291 | .B \-A |
---|
292 | options (see below) also |
---|
293 | accepts the following "common" flags on the command line: |
---|
294 | .RS |
---|
295 | .IP \fB\-t\fR |
---|
296 | The input file is in TeX or LaTeX format. |
---|
297 | .IP \fB\-n\fR |
---|
298 | The input file is in nroff/troff format. |
---|
299 | .IP \fB\-b\fR |
---|
300 | Create a backup file by appending ".bak" |
---|
301 | to the name of the input file. |
---|
302 | .IP \fB\-x\fR |
---|
303 | Don't create a backup file. |
---|
304 | .IP \fB\-B\fR |
---|
305 | Report run-together words with missing blanks as spelling errors. |
---|
306 | .IP \fB\-C\fR |
---|
307 | Consider run-together words as legal compounds. |
---|
308 | .IP \fB\-P\fR |
---|
309 | Don't generate extra root/affix combinations. |
---|
310 | .IP \fB\-m\fR |
---|
311 | Make possible root/affix combinations that |
---|
312 | aren't in the dictionary. |
---|
313 | .IP \fB\-S\fR |
---|
314 | Sort the list of guesses by probable correctness. |
---|
315 | .IP "\fB\-d\fR file" |
---|
316 | Specify an alternate dictionary file. |
---|
317 | For example, use |
---|
318 | .B "\-d deutsch" |
---|
319 | to choose a German dictionary in a German installation. |
---|
320 | .IP "\fB\-p\fR file" |
---|
321 | Specify an alternate personal dictionary. |
---|
322 | .IP "\fB\-w\fR chars" |
---|
323 | Specify additional characters that can be part of a word. |
---|
324 | .IP "\fB\-W\fR n" |
---|
325 | Specify length of words that are always legal. |
---|
326 | .IP "\fB-T\fR type" |
---|
327 | Assume a given formatter type for all files. |
---|
328 | .RE |
---|
329 | .PP |
---|
330 | The |
---|
331 | .B \-n |
---|
332 | and |
---|
333 | .B \-t |
---|
334 | options select whether |
---|
335 | .I ispell |
---|
336 | runs in nroff/troff |
---|
337 | .RB ( \-n ) |
---|
338 | or TeX/LaTeX |
---|
339 | .RB ( \-t ) |
---|
340 | input mode. |
---|
341 | (The default is controlled by the DEFTEXFLAG installation option.) |
---|
342 | TeX/LaTeX mode is also automatically selected if an input file has |
---|
343 | the extension ".tex", unless overridden by the |
---|
344 | .B \-n |
---|
345 | switch. |
---|
346 | In TeX/LaTeX mode, whenever a backslash ("\e") is found, |
---|
347 | .I ispell |
---|
348 | will skip to the next whitespace or TeX/LaTeX delimiter. Certain commands |
---|
349 | contain arguments which should not be checked, such as labels and reference |
---|
350 | keys as are found in the \ecite command, since they contain arbitrary, |
---|
351 | non-word arguments. Spell checking is also suppressed when in math mode. |
---|
352 | Thus, for example, given |
---|
353 | .PP |
---|
354 | .RS |
---|
355 | \echapter {This is a Ckapter} |
---|
356 | \ecite{SCH86} |
---|
357 | .RE |
---|
358 | .PP |
---|
359 | .I ispell |
---|
360 | will find "Ckapter" but not "SCH". |
---|
361 | The |
---|
362 | .B \-t |
---|
363 | option does not recognize the TeX comment character "%", so comments are |
---|
364 | also spell-checked. |
---|
365 | It also assumes |
---|
366 | correct LaTeX syntax. Arguments to infrequently used commands and some |
---|
367 | optional arguments are sometimes checked unnecessarily. |
---|
368 | The bibliography will not be checked if |
---|
369 | .I ispell |
---|
370 | was compiled with |
---|
371 | .B IGNOREBIB |
---|
372 | defined. Otherwise, the bibliography will be checked but the reference |
---|
373 | key will not. |
---|
374 | .PP |
---|
375 | References for the |
---|
376 | .IR tib (1) |
---|
377 | bibliography system, that is, text between a ``[.'' or ``<.'' and |
---|
378 | ``.]'' or ``.>'' will always be ignored in TeX/LaTeX mode. |
---|
379 | .PP |
---|
380 | The |
---|
381 | .B \-b |
---|
382 | and |
---|
383 | .B \-x |
---|
384 | options control whether |
---|
385 | .I ispell |
---|
386 | leaves a backup (.bak) file for each input file. |
---|
387 | The .bak file contains |
---|
388 | the pre-corrected text. If there are file opening / writing errors, |
---|
389 | the .bak file may be left for recovery purposes even with the |
---|
390 | .B \-x |
---|
391 | option. |
---|
392 | The default for this option is controlled by the DEFNOBACKUPFLAG |
---|
393 | installation option. |
---|
394 | .PP |
---|
395 | The |
---|
396 | .B \-B |
---|
397 | and |
---|
398 | .B \-C |
---|
399 | options control how |
---|
400 | .I ispell |
---|
401 | handles run-together words, such as "notthe" for "not the". |
---|
402 | If |
---|
403 | .B \-B |
---|
404 | is specified, such words will be considered as errors, and |
---|
405 | .I ispell |
---|
406 | will list variations with an inserted blank or hyphen as possible |
---|
407 | replacements. |
---|
408 | If |
---|
409 | .B \-C |
---|
410 | is specified, run-together words will be considered to be |
---|
411 | legal compounds, so long as both components are in the dictionary, and |
---|
412 | each component is at least as long as a language-dependent minimum (3 characters, by default). |
---|
413 | This is useful for languages such as German and Norwegian, where |
---|
414 | many compound words are formed by concatenation. |
---|
415 | (Note that compounds formed from three or more root words will still |
---|
416 | be considered errors). |
---|
417 | The default for this option is language-dependent; |
---|
418 | in a multi-lingual installation the default may vary depending on |
---|
419 | which dictionary you choose. |
---|
420 | .PP |
---|
421 | The |
---|
422 | .B \-P |
---|
423 | and |
---|
424 | .B \-m |
---|
425 | options control when |
---|
426 | .I ispell |
---|
427 | automatically generates suggested root/affix combinations for possible |
---|
428 | addition to your personal dictionary. |
---|
429 | (These are the entries in the "guess" list which are preceded by question |
---|
430 | marks.) |
---|
431 | If |
---|
432 | .B \-P |
---|
433 | is specified, such guesses are displayed only if |
---|
434 | .I ispell |
---|
435 | cannot generate any possibilities that match the current dictionary. |
---|
436 | If |
---|
437 | .B \-m |
---|
438 | is specified, such guesses are always displayed. |
---|
439 | This can be useful if the dictionary has a limited word list, or a word |
---|
440 | list with few suffixes. |
---|
441 | However, you should be careful when using this option, as it can |
---|
442 | generate guesses that produce illegal words. |
---|
443 | The default for this option is controlled by the dictionary file used. |
---|
444 | .PP |
---|
445 | The |
---|
446 | .B \-S |
---|
447 | option suppresses |
---|
448 | .IR ispell "'s" |
---|
449 | normal behavior of sorting the list of possible replacement words. |
---|
450 | Some people may prefer this, since it somewhat enhances the probability |
---|
451 | that the correct word will be low-numbered. |
---|
452 | .PP |
---|
453 | The |
---|
454 | .B \-d |
---|
455 | option is used to specify an alternate hashed dictionary file, |
---|
456 | other than the default. |
---|
457 | If the filename does not contain a "/", |
---|
458 | the library directory for the default dictionary file is prefixed; |
---|
459 | thus, to use a dictionary in the local directory "-d ./xxx.hash" must |
---|
460 | be used. |
---|
461 | This is useful to allow dictionaries for alternate languages. |
---|
462 | Unlike previous versions of |
---|
463 | .IR ispell , |
---|
464 | a dictionary of |
---|
465 | .IR /dev/null |
---|
466 | is illegal, because the dictionary contains the affix table. |
---|
467 | If you need an effectively empty dictionary, create a one-entry list |
---|
468 | with an unlikely string (e.g., "qqqqq"). |
---|
469 | .PP |
---|
470 | The |
---|
471 | .B \-p |
---|
472 | option is used to specify an alternate personal dictionary file. |
---|
473 | If the file name does not begin with "/", $HOME is prefixed. Also, the |
---|
474 | shell variable WORDLIST may be set, which renames the personal dictionary |
---|
475 | in the same manner. The command line overrides any WORDLIST setting. |
---|
476 | If neither the |
---|
477 | .B \-p |
---|
478 | switch nor the WORDLIST environment variable is given, |
---|
479 | .I ispell |
---|
480 | will search for a personal dictionary in both the current directory |
---|
481 | and $HOME, creating one in $HOME if none is found. |
---|
482 | The preferred name is constructed by appending ".ispell_" to the base name |
---|
483 | of the hash file. |
---|
484 | For example, if you use the English dictionary, your personal |
---|
485 | dictionary would be named ".ispell_english". |
---|
486 | However, if the file ".ispell_words" exists, it will be used as the |
---|
487 | personal dictionary regardless of the language hash file chosen. |
---|
488 | This feature is included primarily for backwards compatibility. |
---|
489 | .PP |
---|
490 | If the |
---|
491 | .B \-p |
---|
492 | option is |
---|
493 | .I not |
---|
494 | specified, |
---|
495 | .I ispell |
---|
496 | will look for personal dictionaries in both the current directory and |
---|
497 | the home directory. |
---|
498 | If dictionaries exist in both places, they will be merged. |
---|
499 | If any words are added to the personal dictionary, they will be |
---|
500 | written to the current directory if a dictionary already existed in |
---|
501 | that place; |
---|
502 | otherwise they will be written to the dictionary in the home directory. |
---|
503 | .PP |
---|
504 | The |
---|
505 | .B \-w |
---|
506 | option may be used to specify characters other than alphabetics |
---|
507 | which may also appear in words. For instance, |
---|
508 | .B \-w |
---|
509 | "&" will allow "AT&T" |
---|
510 | to be picked up. Underscores are useful in many technical documents. |
---|
511 | There is an admittedly crude provision in this option for 8-bit international |
---|
512 | characters. |
---|
513 | Non-printing characters may be specified in the usual way by inserting a |
---|
514 | backslash followed by the octal character code; |
---|
515 | e.g., "\e014" for a form feed. |
---|
516 | Alternatively, if "n" appears in the character string, the (up to) |
---|
517 | three characters |
---|
518 | following are a DECIMAL code 0 - 255, for the character. |
---|
519 | For example, to include bells and form feeds in your words (an admittedly |
---|
520 | silly thing to do, but aren't most pedagogical examples): |
---|
521 | .PP |
---|
522 | .RS |
---|
523 | n007n012 |
---|
524 | .RE |
---|
525 | .PP |
---|
526 | Numeric digits other than the three following "n" are simply numeric |
---|
527 | characters. Use of "n" does not conflict with anything because actual |
---|
528 | alphabetics have no meaning - alphabetics are already accepted. |
---|
529 | .I Ispell |
---|
530 | will typically be used with input from a file, meaning that preserving |
---|
531 | parity for possible 8 bit characters from the input text is OK. If you |
---|
532 | specify the -l option, and actually type text from the terminal, this may |
---|
533 | create problems if your stty settings preserve parity. |
---|
534 | .PP |
---|
535 | The |
---|
536 | .B \-W |
---|
537 | option may be used to change the length of words that |
---|
538 | .I ispell |
---|
539 | always accepts as legal. |
---|
540 | Normally, |
---|
541 | .I ispell |
---|
542 | will accept all 1-character words as legal, which is equivalent to |
---|
543 | specifying "\fB\-W 1\fR." |
---|
544 | (The default for this switch is actually controlled by the MINWORD |
---|
545 | installation option, so it may vary at your installation.) |
---|
546 | If you want all words to be checked against the dictionary, regardless |
---|
547 | of length, you might want to specify "\fB\-W 0\fR." |
---|
548 | On the other hand, if your document specifies a lot of three-letter acronyms, |
---|
549 | you would specify "\fB\-W 3\fR" to accept all words of three letters or |
---|
550 | less. |
---|
551 | Regardless of the setting of this option, |
---|
552 | .I ispell |
---|
553 | will only generate words that are in the dictionary as suggested replacements |
---|
554 | for words; |
---|
555 | this prevents the list from becoming too long. |
---|
556 | Obviously, this option can be very dangerous, since short misspellings may |
---|
557 | be missed. |
---|
558 | If you use this option a lot, you should probably make a last pass without it |
---|
559 | before you publish your document, to protect yourself against errors. |
---|
560 | .PP |
---|
561 | The |
---|
562 | .B \-T |
---|
563 | option is used to specify a default formatter type for use in |
---|
564 | generating string characters. |
---|
565 | This switch overrides the default type determined from |
---|
566 | the file name. |
---|
567 | The |
---|
568 | .I type |
---|
569 | argument may be either one of the unique names defined in the language |
---|
570 | affix file (e.g., |
---|
571 | .BR nroff ) |
---|
572 | or a file suffix including the dot (e.g., |
---|
573 | .BR .tex ). |
---|
574 | If no |
---|
575 | .B \-T |
---|
576 | option appears and no type can be determined from the file name, the default |
---|
577 | string character type declared in the |
---|
578 | language affix file will be used. |
---|
579 | .PP |
---|
580 | The |
---|
581 | .B \-l |
---|
582 | or "list" option to |
---|
583 | .I ispell |
---|
584 | is used to produce a list of misspelled words from the standard input. |
---|
585 | .PP |
---|
586 | The |
---|
587 | .B \-a |
---|
588 | option |
---|
589 | is intended to be used from other programs through a pipe. In this |
---|
590 | mode, |
---|
591 | .I ispell |
---|
592 | prints a one-line version identification message, and then begins |
---|
593 | reading lines of input. For each input line, |
---|
594 | a single line is written to the standard output for each word |
---|
595 | checked for spelling on the line. If the word |
---|
596 | was found in the main dictionary, or your personal dictionary, then the |
---|
597 | line contains only a '*'. If the word was found through affix removal, |
---|
598 | then the line contains a '+', a space, and the root word. |
---|
599 | If the word was found through compound formation (concatenation of two |
---|
600 | words, controlled by the |
---|
601 | .B \-C |
---|
602 | option), then the line contains only a '\-'. |
---|
603 | .PP |
---|
604 | If the word |
---|
605 | is not in the dictionary, but there are near misses, then the line |
---|
606 | contains an '&', a space, the misspelled word, a space, the number of |
---|
607 | near misses, |
---|
608 | the number of |
---|
609 | characters between the beginning of the line and the |
---|
610 | beginning of the misspelled word, a colon, another space, |
---|
611 | and a list of the near |
---|
612 | misses separated by |
---|
613 | commas and spaces. |
---|
614 | Following the near misses (and identified only by the count of near |
---|
615 | misses), if the word could be formed by adding |
---|
616 | (illegal) affixes to a known root, |
---|
617 | is a list of suggested derivations, again separated by commas and spaces. |
---|
618 | If there are no near misses at all, the line format is the same, except |
---|
619 | that the '&' is replaced by '?' (and the near-miss count is always zero). |
---|
620 | The suggested derivations following the near misses are in the form: |
---|
621 | .PP |
---|
622 | .RS |
---|
623 | [prefix+] root [-prefix] [-suffix] [+suffix] |
---|
624 | .RE |
---|
625 | .PP |
---|
626 | (e.g., "re+fry-y+ies" to get "refries") |
---|
627 | where each optional |
---|
628 | .I pfx |
---|
629 | and |
---|
630 | .I sfx |
---|
631 | is a string. |
---|
632 | Also, each near miss or guess is capitalized the same as the input |
---|
633 | word unless such capitalization is illegal; |
---|
634 | in the latter case each near miss is capitalized correctly |
---|
635 | according to the dictionary. |
---|
636 | .PP |
---|
637 | Finally, if the word does not appear in the dictionary, and |
---|
638 | there are no near misses, then the line contains a '#', a space, |
---|
639 | the misspelled word, a space, |
---|
640 | and the character offset from the beginning of the line. |
---|
641 | Each sentence of text input is terminated |
---|
642 | with an additional blank line, indicating that |
---|
643 | .I ispell |
---|
644 | has completed processing the input line. |
---|
645 | .PP |
---|
646 | These output lines can be summarized as follows: |
---|
647 | .PP |
---|
648 | .RS |
---|
649 | .IP OK: |
---|
650 | * |
---|
651 | .IP Root: |
---|
652 | + <root> |
---|
653 | .IP Compound: |
---|
654 | \- |
---|
655 | .IP Miss: |
---|
656 | & <original> <count> <offset>: <miss>, <miss>, ..., <guess>, ... |
---|
657 | .IP Guess: |
---|
658 | ? <original> 0 <offset>: <guess>, <guess>, ... |
---|
659 | .IP None: |
---|
660 | # <original> <offset> |
---|
661 | .RE |
---|
662 | .PP |
---|
663 | For example, a dummy dictionary containing the words "fray", "Frey", |
---|
664 | "fry", and "refried" might produce the following response to the |
---|
665 | command "echo 'frqy refries | ispell -a -m -d ./test.hash": |
---|
666 | .RS |
---|
667 | .nf |
---|
668 | (#) International Ispell Version 3.0.05 (beta), 08/10/91 |
---|
669 | & frqy 3 0: fray, Frey, fry |
---|
670 | & refries 1 5: refried, re+fry-y+ies |
---|
671 | .fi |
---|
672 | .RE |
---|
673 | .PP |
---|
674 | This mode |
---|
675 | is also suitable for interactive use when you want to figure out the |
---|
676 | spelling of a single word. |
---|
677 | .PP |
---|
678 | The |
---|
679 | .B \-A |
---|
680 | option works just like |
---|
681 | .BR \-a , |
---|
682 | except that if a line begins with the string "&Include_File&", the rest |
---|
683 | of the line is taken as the name of a file to read for further words. |
---|
684 | Input returns to the original file when the include file is exhausted. |
---|
685 | Inclusion may be nested up to five deep. |
---|
686 | The key string may be changed with the environment variable |
---|
687 | .B INCLUDE_STRING |
---|
688 | (the ampersands, if any, must be included). |
---|
689 | .PP |
---|
690 | When in the |
---|
691 | .B \-a |
---|
692 | mode, |
---|
693 | .I ispell |
---|
694 | will also accept lines of single words prefixed with any |
---|
695 | of '*', '&', '@', '+', '-', '~', '#', '!', '%', or '^'. |
---|
696 | A line starting with '*' tells |
---|
697 | .I ispell |
---|
698 | to insert the word into the user's dictionary (similar to the I command). |
---|
699 | A line starting with '&' tells |
---|
700 | .I ispell |
---|
701 | to insert an all-lowercase version of the word into the user's |
---|
702 | dictionary (similar to the U command). |
---|
703 | A line starting with '@' causes |
---|
704 | .I ispell |
---|
705 | to accept this word in the future (similar to the A command). |
---|
706 | A line starting with '+', followed immediately by |
---|
707 | .B tex |
---|
708 | or |
---|
709 | .B nroff |
---|
710 | will cause |
---|
711 | .I ispell |
---|
712 | to parse future input according the syntax of that formatter. |
---|
713 | A line consisting solely of a '+' will place |
---|
714 | .I ispell |
---|
715 | in TeX/LaTeX mode (similar to the |
---|
716 | .B \-t |
---|
717 | option) and '-' returns |
---|
718 | .I ispell |
---|
719 | to nroff/troff mode (but these commands are obsolete). |
---|
720 | However, string character type is |
---|
721 | .I not |
---|
722 | changed; |
---|
723 | the '~' command must be used to do this. |
---|
724 | A line starting with '~' causes |
---|
725 | .I ispell |
---|
726 | to set internal parameters (in particular, the default string |
---|
727 | character type) based on the filename given in the rest of the line. |
---|
728 | (A file suffix is sufficient, but the period must be included. |
---|
729 | Instead of a file name or suffix, a unique name, as listed in the language |
---|
730 | affix file, may be specified.) |
---|
731 | However, the formatter parsing is |
---|
732 | .I not |
---|
733 | changed; the '+' command must be used to change the formatter. |
---|
734 | A line prefixed with '#' will cause the |
---|
735 | personal dictionary to be saved. |
---|
736 | A line prefixed with '!' will turn on |
---|
737 | .I terse |
---|
738 | mode (see below), and a line prefixed with '%' will return |
---|
739 | .I ispell |
---|
740 | to normal (non-terse) mode. |
---|
741 | Any input following the prefix |
---|
742 | characters '+', '-', '#', '!', or '%' is ignored, as is any input |
---|
743 | following the filename on a '~' line. |
---|
744 | To allow spell-checking of lines beginning with these characters, a |
---|
745 | line starting with '^' has that character removed before it is passed |
---|
746 | to the spell-checking code. |
---|
747 | It is recommended that programmatic interfaces prefix every data line |
---|
748 | with an uparrow to protect themselves against future changes in |
---|
749 | .IR ispell . |
---|
750 | .PP |
---|
751 | To summarize these: |
---|
752 | .PP |
---|
753 | .RS |
---|
754 | .IP * |
---|
755 | Add to personal dictionary |
---|
756 | .IP @ |
---|
757 | Accept word, but leave out of dictionary |
---|
758 | .IP # |
---|
759 | Save current personal dictionary |
---|
760 | .IP ~ |
---|
761 | Set parameters based on filename |
---|
762 | .IP + |
---|
763 | Enter TeX mode |
---|
764 | .IP - |
---|
765 | Exit TeX mode |
---|
766 | .IP ! |
---|
767 | Enter terse mode |
---|
768 | .IP % |
---|
769 | Exit terse mode |
---|
770 | .IP ^ |
---|
771 | Spell-check rest of line |
---|
772 | .fi |
---|
773 | .RE |
---|
774 | .PP |
---|
775 | In |
---|
776 | .I terse |
---|
777 | mode, |
---|
778 | .I ispell |
---|
779 | will not print lines beginning with '*', '+', or '\-', all of which |
---|
780 | indicate correct words. |
---|
781 | This significantly improves running speed when the driving program is |
---|
782 | going to ignore correct words anyway. |
---|
783 | .PP |
---|
784 | The |
---|
785 | .B \-s |
---|
786 | option is only valid in conjunction with the |
---|
787 | .B \-a |
---|
788 | or |
---|
789 | .B \-A |
---|
790 | options, and only on BSD-derived systems. |
---|
791 | If specified, |
---|
792 | .I ispell |
---|
793 | will stop itself with a |
---|
794 | .B SIGTSTP |
---|
795 | signal after each line of input. |
---|
796 | It will not read more input until it receives a |
---|
797 | .B SIGCONT |
---|
798 | signal. |
---|
799 | This may be useful for handshaking with certain text editors. |
---|
800 | .PP |
---|
801 | The |
---|
802 | .B \-f |
---|
803 | option is only valid in conjunction with the |
---|
804 | .B \-a |
---|
805 | or |
---|
806 | .B \-A |
---|
807 | options. |
---|
808 | If |
---|
809 | .B \-f |
---|
810 | is specified, |
---|
811 | .I ispell |
---|
812 | will write its results to the given file, rather than to standard output. |
---|
813 | .PP |
---|
814 | The |
---|
815 | .B \-v |
---|
816 | option causes |
---|
817 | .I ispell |
---|
818 | to print its current version identification on the standard output |
---|
819 | and exit. |
---|
820 | If the switch is doubled, |
---|
821 | .I ispell |
---|
822 | will also print the options that it was compiled with. |
---|
823 | .PP |
---|
824 | The |
---|
825 | .BR \-c , |
---|
826 | .BR \-e [ 1-4 ], |
---|
827 | and |
---|
828 | .B \-D |
---|
829 | options of |
---|
830 | .IR ispell , |
---|
831 | are primarily intended for use by the |
---|
832 | .I munchlist |
---|
833 | shell script. |
---|
834 | The |
---|
835 | .B \-c |
---|
836 | switch causes a list of words to be read from the standard input. |
---|
837 | For each word, a list of possible root words and affixes will be |
---|
838 | written to the standard output. |
---|
839 | Some of the root words will be illegal and must be filtered from the |
---|
840 | output by other means; |
---|
841 | the |
---|
842 | .I munchlist |
---|
843 | script does this. |
---|
844 | As an example, the command: |
---|
845 | .PP |
---|
846 | .RS |
---|
847 | echo BOTHER | ispell -c |
---|
848 | .RE |
---|
849 | .PP |
---|
850 | produces: |
---|
851 | .PP |
---|
852 | .RS |
---|
853 | .nf |
---|
854 | BOTHER BOTHE/R BOTH/R |
---|
855 | .fi |
---|
856 | .RE |
---|
857 | .PP |
---|
858 | The |
---|
859 | .B \-e |
---|
860 | switch is the reverse of |
---|
861 | .BR \-c ; |
---|
862 | it expands affix flags to produce a list of words. |
---|
863 | For example, the command: |
---|
864 | .PP |
---|
865 | .RS |
---|
866 | echo BOTH/R | ispell -e |
---|
867 | .RE |
---|
868 | .PP |
---|
869 | produces: |
---|
870 | .PP |
---|
871 | .RS |
---|
872 | .nf |
---|
873 | BOTH BOTHER |
---|
874 | .fi |
---|
875 | .RE |
---|
876 | .PP |
---|
877 | An optional expansion level can also be specified. A level of 1 |
---|
878 | .RB ( \-e1 ) |
---|
879 | is the same as |
---|
880 | .B \-e |
---|
881 | alone. |
---|
882 | A level of 2 causes the original root/affix combination to be |
---|
883 | prepended to the line: |
---|
884 | .PP |
---|
885 | .RS |
---|
886 | .nf |
---|
887 | BOTH/R BOTH BOTHER |
---|
888 | .fi |
---|
889 | .RE |
---|
890 | .PP |
---|
891 | A level of 3 causes multiple lines to be output, one for each |
---|
892 | generated word, with the original root/affix combination followed by |
---|
893 | the word it creates: |
---|
894 | .PP |
---|
895 | .RS |
---|
896 | .nf |
---|
897 | BOTH/R BOTH |
---|
898 | BOTH/R BOTHER |
---|
899 | .fi |
---|
900 | .RE |
---|
901 | .PP |
---|
902 | A level of 4 causes a floating-point number to be appended to each of |
---|
903 | the level-3 lines, giving the ratio between the length of the root and |
---|
904 | the total length of all generated words including the root: |
---|
905 | .PP |
---|
906 | .RS |
---|
907 | .nf |
---|
908 | BOTH/R BOTH 2.500000 |
---|
909 | BOTH/R BOTHER 2.500000 |
---|
910 | .fi |
---|
911 | .RE |
---|
912 | .PP |
---|
913 | Finally, the |
---|
914 | .B \-D |
---|
915 | flag causes the affix tables from the dictionary file |
---|
916 | to be dumped to standard output. |
---|
917 | .PP |
---|
918 | Unless your system administrator has suppressed the feature to save space, |
---|
919 | .I ispell |
---|
920 | is aware of the correct capitalizations of words in the dictionary and |
---|
921 | in your personal dictionary. |
---|
922 | As well as recognizing words that must be capitalized (e.g., George) and |
---|
923 | words that must be all-capitals (e.g., NASA), it can also handle words |
---|
924 | with "unusual" capitalization (e.g., "ITCorp" or "TeX"). |
---|
925 | If a word is capitalized incorrectly, the list of possibilities will |
---|
926 | include all acceptable capitalizations. |
---|
927 | (More than one capitalization may be acceptable; |
---|
928 | for example, my dictionary lists both "ITCorp" and "ITcorp".) |
---|
929 | .PP |
---|
930 | Normally, this feature will not cause you surprises, but there is one |
---|
931 | circumstance you need to be aware of. |
---|
932 | If you use "I" to |
---|
933 | add a word to your dictionary that is at the beginning of a sentence |
---|
934 | (e.g., the first word of this paragraph if "normally" were not in the |
---|
935 | dictionary), it will be marked as "capitalization required". |
---|
936 | A subsequent usage of this word without capitalization (e.g., the quoted word |
---|
937 | in the previous sentence) will be considered a misspelling by |
---|
938 | .IR ispell , |
---|
939 | and it will suggest the capitalized version. |
---|
940 | You must then compare the actual spellings by eye, and then type "I" |
---|
941 | to add the uncapitalized variant to your personal dictionary. |
---|
942 | You can avoid this problem by using "U" to add the original word, |
---|
943 | rather than "I". |
---|
944 | .PP |
---|
945 | The rules for capitalization are as follows: |
---|
946 | .IP (1) |
---|
947 | Any word may appear in all capitals, as in headings. |
---|
948 | .IP (2) |
---|
949 | Any word that is in the dictionary in all-lowercase form may appear |
---|
950 | either in lowercase or capitalized (as at the beginning of a sentence). |
---|
951 | .IP (3) |
---|
952 | Any word that has "funny" capitalization (i.e., it contains both cases |
---|
953 | and there is an uppercase character besides the first) must appear |
---|
954 | exactly as in the dictionary, except as permitted by rule (1). |
---|
955 | If the word is acceptable in all-lowercase, it must appear thus in a |
---|
956 | dictionary entry. |
---|
957 | .SS buildhash |
---|
958 | .PP |
---|
959 | The |
---|
960 | .I buildhash |
---|
961 | program builds hashed dictionary files for later use by |
---|
962 | .I ispell. |
---|
963 | The raw word list (with affix flags) is given in |
---|
964 | .IR dict-file , |
---|
965 | and the the affix flags are defined by |
---|
966 | .IR affix-file . |
---|
967 | The hashed output is written to |
---|
968 | .IR hash-file . |
---|
969 | The formats of the two input files are described in |
---|
970 | .IR ispell (4). |
---|
971 | The |
---|
972 | .B \-s |
---|
973 | (silent) option suppresses the usual status messages that are written |
---|
974 | to the standard error device. |
---|
975 | .SS munchlist |
---|
976 | .PP |
---|
977 | The |
---|
978 | .I munchlist |
---|
979 | shell script is used to reduce the size of dictionary files, |
---|
980 | primarily personal dictionary files. |
---|
981 | It is also capable of combining dictionaries from various sources. |
---|
982 | The given |
---|
983 | .I files |
---|
984 | are read (standard input if no arguments are given), |
---|
985 | reduced to a minimal set of roots and affixes that will match the |
---|
986 | same list of words, and written to standard output. |
---|
987 | .PP |
---|
988 | Input for munchlist contains of raw words (e.g from your personal |
---|
989 | dictionary files) or root and affix combinations (probably generated |
---|
990 | in earlier munchlist runs). Each word or root/affix combination must |
---|
991 | be on a separate line. |
---|
992 | .PP |
---|
993 | The |
---|
994 | .B \-D |
---|
995 | (debug) option leaves temporary files around under standard names instead |
---|
996 | of deleting them, so that the script can be debugged. |
---|
997 | Warning: |
---|
998 | this option can eat up an enormous amount of temporary file space. |
---|
999 | .PP |
---|
1000 | The |
---|
1001 | .B \-v |
---|
1002 | (verbose) option causes progress messages to be reported to stderr so |
---|
1003 | you won't get nervous that |
---|
1004 | .I munchlist |
---|
1005 | has hung. |
---|
1006 | .PP |
---|
1007 | If the |
---|
1008 | .B \-s |
---|
1009 | (strip) option is specified, words that are in the specified |
---|
1010 | .I hash-file |
---|
1011 | are removed from the word list. |
---|
1012 | This can be useful with personal dictionaries. |
---|
1013 | .PP |
---|
1014 | The |
---|
1015 | .B \-l |
---|
1016 | option can be used to specify an alternate |
---|
1017 | .I affix-file |
---|
1018 | for munching dictionaries in languages other than English. |
---|
1019 | .PP |
---|
1020 | The |
---|
1021 | .B \-c |
---|
1022 | option can be used to convert dictionaries that were built with an |
---|
1023 | older affix file, without risk of accidentally introducing unintended |
---|
1024 | affix combinations into the dictionary. |
---|
1025 | .PP |
---|
1026 | The |
---|
1027 | .B \-T |
---|
1028 | option allows dictionaries to be converted to a canonical |
---|
1029 | string-character format. |
---|
1030 | The suffix specified is looked up in the affix file |
---|
1031 | .RB ( \-l |
---|
1032 | switch) |
---|
1033 | to determine the string-character format used for the input file; |
---|
1034 | the output always uses the canonical string-character format. |
---|
1035 | For example, a dictionary collected from TeX source files might be |
---|
1036 | converted to canonical format by specifying |
---|
1037 | .BR "\-T tex" . |
---|
1038 | .PP |
---|
1039 | The |
---|
1040 | .B \-w |
---|
1041 | option is passed on to |
---|
1042 | .IR ispell . |
---|
1043 | .SS findaffix |
---|
1044 | .PP |
---|
1045 | The |
---|
1046 | .I findaffix |
---|
1047 | shell script is an aid to writers of new language descriptions in choosing |
---|
1048 | affixes. |
---|
1049 | The given dictionary |
---|
1050 | .I files |
---|
1051 | (standard input if none are given) are examined for possible prefixes |
---|
1052 | .RB ( \-p |
---|
1053 | switch) or suffixes |
---|
1054 | .RB ( \-s |
---|
1055 | switch, the default). |
---|
1056 | Each commonly-occurring affix is presented along with |
---|
1057 | a count of the number of times it appears |
---|
1058 | and an estimate of the number of bytes that would be saved in a dictionary |
---|
1059 | hash file if it were added to the language table. |
---|
1060 | Only affixes that generate legal roots (found in the original input) |
---|
1061 | are listed. |
---|
1062 | .PP |
---|
1063 | If the "-c" option is not given, the output lines are in the |
---|
1064 | following format: |
---|
1065 | .IP |
---|
1066 | strip/add/count/bytes |
---|
1067 | .PP |
---|
1068 | where |
---|
1069 | .I strip |
---|
1070 | is the string that should be stripped from a root |
---|
1071 | word before adding the affix, |
---|
1072 | .I add |
---|
1073 | is the affix to be added, |
---|
1074 | .I count |
---|
1075 | is a count of the number of times that this |
---|
1076 | .IR strip / add |
---|
1077 | combination appears, and |
---|
1078 | .I bytes |
---|
1079 | is an estimate of the number of bytes that |
---|
1080 | might be saved in the raw dictionary file if this combination is |
---|
1081 | added to the affix file. |
---|
1082 | The field separator in the output will |
---|
1083 | be the tab character specified by the |
---|
1084 | .B -t |
---|
1085 | switch; the default is a slash ("/"). |
---|
1086 | .PP |
---|
1087 | If the |
---|
1088 | .B \-c |
---|
1089 | ("clean output") option is given, the appearance of |
---|
1090 | the output is made visually cleaner (but harder to post-process) |
---|
1091 | by changing it to: |
---|
1092 | .IP |
---|
1093 | -strip+add<tab>count<tab>bytes |
---|
1094 | .PP |
---|
1095 | where |
---|
1096 | .IR strip , |
---|
1097 | .IR add , |
---|
1098 | .IR count , |
---|
1099 | and |
---|
1100 | .I bytes |
---|
1101 | are as before, and |
---|
1102 | .I "<tab>" |
---|
1103 | represents the ASCII tab character. |
---|
1104 | .PP |
---|
1105 | The method used to generate possible affixes will also generate |
---|
1106 | longer affixes which have common headers or trailers. For example, |
---|
1107 | the two words "moth" and "mother" will generate not only the obvious |
---|
1108 | substitution "+er" but also "-h+her" and "-th+ther" (and possibly |
---|
1109 | even longer ones, depending on the value of |
---|
1110 | .IR min ). |
---|
1111 | To prevent |
---|
1112 | cluttering the output with such affixes, any affix pair that shares |
---|
1113 | a common header (or, for prefixes, trailer) string longer than |
---|
1114 | .I elim |
---|
1115 | characters (default 1) will be suppressed. |
---|
1116 | You may want to set "elim" to a value greater than 1 if your language has string |
---|
1117 | characters; |
---|
1118 | usually the need for this parameter will become obvious |
---|
1119 | when you examine the output of your |
---|
1120 | .I findaffix |
---|
1121 | run. |
---|
1122 | .PP |
---|
1123 | Normally, the affixes are sorted according to the estimate of bytes saved. |
---|
1124 | The |
---|
1125 | .B \-f |
---|
1126 | switch may be used to cause the affixes to be sorted by frequency of |
---|
1127 | appearance. |
---|
1128 | .PP |
---|
1129 | To save output file space, |
---|
1130 | affixes which occur fewer than 10 times are eliminated; |
---|
1131 | this limit may be changed with the |
---|
1132 | .B \-l |
---|
1133 | switch. |
---|
1134 | The |
---|
1135 | .B \-M |
---|
1136 | switch specifies a maximum affix length (default 8). |
---|
1137 | Affixes longer than this will not be reported. |
---|
1138 | (This saves on temporary disk space and makes the script run faster.) |
---|
1139 | .PP |
---|
1140 | Affixes which generate stems shorter than 3 characters are suppressed. |
---|
1141 | (A stem is the word after the |
---|
1142 | .I strip |
---|
1143 | string has been removed, and before the |
---|
1144 | .I add |
---|
1145 | string has been added.) |
---|
1146 | This reduces both the running time and the size of the output file. |
---|
1147 | This limit may be changed with the |
---|
1148 | .B \-m |
---|
1149 | switch. |
---|
1150 | The minimum stem length should only be set to 1 if you have a |
---|
1151 | .I lot |
---|
1152 | of free time and disk space (in the range of many days and hundreds of |
---|
1153 | megabytes). |
---|
1154 | .PP |
---|
1155 | The |
---|
1156 | .I findaffix |
---|
1157 | script requires a non-blank field-separator character for internal |
---|
1158 | use. |
---|
1159 | Normally, this character is a slash ("/"), but if the slash |
---|
1160 | appears as a character in the input word list, a different character |
---|
1161 | can be specified with the |
---|
1162 | .B \-t |
---|
1163 | switch. |
---|
1164 | .PP |
---|
1165 | Ispell dictionaries should be expanded before being fed to |
---|
1166 | .IR findaffix ; |
---|
1167 | in addition, characters that are not in the English alphabet (if any) should |
---|
1168 | be translated to lowercase. |
---|
1169 | .SS tryaffix |
---|
1170 | .PP |
---|
1171 | The |
---|
1172 | .I tryaffix |
---|
1173 | shell script is used to estimate the effectiveness of a proposed |
---|
1174 | prefix |
---|
1175 | .RB ( \-p |
---|
1176 | switch) or suffix |
---|
1177 | .RB ( \-s |
---|
1178 | switch, the default) with a given |
---|
1179 | .IR expanded-file . |
---|
1180 | Only one affix can be tried with each execution of |
---|
1181 | .IR tryaffix , |
---|
1182 | although multiple arguments can be used to describe varying forms of the |
---|
1183 | same affix flag (e.g., the |
---|
1184 | .B D |
---|
1185 | flag for English can add either |
---|
1186 | .I D |
---|
1187 | or |
---|
1188 | .I ED |
---|
1189 | depending on whether a trailing E is already present). |
---|
1190 | Each word in the expanded dictionary that ends (or begins) with the chosen |
---|
1191 | suffix (or prefix) has that suffix (prefix) removed; |
---|
1192 | the dictionary is then searched for root words that match the stripped word. |
---|
1193 | Normally, all matching roots are written to standard output, but if the |
---|
1194 | .B \-c |
---|
1195 | (count) flag is given, only a statistical summary of the results is written. |
---|
1196 | The statistics given are a count of words the affix potentially applies to |
---|
1197 | and an estimate of the number of dictionary bytes that a flag using the |
---|
1198 | affix would save. |
---|
1199 | The estimate will be high if the flag generates words |
---|
1200 | that are currently generated by other affix flags |
---|
1201 | (e.g., in English, |
---|
1202 | .I bathers |
---|
1203 | can be generated by either |
---|
1204 | .I bath/X |
---|
1205 | or |
---|
1206 | .IR bather/S ). |
---|
1207 | .P |
---|
1208 | The dictionary file, |
---|
1209 | .IR expanded-file , |
---|
1210 | must already be expanded (using the |
---|
1211 | .B \-e |
---|
1212 | switch of |
---|
1213 | .IR ispell ) |
---|
1214 | and sorted, and things will usually work best if uppercase |
---|
1215 | has been folded to lower with 'tr'. |
---|
1216 | .PP |
---|
1217 | The |
---|
1218 | .I affix |
---|
1219 | arguments are things to be stripped from the dictionary |
---|
1220 | file to produce trial roots: |
---|
1221 | for English, |
---|
1222 | .I con |
---|
1223 | (prefix) and |
---|
1224 | .I ing |
---|
1225 | (suffix) are examples. |
---|
1226 | The |
---|
1227 | .I addition |
---|
1228 | parts of the argument are letters that would have |
---|
1229 | been stripped off the root before adding the affix. |
---|
1230 | For example, in English the affix |
---|
1231 | .I ing |
---|
1232 | normally strips |
---|
1233 | .I e |
---|
1234 | for words ending in that letter (e.g., |
---|
1235 | .I like |
---|
1236 | becomes |
---|
1237 | .IR liking ) |
---|
1238 | so we might run: |
---|
1239 | .PP |
---|
1240 | .RS |
---|
1241 | .nf |
---|
1242 | tryaffix ing ing+e |
---|
1243 | .fi |
---|
1244 | .RE |
---|
1245 | .PP |
---|
1246 | to cover both cases. |
---|
1247 | .PP |
---|
1248 | All of the shell scripts contain documentation as commentary at the |
---|
1249 | beginning; |
---|
1250 | sometimes these comments contain useful information beyond the scope |
---|
1251 | of this manual page. |
---|
1252 | .PP |
---|
1253 | It is possible to install |
---|
1254 | .I ispell |
---|
1255 | in such a way as to only support ASCII range text if desired. |
---|
1256 | .SS icombine |
---|
1257 | The |
---|
1258 | .I icombine |
---|
1259 | program is a helper for |
---|
1260 | .IR munchlist . |
---|
1261 | It reads a list of words in dictionary format (roots plus flags) from |
---|
1262 | the standard input, and produces a reduced list on standard output |
---|
1263 | which combines common roots found on adjacent entries. |
---|
1264 | Identical roots which have differing flags will have their flags |
---|
1265 | combined, and roots which have differing capitalizations will be |
---|
1266 | combined in a way which only preserves important capitalization |
---|
1267 | information. |
---|
1268 | The optional |
---|
1269 | .I aff-file |
---|
1270 | specifies a language file which defines the character sets used and |
---|
1271 | the meanings of the various flags. |
---|
1272 | The |
---|
1273 | .B \-T |
---|
1274 | switch can be used to select among alternative string character types |
---|
1275 | by giving a dummy suffix that can be found in an |
---|
1276 | .B altstringtype |
---|
1277 | statement. |
---|
1278 | .SS ijoin |
---|
1279 | The |
---|
1280 | .I ijoin |
---|
1281 | program is a re-implementation of |
---|
1282 | .IR join (1) |
---|
1283 | which handles long lines and 8-bit characters correctly. |
---|
1284 | The |
---|
1285 | .B \-s |
---|
1286 | switch specifies that the |
---|
1287 | .IR sort (1) |
---|
1288 | program used to prepare the input to |
---|
1289 | .I ijoin |
---|
1290 | uses signed comparisons on 8-bit characters; |
---|
1291 | the |
---|
1292 | .B \-u |
---|
1293 | switch specifies that |
---|
1294 | .IR sort (1) |
---|
1295 | uses unsigned comparisons. |
---|
1296 | All other options and behaviors of |
---|
1297 | .IR join (1) |
---|
1298 | are duplicated as exactly as possible based on the manual page, except |
---|
1299 | that |
---|
1300 | .I ijoin |
---|
1301 | will not handle newline as a field separator. |
---|
1302 | See the |
---|
1303 | .IR join (1) |
---|
1304 | manual page for more information. |
---|
1305 | .SH ENVIRONMENT |
---|
1306 | .IP DICTIONARY |
---|
1307 | Default dictionary to use, if no |
---|
1308 | .B \-d |
---|
1309 | flag is given. |
---|
1310 | .IP WORDLIST |
---|
1311 | Personal dictionary file name |
---|
1312 | .IP INCLUDE_STRING |
---|
1313 | Code for file inclusion under the |
---|
1314 | .B \-A |
---|
1315 | option |
---|
1316 | .IP TMPDIR |
---|
1317 | Directory used for some of munchlist's temporary files |
---|
1318 | .SH FILES |
---|
1319 | .IP !!LIBDIR!!/!!DEFHASH!! |
---|
1320 | Hashed dictionary (may be found in some other local directory, |
---|
1321 | depending on the system). |
---|
1322 | .IP !!LIBDIR!!/!!DEFLANG!! |
---|
1323 | Affix-definition file for |
---|
1324 | .I munchlist |
---|
1325 | .IP "/usr/dict/web2 or /usr/dict/words" |
---|
1326 | For the Lookup function (depending on the WORDS compilation option). |
---|
1327 | .IP $HOME/.ispell_\fIhashfile\fP |
---|
1328 | User's private dictionary |
---|
1329 | .IP .ispell_\fIhashfile\fP |
---|
1330 | Directory-specific private dictionary |
---|
1331 | .SH SEE ALSO |
---|
1332 | .IR spell (1), |
---|
1333 | .IR egrep (1), |
---|
1334 | .IR look (1), |
---|
1335 | .IR join (1), |
---|
1336 | .IR sort (1), |
---|
1337 | .IR sq (1L), |
---|
1338 | .IR tib (1L), |
---|
1339 | .IR ispell (4L), |
---|
1340 | .IR english (4L) |
---|
1341 | .SH BUGS |
---|
1342 | It takes several to many seconds for |
---|
1343 | .I ispell |
---|
1344 | to read in the hash table, depending on size. |
---|
1345 | .sp |
---|
1346 | When all options are enabled, |
---|
1347 | .I ispell |
---|
1348 | may take several seconds to generate all the guesses at corrections for |
---|
1349 | a misspelled word; |
---|
1350 | on slower machines this time is long enough to be annoying. |
---|
1351 | .sp |
---|
1352 | The hash table is stored as a quarter-megabyte (or larger) array, so a PDP-11 |
---|
1353 | or 286 version does not seem likely. |
---|
1354 | .sp |
---|
1355 | .I Ispell |
---|
1356 | should understand more |
---|
1357 | .I troff |
---|
1358 | syntax, and deal more intelligently with contractions. |
---|
1359 | .sp |
---|
1360 | Although small personal dictionaries are sorted before they are written out, |
---|
1361 | the order of capitalizations of the same word is somewhat random. |
---|
1362 | .sp |
---|
1363 | When the |
---|
1364 | .B \-x |
---|
1365 | flag is specified, |
---|
1366 | .I ispell |
---|
1367 | will unlink any existing .bak file. |
---|
1368 | .sp |
---|
1369 | There are too many flags, and many of them have non-mnemonic names. |
---|
1370 | .sp |
---|
1371 | .I Munchlist |
---|
1372 | does not deal very gracefully with dictionaries which contain |
---|
1373 | "non-word" characters. |
---|
1374 | Such characters ought to be deleted from the dictionary with a warning |
---|
1375 | message. |
---|
1376 | .sp |
---|
1377 | .I Findaffix |
---|
1378 | and |
---|
1379 | .I munchlist |
---|
1380 | require tremendous amounts of temporary file space for |
---|
1381 | large dictionaries. |
---|
1382 | They do respect the TMPDIR environment variable, so this space can be |
---|
1383 | redirected. |
---|
1384 | However, a lot of the temporary space needed is for sorting, so TMPDIR |
---|
1385 | is only a partial help on systems with an uncooperative |
---|
1386 | .IR sort (1). |
---|
1387 | ("Cooperative" is defined as accepting the undocumented -T switch). |
---|
1388 | At its peak usage, |
---|
1389 | .I munchlist |
---|
1390 | takes 10 to 40 times the original |
---|
1391 | dictionary's size in Kb. |
---|
1392 | (The larger ratio is for dictionaries that already have heavy affix |
---|
1393 | use, such as the one distributed with |
---|
1394 | .IR ispell ). |
---|
1395 | .I Munchlist |
---|
1396 | is also very slow; |
---|
1397 | munching a normal-sized dictionary (15K roots, 45K expanded words) takes |
---|
1398 | around an hour on a small workstation. |
---|
1399 | (Most of this time is spent in |
---|
1400 | .IR sort (1), |
---|
1401 | and |
---|
1402 | .I munchlist |
---|
1403 | can run much faster on machines that have a more modern |
---|
1404 | .I sort |
---|
1405 | that makes better use of the memory available to it.) |
---|
1406 | .I Findaffix |
---|
1407 | is even worse; |
---|
1408 | the smallest English dictionary cannot be processed with this script in |
---|
1409 | a mere 50Kb of free space, and even after specifying switches to |
---|
1410 | reduce the temporary space required, the script will run for over 24 hours |
---|
1411 | on a small workstation. |
---|
1412 | .SH AUTHOR |
---|
1413 | Pace Willisson (pace@mit-vax), 1983, based on the PDP-10 assembly version. |
---|
1414 | That version was written by |
---|
1415 | R. E. Gorin in 1971, |
---|
1416 | and later revised by W. E. Matson (1974) and W. B. Ackerman (1978). |
---|
1417 | .P |
---|
1418 | Collected, revised, and enhanced for the Usenet by Walt Buehring, 1987. |
---|
1419 | .P |
---|
1420 | Table-driven multi-lingual version by Geoff Kuenning, 1987-88. |
---|
1421 | .P |
---|
1422 | Large dictionaries provided by Bob Devine (vianet!devine). |
---|
1423 | .P |
---|
1424 | A complete list of contributors is too large to list here, but is |
---|
1425 | distributed with the ispell sources in the file "Contributors". |
---|
1426 | .SH VERSION |
---|
1427 | The version of ispell described by this manual page is |
---|
1428 | International Ispell Version 3.1.00, 10/08/93. |
---|