[10333] | 1 | Things remaining to be done to ispell: |
---|
| 2 | |
---|
| 3 | - It might be nice to support multiple personal dictionaries. On |
---|
| 4 | the other hand, it's pretty easy to combine them with "cat". |
---|
| 5 | - A small amount of string space could be saved if buildhash would |
---|
| 6 | combine strings with common suffixes (e.g., "and" could be stored |
---|
| 7 | as a pointer to the tail of "bland"). |
---|
| 8 | - (Pace Willisson) Pace's latest version of ispell compresses |
---|
| 9 | common digrams to reduce the size of the hash file, and stores |
---|
| 10 | shorter words in the dictionary entry itself. (Since the digram |
---|
| 11 | compression also reduces word size, this is a big win). He also |
---|
| 12 | improved startup time, at a slight running-time penalty, by |
---|
| 13 | eliminating the mass conversion of string indexes to pointers |
---|
| 14 | and just using the indexes as such whenever a string is accessed. |
---|
| 15 | - (Peter Wan) Ispell should have a "server mode" for large sites, to |
---|
| 16 | get rid of the time needed to read in the dictionary. On System V, |
---|
| 17 | this could be accomplished by having the first execution of ispell |
---|
| 18 | read the dictionary into a shared-memory region. Later incarnations |
---|
| 19 | would then get the dictionary by just attaching to the region. |
---|
| 20 | One problem would be that the dictionary gets modified during |
---|
| 21 | the run, so you might still have to do a memory-to-memory copy |
---|
| 22 | after the attach. The size of having two copies of the dictionary |
---|
| 23 | might prohibit this on many machines. Another approach is a |
---|
| 24 | message-based "good.c server", but this too would have to deal |
---|
| 25 | with the possibility of modifying the dictionary. |
---|
| 26 | - The findaffix script takes ridiculous amounts of time and disk |
---|
| 27 | space. It desperately needs to be rewritten in C, which would |
---|
| 28 | also allow it to correctly support string characters and to |
---|
| 29 | suppress reporting of choices that are already in the affix file. |
---|
| 30 | - Some of the following ideas require more flag bits in the |
---|
| 31 | dictionary. Since there is only one bit remaining for most |
---|
| 32 | cases, I plan to use that bit as some sort of an indicator |
---|
| 33 | that more flag bits reside somewhere else. This will be a |
---|
| 34 | kludge, but it will save some space. Beware! Don't plan on |
---|
| 35 | using that last flag bit for something else. |
---|
| 36 | - (Ian Dall) For some applications, it can be handy to allow |
---|
| 37 | multiple dictionary hash files. This shouldn't be too hard, since |
---|
| 38 | there's already similar code to support the personal dictionary. |
---|
| 39 | - (Not mine, but I've lost the name of the originator.) |
---|
| 40 | Some misspellings are common, but corrections will not ever |
---|
| 41 | be found by ispell's algorithm. It would be nice to be able |
---|
| 42 | to explicitly specify misspelling/correction pairs for such |
---|
| 43 | words (e.g., "lite->light"). |
---|
| 44 | - Ispell has too much knowledge of particular text formatters |
---|
| 45 | (i.e., nroff and TeX) wired into it. It would be nice if |
---|
| 46 | ispell were more flexible about this, so that it could be |
---|
| 47 | used with some of the other file formats (e.g., Rich Text |
---|
| 48 | Format) that are occasionally used with Unix. |
---|
| 49 | - Several people, notably Peter Mutsaers, have asked if the |
---|
| 50 | affix file format could be extended to allow limited |
---|
| 51 | variables, so that you could specify things like |
---|
| 52 | "[AEIOU][DNL] > \2ING" to handle words like "pad->padding". |
---|
| 53 | - Ispell should be smart enough to ignore hyphenation signs, |
---|
| 54 | such as the TeX \- hyphenation indicator. |
---|
| 55 | - Since there can be two personal dictionaries, there should |
---|
| 56 | be a way to specify which dictionary a new word ("I" |
---|
| 57 | command) should be inserted into. |
---|
| 58 | - For languages that form lots of compound words, such as |
---|
| 59 | German, munchlist should be smart enough to split compound |
---|
| 60 | words into their components when appropriate. |
---|
| 61 | - (Jeff Edmonds) The personal dictionary should be able to |
---|
| 62 | remove certain words from the master dictionary, so that |
---|
| 63 | obscure words like "wether" wouldn't mask favorite typos. |
---|
| 64 | - (Jeff Edmonds) It would be wonderful if ispell could correct |
---|
| 65 | inserted spaces such as "th e" for "the" or even "can not" |
---|
| 66 | for "cannot". |
---|
| 67 | - Since ispell has dictionaries available to it, it is |
---|
| 68 | conceivable that it could automatically determine the |
---|
| 69 | language of a particular file by choosing the dictionary |
---|
| 70 | that produced the fewest spelling errors on the first few |
---|
| 71 | lines. |
---|
| 72 | - It is long past the time when the ispell.1 manual page |
---|
| 73 | should have been broken up into components describing the |
---|
| 74 | various programs in the suite. |
---|
| 75 | - If the -C flag is disabled, ispell should (at least |
---|
| 76 | optionally) use the "??" form to suggestion possible |
---|
| 77 | compound formations. |
---|
| 78 | - The elisp interface should provide a way for ispell to |
---|
| 79 | return error messages to emacs, so that users don't get |
---|
| 80 | inexplicable failures when things like dictionary open |
---|
| 81 | failures happen. |
---|