ASCII /as'kee/ n.
[acronym: American Standard Code for Information Interchange] The predominant character set encoding of present-day computers. The modern version uses 7 bits for each character, whereas most earlier codes (including an early version of ASCII) used fewer. This change allowed the inclusion of lowercase letters -- a major win -- but it did not provide for accented letters or any other letterforms not used in English (such as the German sharp-S or the ae-ligature which is a letter in, for example, Norwegian). It could be worse, though. It could be much worse. See EBCDIC to understand how.
Computers are much pickier and less flexible about spelling than humans; thus, hackers need to be very precise when talking about characters, and have developed a considerable amount of verbal shorthand for them. Every character has one or more names -- some formal, some concise, some silly. Common jargon names for ASCII characters are collected here. See also individual entries for bang, excl, open, ques, semi, shriek, splat, twiddle, and Yu-Shiang Whole Fish.
This list derives from revision 2.3 of the Usenet ASCII pronunciation guide. Single characters are listed in ASCII order; character pairs are sorted in by first member. For each character, common names are given in rough order of popularity, followed by names that are reported but rarely seen; official ANSI/CCITT names are surrounded by brokets: <>. Square brackets mark the particularly silly names introduced by INTERCAL. The abbreviations "l/r" and "o/c" stand for left/right and "open/close" respectively. Ordinary parentheticals provide some usage information.
sh(1)
);
pretzel; amp. [INTERCAL called this `ampersand'; what could be
sillier?]
The pronunciation of #
as `pound' is common in the U.S.
but a bad idea; Commonwealth Hackish has its own, rather more
apposite use of `pound sign' (confusingly, on British keyboards
the pound graphic
happens to replace #
; thus Britishers sometimes call
#
on a U.S.-ASCII keyboard `pound', compounding the
American error). The U.S. usage derives from an old-fashioned
commercial practice of using a #
suffix to tag pound weights
on bills of lading. The character is usually pronounced `hash'
outside the U.S. There are more culture wars over the correct
pronunciation of this character than any other, which has led to
the ha ha only serious suggestion that it be pronounced
`shibboleth' (see Judges 12.6 in an Old Testament or
Torah).
The `uparrow' name for circumflex and `leftarrow' name for underline are historical relics from archaic ASCII (the 1963 version), which had these graphics in those character positions rather than the modern punctuation characters.
The `swung dash' or `approximation' sign is not quite the same as tilde in typeset material but the ASCII tilde serves for both (compare angle brackets).
Some other common usages cause odd overlaps. The #
,
$
, >
, and &
characters, for example, are all
pronounced "hex" in different communities because various
assemblers use them as a prefix tag for hexadecimal constants (in
particular, #
in many assembler-programming cultures,
$
in the 6502 world, >
at Texas Instruments, and
&
on the BBC Micro, Sinclair, and some Z80 machines). See
also splat.
The inability of ASCII text to correctly represent any of the world's other major languages makes the designers' choice of 7 bits look more and more like a serious misfeature as the use of international networks continues to increase (see software rot). Hardware and software from the U.S. still tends to embody the assumption that ASCII is the universal character set and that characters have 7 bits; this is a major irritant to people who want to use a character set suited to their own languages. Perversely, though, efforts to solve this problem by proliferating `national' character sets produce an evolutionary pressure to use a smaller subset common to all those in use.