[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Users of various languages have established many more-or-less standard coding systems for representing them. Emacs does not use these coding systems internally; instead, it converts from various coding systems to its own system when reading data, and converts the internal coding system to other coding systems when writing data. Conversion is possible in reading or writing files, in sending or receiving from the terminal, and in exchanging data with subprocesses.
Emacs assigns a name to each coding system. Most coding systems are used
for one language, and the name of the coding system starts with the language
name. Some coding systems are used for several languages; their names
usually start with ‘iso’. There are also special coding systems
no-conversion
, raw-text
and emacs-mule
which do not
convert printing characters at all.
A special class of coding systems, collectively known as codepages, is
designed to support text encoded by MS-Windows and MS-DOS software. The
names of these coding systems are cpnnnn
, where nnnn is a
3- or 4-digit number of the codepage. You can use these encodings just like
any other coding system; for example, to visit a file encoded in codepage
850, type C-x <RET> c cp850 <RET> C-x C-f filename
<RET>(8).
In addition to converting various representations of non-ASCII characters, a coding system can perform end-of-line conversion. Emacs handles three different conventions for how to separate lines in a file: newline, carriage-return linefeed, and just carriage-return.
Describe coding system coding.
Describe the coding systems currently in use.
Display a list of all the supported coding systems.
The command C-h C (describe-coding-system
) displays information
about particular coding systems, including the end-of-line conversion
specified by those coding systems. You can specify a coding system name as
the argument; alternatively, with an empty argument, it describes the coding
systems currently selected for various purposes, both in the current buffer
and as the defaults, and the priority list for recognizing coding systems
(voir la section Recognizing Coding Systems).
To display a list of all the supported coding systems, type M-x list-coding-systems. The list gives information about each coding system, including the letter that stands for it in the mode line (voir la section The Mode Line).
Each of the coding systems that appear in this list—except for
no-conversion
, which means no conversion of any kind—specifies how
and whether to convert printing characters, but leaves the choice of
end-of-line conversion to be decided based on the contents of each file.
For example, if the file appears to use the sequence carriage-return
linefeed to separate lines, DOS end-of-line conversion will be used.
Each of the listed coding systems has three variants which specify exactly what to do for end-of-line conversion:
…-unix
Don't do any end-of-line conversion; assume the file uses newline to separate lines. (This is the convention normally used on Unix and GNU systems.)
…-dos
Assume the file uses carriage-return linefeed to separate lines, and do the appropriate conversion. (This is the convention normally used on Microsoft systems.(9))
…-mac
Assume the file uses carriage-return to separate lines, and do the appropriate conversion. (This is the convention normally used on the Macintosh system.)
These variant coding systems are omitted from the list-coding-systems
display for brevity, since they are entirely predictable. For example, the
coding system iso-latin-1
has variants iso-latin-1-unix
,
iso-latin-1-dos
and iso-latin-1-mac
.
The coding systems unix
, dos
, and mac
are aliases for
undecided-unix
, undecided-dos
, and undecided-mac
,
respectively. These coding systems specify only the end-of-line conversion,
and leave the character code conversion to be deduced from the text itself.
The coding system raw-text
is good for a file which is mainly
ASCII text, but may contain byte values above 127 which are not
meant to encode non-ASCII characters. With raw-text
, Emacs
copies those byte values unchanged, and sets
enable-multibyte-characters
to nil
in the current buffer so
that they will be interpreted properly. raw-text
handles end-of-line
conversion in the usual way, based on the data encountered, and has the
usual three variants to specify the kind of end-of-line conversion to use.
In contrast, the coding system no-conversion
specifies no character
code conversion at all—none for non-ASCII byte values and none
for end of line. This is useful for reading or writing binary files, tar
files, and other files that must be examined verbatim. It, too, sets
enable-multibyte-characters
to nil
.
The easiest way to edit a file with no conversion of any kind is with the
M-x find-file-literally command. This uses no-conversion
, and
also suppresses other Emacs features that might convert the file contents
before you see them. Voir la section Visiting Files.
The coding system emacs-mule
means that the file contains
non-ASCII characters stored with the internal Emacs encoding. It
handles end-of-line conversion based on the data encountered, and has the
usual three variants to specify the kind of end-of-line conversion.
The character translation feature can modify the effect of various
coding systems, by changing the internal Emacs codes that decoding
produces. For instance, the command unify-8859-on-decoding-mode
enables a mode that “unifies” the Latin alphabets when decoding text.
This works by converting all non-ASCII Latin-n characters to
either Latin-1 or Unicode characters. This way it is easier to use various
Latin-n alphabets together. (In a future Emacs version we hope to
move towards full Unicode support and complete unification of character
sets.)
If you set the variable enable-character-translation
to nil
,
that disables all character translation (including
unify-8859-on-decoding-mode
).
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Ce document a été généré par Eric Reinbold le 23 Février 2009 en utilisant texi2html 1.78.