[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Emacs has two text representations—two ways to represent text in a string or buffer. These are called unibyte and multibyte. Each string, and each buffer, uses one of these two representations. For most purposes, you can ignore the issue of representations, because Emacs converts text between them as appropriate. Occasionally in Lisp programming you will need to pay attention to the difference.
In unibyte representation, each character occupies one byte and therefore
the possible character codes range from 0 to 255. Codes 0 through 127 are
ASCII characters; the codes from 128 through 255 are used for one
non-ASCII character set (you can choose which character set by
setting the variable nonascii-insert-offset
).
In multibyte representation, a character may occupy more than one byte, and as a result, the full range of Emacs character codes can be stored. The first byte of a multibyte character is always in the range 128 through 159 (octal 0200 through 0237). These values are called leading codes. The second and subsequent bytes of a multibyte character are always in the range 160 through 255 (octal 0240 through 0377); these values are trailing codes.
Some sequences of bytes are not valid in multibyte text: for example, a single isolated byte in the range 128 through 159 is not allowed. But character codes 128 through 159 can appear in multibyte text, represented as two-byte sequences. All the character codes 128 through 255 are possible (though slightly abnormal) in multibyte text; they appear in multibyte buffers and strings when you do explicit encoding and decoding (voir la section Explicit Encoding and Decoding).
In a buffer, the buffer-local value of the variable
enable-multibyte-characters
specifies the representation used. The
representation for a string is determined and recorded in the string when
the string is constructed.
This variable specifies the current buffer's text representation. If it is
non-nil
, the buffer contains multibyte text; otherwise, it contains
unibyte text.
You cannot set this variable directly; instead, use the function
set-buffer-multibyte
to change a buffer's representation.
This variable's value is entirely equivalent to (default-value
'enable-multibyte-characters)
, and setting this variable changes that
default value. Setting the local binding of
enable-multibyte-characters
in a specific buffer is not allowed, but
changing the default value is supported, and it is a reasonable thing to do,
because it has no effect on existing buffers.
The ‘--unibyte’ command line option does its job by setting the default
value to nil
early in startup.
Return the byte-position corresponding to buffer position position in
the current buffer. This is 1 at the start of the buffer, and counts upward
in bytes. If position is out of range, the value is nil
.
Return the buffer position corresponding to byte-position
byte-position in the current buffer. If byte-position is out of
range, the value is nil
.
Return t
if string is a multibyte string.
This function returns the number of bytes in string. If string
is a multibyte string, this can be greater than (length
string)
.
[ < ] | [ > ] | [ << ] | [Plus haut] | [ >> ] | [Top] | [Table des matières] | [Index] | [ ? ] |
Ce document a été généré par Eric Reinbold le 13 Octobre 2007 en utilisant texi2html 1.78.