However, the maximum Unicode code point is #x10ffff
, so a UTF-8 encoding of Unicode requires at most four bytes per code point.
266
If you need to parse a file format that uses other character codes, or if you need to parse files containing arbitrary Unicode strings using a non-Unicode-Common-Lisp implementation, you can always represent such strings in memory as vectors of integer code points. They won't be Lisp strings, so you won't be able to manipulate or compare them with the string functions, but you'll still be able to do anything with them that you can with arbitrary vectors.
267
Unfortunately, the language itself doesn't always provide a good model in this respect: the macro DEFSTRUCT
, which I don't discuss since it has largely been superseded by DEFCLASS
, generates functions with names that it generates based on the name of the structure it's given. DEFSTRUCT
's bad example leads many new macro writers astray.
268
Technically there's no possibility of type
or object
conflicting with slot names—at worst they'd be shadowed within the WITH-SLOTS
form. But it doesn't hurt anything to simply GENSYM
all local variable names used within a macro template.
269
Using ASSOC
to extract the :reader
and :writer
elements of spec
allows users of define-binary-type
to include the elements in either order; if you required the :reader
element to be always be first, you could then have used (rest (first spec))
to extract the reader and (rest (second spec))
to extract the writer. However, as long as you require the :reader
and :writer
keywords to improve the readability of define-binary-type
forms, you might as well use them to extract the correct data.
270
The ID3 format doesn't require the parent-of-type
function since it's a relatively flat structure. This function comes into its own when you need to parse a format made up of many deeply nested structures whose parsing depends on information stored in higher-level structures. For example, in the Java class file format, the top-level class file structure contains a parent-of-type
in the code that reads and writes those substructures to get at the top-level class file object and from there to the constant pool.
271
272
Almost all file systems provide the ability to overwrite existing bytes of a file, but few, if any, provide a way to add or remove data at the beginning or middle of a file without having to rewrite the rest of the file. Since ID3 tags are typically stored at the beginning of a file, to rewrite an ID3 tag without disturbing the rest of the file you must replace the old tag with a new tag of exactly the same length. By writing ID3 tags with a certain amount of padding, you have a better chance of being able to do so—if the new tag has more data than the original tag, you use less padding, and if it's shorter, you use more.
273
The frame data following the ID3 header could also potentially contain the illegal sequence. That's prevented using a different scheme that's turned on via one of the flags in the tag header. The code in this chapter doesn't account for the possibility that this flag might be set; in practice it's rarely used.
274
In ID3v2.4, UCS-2 is replaced by the virtually identical UTF-16, and UTF-16BE and UTF-8 are added as additional encodings.
275
The 2.4 version of the ID3 format also supports placing a footer at the end of a tag, which makes it easier to find a tag appended to the end of a file.