Type preservation



December 7, 2004

Type preservation


The time scale


It is difficult to think about the future, well past your own lifetime. But a question that has been nagging me is how to preserve type at a time scale running in tens to hundreds of years. It intrigues me, because I see how hard it sometimes is to get my hands on some clean type specimen that is barely older than a hundred years.



Type-eating insects


Type was passed on initially in two forms, its metal cast format, and specimen printed on paper. Those were the choices until about 1950. Both carried risks. Many a fire has destroyed complete collections of types--just ask Frederic Goudy. Floods, careless heirs, insects, wars, forced emigration and political upheaval have all taken a toll on the world's type treasures.



Genetic weakness


And then the world exploded into the digital age, where electronic versions of type started proliferating. Literally hundreds of electronic type formats have been invented, only a handful of which are dominant at any given moment in time. It is preposterous to claim that one format will outlive or outperform any other. None of today's formats will be with us in the year 2100. And we have created a new monster insect, the computer. Computers live about as long as rats, and virtually no computer outlives a cat. In fact, the average lifespan of a computer decreases with every passing year! Genetic weakness is built into the system. Software has a matching lifecycle, as computers and operating systems are often bundled. So, what will we do with our favorite truetype or postscript files when there is no more software to read or display them? Can we count on a limitless supply of volunteer hackers to keep us afloat in such software once formats are discontinued? Fat chance---out of sight, out of the heart!



Beautiful bodies


So, how will we preserve our typefaces? Shall we rely on electronic formats, or shall we return to images? One can forget about fonts in the form of programs, because almost every programming language ever invented has bitten the dust. Sad to say, but Knuth's beautiful Metafont programs for Computer Modern, a 72-weight optically adjusted family of fonts, will be unreadable in a century. The programs are splendid for generating fonts today, and indeed, the Computer Modern family lives a happy life not in its Metafont skin, but in its type 1 or PostScript dress. That dress will have to change many times, and with each change, a bit of the original design will be lost. For example, in Computer Modern, we have virtually limitless accuracy in positioning the points. The type 1 version must round to the nearest thousand, so only three digits of accuracy are left. More will be lost with every conversion, as layer upon layer of dresses will hide more and more of the shapely body.



No oxygen please


So, maybe we could print out all 72 fonts right away in the highest possible resolution, on paper that is immediately protected from oxygen by sealing it in a vacuum, repeat this a thousand times, and store it in a thousand different musea all over the world. That will insure its survival, but it is not practical. Electronic formats that are too closely linked to applications, software or hardware, are also doomed. The only file format that has survived since the inception of the computer is the text format, where each character corresponds to a letter or number, just as in a book. There is a software-independent one-to-one connection between such files and printed versions. But just to be sure, we should also explicitly state how each sequence of 8 bits in a file corresponds to a letter--one can not even be sure that such things will be evident in the future. So, the old text format, files that end with ".txt" in today's lingo, is the medium of choice for preserving data across decades and generations of computers and technological upheaval. Such files are now stored on various media such as hard disks, tape back-ups, CDs, DVDs, and so forth. Undoubtedly, new media will be invented that we cannot even imagine. If the hardware media can no longer be read, they become useless, so we must protect ourselves by parallelism. We must spread those files to many people so that each one survives somewhere somehow in one subpopulation.



Text format


The arguments used above thus lead in one direction, the description of each typeface in a simple universal text format, one that is easy to print and interpret. This format can describe outlines, but it does not have to. For example, it is not at all certain that Bezier curves will still be technologically or mathematically relevant a hundred years from now. It does not make sense to have files full of control points without saying what these numbers mean. Future readers might think that these are messages from Mars! A textual explanation, complete with mathematical definitions, could be provided in each file somewhere. This is quite cumbersome, to say the least. Another option is to preserve glyphs as pictures in a format, that unlike jpg or gif, is again universally readable, a true bitmap format with rows of binary pixels. It does not matter whether these are rows of 0s and 1s or rows of white and black squares. This is the equivalent of storing pictures of the glyphs in such a text bitmap format.



Monstrous files


And that brings us to my last point, the absolute necessity to take pictures of the highest possible resolution and to make monstrous bitmap files that have maximum accuracy. For what may seem like something redundant today may be considered old technology or noise from the 20th century in a few decades. Just look at pictures taken with digital cameras five years ago---the sight is almost unbearable. Digital cameras will evolve to the point that they will surpass the best prints soon. If you can afford an 8 megapixel digital camera today, buy it. It too will be obsolete soon, but at least you are doing your pictures and your heirs a favor. Similarly, we should describe outlines not with integers on a scale from 0 to 1000, but on integer grids with scales that range from 0 to one million or one billion. We should not impose limits on how complex the description is of any character or how many characters can be held in a font. A font is a possibly unlimited collection of glyphs. If text bitmaps are preferred, then draw each glyph on those oversized grids. Not too long from today, those massive storage requirements will appear trivial.



Mothers


Note that I am not saying that we should use these overweight formats for communication with printers or screens. They are just a way of storing the essence of a typeface over long time periods. Smaller electronic formats can be derived from that mother monster by programs that run on the present computers. We preserve the mothers, and do not have to worry about changes in printers, programs, screens, and all other media. We can throw the babies out with the used keyboards.



Your job


So, there you have it, some thoughts on how to preserve all those beautiful fonts. For the font technicians: please go back to the drawing boards, and give us something that addresses these points. Junk the 1000 by 1000 grid that is standard in type 1, junk the 2048 by 2048 grid in truetype, forget about opentype, take the limits off Unicode, and give us something closer to the ideal.





Luc Devroye, type ecologist
School of Computer Science
McGill University
Montreal, Canada H3A 2K6
luc@cs.mcgill.ca
http://luc.devroye.org/index.html