TYPE DESIGN INFORMATION PAGE last updated on Wed Oct 22 22:09:57 EDT 2014

SEARCH THIS SITE:
IMAGE SEARCH:


Extracting fonts from PDF files



[Drawing by Ronald Searle entitled Larches]

Luc Devroye
McGill University
Montreal, Canada
lucdevroye@gmail.com
http://luc.devroye.org
Up to main font page
Up to main font index page



SWITCH TO INDEX FILE


Extract Embedded Fonts from PDF Files

Mirko Scholz explains how to use pdftosrc (from pdfTeX) to extract type 1 fonts from PDF files. [Google] [More]  ⦿

Håvar Ingmund Henriksen

Håvar Ingmund Henriksen (b. 1980) is from Skjervøy, in Nord-Troms, in the northern part of Norway. His interests include technology and comics. In 2009, he used FontStruct to create LCD DotMatrix. He writes: This is the Dot Matrix LCD Font used on the Ricoh Aficio AP3800C, Aficio AP3200 and AP306D printers, among others. He also explains how to use FontForge to extract fonts from PDF files. He says: Basically you just need to select "Extract from PDF" in the filter section of the "Open Font" dialogbox used when opening files. When you have selected your PDF file, a "Pick a font" dialogbox will open where you could select wich font to open. Then you'll just need to compact the font using the "Encoding" menu and selecting "Compact". This will remove all non-used glyphs in the font. Then you would have to edit the Font Info, and save the font as a font file (usally TrueType is best). Quote from the article: "Beware though, sometimes when a font is embedded into a PDF it will only contain [glyphs for] characters used. So, if the PDF file that you are trying to extract from does not contain the letter "P" [glyph], then that letter will not show up in FontForge." (You could see an example of this in the image above, the PDF file the font was extracted from did not contain glyphs for all the letters in the english alphabet). [Google] [More]  ⦿

Håvar Ingmund Henriksen
[Using FontForge to extract fonts from a PDF file]

[More]  ⦿

[HOWTO] Extract Fonts from a PDF File using FontForge

A Japanese hacker suggests this method for Linux: Install FontForge
$ sudo apt-get install fontforge

Once FontForge is installed, start it
$ fontforge

On the "Open Font" screen, go down to where it says "Filter" and change it to "Extract from PDF".
Select your PDF and a "Pick a font" window will open.
Select the font you want to extract and click OK.

A window with a display of the font will show up. It's not quite ready to turn into a TTF yet. Here's how to prepare it:

Go to the Encoding menu and select "Compact". This will cause FontForge to remove all characters that are not defined in the embedded font. Beware though, sometimes when a font is embedded into a PDF it will only contain characters used. So, if the PDF file that you are trying to extract from does not contain the letter "P", then that letter will not show up in FontForge. Check to make sure all the characters you need are displayed and then head over to the Element menu.

Click on Font Info.
You can update the Fontname, Family Name, and most importantly, "Name for Humans". This field is what the font will display as in your editing program. The font name is usually a little garbled when you extract it, so just make it something readable. If there is a copyright notice displayed at the bottom, you should probablly stop what you are doing since that usually means the font should be purchased.

If there's no copyright, click on "OK". Then go to File > Generate Fonts.

Select the type of font you want to save as (Usually TrueType is best), and click on Save. You may encounter some messages about Non-standard Em size and Bad Private Dictionary errors. Just click on Save and you should be OK.

Then, find your font file and open it up to make sure that it displays properly.
[Google] [More]  ⦿

MuPDF

MuPDF is a free lightweight PDF viewer and toolkit written in portable C. It includes pdfdraw (PDf to PNM iimage converter), pdfextract (rxtract fonts and images), pdfinfo, pdfclean (rewrite PDF files), and pdfshow. By Artifex Software. [Google] [More]  ⦿

PDF extraction and piracy

A discussion on Typophiles about the process of extracting fonts from PDF files. The more noteworthy contributions:

  • Thomas Phinney: Obviously we at Adobe are not very worried, as we make and post PDFs showing every glyph in each font we sell. I have no reason to believe that piracy via ripping fonts out of the PDF is a significant portion of all piracy of our fonts. If it were 10% or more, I'd start to be concerned about doing things differently. But I have no reason to think it's even 1%. It's just so much easier to get the actual font from pirate sources.
  • Haley Fiege: Ripping fonts from pdf is such an archaic way of pirating.
[Google] [More]  ⦿

PDF font extraction

PDF font extraction used to be a simple thing until 2001. Most PDF files created after 2001 have only partial character sets, but most older PDF files have full type 1 or other font files. Mirko Scholz recommends the use of pdftosrc, part of the PDFTeX package. Alternately, one can use Acrobat3 (*not* higher versions) and output a PostScript file from a PDF file. Inspect the PostScript file to find the fonts, usually located between BeginResource and EndResource lines (or the line with "cleartomark"). You may have to add a header line (example: %!PS-AdobeFont-1.0: AmasisMT (001.003)). The PFA file (in case of type 1) needs to be converted to PFB using t1utils, a free package. Remember that no metrics (AFM, PFM) file can be extracted from a PDF file! Several utilities exist (e.g., Crossfont) that automatically generate a basic PFM file. See also the discussion here. [Google] [More]  ⦿

PDF Unlock

On-line PDF unlocker. Upload your file, and get an unlocked version back. [Google] [More]  ⦿

UNIX shell script for extracting PDF files

A UNIX Free BSD shell script for extracting type 1 fonts from a PDF file, after having produced a .ps file as an output of the xpdf utility. The fonts are in "pfa" format, so you may want to use "pfa2pfb" or "t1asm" to make a "pfb" file from this:
# pdfextract file.ps
#
# extracts font files from a .ps generated from an xpdf output
#

echo "" > A

gawk 'BEGIN {m=0; f=""; a=1}
m == 1 {print $0 >> a".pfa" }
$1 ~ /%%BeginResource:/ {m=1; f=$3; print "%!PS-AdobeFont-1.0: " f > a".pfa" }
$1 ~ /cleartomark/ {m=0; a+=1; f=""} ' $1 >> A

[Google] [More]  ⦿

Using FontForge to extract fonts from a PDF file
[Håvar Ingmund Henriksen]

We read: Basically you just need to select "Extract from PDF" in the filter section of the "Open Font" dialogbox used when opening files. When you have selected your PDF file, a "Pick a font" dialogbox will open where you could select wich font to open. Then you'll just need to compact the font using the "Encoding" menu and selecting "Compact". This will remove all non-used glyphs in the font. Then you would have to edit the Font Info, and save the font as a font file (usally TrueType is best). Quote from the article: "Beware though, sometimes when a font is embedded into a PDF it will only contain [glyphs for] characters used. So, if the PDF file that you are trying to extract from does not contain the letter "P" [glyph], then that letter will not show up in FontForge." (You could see an example of this in the image above, the PDF file the font was extracted from did not contain glyphs for all the letters in the english alphabet). [Google] [More]  ⦿

xpdf

View PDF files on X-Windows, and extract images from PDF files. By Derek B. Noonburg. GNU license software, faster and more reliable than Acrobat Reader. To extract fonts from pdf files: start xpdf, print to a file (a postscript file). That file has all the fonts neatly embedded in pfa format, except for the first line (the BeginResource line should be replaced by the first line of a pfa font). The last line is "cleartomark". Use a pfa to pfb converter, and you are done. [Google] [More]  ⦿

X-PDF Browser

Free program to embed files into PDF files and extract files from it. [Google] [More]  ⦿