Compressing PDFs using Ghostscript under Linux

Introduction
I had to email a PDF with several high-resolution images embedded. The original file size was 7.31 MB and this was unnecessary large for a single page PDF. I did not need the very high-resolution of the pictures, but only that the PDF would look good on-screen and in print on a normal inkjet printer.

The magic of Ghostscript
Googling the terms “compressing pdf” revealed several online options for uploading and compressing PDFs, but since I was sitting in front of a Linux computer and didn’t really trust any of these unknown providers I ended up using Ghostscript instead. The following command compressed my PDF from 7.31MB to 674KB in about a second:

gs -sDEVICE=pdfwrite -dCompatibilityLevel=1.4 -dPDFSETTINGS=/printer -dNOPAUSE -dQUIET -dBATCH -sOutputFile=output.pdf input.pdf

From the Ghostscript manual it can be seen that there are several qualities to choose from:

  • /screen – selects low-resolution output similar to the Acrobat Distiller “Screen Optimized” setting.
  • /ebook – selects medium-resolution output similar to the Acrobat Distiller “eBook” setting.
  • /printer – selects output similar to the Acrobat Distiller “Print Optimized” setting.
  • /prepress – selects output similar to Acrobat Distiller “Prepress Optimized” setting.
  • /default – selects output intended to be useful across a wide variety of uses, possibly at the expense of a larger output file.


I tried them all and ended up with the “/printer” quality. Below is a list of the file sizes and my comments:

  • 77KB – output-screen.pdf – I didn’t like this setting, JPEG compression artifacts was very visible and the quality very low.
  • 167KB – output-ebook.pdf – Decent result, but compression clearly visible when zooming a bit.
  • 674KB – output-printer.pdf – Very good result, even when zoomed it looks respectable.
  • 986KB – output-prepress.pdf – Very good indeed, no visible artifacts when zoomed.
  • 373KB – output-default.pdf – Surprisingly not as good as expected with clear artifacts when zoomed.

References
http://milan.kupcevic.net/ghostscript-ps-pdf/
http://www.ghostscript.com/doc/9.05/Ps2pdf.htm

Only registered users can comment.

  1. Cool, this is exact that what I need!
    /screen is cool for Books with low-Res Images
    /ebook is good for Texts
    /printer is for the Normal Ink-Jet perfect
    /prepress is for photos

  2. Great stuff, thanks for writing this up.
    In my case, another variable to the size/quality tradeoff was the image resolution in DPI (-dColorImageResolution). Originally I had it set to 72 which is pretty aggressive, and even /prepress in that case is full of artifacts. Upping to 150 gave great results while still being way smaller than XSane’s uncompressed output.

  3. Dear Thomas.

    Thank you very much for your wonderful compression tip.

    It seems you’re a script magician, so would it be too much to ask for a little bit of polishing/rounding? I’m sorry I’m on UX and don’t know how to do it (or would take me days), but here is one possible suggested flow.

    1. the user runs “pdf-compress [file-to-compress.pdf]”

    2. the script shows the list of possible compressions with your comments (they are great tips), each one with a number to choose like this:
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    What compression level do you prefer? (please type the number)
    1. screen – low-resolution output similar to the Acrobat Distiller “Screen Optimized” setting.
    2. ebook, medium-resolution output similar to the Acrobat Distiller “eBook” setting.
    3. printer, output similar to the Acrobat Distiller “Print Optimized” setting.
    4. prepress, output similar to Acrobat Distiller “Prepress Optimized” setting.
    5. default, output intended to be useful across a wide variety of uses, possibly at the expense of a larger output file.
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

    3. After the compression, we choose the image resolution in a similar way.
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    What image resolution would you need? higher values means more quality and bigger size (please type the number)
    1. 72 dpi
    2. 120 dpi
    3. 150 dpi
    4. 200 dpi
    5. 300 dpi
    6. leave as it is in the original
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

    4. The script compresses the original pdf taking the 2 selections and outputs the file adding after the original name the compression and resolution (so we never overwrite the original file).

    In our example case, if the user chooses 3 (printer) and 3 (150) it would result in
    file-to-compress-CompressionPrinter-Resolution150dpi.pdf

    This is fast, helps the novice user, and allows for quick exploration of possibilities (size/quality balance) since no file is overwrited and each has the parameters on it’s name.

    What do you think?

    In any and every case, thank you a lot for sharing your knowledge and time. 🙂

    Best…


    eduardo

Leave a Reply to MtgxyzCancel reply