Tuesday, May 7, 2013

Split A3 2 slides PDF into a A4 using ghostview and pdftk

What we need

One new question is how to split a 2 pages/side PDF into a simple page per side. To start we need a list of free tools:

  • ghostview - To extract and split the pages
  • pdftk - To merge the generated documents into one

For the example I will use a A3 document with page size (595.22x842) as we can see in the last section.

Extracting odd pages

We will use the next command

gs -o left-sections.pdf -sDEVICE=pdfwrite -sPAPERSIZE=a4 
-c "</PageOffset[0 0]>> setpagedevice" -g5950x8420 -f input.pdf
> setpagedevice" -f input.pdf

where 
-o output file
-sDEVICE is the type o file we create
-sPAPERSIZE
-g is the page geometry (page size)
-c crop the page according with [x,y] and the geometry of our page

Extracting even pages

We will use the next command

gs -o right-sections.pdf -sDEVICE=pdfwrite -sPAPERSIZE=a4 -g5950x8420 -c "</PageOffset[421 0]>> setpagedevice" -f input.pdf

How to extract a range of pages

We can use ghost view again as follow:
gs -sDEVICE=pdfwrite -dNOPAUSE -dBATCH -dSAFER -dFirstPage=10 -dLastPage=20 -o left-sections.pdf -f tmpleft-sections.pdf
to extract pages from 10 to 20.

How to merge the two documents into one

The command we can use is pdftk:

SORT=""
PAGE="1"

while [ $PAGE -le $PAGES ]
do
        SORT=$SORT"A"$PAGE" B"$PAGE" "
        PAGE=$[$PAGE+1]
done

echo $SORT 
pdftk A=right-sections.pdf B=left-sections.pdf cat $SORT output mergedoc.pdf  verbose

How to know the geometry of the document (page size)

The easiest way is using the command pdfinfo. For example:
# pdfinfo tabebak.pdf 
Title:          .indd
Author:         xxxx
Creator:        PScript5.dll Version 5.2.2
Producer:       Acrobat Distiller 8.0.0 (Windows)
CreationDate:   Sat Jan 12 10:21:09 2013
ModDate:        Sat Jan 12 10:21:09 2013
Tagged:         no
Form:           none
Pages:          27
Encrypted:      no
Page size:      595.22 x 842 pts (A4)
Page rot:       90
File size:      7782888 bytes
Optimized:      yes
PDF version:    1.4
We can extract more information into variables using for example:
WIDTH=$(pdfinfo $1 | grep -i "page size" | tail -n 1 | awk '{ print $3 }')
HEIGHT=$(pdfinfo $1 | grep -i "page size" | tail -n 1 | awk '{ print $5 }')
PAGES=$(pdfinfo $1 | grep -i pages | tail -n 1 | awk '{ print $2 }')

No comments:

Post a Comment