Today Idiscovered an easy way to merge PDFs files using Perl.
In my current project, I have to automatically generate a technical documentation from my perl modules PODs. It's easy to create one PDF file for each modules, but then I need to merge them all in one.
The very easy to use CAM::PDF Perl module provides appendpdf.pl. However it can only merge 2 files together.
I've copied and modified it for my need, and renamed it mergepdf.pl
Download : mergepdf.pl
Usage : appendpdf.pl [options] file1.pdf file2.pdf ... fileN.pdf outfile.pdf
See the help for more options.
Sounds like a great idea. But aughtn't it be called catpdf as it is rather analogous to the cat command?
Posted by: Leifbk | June 29, 2010 at 04:45 PM
`gs' utility from app-text/ghostscript-gpl suite should be also suitable here, if I've got the point properly.
gs -dNOPAUSE -sDEVICE=pdfwrite -sOUTPUTFILE=report.pdf -dBATCH title.pdf chapter1.pdf chapter2.pdf ... chapterN.pdf
Thank you for your work, anyway.
Posted by: rtheb | June 29, 2010 at 04:56 PM
there is also PDFSAM "split and merge" -> www.pdfsam.org .
Posted by: Jochen Hayek | June 29, 2010 at 05:17 PM
@Leifbk : just rename the file to catpdf.pl if you want :) But the cat command by default outputs to STDOUT whereas the script outpus to a file (which can be STDOUT if you give it the special "-" name). Anyway that's why I called it mergepdf and not catpdf
@rtheb : ah true, I could use ghostscript here, but it's just not installed on my mac right now
@Jochen Hayek : I used to use PDFSAM, but as far as I know you have to use it from the Graphical interface, and I need to automate (script) the process. Or can you use it from the command line ?
Posted by: dams | June 29, 2010 at 05:52 PM
I'm the author of CAM::PDF. You can also do it as an almost-one liner:
perl -MCAM::PDF -e'$p=CAM::PDF->new(shift);while(@ARGV){$p->appendPDF(CAM::PDF->new(shift))}$p->cleanoutput("out.pdf")' *.pdf
Posted by: Chris Dolan | June 30, 2010 at 01:02 AM
As I recently learned, this can also be done with ImageMagick:
convert file1.pdf file2.pdf ... fileN.pdf outfile.pdf
Posted by: Patrick | June 30, 2010 at 08:53 AM
@Chris Dolan : Thanks for the very useful CAM::PDF ! indeed, the one liner you propose is all what it needs, but a help message and the options you originally put in appendpdf.pl are useful.
BTW I proposed the script as new feature on CPAN. Not sure if you want it or not in the distribution.
Posted by: dams | June 30, 2010 at 10:41 AM
If you ever need a little more sophisticated arragement of PDF pages, I would recommend a look at the pdfpages package for pdfTeX:
http://www.ctan.org/tex-archive/macros/latex/contrib/pdfpages/
Posted by: mark | July 02, 2010 at 03:26 PM
What I would like to know is how to remove the background image from PDF pages and change the main font colour. A lot of presentations in PDF format have a solid dark colour background and white text, but that wastes a lot of ink printing out, even when you do 6 slides per page.
Of course now days a lot of people would just read the PDF on their phone, but I would still like the option of printing them out.
Posted by: Gordon | July 27, 2010 at 03:08 AM