TUTORIALS
Manipulate and work with PDF files
TERMINAL
Shashank Sharma breaks down the different things you can do with PDF files, apart from reading and creating them from the blackness of a terminal.
OUR EXPERT
Shashank Sharma is a trial lawyer in Delhi and an avid Arch user. He’s always on the hunt for pocket friendly geeky memorabilia.
QUICK TIP
You can only use uppercase letters to define ‘handles’ in pdftk. You can, however, use a single letter or use them in combination to define ‘handles’. For instance, you can use A, B, D, Z, TT or ASD as your ‘handles’.
According to the statistics released by Adobe as part of the 25th anniversary of PDF in 2018, more than 200 billion PDF files were accessed using Adobe products in 2017. In the years since, and taking into account non-Adobe products, the sheer number of PDF files accessed in a year, or even on a daily basis is practically incalculable.
The trusted and popular document format started its journey as a proprietary format at Adobe in 1993, and was released as an open standard in 2008. Today, there are a large number of graphical and command-line utilities that you can use to not only read, but also manipulate PDF files.
Here, we’ll discuss everything from splitting large PDF files into smaller ones, or merging multiple small files into a single PDF file. We’ll also discuss how to churn multiple image files into a PDF document and shrink the size of a PDF file. We’ll reveal how to run OCR on your PDF files, edit metadata of PDF files, extract text, or images out of PDF files and more!
Creating a PDF from image files
It’s a common practice to use your cellphone camera to digitise documents. Unless you’re using dedicated scanner apps, you’ll end up with image files. You can convert these image files into PDFs using the popular ImageMagick suite.
ImageMagick is a suite of command-line applications and utilities you can use to perform various operations on image files. Whether it’s creating GIF images, collage, or otherwise editing your images, all operations can be performed using one of the included utilities. The most popular of these is convert, which can be used to edit, crop, resize, flip, blur and perform myriad other operations on image files.
However, the convert can also be used to stitch multiple image files into a PDF:
$ convert image1.png image2.png image3.png output. pdf
The simple command expects a series of image files and the output file format as arguments. While this works well if all your image files are the same size – have the same resolution and DPI – if your images are taken from various sources, such as screenshots of your desktop, taken using your camera phone or a DSLR, say, then you’ll end up with a PDF file where each page is of a different size. You can overcome this problem by specifying the page size, and other parameters with the