Wednesday, 7 August 2013

The easiest way to split and merge pdf files in linux

I scanned some documents to .pdf format but, I wanted to email them all in one .pdf file.

The easiest way to split, merge or edit pdf files in linux is to use pdftk.

sudo apt-get install pdftk
sudo aptitude install pdftk
sudo yum install pdftk
depending on what flavour of linux you have.

pdftk is quite powerful but, here are a few examples.

Split a multi-page pdf into separate files:

pdftk largepdfile.pdf burst
results in a number files like 0001.pdf, 0002.pdf, etc.

Merge files into one PDF file:

pdftk *.pdf cat output largepdfile.pdf
pdftk file1.pdf file2.pdf file3.pdf cat output largepdfile.pdf

Saturday, 12 January 2013

Converting date formats in a flat-file using Regular expressions

I use two free text editors for preparing flat-files (.csv, etc.) prior to importing into MySQL.

I use Notepad++ in M$ Windows and Kate in Linux.  Both can do Regular expression Search & Replace upon text.

I find regular expressions very useful for preparing date fields prior to import into a database table.  So, converting the UK date format
dd/mm/yyyy (12/01/2013) into a MySQL DATE, DATETIME or, TIMESTAMP datatype (2013-01-12) is easy.


Replace (converts UK date format):

Replace (converts US date format):

So the above UK example searches for dates in the UK date format dd/mm/yyyy and converts them to yyyy-mm-dd.

Click image to enlarge
Click image to enlarge
Unlike M$ Excel, these text-editors can cope with large files and do this relatively quickly.

Nice tutorials on Regular expressions at