Quick and dirty version. I made a version that checks if a manpage exists (but it's not a oneliner). You must have ps2pdf and of course Ghostscript installed in your box. Enhancements appreciated :-)
Remove security from PDF document using this very simple command on Linux and OSX. You need ghostscript for this baby to work.
PDF files are simultaneously wonderful and heinous. They are wonderful in being ubiquitous and mostly being cross platform. They are heinous in being very difficult to work with from the command line, search, grep, use only the text inside the PDF, or use outside of proprietary products. xpdf is a wonderful set of PDF tools. It is on many linux distros and can be installed on OS X. While primarily an open PDF viewer for X, xpdf has the tool "pdftotext" that can extract formated or unformatted text from inside a PDF that has text. This text stream can then be further processed by grep or other tool. The '-' after the file name directs output to stdout rather than to a text file the same name as the PDF. Make sure you use version 3.02 of pdftotext or later; earlier versions clipped lines. The lines extracted from a PDF without the "-layout" option are very long. More paragraphs. Use just to test that a pattern exists in the file. With "-layout" the output resembles the lines, but it is not perfect. xpdf is available open source at http://www.foolabs.com/xpdf/
Given some images (jpg or other supported formats) in input, you obtain a single PDF file with an image for every page.
To quickly add some remark, comment, stamp text, ... on top of (each of) the pages of the input pdf file.
Uses htmldoc to perform the conversion
See man wget if you want linked files and not only those hosted on the website.
In this example we extract pages 14-17
Saves to a PDF with title and alt text of comic. As asked for on http://bbs.archlinux.org/viewtopic.php?id=91100 Change xkcd.com to dynamic.xkcd.com/comics/random for a random comic.
More pdftk examples: http://www.pdflabs.com/docs/pdftk-cli-examples/
Joins two pdf documents coming from a simplex document feed scanner. Needs pdftk >1.44 w/ shuffle.
Turns a PDF into HTML (without images) and prints it to the standard out which is picked up and interpreted by w3m.
Without the bashisms and unnecessary sed dependency. Substitutions quoted so that filenames with whitespace will be handled correctly.
Probably will not work very well with scanned documents.
Xsane produces PDFs that are too large - particularly multipage PDFs. This command compresses them. If you do not use A4, remove the -sPAPERSIZE flag.
This is an example of the usage of pdfnup (you can find it in the 'pdfjam' package). With this command you can save ink/toner and paper (and thus trees!) when you print a pdf. This tools are very configurable, and you can make also 2x2, 3x2, 2x3 layouts, and more (the limit is your fantasy and the resolution of the printer :-) You must have installed pdfjam, pdflatex, and the LaTeX pdfpages package in your box. Show Sample Output
#4345 also works under windows
This will extract all DCT format images from foo.pdf and save them in JPEG format (option -j) to bar-000.jpg, bar-001.jpg, bar-002.jpg, etc. Inspired by http://stefaanlippens.net/extract-images-from-pdf-documents
Adjust the --resolution and --mode as required (if these options are available for your scanner).
The size options (-x, -y, -imageheight, -imagewidth) are for US letter paper. For A4, I think the command would be:
scanimage -p --resolution 250 --mode Gray -x 210 -y 297 | pnmtops -imageheight 11.7 -imagewidth 8.3 | ps2pdf - output.pdf
pdfunite is a part of the poppler-utils. poppler-utils package is only 150KB. The alternative - pdftk package is 14MB! Install poppler-utils if you need simple pdf operation commands like unite, separate, info, text/html conversions
The pdf is first converted to a bitmap, so change "-density" to match your printer resolution. Also be careful about the RAM required. In this example rgb(0,0,0) is replaced by rgb(255,255,255), change to suit your needs.
If you skip this part:
-density 300x300
you'll get a very lo-res image.
use imagemagik convert
