GREP a PDF file.

grep -i '[^script$]' 1.txt

By: vinod

2010-10-20 12:17:04

grep pipe grep pdf

3 Alternatives + Submit Alt

GREP a PDF file.

PDF files are simultaneously wonderful and heinous. They are wonderful in being ubiquitous and mostly being cross platform. They are heinous in being very difficult to work with from the command line, search, grep, use only the text inside the PDF, or use outside of proprietary products. xpdf is a wonderful set of PDF tools. It is on many linux distros and can be installed on OS X. While primarily an open PDF viewer for X, xpdf has the tool "pdftotext" that can extract formated or unformatted text from inside a PDF that has text. This text stream can then be further processed by grep or other tool. The '-' after the file name directs output to stdout rather than to a text file the same name as the PDF. Make sure you use version 3.02 of pdftotext or later; earlier versions clipped lines. The lines extracted from a PDF without the "-layout" option are very long. More paragraphs. Use just to test that a pattern exists in the file. With "-layout" the output resembles the lines, but it is not perfect. xpdf is available open source at http://www.foolabs.com/xpdf/
This is sample output - yours may be different.
27

pdftotext [file] - | grep 'YourPattern'

drewk · 2010-02-14 21:42:35 16
GREP a PDF file.

This is a good alternative to pdf2text for Ubuntu. To install it: sudo apt-get install python-pdfminer
This is sample output - yours may be different.
2

pdf2txt myfile.pdf | grep mypattern

grinob · 2015-11-23 17:46:22 9
GREP a PDF file.

grep pdf files easily
This is sample output - yours may be different.
0

pdfgrep pattern /the/path

ees · 2017-02-02 09:27:43 19

What Others Think

rahimhh21 · 86 weeks and 3 days ago

Perfecthomepugs · 77 weeks and 3 days ago

pugpuppies95 · 39 weeks and 4 days ago

What do you think?

Any thoughts on this command? Does it work on your machine? Can you do the same thing with only 14 characters?

You must be signed in to comment.

What's this?

commandlinefu.com is the place to record those command-line gems that you return to again and again. That way others can gain from your CLI wisdom and you from theirs too. All commands can be commented on, discussed and voted up or down.

Share Your Commands

Similar Commands

Check if a file is text

Remove blank lines from a file

Recover a deleted file

Grep for pattern & get uniq filenames

Stay in the loop…

Follow the Tweets.

Every new command is wrapped in a tweet and posted to Twitter. Following the stream is a great way of staying abreast of the latest commands. For the more discerning, there are Twitter accounts for commands that get a minimum of 3 and 10 votes - that way only the great commands get tweeted.

» http://twitter.com/commandlinefu
» http://twitter.com/commandlinefu3
» http://twitter.com/commandlinefu10

Subscribe to the feeds.

Use your favourite RSS aggregator to stay in touch with the latest commands. There are feeds mirroring the 3 Twitter streams as well as for virtually every other subset (users, tags, functions,…):

Subscribe to the feed for:

» all commands
» commands with 3 up-votes (commandlinefu3)
» commands with 10 up-votes (commandlinefu10)