This is a modified version of the OP, wrapped into a bash function. This version handles newlines and other whitespace correctly, the original has problems with the thankfully rare case of newlines in the file names. It also allows checking an arbitrary number of directories against each other, which is nice when the directories that you think might have duplicates don't have a convenient common ancestor directory.
Executing pfiles will return a list of all descriptors utilized by the process We are interested in the S_IFREG entries since they are pointing usually to files In the line, there is the inode number of the file which we use in order to find the filename. The only bad thing is that in order not to search from / you have to suspect where could possibly be the file. Improvements more than welcome. lsof was not available in my case Show Sample Output
Let the shell handle the repetition in stead of find :)
Linux users wanting to extract text from PDF files in the current directory and its sub-directories can use this command. It requires "bash", "ps2ascii" and "par", and the PARINIT environment variable sanely set (see man par). WARNING: the file "junk.sh" will be created, run, and destroyed in the current directory, so you _must_ have sufficient rights. Edit the command if you need to avoid using the file name "junk.sh"
This checks jpeg data and metadata, should be grepped as needed, maybe a -B1 Warning for the first, and a -E "WARNING|ERROR" for the second part....
This command will take the files in a directory, rename them, and then number them from 1...N. Black belt stuff. Hell of a time saver.
Avoids the nested 'find' commands but doesn't seem to run any faster than syssyphus's solution.
the advantage to doing it this way is that you can adjust the max depth to get more recursive results and run it on non GNU systems. It also won't print trailing slashes, which can easily be removed, but can be slightly annoying.. You could run: # for file in `find * -maxdepth 0 -type d`;do ls -d $file;done and in the ls -d part of the command you can put in whatever parameters you want to get things like permissions, time stamps, and ownership. Show Sample Output
While `echo rm * | batch` might seem to work, it might still raise the load of the system since `rm` will be _started_ when the load is low, but run for a long time. My proposed command executes a new `rm` execution once every minute when the load is small. Obviously, load could also be lower using `ionice`, but I still think this is a useful example for sequential batch jobs. Show Sample Output
make a bunch of files with the same permissions, owner, group, and content as a template file (handy if you have much to do w. .php, .html files or alike)
If you have GNU findutils, you can get only the file name with
find /some/path -type f -printf '%f\n'
instead of
find /some/path -type f | gawk -F/ '{print $NF}'
Show Sample Output
Searched strings: passthru, shell_exec, system, phpinfo, base64_decode, chmod, mkdir, fopen, fclose, readfile Since some of the strings may occur in normal text or legitimately you will need to adjust the command or the entire regex to suit your needs.
I have found that base64 encoded webshells and the like contain lots of data but hardly any newlines due to the formatting of their payloads. Checking the "width" will not catch everything, but then again, this is a fuzzy problem that relies on broad generalizations and heuristics that are never going to be perfect. What I have done is set an arbitrary threshold (200 for example) and compare the values that are produced by this script, only displaying those above the threshold. One webshell I tested this on scored 5000+ so I know it works for at least one piece of malware.
The number on the far right is ratio of comments to code, expressed as a percentage. For the rest of the Yardstick documentation see https://github.com/calmh/yardstick/blob/master/README.md#reported-metrics Show Sample Output
list all files are greater than 10mb lent from: http://www.tippscout.de/linux-grosze-dateien-finden_tipp_1653.html
find ip address in all files in /etc directory. can be used to find any string in any directory really
git gc should be run on all git repositories every 100 commits. This will help do do so if you have many git repositories ;-)
The "find $stuff -print0 | xargs -0 $command" pattern causes both find and xargs to use null-delineated paths, greatly reducing the probability of either hiccuping on even the weirdest of file/path names.
It's also not strictly necessary to add the {} at the end of the xargs command line, as it'll put the files there automatically.
Mind, in most environments, you could use find's "-exec" option to bypass xargs entirely:
find . -name '*.jpg' -o -name '*.JPG' -exec mogrify -resize 1024">" -quality 40 {} +
will use xargs-like "make sure the command line isn't too long" logic to run the mogrify command as few times as necessary (to run once per file, use a ';' instead of a '+' - just be sure to escape it properly).
The find command can do this on it's own. This is a shorter faster version, it also includes more advanced regex (it will find .Jpg etc). Find doesn't need a pipe, you can run it directly from the command.
This command removes and then cvs removes all files in the current directory recursively.
It starts in the current working directory.
It removes the empty directory and its ancestors (unless the ancestor contains other elements than the empty directory itself).
It will print a failure message for every directory that isn't empty.
This command handles correctly directory names containing single or double quotes, spaces or newlines.
If you do not want only to remove all the ancestors, just use:
find . -empty -type d -print0 | xargs -0 rmdir
commandlinefu.com is the place to record those command-line gems that you return to again and again. That way others can gain from your CLI wisdom and you from theirs too. All commands can be commented on, discussed and voted up or down.
Every new command is wrapped in a tweet and posted to Twitter. Following the stream is a great way of staying abreast of the latest commands. For the more discerning, there are Twitter accounts for commands that get a minimum of 3 and 10 votes - that way only the great commands get tweeted.
» http://twitter.com/commandlinefu
» http://twitter.com/commandlinefu3
» http://twitter.com/commandlinefu10
Use your favourite RSS aggregator to stay in touch with the latest commands. There are feeds mirroring the 3 Twitter streams as well as for virtually every other subset (users, tags, functions,…):
Subscribe to the feed for: