less file.lst | head -n 50000 > output.txt

Cut a large wordlist into smaller chunks

Useful for situations where you have word lists or dictionaries that range from hundreds of megabytes to several gigabytes in size. Replace file.lst with your wordlist, replace 50000 with however many lines you want the resulting list to be in total. The result will be redirected to output.txt in the current working directory. It may be helpful to run wc -l file.lst to find out how many lines the word list is first, then divide that in half to figure out what value to put for the head -n part of the command.

These Might Interest You

  • This is just a little snippit to split a large file into smaller chunks (4mb in this example) and then send the chunks off to (e)mail for archival using mutt. I usually encrypt the file before splitting it using openssl: openssl des3 -salt -k <password> -in file.tgz -out file.tgz.des3 To restore, simply save attachments and rejoin them using: cat file.tgz.* > output_name.tgz and if encrypted, decrypt using: openssl des3 -d -salt -k <password> -in file.tgz.des3 -out file.tgz edit: (changed "g" to "e" for political correctness)


    -1
    split -b4m file.tgz file.tgz. ; for i in file.tgz.*; do SUBJ="Backup Archive"; MSG="Archive File Attached"; echo $MSG | mutt -a $i -s $SUBJ YourEmail@(E)mail.com
    tboulay · 2010-03-20 16:49:19 4
  • Using large wordlists is cumbersome. Using password cracking programs with rules such as Hashcat or John the ripper is much more effective. In order to do this many times we need to "clean" a wordlist removing all numbers, special characters, spaces, whitespace and other garbage. This command will covert a entire wordlist to all lowercase with no garbage.


    0
    cat dirtyfile.txt | awk '{gsub(/[[:punct:]]/,"")}1' | tr A-Z a-z | sed 's/[0-9]*//g' | sed -e 's/ //g' | strings | tr -cs '[:alpha:]' '\ ' | sed -e 's/ /\n/g' | tr A-Z a-z | sort -u > cleanfile.txt
    purehate · 2011-08-28 01:26:04 0

  • 0
    perl -F'\s+' -anE 'push @w,$F[1];END{$r.=splice @w,rand @w,1 for(1..4);say $r}' diceware.wordlist.asc
    zedlopez · 2011-08-24 18:50:16 0
  • It's common to want to split up large files and the usual method is to use split(1). If you have a 10GiB file, you'll need 10GiB of free space. Then the OS has to read 10GiB and write 10GiB (usually on the same filesystem). This takes AGES. . The command uses a set of loop block devices to create fake chunks, but without making any changes to the file. This means the file splitting is nearly instantaneous. The example creates a 1GiB file, then splits it into 16 x 64MiB chunks (/dev/loop0 .. loop15). . Note: This isn't a drop-in replacement for using split. The results are block devices. tar and zip won't do what you expect when given block devices. . These commands will work: hexdump /dev/loop4 . gzip -9 < /dev/loop6 > part6.gz . cat /dev/loop10 > /media/usb/part10.bin Show Sample Output


    5
    FILE=file_name; CHUNK=$((64*1024*1024)); SIZE=$(stat -c "%s" $FILE); for ((i=0; i < $SIZE; i+=$CHUNK)); do losetup --find --show --offset=$i --sizelimit=$CHUNK $FILE; done
    flatcap · 2014-10-03 13:18:19 2
  • shortest alternative without the speed-o-meter"xclip large.xml" "xclip -o" to get the clipboard content, alternatively [shift key] + insert or middle button of your mouse.


    5
    pv large.xml | xclip
    marssi · 2009-07-08 19:26:12 0
  • The command gives size of all files smaller than 1024k, this information, together with disk usage, can help determin file system parameter (e.g. block size) or storage device (e.g. SSD v.s. HDD). Note if you use awk instead of "cut| dc", you easily breach maximum allowed number of records in awk. Show Sample Output


    1
    find dir -size -1024k -type f | xargs -d $'\n' -n1 ls -l | cut -d ' ' -f 5 | sed -e '2,$s/$/+/' -e '$ap' | dc
    zhangweiwu · 2009-12-28 04:23:01 1

What Others Think

Could skip the 'less'. Not harmful but wastes a process: head -n 50000 file.lst > output.txt Might also consider trying 'split' instead: split -l 50000 file.lst output.txt.
nnutter · 350 weeks and 3 days ago
very true. Never used the split command before, thanks for the heads up tho.
Richie086 · 350 weeks and 3 days ago

What do you think?

Any thoughts on this command? Does it work on your machine? Can you do the same thing with only 14 characters?

You must be signed in to comment.

What's this?

commandlinefu.com is the place to record those command-line gems that you return to again and again. That way others can gain from your CLI wisdom and you from theirs too. All commands can be commented on, discussed and voted up or down.

Share Your Commands



Stay in the loop…

Follow the Tweets.

Every new command is wrapped in a tweet and posted to Twitter. Following the stream is a great way of staying abreast of the latest commands. For the more discerning, there are Twitter accounts for commands that get a minimum of 3 and 10 votes - that way only the great commands get tweeted.

» http://twitter.com/commandlinefu
» http://twitter.com/commandlinefu3
» http://twitter.com/commandlinefu10

Subscribe to the feeds.

Use your favourite RSS aggregator to stay in touch with the latest commands. There are feeds mirroring the 3 Twitter streams as well as for virtually every other subset (users, tags, functions,…):

Subscribe to the feed for: