SvennD
The power of find commands
June 14, 2018

The power of find commands

Posted on June 14, 2018  •  3 minutes  • 595 words  •  Suggest Changes

GNU find is a useful tool in any Linux user’s toolbox. It can be used for simple searches up to a small processing tool. Here I share some of the useful examples I used in the past; As with anything on the web, test before running ! (most definitely those with -rm or delete flags!)

Simple find

Simply find all PHP files :

find wordpress/ -iname "*.php"

recursively search, case insensitive to all files ending with *.php

AND && OR ||
find . -type f -name "*.php" -o -name "*.txt" -o -name ".csv"

This selects multiple extensions -o is the programming “OR or ||”. Since all these are extensions, the -type f is a bit redundant but it means we only search for files.

user files
find . -user alice -exec chown svennd.TeamAlice "{}" \;

Searching for all users owned by Alice, and changing ownership to Svenn in group TeamAlice.

Complex examples

Find all _*.tsv _larger 1Mb, compress them with the super fast lz4 on high compression and remove the source file after this.

find . -name "*.tsv" -size +1M -exec lz4 -9 --rm '{}' \;

In the same line; compress all files ending with *.fastq and gzip them, also they cannot end with *.gz (in this case redundant but its an extra safety)

find . -type f -name "*.fastq" ! -name '*.gz' -exec gzip "{}" \;

Recursive remove all directory’s matching the name *.tsv.index in a rm or echo single command. This makes it possible to easily swap out rm for echo as a test.

find . -type d -name "*.tsv.index" -exec echo {} +
find . -type d -name "*.tsv.index" -exec rm -rf {} +

A combination of a few commands, calculate the storage use from all files size larger then 1M, with no hardlinks, ending with *.tsv.

find . -name "*.tsv" -size +1M -links 1 -print0 | du -hc | tail -n 1

(edit: might not work as intended)

find . -name "log_jobs" -exec du -hc {} +

 

Find files, that are newer then 5 minutes :

find . -type f -mmin -5

and older :

find . -type f -mmin +5

Hard links are nice, but also a (enter curse-word) to track, luckily we have find to locate it :

find /data -samefile file.txt -xdev

This would find all the files that are exactly the same as file.txt (so only hard links, no soft links or copy’s) considering hard links can only be in one file system its logical to add _-xdev _which tells find not to enter other file-systems since hard links can not be across file-systems. If you are also looking for soft links remove -xdev and add -L

Generate a md5sum for every file in this current directory except files “mylog.log” and “md5.lst”.

find . -type f ! -name "mylog.log" ! -name "md5.lst" -exec md5sum "{}" + > md5.lst

 

A quick and dirty way to find directories (=experiments) that have been made in the last 90 days, sorted on date (removing hard linked .save dirs) This is a sort.

find . -maxdepth 1 -name "*_machine_ID_*" -type d -ctime -90 | grep -v .save | sort -t_ -k 2

 

Ignore certain files, can be done using ! -name “*file” for example. This finds all directories starting with 17, and not ending with .save (hard link for us) and shows the size of those directories.

find . -maxdepth 1 -name "17*" ! -name "*.save" -type d -exec du -hs '{}' +

 

Count certain file type in a single directory (not recursively)

find . -maxdepth 1 -name "*.fastq" | wc -l

 

Support

If you enjoyed this website, consider buying me a Dr. Pepper

Buy me a Dr PepperBuy me a Dr Pepper