Find Gems
05 Nov 2023Introduction
The find utility is an extremely flexible query tool for your file system. It allows you to navigate and traverse your folder structures.
From the man page:
This manual page documents the GNU version of find. GNU find searches the directory tree rooted at each given file name by evaluating the given expression from left to right, according to the rules of precedence (see section OPERATORS), until the outcome is known (the left hand side is false for and operations, true for or), at which point find moves on to the next file name.
In this article, we’ll go through some usages.
Find by name
You can simply find by name with the -name switch:
find /the/directory -name the-file-name.txtYou can also ignore case by switching out -name for -iname.
Find by permissions
If you’d like to find files with a specific permission (in this case 777):
find . -type f -perm 0777 -printYou can take the converse of this with !:
find . -type f ! -perm 0777 -printAverage file size
Let’s say you have a directory of text files, and you’d like to find the average size of them.
By joining find with awk, you can use the following to do just that:
find . -type f -exec wc -w {} \; | awk '{numfiles=numfiles+1;total += $1} END{print total/numfiles}'Random sampling
You may need to take a random sample of files in a given folder. You can start this process off
with find, and then through clever usage of sort, and tail you can get a random sample.
This will take 500 random files from a directory:
find /the/path -type f -print0 | sort -zR | head -zn 500find /the/path -type f -print0 prints out the files using \0 as a delimiter thanks to -print0.
sort is told to use \0 as its delimiter with -z and -R is to sort them randomly.
head now takes the first 500, again using \0 as the delimiter.