Cogs and Levers A blog full of technical stuff

Find Gems

Introduction

The find utility is an extremely flexible query tool for your file system. It allows you to navigate and traverse your folder structures.

From the man page:

This manual page documents the GNU version of find. GNU find searches the directory tree rooted at each given file name by evaluating the given expression from left to right, according to the rules of precedence (see section OPERATORS), until the outcome is known (the left hand side is false for and operations, true for or), at which point find moves on to the next file name.

In this article, we’ll go through some usages.

Find by name

You can simply find by name with the -name switch:

find /the/directory -name the-file-name.txt

You can also ignore case by switching out -name for -iname.

Find by permissions

If you’d like to find files with a specific permission (in this case 777):

find . -type f -perm 0777 -print

You can take the converse of this with !:

find . -type f ! -perm 0777 -print

Average file size

Let’s say you have a directory of text files, and you’d like to find the average size of them. By joining find with awk, you can use the following to do just that:

find . -type f -exec wc -w {} \; | awk '{numfiles=numfiles+1;total += $1} END{print total/numfiles}'

Random sampling

You may need to take a random sample of files in a given folder. You can start this process off with find, and then through clever usage of sort, and tail you can get a random sample.

This will take 500 random files from a directory:

find /the/path -type f -print0 | sort -zR | head -zn 500

find /the/path -type f -print0 prints out the files using \0 as a delimiter thanks to -print0.

sort is told to use \0 as its delimiter with -z and -R is to sort them randomly.

head now takes the first 500, again using \0 as the delimiter.