thsh:~$

Finding disk space sinners

In search for a way to find disk hogs on a shared Linux system I came across this neat pipeline:

$ du -h <directory> | grep '[0-9\.]\+G'

The -h option makes du (disk usage) deliver human-readable output. The output is then piped through grep, which searches for GB values between 0-9.

The expression can be made more accurate, but, of course, also more complex. For instance, grep '^\s*[0-9\.]\+G' will only return directories larger than 1GB.

Obviously there’s a trade-off between accuracy and complexity, but there will always be false positives.