
A Simple Linux Disk Usage Gadget
A short post about a POSIX-compatible gadget that I use to quickly find large, problematic files in a directory.
By: TheHans255
3/28/2025
A bit of a shorter article today. Whenever I'm working on my computer and finding that I'm dealing with a lot of disk space (such as when syncing files between computers, making backups, cleaning up a full disk), I like to use the following shell command to quickly find the files that are causing the most trouble:
$ du -a -BM -t 1M | sort -rn | head -n 20
Here's some example output - this is from my personal rips of the first 200 Strong Bad Emails - the whole folder takes 13 GB, and the largest files are (unsurprisingly) the two largest sbemails and the sbemail with the longest segment of live-action video.
$ du -a -BM -t 1M | sort -rn | head -n 20
1306M .
19M ./150 alternate universe.mkv
17M ./159 retirement.mkv
16M ./126 best thing.mkv
15M ./176 hygiene.mkv
14M ./200 email thunder.mkv
13M ./185 nightlife.mkv
13M ./183 yes, wrestling.mkv
13M ./106 dangeresque 3.mkv
12M ./174 mini-golf.mkv
12M ./155 theme song.mkv
12M ./143 technology.mkv
12M ./133 bottom 10.mkv
12M ./130 do over.mkv
12M ./125 rock opera.mkv
11M ./190 licensed.mkv
11M ./164 looking old.mkv
11M ./135 lady...ing.mkv
11M ./100 flashback.mkv
10M ./195 love poems.mkv
This is what it looks like run on my personal Steam directory
(specifically steamapps/common
):
$ du -a -BM -t 1M | sort -rn | head -n 20
573147M .
94193M ./Helldivers 2
94109M ./Helldivers 2/data
82108M ./METAPHOR
81778M ./METAPHOR/base.cpk
73630M ./MarvelRivals
73152M ./MarvelRivals/MarvelGame
72740M ./MarvelRivals/MarvelGame/Marvel
72085M ./MarvelRivals/MarvelGame/Marvel/Content
72046M ./MarvelRivals/MarvelGame/Marvel/Content/Paks
65204M ./Titanfall2
49522M ./Titanfall2/r2
42183M ./Titanfall2/r2/paks/Win64
42183M ./Titanfall2/r2/paks
40340M ./P5R
39961M ./P5R/CPK
30116M ./Phasmophobia
29990M ./Phasmophobia/Phasmophobia_Data
25983M ./Satisfactory
25541M ./P5R/CPK/BASE.CPK
indicating that the whole thing takes up 573 GB, and the games that take up the most space are Helldivers 2, METAPHOR: ReFantazio, Marvel Rivals, Titanfall 2, and Persona 5 Royal in that order.
Running this in the METAPHOR: ReFantazio folder indicates that the vast majority of its space is in an 82 GB file, presumably read as some sort of virtual disk image:
$ du -a -BM -t 1M | sort -rn | head -n 20
82108M .
81778M ./base.cpk
319M ./METAPHOR.exe
6M ./libcurl-x64.dll
2M ./PhysXCommon_64.dll
2M ./PhysX_64.dll
2M ./MetaphorFix.asi
So, how does it work, and how can you adjust it to your needs?
$ du -a -BM -t 1M | sort -rn | head -n 20
du
is the POSIX "disk usage" utility - the heart of this command.-a
indicates all files - omit this and you'll only see directories.-BM
indicates that the file sizes should be reported in megabytes. You can change the second letter toK
,M
,G
,T
or further to change the scaling, or replace it with--block-size=1
to give the size in bytes.- At very small file sizes, you might also want to add the flag
--apparent-bytes
, which tells you the logical size of the file. By default,du
reports the size that the file actually takes on disk, which can be much larger since files will be rounded up to take entire disk chunks. -t 1M
indicates that only files above 1 MB will be reported. This is to help with performance - since it's unlikely that these files will appear in the top 20, cutting off smaller files preventssort
from having to deal with them. If your expected top sizes are smaller, lower this number (such as to-t 1K
) or remove the option entirely.
sort
is the POSIX sort utility, which sorts lines of text input.-r
sorts in reverse order (largest number first)-n
performs a numerical sort - it indicates that each line starts with a number and we should sort by the value of that number. Omitting this causes the lines to be sorted alphabetically
head
is the POSIX utility that prints only the first few lines of input-n 20
indicates that 20 lines should be printed. You can change this number to see more lines.
All of this can also be piped to less
to scroll through the
output, or routed to a text file (e.g. >out.txt
) for further analysis.
Of course, this sort of analysis is very, very simple compared to what you might get from a GUI disk analysis utility, such as KDE Filelight or Windows's WinDirStat. It's a nice, quick and dirty trick though, especially when you want to get a quick analysis of a particular directory.