Server running out of space 😱

Jun 25 - 2 min read

Photo by Kelvin Ang on Unsplash

If your job involves maintaining server, you are likely to stumble onto this issue sooner or later. Unlike CPU load, high traffic etc. I found this issue harder to track as you need to go to OS level and check it manually. Moreover, the hosting providers don’t provide alarms for “server running out of space” events.

Things mentioned on the post are related to AWS ec2(ubuntu 20) with node(express) server managed by pm2.

This post is written as a future reference for myself and other fellow developers struggling in this area.

Scenario

Few days ago one of the Ec2 server I was maintaining began to show weird behaviors.

Redis queue stopped working. I couldn’t ssh into the server for diagnosis. And there were thousands of requests per minute waiting for server to respond. It was hell.

Finally I was able to get into the server & checked pm2 logs. It was showing “MISCONF Redis is configured to save RDB snapshots”. Googled and found that ec2 was running out of space.

If you are stuck in similar case, this post may help to get your server running smooth again.

What to do next

First thing is to check how much space is left. Like you can view CPU load, disk I/O etc. at cloud service providers’ dashboard, you won’t be able to view remaining space. You need to ssh into the server itself and run df command.

# view space
df -h

# view partitions
lsblk

The next step would be to identify which files/folders are occupying your server’s space.

Start by running this as root or with sudo:

du -cha --max-depth=1 / | grep -E "M|G"

So for example, if /var was the biggest you could do it like this next & so on until you identify the big fish.

du -cha --max-depth=1 /var | grep -E "M|G"

If those big files/folders turn out to be some junks that can be cleaned, just go ahead and run

sudo rm -r ./junk-folder-to-clean

In my case it was more complex. It was redis cache and other OS related junks.

Luckily I found this script on Github gist

Before copy pasting and executing it, make sure you understand what you are doing. You can check comments at Github to know more.

#!/bin/sh
#Check the Drive Space Used by Cached Files
du -sh /var/cache/apt/archives
#Clean all the log file
#for logs in `find /var/log -type f`; do > $logs; done
logs=`find /var/log -type f`
foriin$logs
do
>$i
done
#Getting rid of partial packages
apt-get clean && apt-get autoclean
apt-get remove --purge -y software-properties-common
#Getting rid of no longer required packages
apt-get autoremove -y
#Getting rid of orphaned packages
deborphan | xargs sudo apt-get -y remove --purge
#Free up space by clean out the cached packages
apt-get clean
# Remove the Trash
rm -rf /home/*/.local/share/Trash/*/**
rm -rf /root/.local/share/Trash/*/**
# Remove Man
rm -rf /usr/share/man/??
rm -rf /usr/share/man/??_*
#Delete all .gz and rotated file
find /var/log -type f -regex ".*\.gz$"| xargs rm -Rf
find /var/log -type f -regex ".*\.[0-9]$"| xargs rm -Rf
#Cleaning the old kernels
dpkg-query -l|grep linux-im*
#dpkg-query -l |grep linux-im*|awk '{print $2}'
apt-get purge $(dpkg -l 'linux-*'| sed '/^ii/!d;/'"$(uname -r | sed "s/\(.*\)-\([^0-9]\+\)/\1/")"'/d;s/^[^ ]* [^ ]* \([^ ]*\).*/\1/;/[0-9]/!d'| head -n -1) --assume-yes
apt-get install linux-headers-`uname -r|cut -d'-' -f3`-`uname -r|cut -d'-' -f4`
#Cleaning is completed
echo"Cleaning is completed"
view rawclean.sh hosted with ❤ by GitHub

This approach will work once. But problem will definitely recur. Best way is to write a shell script and run it as cron job. This may help.


Sometimes, simply cleaning the junks may not be possible. Your server may need space upgrade. If you are using EC2, there is a nice video tutorial from AWS itself.

Also, if you are using pm2 to manage server, make sure you use tools like pm2-logrorate that will auto clean your server logs.