What's going on with disk space?

After facing a surprising no space left on device, I issued a df -h, whose result follows.

Filesystem      Size  Used Avail Use% Mounted on
devtmpfs         63G     0   63G   0% /dev
tmpfs            63G     0   63G   0% /dev/shm
tmpfs            63G   18M   63G   1% /run
tmpfs            63G     0   63G   0% /sys/fs/cgroup
/dev/nvme0n1p5   64Z   64Z   42G 100% /
tmpfs            13G   60K   13G   1% /tmp
/dev/nvme0n1p2   96M   25M   72M  26% /boot/efi
/dev/nvme0n1p6   64Z   64Z   19G 100% /var
tmpfs           9,6M  340K  9,3M   4% /var/log
tmpfs           9,6M     0  9,6M   0% /var/tmp
/dev/nvme0n1p7   64Z   64Z  806G 100% /home
tmpfs            13G   12K   13G   1% /run/user/42
tmpfs            13G  4,0K   13G   1% /run/user/1000

To me the Avail column values are realistic, but the Size and Used ones are driving me nuts.
On the root of the file system I issued the following command.
for d in $(ls) ; do du -sh $d 2>/dev/null; done
The output follows.

ss

What am I missing?! :worried:

Thanks in advance!

Andrea

Sometimes processes can block disk space, until the service/process has been restarted. I’ve had such situations like that. Have you tried restarting the server to see if that 42GB becomes available?

Just rebooted. Nope.

ss

It is frustrating how likely is that I find myself ignorant after tons of Linux installations made.

I hope I’ll get out of this nightmare soon.

Thanks anyway!

Andrea

Theres something else:
It shows /dev/nvme0n1p5 and /dev/nvme0n1p6 to have 64 *Zeta*bytes.
That cant’t be true, so I suspect that the partition table of your NVME disk is somehow corrupted.

Later:
Maybe not the partition table, but definitively the filesystem (superblock?). So you should
check, if lsblk and fdisk -l report correct partition sizes.

First, there is a bit simpler ‘du’: sudo du -d 1 -hx /

The sizes shown by df look fascinating. What does lsblk show?

PS. Please, no bitmap screenshots.

lsblk gives

NAME        MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
sda           8:0    0   3,7T  0 disk 
├─sda1        8:1    0    16M  0 part 
└─sda2        8:2    0   3,7T  0 part 
sdb           8:16   0   5,5T  0 disk 
└─sdb1        8:17   0   5,5T  0 part 
nvme0n1     259:0    0   1,8T  0 disk 
├─nvme0n1p1 259:1    0   450M  0 part 
├─nvme0n1p2 259:2    0   100M  0 part /boot/efi
├─nvme0n1p3 259:3    0    16M  0 part 
├─nvme0n1p4 259:4    0 930,1G  0 part 
├─nvme0n1p5 259:5    0    50G  0 part /
├─nvme0n1p6 259:6    0    20G  0 part /var
└─nvme0n1p7 259:7    0 862,4G  0 part /home

and fdisk -l gives



Disk /dev/nvme0n1: 1,8 TiB, 2000398934016 bytes, 3907029168 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: gpt
Disk identifier: 3F3A7ADB-96DC-4651-82E7-6C2869DAC50F

Device              Start        End    Sectors   Size Type
/dev/nvme0n1p1       2048     923647     921600   450M Windows recovery environment
/dev/nvme0n1p2     923648    1128447     204800   100M EFI System
/dev/nvme0n1p3    1128448    1161215      32768    16M Microsoft reserved
/dev/nvme0n1p4    1161216 1951746047 1950584832 930,1G Microsoft basic data
/dev/nvme0n1p5 1951746048 2056603647  104857600    50G Linux filesystem
/dev/nvme0n1p6 2056603648 2098546687   41943040    20G Linux filesystem
/dev/nvme0n1p7 2098546688 3907029134 1808482447 862,4G Linux filesystem


Disk /dev/sda: 3,7 TiB, 4000787030016 bytes, 7814037168 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: gpt
Disk identifier: 1CECEDC5-FE1E-4007-A62B-767B8BA272DB

Device     Start        End    Sectors  Size Type
/dev/sda1     34      32767      32734   16M Microsoft reserved
/dev/sda2  32768 7814033407 7814000640  3,7T Microsoft basic data

Partition 1 does not start on physical sector boundary.


Disk /dev/sdb: 5,5 TiB, 6001174511616 bytes, 11721043968 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 4096 bytes
I/O size (minimum/optimal): 4096 bytes / 4096 bytes
Disklabel type: gpt
Disk identifier: 3958AA31-F16B-4C5A-BCFF-304E97F2E3A3

Device     Start         End     Sectors  Size Type
/dev/sdb1   2048 11721041919 11721039872  5,5T Microsoft basic data

Can you see any outstanding explanation there? So far I’m puzzled…

Andrea

Below the output of du -d 1 -hx /.


|4,0K|/opt|
|---|---|
|8,0K|/media|
|16K|/lost+found|
|4,0K|/srv|
|47M|/root|
|12K|/mnt|
|206M|/boot|
|4,4G|/usr|
|34M|/etc|
|4,7G|/|

I respect the request of avoiding screenshots, though, as you can see, in this case upon pasting I get pipes as separation characters and I don’t know why. That was the reason why I resorted to screenshots.

Thanks!

Andrea

Fascinating.

The forum software thinks the content is a table and adds “table elements”:

4.3G /var
16K /lost+found
4.0K /mnt

Which one sould probably edit to:

Size Path
4.3G /var
16K /lost+found
4.0K /mnt

However, if the paste has other than tab-separated lines, you get:
$ sudo du -d 1 -hx /
4.3G /var
16K /lost+found
4.0K /mnt

These communication issues do not help with the actual problem.

So your partitions look ok.
A hunch (hopefully not true):

In the partition table, a Microsoft partition is on /dev/nvme0n1p4, just before the corrupted
linux partitions.

So:
Could it be this partition was bigger in the past, and when you installed Rocky,
you resized the partition but forgot to resize the filesystem (NTFS?) in it and you
ran Windows before those strange errors did occur?

If yes: You data is most likely overwritten (lost) and you know at least the reason why.
If no, I’m out of ideas unfortunately.

If that has happened and filesystems are messed up, then fsck should be able to tell that things are not ok?

Normally, I would say yes, on the other hand IF they are really corrupted - prediction of fsck behavior is more like a guessing game :wink:

1 Like

/dev/nvme0n1p4 is simply the partition I made to install Windows there. I partitioned the entire SSD disk from scratch. Windows was installed first and then I installed Rocky Linux in the space I left for it, presumably yet not partitioned at the time I installed Windows.

/dev/nvme0n1p5, /dev/nvme0n1p6 and /dev/nvme0n1p7 do have the size I could expected from there after partitioning. As for the no space left on device, it turned out it was probably due to space exhaustion in /var/tmp, that I configured as a temporary file system in RAM, but unfortunately with too much limitation in size - I could easily mend for it, once discovered the problem. Elsewhere, in spite of the 100% usage indication, I could successfully create huge dummy files with dd and no error raised.

I’m pretty much ignorant concerning such issues, but could it be just a SSD-related issue? Maybe something related to the way the SSD disk is controlled? I remember a setting in the BIOS configuration…

Thanks!

Andrea

2022-09-21

For some reason I decided to install the system again, in the same partitions seen before.
Mysteriously now the behavior is normal, as far as sizes are concerned.

First of all, when using ‘du’ use the ‘-x’ switch .
When a process uses a file, in /proc/{pid}/fd there is a “reference” to that file.

So imagine that an app writes to a log that is never rotated and that becomes 30G. Then an unexperienced user would ‘rm -f’ that log.
Yet, the space won’t be released until the process is dead.

In such scenarios use ‘lsof’ and search for deleted files. Once you find it, go to /proc/{pid}/fd/ and use ‘truncate -s 0 {number of the fd}’ to release the space without restarting the application.

P.S.: You can also copy deleted files by ‘cp {number of fd} /{new_destination}’.