Discussion:
very odd nfs behaviour
(too old to reply)
Mike Scott
2025-01-24 16:56:00 UTC
Permalink
A very odd situation here.

I have a (freebsd) server serving a tree of photos and information
files. It's large, and the paths quite long - whether that's relevant I
don't know.

On two of three machines all running mint at various versions all is
well; I have problems on the third, which happens to be my desktop box.
An example good listing would be (sorry about wrap):

***@troi ~ $ ls
/nfs/mmedia/pictures/originals-index4/mike/camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.*
/nfs/mmedia/pictures/originals-index4/mike/camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG.exif
/nfs/mmedia/pictures/originals-index4/mike/camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG--info.html
/nfs/mmedia/pictures/originals-index4/mike/camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG.sha
/nfs/mmedia/pictures/originals-index4/mike/camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG--slide.html
/nfs/mmedia/pictures/originals-index4/mike/camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG--slide.png
/nfs/mmedia/pictures/originals-index4/mike/camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG--thumb.png

That corresponds exactly to what's on the server.


However, on my desktop m/c, the same command complains about a missing
file, and triplicates all the lines bar the first, which is duplicated,
and there's an error about not finding a file that has an incorrect name
anyway:

Desktop> ls
/nfs/mmedia/pictures/originals-index4/mike/camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.*
ls: cannot access
'/nfs/mmedia/pictures/originals-index4/mike/camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.2':
No such file or directory
/nfs/mmedia/pictures/originals-index4/mike/camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG.exif
/nfs/mmedia/pictures/originals-index4/mike/camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG.exif
/nfs/mmedia/pictures/originals-index4/mike/camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG--info.html
/nfs/mmedia/pictures/originals-index4/mike/camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG--info.html
/nfs/mmedia/pictures/originals-index4/mike/camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG--info.html
/nfs/mmedia/pictures/originals-index4/mike/camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG.sha
/nfs/mmedia/pictures/originals-index4/mike/camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG.sha
/nfs/mmedia/pictures/originals-index4/mike/camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG.sha
/nfs/mmedia/pictures/originals-index4/mike/camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG--slide.html
/nfs/mmedia/pictures/originals-index4/mike/camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG--slide.html
/nfs/mmedia/pictures/originals-index4/mike/camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG--slide.html
/nfs/mmedia/pictures/originals-index4/mike/camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG--slide.png
/nfs/mmedia/pictures/originals-index4/mike/camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG--slide.png
/nfs/mmedia/pictures/originals-index4/mike/camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG--slide.png
/nfs/mmedia/pictures/originals-index4/mike/camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG--thumb.png
/nfs/mmedia/pictures/originals-index4/mike/camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG--thumb.png
/nfs/mmedia/pictures/originals-index4/mike/camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG--thumb.png


If I unmount and remount the file system, I get different results -
always works on the other machines, and fails /differently/ each time on
mine.


I've also seen this happen in a virtual machine running on my box.

It happens whether I hard mount or use the automounter.


The OS versions are different - I'm running mint 21.2, the VM is at
21.3; while the others are both rather older versions (and different
hardware). The machines are all configured the same.


I'm at a loss! Can anyone suggest what's going on here please? I'm sure
this used to work!

Thanks.
--
Mike Scott
Harlow, England
Mike Scott
2025-01-24 17:00:20 UTC
Permalink
Post by Mike Scott
A very odd situation here.
....


I should have mentioned that things do look OK in caja. The problem
affects ls, find, as well as a perl test script using File::Find.
--
Mike Scott
Harlow, England
Edmund
2025-01-24 17:01:45 UTC
Permalink
Post by Mike Scott
A very odd situation here.
I have a (freebsd) server serving a tree of photos and information
files. It's large, and the paths quite long - whether that's relevant I
don't know.
On two of three machines all running mint at various versions all is
well; I have problems on the third, which happens to be my desktop box.
camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.*
/nfs/mmedia/pictures/originals-index4/mike/
camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG.exif
/nfs/mmedia/pictures/originals-index4/mike/
camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG--info.html
/nfs/mmedia/pictures/originals-index4/mike/
camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG.sha
/nfs/mmedia/pictures/originals-index4/mike/
camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG--slide.html
/nfs/mmedia/pictures/originals-index4/mike/
camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG--slide.png
/nfs/mmedia/pictures/originals-index4/mike/
camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG--thumb.png
That corresponds exactly to what's on the server.
However, on my desktop m/c, the same command complains about a missing
file, and triplicates all the lines bar the first, which is duplicated,
and there's an error about not finding a file that has an incorrect name
Desktop> ls /nfs/mmedia/pictures/originals-index4/mike/
camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.*
ls: cannot access '/nfs/mmedia/pictures/originals-index4/mike/
camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.2': No such file or
directory
/nfs/mmedia/pictures/originals-index4/mike/
camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG.exif
/nfs/mmedia/pictures/originals-index4/mike/
camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG.exif
/nfs/mmedia/pictures/originals-index4/mike/
camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG--info.html
/nfs/mmedia/pictures/originals-index4/mike/
camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG--info.html
/nfs/mmedia/pictures/originals-index4/mike/
camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG--info.html
/nfs/mmedia/pictures/originals-index4/mike/
camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG.sha
/nfs/mmedia/pictures/originals-index4/mike/
camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG.sha
/nfs/mmedia/pictures/originals-index4/mike/
camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG.sha
/nfs/mmedia/pictures/originals-index4/mike/
camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG--slide.html
/nfs/mmedia/pictures/originals-index4/mike/
camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG--slide.html
/nfs/mmedia/pictures/originals-index4/mike/
camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG--slide.html
/nfs/mmedia/pictures/originals-index4/mike/
camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG--slide.png
/nfs/mmedia/pictures/originals-index4/mike/
camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG--slide.png
/nfs/mmedia/pictures/originals-index4/mike/
camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG--slide.png
/nfs/mmedia/pictures/originals-index4/mike/
camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG--thumb.png
/nfs/mmedia/pictures/originals-index4/mike/
camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG--thumb.png
/nfs/mmedia/pictures/originals-index4/mike/
camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG--thumb.png
If I unmount and remount the file system, I get different results -
always works on the other machines, and fails /differently/ each time on
mine.
I've also seen this happen in a virtual machine running on my box.
It happens whether I hard mount or use the automounter.
The OS versions are different - I'm running mint 21.2, the VM is at
21.3; while the others are both rather older versions (and different
hardware). The machines are all configured the same.
I'm at a loss! Can anyone suggest what's going on here please? I'm sure
this used to work!
Thanks.
Wild guess, running out of disk space?
Mike Scott
2025-01-24 17:30:30 UTC
Permalink
Post by Edmund
Wild guess, running out of disk space?
Would it were so simple! No, but nice idea, thanks.
--
Mike Scott
Harlow, England
Richard Kettlewell
2025-01-24 17:55:37 UTC
Permalink
Post by Mike Scott
A very odd situation here.
I have a (freebsd) server serving a tree of photos and information
files. It's large, and the paths quite long - whether that's relevant
I don't know.
How many files in the directory?
Post by Mike Scott
On two of three machines all running mint at various versions all is
well; I have problems on the third, which happens to be my desktop
/nfs/mmedia/pictures/originals-index4/mike/camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.*
/nfs/mmedia/pictures/originals-index4/mike/camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG.exif
/nfs/mmedia/pictures/originals-index4/mike/camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG--info.html
/nfs/mmedia/pictures/originals-index4/mike/camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG.sha
/nfs/mmedia/pictures/originals-index4/mike/camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG--slide.html
/nfs/mmedia/pictures/originals-index4/mike/camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG--slide.png
/nfs/mmedia/pictures/originals-index4/mike/camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG--thumb.png
That corresponds exactly to what's on the server.
However, on my desktop m/c, the same command complains about a missing
file, and triplicates all the lines bar the first, which is
duplicated, and there's an error about not finding a file that has an
Desktop> ls
/nfs/mmedia/pictures/originals-index4/mike/camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.*
ls: cannot access
No such file or directory
/nfs/mmedia/pictures/originals-index4/mike/camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG.exif
/nfs/mmedia/pictures/originals-index4/mike/camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG.exif
/nfs/mmedia/pictures/originals-index4/mike/camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG--info.html
/nfs/mmedia/pictures/originals-index4/mike/camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG--info.html
/nfs/mmedia/pictures/originals-index4/mike/camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG--info.html
/nfs/mmedia/pictures/originals-index4/mike/camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG.sha
/nfs/mmedia/pictures/originals-index4/mike/camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG.sha
/nfs/mmedia/pictures/originals-index4/mike/camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG.sha
/nfs/mmedia/pictures/originals-index4/mike/camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG--slide.html
/nfs/mmedia/pictures/originals-index4/mike/camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG--slide.html
/nfs/mmedia/pictures/originals-index4/mike/camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG--slide.html
/nfs/mmedia/pictures/originals-index4/mike/camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG--slide.png
/nfs/mmedia/pictures/originals-index4/mike/camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG--slide.png
/nfs/mmedia/pictures/originals-index4/mike/camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG--slide.png
/nfs/mmedia/pictures/originals-index4/mike/camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG--thumb.png
/nfs/mmedia/pictures/originals-index4/mike/camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG--thumb.png
/nfs/mmedia/pictures/originals-index4/mike/camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG--thumb.png
If I unmount and remount the file system, I get different results -
always works on the other machines, and fails /differently/ each time
on mine.
I've also seen this happen in a virtual machine running on my box.
It happens whether I hard mount or use the automounter.
The OS versions are different - I'm running mint 21.2, the VM is at
21.3; while the others are both rather older versions (and different
hardware). The machines are all configured the same.
I'm at a loss! Can anyone suggest what's going on here please? I'm
sure this used to work!
Formerly, Linux NFS servers could get confused by large directories.
https://lwn.net/Articles/544520/ is the best writeup I’ve found.

In your case the server is FreeBSD, so Linux’s historical bugs aren’t
directly relevant, beyond highlighting that merely listing a directory
is more complex than you might initially imagine. I’m not sure why a
hypothetical similar bug in FreeBSD would only be visible on a subset of
clients either.
--
https://www.greenend.org.uk/rjk/
Mike Scott
2025-01-27 15:41:37 UTC
Permalink
[ comp.unix.bsd.freebsd.misc added ]
Post by Richard Kettlewell
Post by Mike Scott
A very odd situation here.
I have a (freebsd) server serving a tree of photos and information
files. It's large, and the paths quite long - whether that's relevant
I don't know.
[ screed about file names being truncated when read over nfs ]
.......
Post by Richard Kettlewell
Post by Mike Scott
I'm at a loss! Can anyone suggest what's going on here please? I'm
sure this used to work!
Formerly, Linux NFS servers could get confused by large directories.
https://lwn.net/Articles/544520/ is the best writeup I’ve found.
In your case the server is FreeBSD, so Linux’s historical bugs aren’t
directly relevant, beyond highlighting that merely listing a directory
is more complex than you might initially imagine. I’m not sure why a
hypothetical similar bug in FreeBSD would only be visible on a subset of
clients either.
OK, I've at least found what's happened, if not the root issue. Sort of
mea culpa, for which I apologise.

In spite of my assertion (which I should have checked and didn't), the
mount options differed. The working machines all specified rsize=8192.
My box was using a much larger figure, of 131072 (ie 32 * 4096).

It seems anything over 8192 causes this issue - that filenames get
truncated.

Whether that's a linux client issue or a freebsd server issue, or the
result of interworking, I've no idea. Nor can I imagine why it should
happen without errors being flagged up somewhere (I checked the logs at
both ends) -- which is nasty, because I had a system that met the specs
and mostly worked but very occasionally (< about 1 in 100k times, I
reckon) failed silently. Ouch.


Anyway, thanks to all for comments and advice offered. I'm back 'on the
road'; maybe if someone else hits the same issue they'll find this thread.
--
Mike Scott
Harlow, England
Lawrence D'Oliveiro
2025-01-27 23:24:38 UTC
Permalink
Post by Mike Scott
In spite of my assertion (which I should have checked and didn't), the
mount options differed. The working machines all specified rsize=8192.
My box was using a much larger figure, of 131072 (ie 32 * 4096).
It seems anything over 8192 causes this issue - that filenames get
truncated.
I don’t understand why increasing rsize on its own would have any
effect: according to the docs, that only controls the maximum size of
packets that this end can receive; the maximum size the other end can
send is limited by that end’s wsize value. So increasing rsize on its
own should have no effect.

Looking up NFS mount options online, this page
<https://docs.redhat.com/en/documentation/red_hat_enterprise_linux/4/html/reference_guide/s2-nfs-client-config-options#s2-nfs-client-config-options>
does say “be careful when changing these values; some older Linux
kernels and network cards do not work well with larger block sizes”.
Post by Mike Scott
Whether that's a linux client issue or a freebsd server issue, or the
result of interworking, I've no idea. Nor can I imagine why it should
happen without errors being flagged up somewhere (I checked the logs at
both ends) -- which is nasty, because I had a system that met the specs
and mostly worked but very occasionally (< about 1 in 100k times, I
reckon) failed silently. Ouch.
That really baffles me, that you don’t see any errors indicating there was
a problem.
Mike Scott
2025-01-28 08:06:20 UTC
Permalink
Post by Lawrence D'Oliveiro
Post by Mike Scott
In spite of my assertion (which I should have checked and didn't), the
mount options differed. The working machines all specified rsize=8192.
My box was using a much larger figure, of 131072 (ie 32 * 4096).
It seems anything over 8192 causes this issue - that filenames get
truncated.
I don’t understand why increasing rsize on its own would have any
effect: according to the docs, that only controls the maximum size of
packets that this end can receive; the maximum size the other end can
send is limited by that end’s wsize value. So increasing rsize on its
own should have no effect.
Looking up NFS mount options online, this page
<https://docs.redhat.com/en/documentation/red_hat_enterprise_linux/4/html/reference_guide/s2-nfs-client-config-options#s2-nfs-client-config-options>
does say “be careful when changing these values; some older Linux
kernels and network cards do not work well with larger block sizes”.
Post by Mike Scott
Whether that's a linux client issue or a freebsd server issue, or the
result of interworking, I've no idea. Nor can I imagine why it should
happen without errors being flagged up somewhere (I checked the logs at
both ends) -- which is nasty, because I had a system that met the specs
and mostly worked but very occasionally (< about 1 in 100k times, I
reckon) failed silently. Ouch.
That really baffles me, that you don’t see any errors indicating there was
a problem.
Yes, it's an odd one in many ways. Not least because rsize/wsize are
supposed to be irrelevant for tcp mounts (which is all the server
provides anyway)

I've just tried a loopback NFS mount on the server (it's the only fbsd
box I have to hand) and can't force the problem to show. So presumably
it's something to do with the inter-system working, but I don't have the
knowledge to delve further :-{

So I'll have to settle for 'it works now'. But as I noted, I'm very
discomforted that such a problem is even possible without errors being
flagged somewhere.

Thanks again to all who've responded.
--
Mike Scott
Harlow, England
Carlos E.R.
2025-01-28 11:34:42 UTC
Permalink
...
Post by Mike Scott
So I'll have to settle for 'it works now'. But as I noted, I'm very
discomforted that such a problem is even possible without errors being
flagged somewhere.
Maybe report to some bug tracker at the "distributions" involved.
--
Cheers, Carlos.
pinnerite
2025-01-28 22:58:45 UTC
Permalink
On Tue, 28 Jan 2025 08:06:20 +0000
Post by Mike Scott
Post by Lawrence D'Oliveiro
Post by Mike Scott
In spite of my assertion (which I should have checked and didn't), the
mount options differed. The working machines all specified rsize=8192.
My box was using a much larger figure, of 131072 (ie 32 * 4096).
It seems anything over 8192 causes this issue - that filenames get
truncated.
I don’t understand why increasing rsize on its own would have any
effect: according to the docs, that only controls the maximum size of
packets that this end can receive; the maximum size the other end can
send is limited by that end’s wsize value. So increasing rsize on its
own should have no effect.
Looking up NFS mount options online, this page
<https://docs.redhat.com/en/documentation/red_hat_enterprise_linux/4/html/reference_guide/s2-nfs-client-config-options#s2-nfs-client-config-options>
does say “be careful when changing these values; some older Linux
kernels and network cards do not work well with larger block sizes”.
Post by Mike Scott
Whether that's a linux client issue or a freebsd server issue, or the
result of interworking, I've no idea. Nor can I imagine why it should
happen without errors being flagged up somewhere (I checked the logs at
both ends) -- which is nasty, because I had a system that met the specs
and mostly worked but very occasionally (< about 1 in 100k times, I
reckon) failed silently. Ouch.
That really baffles me, that you don’t see any errors indicating there was
a problem.
Yes, it's an odd one in many ways. Not least because rsize/wsize are
supposed to be irrelevant for tcp mounts (which is all the server
provides anyway)
I've just tried a loopback NFS mount on the server (it's the only fbsd
box I have to hand) and can't force the problem to show. So presumably
it's something to do with the inter-system working, but I don't have the
knowledge to delve further :-{
So I'll have to settle for 'it works now'. But as I noted, I'm very
discomforted that such a problem is even possible without errors being
flagged somewhere.
Thanks again to all who've responded.
--
Mike Scott
Harlow, England
I had a similar situation several months ago.
I tested the drive and tried repairs but in the end it was clear the drive had reached its "sell-by" date.
Then a second one went too.
Both were made by Seagate (in different countries) and manufactered five years apart.

Regards,

Alan
--
Linux Mint 21.3 kernel version 5.15.0-127-generic Cinnamon 6.0.4
AMD Ryzen 7 7700, Radeon RX 6600, 32GB DDR5, 1TB SSD, 2TB Barracuda
Carlos E.R.
2025-01-24 21:56:30 UTC
Permalink
Post by Mike Scott
A very odd situation here.
I have a (freebsd) server serving a tree of photos and information
files. It's large, and the paths quite long - whether that's relevant I
don't know.
If you are using nfs version 3, perhaps try version 4.
If version 4, what's in the exports file?

All client machines have the same fstab nfs entry?
--
Cheers, Carlos.
Paul
2025-01-24 22:34:19 UTC
Permalink
Post by Carlos E.R.
Post by Mike Scott
A very odd situation here.
I have a (freebsd) server serving a tree of photos and information files.
It's large, and the paths quite long - whether that's relevant I don't know.
If you are using nfs version 3, perhaps try version 4.
If version 4, what's in the exports file?
All client machines have the same fstab nfs entry?
I would be just a bit curious about the software versions myself.

Back when I was using nfs at work, that topic came up quite often.
What is the version at each end.

The FreeBSD have their own taste in software, so there's no reason
for anything to particularly match Linux.

I would be examining the versions on the cases that work,
and checking the versions in the non-working case.

In mo9dern times, some of the computers have "power management"
and that could influence whether things like "stale mounts"
are showing up. You would want to find a log and see if
there is any sign of behaviors like that (mount malfunctions
because the disk could not be accessed in time, like a stat()
check).

Even your NIC can be set to power down when not in use.

Paul
Carlos E.R.
2025-01-25 00:45:25 UTC
Permalink
Post by Paul
Post by Mike Scott
A very odd situation here.
...
Post by Paul
The FreeBSD have their own taste in software, so there's no reason
for anything to particularly match Linux.
I would be examining the versions on the cases that work,
and checking the versions in the non-working case.
In mo9dern times, some of the computers have "power management"
and that could influence whether things like "stale mounts"
are showing up. You would want to find a log and see if
there is any sign of behaviors like that (mount malfunctions
because the disk could not be accessed in time, like a stat()
check).
Even your NIC can be set to power down when not in use.
I have seen nfs survive hibernation of the machines. It is quite resilient.

Then we typically forget about the "fsid= " number.
--
Cheers, Carlos.
Mike Scott
2025-01-25 17:02:35 UTC
Permalink
Post by Carlos E.R.
Post by Mike Scott
A very odd situation here.
I have a (freebsd) server serving a tree of photos and information
files. It's large, and the paths quite long - whether that's relevant
I don't know.
If you are using nfs version 3, perhaps try version 4.
If version 4, what's in the exports file?
All client machines have the same fstab nfs entry?
Thanks to all for commenting.

To clarify a few points:

The clients in question all use autofs, and the tables are copies of a
central master. So nfs options should be the same.

On my own machine, it made no difference whether the fs was automounted,
or manually.

The precise point of error changes when the fs is remounted, manually or
by reboot.

BTW NFSv4 isn't really an option. A very different beast, and doesn't
seem to offer anything I need.

It used to work, I'm (nearly) sure - it affects my software to make a
photo index which dates back years. I'm sure I'd have noticed an issue.

The directories can each have several hundred files.



Currently, I see:

Desktop> find
/nfs/mmedia/pictures/originals-index4/mike/master_digital_camera/2008_0418/
-ls >/dev/null
find:
‘/nfs/mmedia/pictures/originals-index4/mike/master_digital_camera/2008_0418/2008_040’:
No such file or directory

ls /nfs/mmedia/pictures/originals-index4 [linewrapped]
/mike/master_digital_camera[/2008_0418/
<skip>
2008_0216_102631.jpg--slide.png
2008_0216_102631.jpg--thumb.png
2008_040 <<<<< odd extra entry
2008_0406_070443.jpg.exif
etc


I copied just that folder to within /tmp, so local: the initial copy
failed because of that bad entry, so I did a copy-and-paste of the
originals. A 'diff -r' on the two directories moaned that 2008_040 only
existed on the nfs folder, so at least at the moment, it's a spurious
extra entry rather than a mangled real one.


FWIW my machine is on Linux Mint 21.2 Victoria; one of the working ones
is on Linux Mint 21.1 Vera, so not too different. The lappy I can't
check ATM.


......

Oh, I've just rebooted after an abortive attempt to run a DVD live
system. Now I get

~> find /nfs/mmedia/pictures/ -ls >/dev/null
find:
‘/nfs/mmedia/pictures/originals-index4/mike/camera2018/20181210/2018-12-10/’:
No such file or directory
find:
‘/nfs/mmedia/pictures/originals-nokeys-index/mike/master_digital_camera/2008_1219/2008_121’:
No such file or directory
find:
‘/nfs/mmedia/pictures/originals-nokeys-index/mike/camera2018/20181210/2018-12-10/’:
No such file or directory
~>

I'm pretty sure I had a "filename too long" yesterday as well.


I'm at a loss as to what to try next - the live DVD seemed a good idea,
but wouldn't boot: I'll have to think about a fresh thread for that one.
--
Mike Scott
Harlow, England
Arti F. Idiot
2025-01-25 00:13:23 UTC
Permalink
Post by Mike Scott
A very odd situation here.
I have a (freebsd) server serving a tree of photos and information
files. It's large, and the paths quite long - whether that's relevant I
don't know.
On two of three machines all running mint at various versions all is
well; I have problems on the third, which happens to be my desktop box.
<snip>
Post by Mike Scott
The OS versions are different - I'm running mint 21.2, the VM is at
21.3; while the others are both rather older versions (and different
hardware). The machines are all configured the same.
I'm at a loss! Can anyone suggest what's going on here please? I'm sure
this used to work!
I'm sure you've already checked for aliased commands but if not..

Any chance the problem machine has a different filesystem, i.e. BTRFS ?
Delayed CoW processing of large NFS mounts could cause some weirdness.
Lawrence D'Oliveiro
2025-01-26 00:01:47 UTC
Permalink
Post by Mike Scott
On two of three machines all running mint at various versions all is
well; I have problems on the third, which happens to be my desktop box.
Have you checked the system logs, on both client and server side, to see
if any interesting messages appear when you are doing these listings?
Grant Taylor
2025-01-26 02:08:05 UTC
Permalink
Post by Mike Scott
A very odd situation here.
Yes, it seems that way.

I don't have an answer, or even a hint. But I do have some additional
Post by Mike Scott
I have a (freebsd) server serving a tree of photos and information
files. It's large, and the paths quite long - whether that's relevant I
don't know.
How long is "quite long"? Are you tickling any sort of limits?
Post by Mike Scott
On two of three machines all running mint at various versions all is
well; I have problems on the third, which happens to be my desktop box.
What versions (kernel, OS, etc.) are the three machines?
The line wrap actually came through well on my end.
Post by Mike Scott
camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.*
/nfs/mmedia/pictures/originals-index4/mike/
camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG.exif
/nfs/mmedia/pictures/originals-index4/mike/
camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG--info.html
/nfs/mmedia/pictures/originals-index4/mike/
camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG.sha
/nfs/mmedia/pictures/originals-index4/mike/
camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG--slide.html
/nfs/mmedia/pictures/originals-index4/mike/
camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG--slide.png
/nfs/mmedia/pictures/originals-index4/mike/
camera2014/20140531b/2014-04-24/2014-04-15_10-21-47.JPG--thumb.png
That corresponds exactly to what's on the server.
Okay.
Post by Mike Scott
However, on my desktop m/c, the same command complains about a missing
file, and triplicates all the lines bar the first, which is duplicated,
and there's an error about not finding a file that has an incorrect name
This feel like NFS loosing state and or synchronization between the
client and server when listing directories.

The duplication -> triplication and the wild name seem like something
has failed somewhere at the underlying RPC layer.
Post by Mike Scott
If I unmount and remount the file system, I get different results -
always works on the other machines, and fails /differently/ each time on
mine.
Different network / RPC / NFS mismatches would likely happen with
underlying protocol problems.
Post by Mike Scott
I've also seen this happen in a virtual machine running on my box.
Is the VM running on the same box that has the problem? Or is it
running on a different system?
Post by Mike Scott
It happens whether I hard mount or use the automounter.
I'm not surprised by that. IMHO the auto-mounter's only role is to
automatically mount (and unmount when idle) the NFS export using
standard mount methods.
Post by Mike Scott
The OS versions are different - I'm running mint 21.2, the VM is at
21.3; while the others are both rather older versions (and different
hardware). The machines are all configured the same.
So not exactly the same versions, but close to each other.
Post by Mike Scott
I'm at a loss! Can anyone suggest what's going on here please? I'm sure
this used to work!
I'd reach for a packet capture and feed it into Wireshark or something
similar that can analyze the underlying UDP / TCP, RPC, and NFS protocol
and call out any oddities.
--
Grant. . . .
Loading...