Re: 2.6.31-rc8: CIFS with 5 seconds hiccups

View: New views
11 Messages — Rating Filter:   Alert me  

Parent Message unknown Re: 2.6.31-rc8: CIFS with 5 seconds hiccups

by Jeff Layton-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Fri, 4 Sep 2009 12:27:35 -0400 (EDT)
Christoph Lameter <cl@...> wrote:

> This is on 32 bit x86 on a Dell 1950
>
> After mouting a cifs share we have 5 second hiccups. Typical log output
> when doing a simple "ls /mnt":
>
> Sep  4 16:21:43 rd-spare kernel:  fs/cifs/transport.c: For smb_command 50
> Sep  4 16:21:43 rd-spare kernel:  fs/cifs/transport.c: Sending smb:
> total_len 118
> Sep  4 16:21:43 rd-spare kernel:  fs/cifs/inode.c: CIFS VFS: leaving
> cifs_revalidate (xid = 258) rc = 0
> Sep  4 16:21:43 rd-spare kernel:  fs/cifs/dir.c: CIFS VFS: in cifs_lookup
> as Xid: 263 with uid: 0
> Sep  4 16:21:43 rd-spare kernel:  fs/cifs/dir.c: parent inode = 0xf58d2e60
> name is: AutoWire.bmp and dentry = 0xf5adb63c
> Sep  4 16:21:43 rd-spare kernel:  fs/cifs/dir.c: NULL inode in lookup
> Sep  4 16:21:43 rd-spare kernel:  fs/cifs/dir.c: Full path: \AutoWire.bmp
> inode = 0x(null)
> Sep  4 16:21:43 rd-spare kernel:  fs/cifs/inode.c: Getting info on \AutoWire.bmp
> Sep  4 16:21:43 rd-spare kernel:  fs/cifs/transport.c: For smb_command 50
> Sep  4 16:21:43 rd-spare kernel:  fs/cifs/transport.c: Sending smb:  total_len 104
>
>  ---- 5 second hiccup
>
> Sep  4 16:21:48 rd-spare kernel:  fs/cifs/connect.c: rfc1002 length 0xce
> Sep  4 16:21:48 rd-spare kernel:  fs/cifs/connect.c: rfc1002 length 0xc0

(adding linux-cifs-client mailing list)

It looks like it's just taking 5s for the server to respond here. Do
you happen to have a wire capture of one of these events? That may tell
us more than cifsFYI info...

> Sep  4 16:21:48 rd-spare kernel:  fs/cifs/inode.c: inode 0xf5876518 old_time=26000 new_time=32751
> Sep  4 16:21:48 rd-spare kernel:  fs/cifs/inode.c: cifs_revalidate - inode unchanged
> Sep  4 16:21:48 rd-spare kernel:  fs/cifs/file.c: CIFS VFS: in cifs_writepages as Xid: 264 with uid: 0
> Sep  4 16:21:48 rd-spare kernel:  fs/cifs/file.c: CIFS VFS: leaving cifs_writepages (xid = 264) rc = 0
> Sep  4 16:21:48 rd-spare kernel:  fs/cifs/inode.c: CIFS VFS: leaving cifs_revalidate (xid = 262) rc = 0
> Sep  4 16:21:48 rd-spare kernel:  fs/cifs/inode.c: CIFS VFS: in cifs_revalidate as Xid: 265 with uid: 0
> Sep  4 16:21:48 rd-spare kernel:  fs/cifs/inode.c: Revalidate: \Akamai Headsets.doc inode 0xf5876518 count 2 dentry: 0xf5ada8d0 d_time 260
> 00 jiffies 32751
> Sep  4 16:21:48 rd-spare kernel:  fs/cifs/inode.c: CIFS VFS: leaving cifs_revalidate (xid = 265) rc = 0
> Sep  4 16:21:48 rd-spare kernel:  fs/cifs/inode.c: CIFS VFS: in cifs_revalidate as Xid: 266 with uid: 0
> Sep  4 16:21:48 rd-spare kernel:  fs/cifs/inode.c: Revalidate: \Akamai Headsets.doc inode 0xf5876518 count 2 dentry: 0xf5ada8d0 d_time 260
> 00 jiffies 32751
>
>
> This is happening intermittently on a variety of hosts.
>
> cat /proc/fs/cifs/DebugData
>
> Display Internal CIFS Data Structures for Debugging
> ---------------------------------------------------
> CIFS Version 1.60
> Active VFS Requests: 2
> Servers:
> 1) Name: 10.2.4.64  Domain: W2K Uses: 1 OS: Windows Server 2003 R2 3790
> Service Pack 2
> NOS: Windows Server 2003 R2 5.2 Capability: 0x1f3fd
> SMB session status: 1 TCP status: 1
> Local Users To Server: 1 SecMode: 0x3 Req On Wire: 2
> Shares:
> 1) \\chiprodfs2\company Mounts: 1 Type: NTFS DevInfo: 0x20
> Attributes: 0x700ff
> PathComponentMax: 255 Status: 0x1 type: DISK
>
> MIDs:
> State: 2 com: 50 pid: 5951 tsk: f756d1b0 mid 277
> State: 2 com: 50 pid: 6044 tsk: f69d4760 mid 278
>
> cat /proc/fs/cifs/Stats
>
> Resources in use
> CIFS Session: 1
> Share (unique mount targets): 1
> SMB Request/Response Buffer: 5 Pool size: 5
> SMB Small Req/Resp Buffer: 1 Pool size: 30
> Operations (MIDs): 2
>
> 0 session 0 share reconnects
> Total vfs operations: 525 maximum at one time: 3
>
> 1) \\chiprodfs2\company
> SMBs: 305 Oplock Breaks: 0
> Reads:  0 Bytes: 0
> Writes: 0 Bytes: 0
> Flushes: 0
> Locks: 0 HardLinks: 0 Symlinks: 0
> Opens: 0 Closes: 0 Deletes: 0
> Posix Opens: 0 Posix Mkdirs: 0
> Mkdirs: 0 Rmdirs: 0
> Renames: 0 T2 Renames 0
> FindFirst: 2 FNext 0 FClose 0
>
>
> What is this ???
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@...
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
>


--
Jeff Layton <jlayton@...>
_______________________________________________
linux-cifs-client mailing list
linux-cifs-client@...
https://lists.samba.org/mailman/listinfo/linux-cifs-client

Parent Message unknown Re: 2.6.31-rc8: CIFS with 5 seconds hiccups

by Jeff Layton-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Wed, 9 Sep 2009 12:33:21 -0400 (EDT)
Christoph Lameter <cl@...> wrote:

> On Sat, 5 Sep 2009, Jeff Layton wrote:
>
> > It looks like it's just taking 5s for the server to respond here. Do
> > you happen to have a wire capture of one of these events? That may tell
> > us more than cifsFYI info...
>
> I did a tcpdump and nothing stands out. Server acks the "cmd 50" and then
> waits 5 seconds before sending the data.
>
> 16:23:34.336373 IP (tos 0x0, ttl  64, id 20616, offset 0, flags [DF], proto 6, length: 118) fawkes.jules.org.43355 > dogmeat.jules.org.microsoft-ds: P 2801206064:2801206142(78) ack 468207120 win 190
> 16:23:34.336624 IP (tos 0x0, ttl 125, id 19869, offset 0, flags [DF], proto 6, length: 206) dogmeat.jules.org.microsoft-ds > fawkes.jules.org.43355: P 1:167(166) ack 78 win 64548
> 16:23:34.336636 IP (tos 0x0, ttl  64, id 20617, offset 0, flags [DF], proto 6, length: 40) fawkes.jules.org.43355 > dogmeat.jules.org.microsoft-ds: . [tcp sum ok] 78:78(0) ack 167 win 190
> 16:23:34.336669 IP (tos 0x0, ttl  64, id 20618, offset 0, flags [DF], proto 6, length: 128) fawkes.jules.org.43355 > dogmeat.jules.org.microsoft-ds: P 78:166(88) ack 167 win 190
> 16:23:34.456343 IP (tos 0x0, ttl 125, id 20045, offset 0, flags [DF], proto 6, length: 40) dogmeat.jules.org.microsoft-ds > fawkes.jules.org.43355: . [tcp sum ok] 167:167(0) ack 166 win 64460
>
> hiccup
>
> 16:23:39.284930 IP (tos 0x0, ttl 125, id 27544, offset 0, flags [DF], proto 6, length: 230) dogmeat.jules.org.microsoft-ds > fawkes.jules.org.43355: . 167:357(190) ack 166 win 64460
> 16:23:39.324060 IP (tos 0x0, ttl  64, id 20619, offset 0, flags [DF], proto 6, length: 40) fawkes.jules.org.43355 > dogmeat.jules.org.microsoft-ds: . [tcp sum ok] 166:166(0) ack 357 win 190


A binary capture would probably be easier to infer something from --
we'd be able to open it up in wireshark and get a little more info
about what sort of call the client is doing.

My suspicion would be that the server needs to perform an oplock break
to another client before it can send the response. The only way I know
how to tell that is to sniff all SMB traffic on the server and watch
for oplock break calls to other clients when these stalls occur.

--
Jeff Layton <jlayton@...>
_______________________________________________
linux-cifs-client mailing list
linux-cifs-client@...
https://lists.samba.org/mailman/listinfo/linux-cifs-client

Parent Message unknown Re: 2.6.31-rc8: CIFS with 5 seconds hiccups

by Jeff Layton-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Wed, 9 Sep 2009 13:07:52 -0400 (EDT)
Christoph Lameter <cl@...> wrote:

> On Wed, 9 Sep 2009, Jeff Layton wrote:
>
> > My suspicion would be that the server needs to perform an oplock break
> > to another client before it can send the response. The only way I know
> > how to tell that is to sniff all SMB traffic on the server and watch
> > for oplock break calls to other clients when these stalls occur.
>
> That could be tested by switching them off right? If I do
>
> echo 0 >/proc/fs/cifs/OplockEnabled
>
> and then remount the volume it should switch off oplocks?
>
> This has no effect on the stalls.
>

That'll stop your client from requesting oplocks, but that won't
prevent others from doing so. If my suspicion is correct, then another
client is holding an oplock and the server needs to break it before it
can reply to yours.

Unfortunately I doubt there's much you can do from your client to
prevent that (if that is the case). There may be a way to turn off
oplocks on the server side, but that may very well be even worse for
performance.

--
Jeff Layton <jlayton@...>
_______________________________________________
linux-cifs-client mailing list
linux-cifs-client@...
https://lists.samba.org/mailman/listinfo/linux-cifs-client

Parent Message unknown Re: 2.6.31-rc8: CIFS with 5 seconds hiccups

by Jeff Layton-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Wed, 9 Sep 2009 13:33:33 -0400 (EDT)
Christoph Lameter <cl@...> wrote:

> On Wed, 9 Sep 2009, Jeff Layton wrote:
>
> > Unfortunately I doubt there's much you can do from your client to
> > prevent that (if that is the case). There may be a way to turn off
> > oplocks on the server side, but that may very well be even worse for
> > performance.
>
> Also note that these hiccups occur when simply doing an
>
> ls
>
> we are not accessing or writing files.
>

Hmm...

The hiccups you posted in the original email happened during a
QPathInfo call (somewhat similar to a NFS GETATTR). I wouldn't think
that would cause an oplock break, but I suppose it might. The server
might decide that it needs to revoke the oplock in order to retrieve
accurate size, LastWriteTime (aka mtime), etc. It could also be a
windows bug...

Here's an excerpt from an IRC conversation on this in #samba-technical, that might give a little info:

13:42 < jlayton> would a QPathInfo call cause an oplock break?
13:42 < jlayton> (typically)?
13:47 < sdann> jlayton, no it shouldn't, as it's path based and could be done with a stat() call.  Only an open() or brl()
               operation should break an oplock.
13:48 < jlayton> ok, good to know -- thx
13:49 < jlayton> sdann: actually though, I'm asking about win2k3 server...
13:49 < jlayton> do you know whether it might break the oplock on a qpathinfo?
13:49 < jlayton> i.e. to get accurate size info, for instance
13:50 < sdann> well in general, only opens, writes (truncate included), and byte-range-lock ops break oplocks
13:50 < sdann> so any kind of meta-data request should not
13:51 < jlayton> hmm ok, one of the linux-kernel guys is seeing QPathInfo calls go out to win2k3 server and the server waits
                 5s before responding
13:51 < jlayton> my initial thought was oplock break to another client is causing the stall, but maybe it's something else
13:51 < coffeedude> sdann, SetFileInfo (allocationInfo and EndofFile) will as well.
13:51 < jlayton> I'm pretty sure this is QPathInfo call
13:52 < sdann> a quick torture test in source4/torture/raw/oplock.c would solve the issue :)
13:52 < coffeedude> jlayton, internally in Windows, the NTFS interface is handle based so I assume the server does a
                    NtCreateFile(), QueryInformationFile(), CloseFile().
13:52 < jlayton> ahhh maybe so
13:52 < coffeedude> jlayton, the internal opens should done with FILE_READ_ATTRIBUTES so they don't cause a break but it
                    could be a Windows bug.
13:53 < jlayton> sounds plausible
13:53 < jlayton> coffeedude, sdann: thanks!
13:53 < coffeedude> jlayton, any open with nothing other than FILE_READ_ATTRIBUTES, FILE_WRITE_ATTRIBUTES or SYNCHRONIZE
                    should nto cause an oplock break either.
13:53 < sdann> coffeedude, yeah that's certainly possible
13:53 < coffeedude> jlayton, any open with nothing other than FILE_READ_ATTRIBUTES, FILE_WRITE_ATTRIBUTES or SYNCHRONIZE
                    should nto cause an oplock break either.
13:53 < sdann> coffeedude, yeah that's certainly possible
13:53 < coffeedude> sdann, only know cause I've done it :)

I'd probably start with sniffing traffic at the server side and see if
you can correlate the stalls with traffic to other hosts (oplock breaks
in particular).

If so then maybe consider patching the server or testing with a
different flavor of windows.

--
Jeff Layton <jlayton@...>
_______________________________________________
linux-cifs-client mailing list
linux-cifs-client@...
https://lists.samba.org/mailman/listinfo/linux-cifs-client

Parent Message unknown Re: 2.6.31-rc8: CIFS with 5 seconds hiccups

by Jeff Layton-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Wed, 9 Sep 2009 13:28:24 -0400 (EDT)
Christoph Lameter <cl@...> wrote:

> On Wed, 9 Sep 2009, Jeff Layton wrote:
>
> > That'll stop your client from requesting oplocks, but that won't
> > prevent others from doing so. If my suspicion is correct, then another
> > client is holding an oplock and the server needs to break it before it
> > can reply to yours.
> >
> > Unfortunately I doubt there's much you can do from your client to
> > prevent that (if that is the case). There may be a way to turn off
> > oplocks on the server side, but that may very well be even worse for
> > performance.
>
> Hmmm... We can look at that.
>
> Another interesting tidbit is that I have never seen this from a 64 bit
> Linux kernel. Only occurs with 32 bit kernels it seems.
>

That sounds rather strange. Maybe we do have a bug of some sort? The
thing to do might be to get a binary capture of the 32-bit traffic
around the time of the stalls. We could then inspect the packets and
see whether we have something wrong in there.

--
Jeff Layton <jlayton@...>
_______________________________________________
linux-cifs-client mailing list
linux-cifs-client@...
https://lists.samba.org/mailman/listinfo/linux-cifs-client

Parent Message unknown Re: 2.6.31-rc8: CIFS with 5 seconds hiccups

by Jeff Layton-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Wed, 9 Sep 2009 17:27:57 -0400 (EDT)
Christoph Lameter <cl@...> wrote:

> On Wed, 9 Sep 2009, Jeff Layton wrote:
>
> > That sounds rather strange. Maybe we do have a bug of some sort? The
> > thing to do might be to get a binary capture of the 32-bit traffic
> > around the time of the stalls. We could then inspect the packets and
> > see whether we have something wrong in there.
>
> Capture attached.

Well, I can see the delays in the capture, but the snarflen for the
capture is a little too small to tell much else. Can you redo the
capture with a larger snarflen (maybe -s 512 or so)?

Also, were you able to tell anything from a server-side capture? Is the
server issuing oplock breaks at those times?

Cheers,
Jeff
_______________________________________________
linux-cifs-client mailing list
linux-cifs-client@...
https://lists.samba.org/mailman/listinfo/linux-cifs-client

Parent Message unknown Re: 2.6.31-rc8: CIFS with 5 seconds hiccups

by Jeff Layton-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Thu, 10 Sep 2009 14:53:12 -0400 (EDT)
Christoph Lameter <cl@...> wrote:

> On Wed, 9 Sep 2009, Jeff Layton wrote:
>
> > Well, I can see the delays in the capture, but the snarflen for the
> > capture is a little too small to tell much else. Can you redo the
> > capture with a larger snarflen (maybe -s 512 or so)?
>
> -s 1000 version attached.
>
> > Also, were you able to tell anything from a server-side capture? Is the
> > server issuing oplock breaks at those times?
>
> Thats a pretty busy system. They have not gotten around to do any logging
> on that end.

Ok. I had a look at the capture. The stalls seem to be occurring on
FIND_FILE requests. Those are similar to READDIRPLUS requests in NFS,
it returns a list of files that match a particular set of criteria and
their attributes.

Each time the client is making one of these calls to the server, it
requests a set of up to 150 files. The server grinds for 5s each time
and then responds.

The calls themselves seem to be sane AFAICT. I don't see any problems
with the parameters we're sending for the search. I also had a look
over the FIND_FIRST code and it doesn't seem to have any obvious
word size related problems.

I assume that the 32 and 64 bit clients you have are calling "ls" in
the same dir. If so, maybe a similar capture from a 64-bit client might
help us see the difference?

Thanks,
--
Jeff Layton <jlayton@...>
_______________________________________________
linux-cifs-client mailing list
linux-cifs-client@...
https://lists.samba.org/mailman/listinfo/linux-cifs-client

Parent Message unknown Re: 2.6.31-rc8: CIFS with 5 seconds hiccups

by Jeff Layton-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Thu, 10 Sep 2009 15:42:28 -0400 (EDT)
Christoph Lameter <cl@...> wrote:

> On Thu, 10 Sep 2009, Jeff Layton wrote:
>
> > I assume that the 32 and 64 bit clients you have are calling "ls" in
> > the same dir. If so, maybe a similar capture from a 64-bit client might
> > help us see the difference?
>
> 64 bit trace attached.

A couple of differences. First, the "ls's" were done in different
directories since they had different search patterns:

32 == \*
64 == \clameter\*

...did they also mount different shares from the server?

The 64-bit capture was done in a directory with only 50 files,
whereas the other one had at least 600-700 files (capture ends before
it finished listing the files). That may make quite a bit of difference
on the server (not sure how windows works internally in this case).

The only other substantive difference I see is that the Level of
Interest that the client is requesting is different:

32 == SMB_FIND_FILE_DIRECTORY_INFO
64 == SMB_FIND_FILE_ID_FULL_DIR_INFO

That probably means that the 32 bit client has disabled
CIFS_MOUNT_SERVER_INUM for some reason. That means that it's not asking
the server for the windows equivalent of inode numbers. We typically
disable that flag automatically if a query for the inode number of a
path fails.

Since these are the same server, that may be an indicator that the
server is serving out info from two different filesystem types (maybe
FAT vs. NTFS, or maybe even a CDROM or something). If so, then that may
help explain some of the performance delta there. I'd be more
interested to see how the 64 bit client behaves when it mounts the
exact same share and does an ls in the same directory as the 32 bit
client.

Cheers,
--
Jeff Layton <jlayton@...>
_______________________________________________
linux-cifs-client mailing list
linux-cifs-client@...
https://lists.samba.org/mailman/listinfo/linux-cifs-client

Parent Message unknown Re: 2.6.31-rc8: CIFS with 5 seconds hiccups

by Jeff Layton-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Thu, 10 Sep 2009 17:27:53 -0400 (EDT)
Christoph Lameter <cl@...> wrote:

> Right. 32 bit cannot mount the clameter directory for strange reasons. I
> have to go one level higher.
[...]
> One other issue that may be important: The mounting operation is very slow
> on 32 bit. Could it be that the handshake does not work out?

Ok, looks like the 64 bit client is using a different level of interest
than the 32 bit on the FIND_FIRST call. I suspect that that difference
may account for the difference in response time.

It's not completely clear to me why that would be. Maybe a windows bug
that causes a slowdown with that LOI?

In any case, I think we need to look closely at what's happening at
mount time. First, I'll need some other info:

1) output of "/sbin/mount.cifs -V" from both machines

2) mount options that you're using on both boxes

3) wire captures from mount attempts on both machines. Try to mount the
"clameter" dir on both boxes and do captures of each attempt. Maybe
this time use -s 0 with tcpdump so we get all of the traffic.

There may be crackable password hashes in the captures, so you may want
to send them to me privately and not cc the list.

Thanks,
--
Jeff Layton <jlayton@...>
_______________________________________________
linux-cifs-client mailing list
linux-cifs-client@...
https://lists.samba.org/mailman/listinfo/linux-cifs-client

Parent Message unknown Re: 2.6.31-rc8: CIFS with 5 seconds hiccups

by Jeff Layton-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Mon, 14 Sep 2009 16:10:47 -0400 (EDT)
Christoph Lameter <cl@...> wrote:

> On Thu, 10 Sep 2009, Jeff Layton wrote:
>
> > In any case, I think we need to look closely at what's happening at
> > mount time. First, I'll need some other info:
> >
> > 1) output of "/sbin/mount.cifs -V" from both machines
>
> The 32 bit machine
>
> #/sbin/mount.cifs -V
> mount.cifs version: 1.5
> //chiprodfs2/company /mnt -ouser=clameter,domain=xxx
> mount -t cifs //chiprodfs2/company /mnt -ouser=clameter,domain=xxx
>

Yow, that version of mount.cifs is really old. I wonder if it may be
passing bad mount options to the kernel? Might be interesting to strace
that. Something like:

# strace -f -s 256 -e mount mount -t cifs //chiprodfs2/company /mnt -ouser=clameter,domain=xxx

...it'll probably have a cleartext password in it so you might want to
doctor the options a bit before sending along if you do.

Alternately, you might just want to try a newer version of mount.cifs
and see whether that fixes this.

> 64 bit machine
>
> $ /sbin/mount.cifs -V
> mount.cifs version: 1.12-3.4.0
>
> mount -t cifs //chiprodfs2/company /mnt -ouser=clameter,domain=w2k
>
> > 3) wire captures from mount attempts on both machines. Try to mount the
> > "clameter" dir on both boxes and do captures of each attempt. Maybe
> > this time use -s 0 with tcpdump so we get all of the traffic.
>
> I cannot mount the clameter dir on the 32 bit box. Hangs. So I will mount
> /company.
>

Actually, the trace of a hanging mount would probably be interesting.

Does the 32-bit capture that you sent represent a mount attempt that
hung? Or was it successful?

> > There may be crackable password hashes in the captures, so you may want
> > to send them to me privately and not cc the list.
>
> Ok will follow.
>

Thanks for the info, I had a look at the captures. They both look
fairly similar. The main difference is that the 32-bit box doesn't seem
to have sent any more calls after sending a QPathInfo call to the
server for the root inode of the mount.

What's the "devname" that you're giving to the mount command for the
"clameter" dir? If there's more than 1 path component after the
hostname, then the problem may be in the old version of mount.cifs.
Some of them had broken handling for path prefixes.

--
Jeff Layton <jlayton@...>
_______________________________________________
linux-cifs-client mailing list
linux-cifs-client@...
https://lists.samba.org/mailman/listinfo/linux-cifs-client

Parent Message unknown Re: 2.6.31-rc8: CIFS with 5 seconds hiccups

by Jeff Layton-2 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Wed, 16 Sep 2009 12:26:04 -0400 (EDT)
Christoph Lameter <cl@...> wrote:

> On Tue, 15 Sep 2009, Jeff Layton wrote:
>
> > Yow, that version of mount.cifs is really old. I wonder if it may be
> > passing bad mount options to the kernel? Might be interesting to strace
> > that. Something like:
> >
> > # strace -f -s 256 -e mount mount -t cifs //chiprodfs2/company /mnt -ouser=clameter,domain=xxx
> >
> > ...it'll probably have a cleartext password in it so you might want to
> > doctor the options a bit before sending along if you do.
> >
> > Alternately, you might just want to try a newer version of mount.cifs
> > and see whether that fixes this.
>
> Tried a newer version of mount.cifs without any change.
>

Ok, good to rule that out then.

> > > I cannot mount the clameter dir on the 32 bit box. Hangs. So I will mount
> > > /company.
> > >
> >
> > Actually, the trace of a hanging mount would probably be interesting.
> >
> > Does the 32-bit capture that you sent represent a mount attempt that
> > hung? Or was it successful?
>
> No it was successful.
>

Hmm, ok. That isn't going to tell me as much as a mount that fails. For
now, I suggest that we focus on determining why these mounts hang/fail.
After that we can see whether the solution there has any bearing on why
the server is so slow to respond to this particular client.

> > What's the "devname" that you're giving to the mount command for the
> > "clameter" dir? If there's more than 1 path component after the
> > hostname, then the problem may be in the old version of mount.cifs.
> > Some of them had broken handling for path prefixes.
>
> its //machinename/company/clameter
>
> So two components.
>

Also good to know.

What we should probably do at this point is track down why the 32-bit
client has such a hard time mounting the clameter dir. Here's what
would be most helpful:

1) some debug log info of the mount attempt:

# modprobe cifs
# echo 7 > /proc/fs/cifs/cifsFYI

...then attempt the mount. After it hangs for a few seconds, ^c the
mount to kill it. Collect the output from dmesg and send it to me. That
should give me some idea of what the client is doing during this phase.

If you can simultaneously capture wire traffic during the same mount
attempt that would also be helpful.

Cheers,
--
Jeff Layton <jlayton@...>
_______________________________________________
linux-cifs-client mailing list
linux-cifs-client@...
https://lists.samba.org/mailman/listinfo/linux-cifs-client