|
View:
New views
20 Messages
—
Rating Filter:
Alert me
|
| < Prev | 1 - 2 - 3 | Next > |
|
|
socket woesI have been getting lots of socket-related errors
for a long time (months/years). Here is one example of one message: HHCLG009S Syslog message pipe creation failed: The requested address is not valid in its context. I don't know the last thing about sockets, but I decided to start putting some debug info in after a particularly persistent spate of errors (normally I just need to rerun it a couple of times and it goes away). I did a search to find out about the socketpair function, and after a lot of stuffing around with combinations, I decided the problem was sort of elsewhere. Anyway, I haven't had an error since this change went in, despite running it 10 times or something, but since it's a random (ie presumably time sensitive) error anyway, who knows. BFN. Paul. Index: hercules/hmacros.h diff -c hercules/hmacros.h:1.6 hercules/hmacros.h:1.7 *** hercules/hmacros.h:1.6 Sun Jan 11 13:02:43 2009 --- hercules/hmacros.h Sat Jun 27 17:58:38 2009 *************** *** 78,84 **** #endif #ifdef _MSVC_ ! #define create_pipe(a) socketpair(AF_INET,IPPROTO_IP,SOCK_STREAM,a) #define read_pipe(f,b,n) recv(f,b,n,0) #define write_pipe(f,b,n) send(f,b,(int)n,0) #define close_pipe(f) closesocket(f) --- 78,84 ---- #endif #ifdef _MSVC_ ! #define create_pipe(a) socketpair(AF_INET,SOCK_STREAM,IPPROTO_IP,a) #define read_pipe(f,b,n) recv(f,b,n,0) #define write_pipe(f,b,n) send(f,b,(int)n,0) #define close_pipe(f) closesocket(f) Index: hercules/w32util.c diff -c hercules/w32util.c:1.1.1.2 hercules/w32util.c:1.2 *** hercules/w32util.c:1.1.1.2 Sun Jan 11 12:52:04 2009 --- hercules/w32util.c Sat Jun 27 17:58:38 2009 *************** *** 1864,1871 **** // -1 shall be returned and errno set to indicate the error." if ( AF_INET != domain ) { errno = WSAEAFNOSUPPORT; return -1; } ! if ( SOCK_STREAM != protocol ) { errno = WSAEPROTONOSUPPORT; return -1; } ! if ( IPPROTO_IP != type ) { errno = WSAEPROTOTYPE; return -1; } socket_vector[0] = socket_vector[1] = INVALID_SOCKET; --- 1864,1871 ---- // -1 shall be returned and errno set to indicate the error." if ( AF_INET != domain ) { errno = WSAEAFNOSUPPORT; return -1; } ! if ( IPPROTO_IP != protocol ) { errno = WSAEPROTONOSUPPORT; return -1; } ! if ( SOCK_STREAM != type ) { errno = WSAEPROTOTYPE; return -1; } socket_vector[0] = socket_vector[1] = INVALID_SOCKET; |
|
|
Re: socket woes--- In hercules-390@..., "kerravon86" <kerravon86@...> wrote:
> > HHCLG009S Syslog message pipe creation failed: The > requested address is not valid in its context. Problem still exists. So does: ** Win32 porting error: invalid call to 'w32_select' from sockdev.c(437): mixed set(s) and HHCSD021E select failed; errno=0: No error to which I have started adding debug info: bSocketFound 1, bNonSocketFound 1 BFN. Paul. |
|
|
Re: socket woes--- In hercules-390@..., "kerravon86" <kerravon86@...> wrote:
> > > HHCLG009S Syslog message pipe creation failed: The > > requested address is not valid in its context. > > Problem still exists. > > So does: > > ** Win32 porting error: invalid call to 'w32_select' from sockdev.c(437): mixed > set(s) > > and > > HHCSD021E select failed; errno=0: No error > > to which I have started adding debug info: > > bSocketFound 1, bNonSocketFound 1 I added more debug info (tracking down that porting error) and got this: maxfd is 764 fd 0 764 is socket fd 1 -1 is not socket so you can see that a -1 is in the array. That maxfd is very large for here: * Do the select and save results */ rc = select ( maxfd+1, &sockset, NULL, NULL, NULL ); I think Windows only allows this: /usr/include/w32api/winsock.h:#define FD_SETSIZE 64 So passing 0 or FD_SETSIZE to select might be more appropriate? Regardless, I don't think the problem is directly there. Now I could add code like this: static void SelectSet ( ... for (i=0; i < pSet->fd_count && i < FD_SETSIZE; i++) { /* protect against empty slots */ if ( pSet->fd_array[i] == -1 ) continue; to stop it from falling over on the -1, but I don't think the -1 should be there in the first place. There isn't a lot of places for it to be here: void* socket_thread( void* arg ) ... for (;;) { /* Set the file descriptors for select */ FD_ZERO ( &sockset ); maxfd = add_socket_devices_to_fd_set ( 0, &sockset ); SUPPORT_WAKEUP_SOCKDEV_SELECT_VIA_PIPE( maxfd, &sockset ); /* Do the select and save results */ rc = select ( maxfd+1, &sockset, NULL, NULL, NULL ); I believe it was added by this: #define SUPPORT_WAKEUP_SOCKDEV_SELECT_VIA_PIPE( maxfd, prset ) SUPPORT_WAKEUP_SELECT_VIA_PIPE( sysblk.sockrpipe, (maxfd), (prset) ) If that sockrpipe is -1, it should flow through this logic: #define SUPPORT_WAKEUP_SELECT_VIA_PIPE( pipe_rfd, maxfd, prset ) \ FD_SET((pipe_rfd),(prset)); \ (maxfd)=(maxfd)>(pipe_rfd)?(maxfd):(pipe_rfd) And get through here too: DLL_EXPORT void w32_FD_SET( int fd, fd_set* pSet ) { SOCKET hSocket; if (0 || socket_is_socket( fd ) || (SOCKET) -1 == ( hSocket = (SOCKET) _get_osfhandle( fd ) ) ) hSocket = (SOCKET) fd; ORIGINAL_FD_SET( hSocket, pSet ); // (add HANDLE/SOCKET to specified set) } And the -1 would go in here: #define ORIGINAL_FD_SET( fd, pSet ) \ do \ { \ unsigned int i; \ for (i=0; i < ((fd_set*)(pSet))->fd_count; i++) \ if (((fd_set*)(pSet))->fd_array[i] == (fd)) \ break; \ if (i == ((fd_set*)(pSet))->fd_count \ && ((fd_set*)(pSet))->fd_count < FD_SETSIZE) \ { \ ((fd_set*)(pSet))->fd_array[i] = (fd); \ ((fd_set*)(pSet))->fd_count++; \ } \ } \ while (0) And I'd say there's a race condition with this code in impl.c: #if defined( OPTION_WAKEUP_SELECT_VIA_PIPE ) { int fds[2]; initialize_lock(&sysblk.cnslpipe_lock); initialize_lock(&sysblk.sockpipe_lock); sysblk.cnslpipe_flag=0; sysblk.sockpipe_flag=0; VERIFY( create_pipe(fds) >= 0 ); sysblk.cnslwpipe=fds[1]; sysblk.cnslrpipe=fds[0]; VERIFY( create_pipe(fds) >= 0 ); sysblk.sockwpipe=fds[1]; sysblk.sockrpipe=fds[0]; } #endif // defined( OPTION_WAKEUP_SELECT_VIA_PIPE ) And as a first step towards validation of this theory, I've added code to inspect the sockrpipe just above and exit if it's -1. BFN. Paul. |
|
|
Re: socket woes--- In hercules-390@..., "kerravon86" <kerravon86@...> wrote:
> While wrapping up for the day/night/morning, I got this variation: 02:50:39 console: DBG028: select: An operation was attempted on something that Which points to a similar problem: /* VERIFY that the file descriptor is valid. If it's NOT, then IGNORE this console device since it's thus obvious that SOMETHING has gone wrong SOMEWHERE at some point! (some sort of race condition SOMEWHERE, obviously) */ if (dev->fd < 0) { // Ah-HA! We may have FINALLY found (or at // least have gotten a little bit closer to // finding) the ROOT CAUSE of our problematic // "DBG028 select: Bad FIle Number" problem! logmsg ( "\n" "*********** DBG028 CONSOLE BUG ***********\n" "device %4.4X: 'connected', but dev->fd = -1\n" "\n" ,dev->devnum ); dev->connected = 0; // (since it's not connected!) } Once again, it's hitting similar code: FD_ZERO ( &readset ); maxfd=INT_MIN; FD_SET ( lsock, &readset ); maxfd = lsock; SUPPORT_WAKEUP_CONSOLE_SELECT_VIA_PIPE( maxfd, &readset ); However, my code didn't get hit: sysblk.sockrpipe=fds[0]; if (sysblk.sockrpipe == -1) { fprintf(stderr, "gotcha\n"); exit(0); } on that occasion, nor a second run, but a third run, it did: Running on PAUL-LAPTOP Windows_NT-6.0 i686 UP gotcha Which basically points the finger back to socketpair, which I still had debug info in for, which wasn't hit. So either: 1. debug info (via logmsg) isn't working. 2. Check for INVALID_SOCKET isn't working. 3. the parameters were passed incorrectly. I can see that the "VERIFY" macro does nothing. I put in some intensive debug and got this: fred1 fred2 fred3 fred4 fred5 fred5 fred6 HHCLG009S Syslog message pipe creation failed: No error The freds were put in after I wasn't getting any reported error at all, just the "gotcha". I suspect that the errors are occurring before the logger thread is ready, so my log debugging wasn't working. Anyway, the logging error seems to show a general problem my system has creating sockets. But a message like the above, with an exit, should be sufficient safeguard. The problem is that socketpair is reporting that sort of error by returning a negative return code, but it is unchecked, so it's not until much later that you start getting weird errors. Solution? Change the VERIFY macro? More debugging: fred1 fred2 fred3 fred4 fred5 fred5 fred6 stage3 yaya3 if (0 || SOCKET_ERROR == connect( socket_vector[1], (SOCKADDR*) &loca || INVALID_SOCKET == (SOCKET)( socket_vector[0] = accept( temp_li ) { int nLastError = (int)WSAGetLastError(); closesocket( socket_vector[1] ); socket_vector[1] = INVALID_SOCKET; closesocket( temp_listen_socket ); errno = nLastError; logmsg("stage3\n"); fprintf(stderr, "yaya3\n"); exit(0); return -1; Another run: fred1 fred2 fred3 fred4 fred5 fred5 fred6 yaya3 No error and you can see the log not working. if (0 || SOCKET_ERROR == connect( socket_vector[1], (SOCKADDR*) &loc || INVALID_SOCKET == (SOCKET)( socket_vector[0] = accept( temp_l ) { int nLastError = (int)WSAGetLastError(); closesocket( socket_vector[1] ); socket_vector[1] = INVALID_SOCKET; closesocket( temp_listen_socket ); errno = nLastError; logmsg("stage3\n"); fprintf(stderr, "yaya3 %s\n", strerror(errno)); exit(0); return -1; } More debug: fred1 fred2 fred3 fred4 fred5 fred5 fred6 fred7 fred8 fred9 fred1 fred2 fred3 fred4 fred5 fred5 fred6 yaya2.5 No error myflag 0 last 10049 yaya3 No error 1 file(s) copied. if (0 || SOCKET_ERROR == connect( socket_vector[1], (SOCKADDR*) &localho || ((myflag = 7) != 7) || INVALID_SOCKET == (SOCKET)( socket_vector[0] = accept( temp_liste ) { int nLastError; fprintf(stderr, "yaya2.5 %s\n", strerror(errno)); fprintf(stderr, "myflag %d\n", myflag); nLastError = (int)WSAGetLastError(); fprintf(stderr, "last %d\n", nLastError); closesocket( socket_vector[1] ); socket_vector[1] = INVALID_SOCKET; closesocket( temp_listen_socket ); errno = nLastError; logmsg("stage3\n"); fprintf(stderr, "yaya3 %s\n", strerror(errno)); showing that the error is on the connect, and it's equal to 10049. /usr/include/w32api/winerror.h:#define WSAEADDRNOTAVAIL 10049L which according to this documentation I found online: http://msdn.microsoft.com/en-us/library/ms737625(VS.85).aspx is suggesting that the name is invalid. I think it would be odd that my machine is randomly not having an address available. Is another thread responsible for creating that IP address? But the address was available shortly before, right? Could another thread be shutting it down? More debugging info needed? BFN. Paul. |
|
|
Re: Re: socket woesPaul,
First, the correction to create_pipe/socketpair is correct (the 2nd & 3rd parameters are reversed).... But irrelevant to your problem ! The parameters are NOT actually used besides to check for validity. I believe your problem is a race condition. On Windows, it seems we took the route to use TCP sockets to emulate the BSD 'socketpair' system call (I don't know why we didn't go for named pipes, but this might not have been available on Win9X). Anyway, the way it's done is this : A socket of type AF_INET/SOCK_STREAM is created, bound to 127.0.0.1 port 0 and put into listen A connect is issued to 127.0.0.1 port 0 An Accept is issued to grab the other side The listening socket is closed. On a Posix system, doing this would lead to a EADDRINUSE on the 2nd try unless you do a setsockopt/SOCK_REUSE_ADDR.. Windows seems to not have this problem. We don't have that problem on linux because under linux socketpair() does not involves binding sockets for listening. However, if multiple threads attempt to simultaneously call the socketpair function in w32util.c, there is a time window for one or more of simultaneous attempts to fail if the 1st attempt hasn't completed its work. Apparently, I'd say a lock is in order here to serialize access to w32util.c[socketpair()]. In hercules, you have a good chance of this happening if the connect/accept sequence is taking an unusual amount of time to complete (a firewall may cause this for example). The 3270 shoulder tap socket and the console logging pipes are created consecutively in the same thread.. So they can't be directly responsible.. (or rather it's not those 2 conflicting).. But the printer emulation code and the BSC emulation code may be doing those in separate threads.. Anyway.. Could you try this : =================================================================== --- w32util.c (revision 5421) +++ w32util.c (working copy) @@ -1881,6 +1881,7 @@ localhost_addr.sin_port = htons( 0 ); localhost_addr.sin_addr.s_addr = htonl( INADDR_LOOPBACK ); + obtain_lock(&sysblk.sockpipe_lock); if (0 || SOCKET_ERROR == bind( temp_listen_socket, (SOCKADDR*) &localhost_addr, len ) || SOCKET_ERROR == listen( temp_listen_socket, 1 ) @@ -1890,6 +1891,7 @@ { int nLastError = (int)WSAGetLastError(); closesocket( temp_listen_socket ); + release_lock(&sysblk.sockpipe_lock); errno = nLastError; return -1; } @@ -1903,11 +1905,13 @@ closesocket( socket_vector[1] ); socket_vector[1] = INVALID_SOCKET; closesocket( temp_listen_socket ); + release_lock(&sysblk.sockpipe_lock); errno = nLastError; return -1; } closesocket( temp_listen_socket ); + release_lock(&sysblk.sockpipe_lock); return 0; } [Non-text portions of this message have been removed] |
|
|
Re: socket woes--- In hercules-390@..., Ivan Warren <ivan@...> wrote:
> > Paul, > > First, the correction to create_pipe/socketpair is correct (the 2nd & > 3rd parameters are reversed).... But irrelevant to your problem ! Ok. > In hercules, you have a good chance of this happening if the > connect/accept sequence is taking an unusual amount of time > to complete (a firewall may cause this for example). Ok, perhaps combined with single-threading on my PC. > Anyway.. Could you try this : Thanks Warren. Unfortunately I'm still getting: 10:51:07 HHCSD004I Device 000C bound to socket localhost:3505 10:51:07 HHCSD020I Socketdevice listener thread started: tid=00001A04, pid=6684 10:51:07 ** Win32 porting error: invalid call to 'w32_select' from sockdev.c(43 10:51:07 HHCSD021E select failed; errno=0: No error 10:51:07 ** Win32 porting error: invalid call to 'w32_select' from sockdev.c(43 sockdev.c(437): mixed set(s) (ie almost certainly -1) on some runs, and: HHCLG009S Syslog message pipe creation failed: The requested address is not vali d in its context. on other runs. and working on 50% of runs. I'll put back my debug info to see what's happening. BFN. Paul. |
|
|
Re: socket woes--- In hercules-390@..., Ivan Warren <ivan@...> wrote:
> > However, if multiple threads attempt to simultaneously call the > socketpair function in w32util.c, there is a time window for one or more > of simultaneous attempts to fail if the 1st attempt hasn't completed its > work. Apparently, I'd say a lock is in order here to serialize access to > w32util.c[socketpair()]. > > In hercules, you have a good chance of this happening if the > connect/accept sequence is taking an unusual amount of time to complete > (a firewall may cause this for example). I tried forcing an unusual amount of time in order to get a hard error. But despite putting a 3 second pause in at the start of socketpair, and a 3 second pause before the "connect", it still has the same symptoms - ie fails some of the time, works some of the time. I don't have a theory that would explain that. With the first 3 second pause, all other threads should have either completed or blocked by then (unless they're being triggered by timers?). With the second 3 second pause, every socket/connect sequence takes a long time, but Windows sometimes doesn't seem to care. Also, judging by my debug statements, I don't see socketpair being called from multiple threads simultaneously, even without your new lock. BFN. Paul. P.S. Here's 2 10-second delays, no locking: socket_vector[0] = socket_vector[1] = INVALID_SOCKET; sleep(10); plus sleep(10); ... || SOCKET_ERROR == connect( socket_vector[1], C:\mvs380\jcl>runmvs mvsendec.jcl temp.txt 1 file(s) copied. 1 file(s) copied. Hercules HET IEHINITT program Version 3.06:380-4.x (c)Copyright 1999-2007 by Roger Bowler, Jan Jaeger, and others Hercules HET IEHINITT program Version 3.06:380-4.x (c)Copyright 1999-2007 by Roger Bowler, Jan Jaeger, and others mvsendec.jcl c:\mvs380\jcl\termherc.jcl 1 file(s) copied. fred1 fred2 fred3 fred4 fred5 fred5.5 fred6 fred7 fred8 fred9 Hercules Version 3.06:380-4.x (c)Copyright 1999-2007 by Roger Bowler, Jan Jaeger, and others Built on Jun 28 2009 at 11:53:05 Build information: Win32 (MSVC) build Modes: S/370 S/380 ESA/390 z/Arch Max CPU Engines: 8 Using fthreads instead of pthreads Dynamic loading support Loadable module default base directory is . Using shared libraries HTTP Server support No SIGABEND handler Regular Expressions support Automatic Operator support Machine dependent assists: cmpxchg1 cmpxchg4 cmpxchg8 fetch_dw store_dw Running on PAUL-LAPTOP Windows_NT-6.0 i686 UP Crypto module loaded (c) Copyright Bernard van der Helm, 2003-2008 Active: Message Security Assist Message Security Assist Extension 1 Message Security Assist Extension 2 fred1 fred2 fred3 fred4 fred5 fred5.5 fred6 yaya2.5 No error myflag 0 last 10049 stage3 yaya3 No error 1 file(s) copied. |
|
|
Re: socket woes--- In hercules-390@..., "kerravon86" <kerravon86@...> wrote:
> > I tried forcing an unusual amount of time in order > to get a hard error. New idea - I tried doing the same operation in a loop. DLL_EXPORT int socketpair( int domain, int type, int protocol, int socket_vect { int i; for (i = 0; i < 100; i++) { socketpair2(domain, type, protocol, socket_vector); closesocket(socket_vector[0]); closesocket(socket_vector[1]); } exit(0); } C:\mvs380\jcl>runmvs mvsendec.jcl temp.txt 2>temp2.txt It seems it goes into batches of errors: fred3 fred4 fred5 fred5.5 fred6 fred7 fred8 fred9 fred1 fred2 fred3 fred4 fred5 fred5.5 fred6 yaya2.5 No error myflag 0 last 10049 yaya3 No error fred1 fred2 fred3 fred4 fred5 fred5.5 fred6 yaya2.5 No error myflag 0 last 10049 yaya3 No error fred1 fred2 fred3 fred4 fred5 fred5.5 fred6 yaya2.5 No error myflag 0 last 10049 yaya3 No error fred1 fred2 fred3 fred4 fred5 fred5.5 fred6 yaya2.5 No error myflag 0 last 10049 yaya3 No error fred1 fred2 fred3 fred4 fred5 fred5.5 fred6 fred7 With 21 failures in total out of 100 runs. Of course, this lends itself to a "solution" of sorts. Just retry the operation 300 times before giving up. Let me experiment some more. BFN. Paul. |
|
|
Re: Re: socket woeskerravon86 wrote:
> --- In hercules-390@..., "kerravon86" <kerravon86@...> wrote: > >> I tried forcing an unusual amount of time in order >> to get a hard error. >> > > New idea - I tried doing the same operation in a loop. > > last 10049 > unnecessary. I misinterpreted what a bind to port 0 meant. A bind to port 0 on an unicast address (that is not IPADDR_ANY) assigns a random available port, so we shouldn't have any concurrency problem after all. Now, I'm having a hard time figuring from your debug messages WHICH of the calls is failing (bind, listen, connect or accept) with the WSAEADDRNOTAVAIL error.. What could be helpful would be to obtain the raw contents of the localhost_addr structure and the actual call that failed when you hit a 10049 error. --Ivan [Non-text portions of this message have been removed] |
|
|
Re: socket woes--- In hercules-390@..., "kerravon86" <kerravon86@...> wrote:
> > With 21 failures in total out of 100 runs. Even with a 10 second pause after a failure, I still got 16 out of 100 failures. The failures are less batched, but still sometimes batched. It's difficult to tell at this level. Could just be coincidence. Not sure if it's relevant, but I have two non-Hercules errors that started a few months ago: 1. Type in www.google.com.au It will wait and wait. Even if I type in another address it will still wait. Open another screen and go to the same address, and it usually works. Original IE window never comes back, even after hours. 2. Type in "cvs diff" or any other command to a sourceforge project. 50% of the time it hangs and never comes back. ctrl-c, try again, usually works. I did an internet search and of course the world is full of bugs: http://forums.sun.com/thread.jspa?threadID=670368 and it's difficult to tell what's relevant. All IE plugins are disabled on my system (deliberately). IE also crashes regularly on my system. I always seem to have so many PC problems. Anyway, I figured it was probably an error on the Windows side, but decided to inspect the parameters that were being passed to connect(). To my surprise, the port number was no longer 0. It turns out that getsockname() is changing the port number. I tried forcing it back to 0, but that failed. It took all day, most spent on a wild goose chase, but I finally found out where it's going wrong. The address sometimes stays correct: byte 4c is 7f byte 4d is 7f byte 4e is 7f and sometimes gets clobbered: byte 4c is 7f byte 4d is 7f byte 4e is 0 if (0 || SOCKET_ERROR == bind( temp_listen_socket, (SOCKADDR*) &localhost_a || (fprintf(stderr, "byte 4c is %x\n", ((unsigned char *)&localhost_add || SOCKET_ERROR == listen( temp_listen_socket, 1 ) || (fprintf(stderr, "byte 4d is %x\n", ((unsigned char *)&localhost_add || SOCKET_ERROR == getsockname( temp_listen_socket, (SOCKADDR*) &loca || (fprintf(stderr, "byte 4e is %x\n", ((unsigned char *)&localhost_add || INVALID_SOCKET == (SOCKET)( socket_vector[1] = socket( AF_INET, SOCK That byte is the first byte of the address. || (fprintf(stderr, "byte 4e is %x\n", ((unsigned char *)&localhost_addr)[4]) == 999) And it's being wiped out by getsockname. getsockname returns the port number, which I think we need, but wipes out the address randomly. The documentation: http://msdn.microsoft.com/en-us/library/ms738543(VS.85).aspx says: The getsockname function does not always return information about the host address when the socket has been bound to an unspecified address, unless the socket has been connected with connect or accept (for example, using ADDR_ANY). A Windows Sockets application must not assume that the address will be specified unless the socket is connected. The address that will be used for the socket is unknown unless the socket is connected when used in a multihomed host. If the socket is using a connectionless protocol, the address may not be available until I/O occurs on the socket. Not sure whether that says it or the app is misbehaving. A simple fix then would seem to be to stop reusing the same variable, and then just copy the port number across. Or is there some other intention? BFN. Paul. |
|
|
Re: socket woes--- In hercules-390@..., "kerravon86" <kerravon86@...> wrote:
> > yaya2.5 No error > myflag 0 > last 10049 > yaya3 No error > 1 file(s) copied. > > if (0 > || SOCKET_ERROR == connect( socket_vector[1], (SOCKADDR*) &localho > || ((myflag = 7) != 7) > || INVALID_SOCKET == (SOCKET)( socket_vector[0] = accept( temp_liste > ) > { > int nLastError; > fprintf(stderr, "yaya2.5 %s\n", strerror(errno)); > fprintf(stderr, "myflag %d\n", myflag); > > showing that the error is on the connect, and it's > equal to 10049. Warren, the error is on that connect above. > /usr/include/w32api/winerror.h:#define WSAEADDRNOTAVAIL 10049L As per documentation: http://msdn.microsoft.com/en-us/library/ms737625(VS.85).aspx If the address member of the structure specified by the name parameter is all zeroes, connect will return the error WSAEADDRNOTAVAIL. The reason it's zero, as per previous email, is the getsockname() function appears to be allowed to wipe out the address. I'm not familiar with sockets so I don't know if I'm misreading the doco. I do know how to put in debug statements to track down where it's falling over though. :-) BFN. Paul. |
|
|
Re: Re: socket woeskerravon86 wrote:
> Warren, the error is on that connect above. > Ok.. (btw, Warren is my last name.. Call me Ivan.. please :) ) > >> /usr/include/w32api/winerror.h:#define WSAEADDRNOTAVAIL 10049L >> > > As per documentation: > > http://msdn.microsoft.com/en-us/library/ms737625(VS.85).aspx > > If the address member of the structure specified by the name parameter is all zeroes, connect will return the error WSAEADDRNOTAVAIL. > > > The reason it's zero, as per previous email, is > the getsockname() function appears to be allowed > to wipe out the address. > > (IP address and port number) to which the socket is actually bound. After the bind on 127.0.0.1 Port 0, the Ip address should still be 127.0.0.1 and the Port number should be one of the automatically usable ports. And WSAEADDRNOTAVAIL will be returned for other reasons than the whole structure being zeroes !.. Could you do something like this when the error occurs ? : unsigned char *lptr; int i; lptr=&localhost_addr; for(i=0;i<len;i++) { if(i%16==0) { if(i) printf("\n"); printf("%4.4X : ",i); } if(i%4==0) printf(" "); printf("%2.2X",lptr[i]); } if(i%16) printf("\n"); --Ivan [Non-text portions of this message have been removed] |
|
|
Re: socket woes--- In hercules-390@..., Ivan Warren <ivan@...> wrote:
> > kerravon86 wrote: > > Warren, the error is on that connect above. > > > Ok.. (btw, Warren is my last name.. Call me Ivan.. please :) ) Sorry, I'm totally zonked out after a marathon debugging. It appears that my previous email failed to make it so far. > getsockname is suppose to fill the structure with the > complete address (IP address and port number) to which > the socket is actually bound. The documentation quibbles. "The getsockname function does not always return information about the host address when the socket has been bound to an unspecified address, unless the socket has been connected with connect or accept (for example, using ADDR_ANY). A Windows Sockets application must not assume that the address will be specified unless the socket is connected." > After the bind on 127.0.0.1 Port 0, the Ip address should still be > 127.0.0.1 and the Port number should be one of the automatically > usable ports. Wouldn't that be nice? > And WSAEADDRNOTAVAIL will be returned for other reasons than > the whole structure being zeroes !.. Yep, wiped out by getsockname. > Could you do something like this when the error occurs ? : > > unsigned char *lptr; > int i; > > lptr=&localhost_addr; > for(i=0;i<len;i++) > { > if(i%16==0) { > if(i) printf("\n"); > printf("%4.4X : ",i); > } > if(i%4==0) printf(" "); > printf("%2.2X",lptr[i]); > } > if(i%16) printf("\n"); In the possibly lost email, I showed this: A good one: check parms ready for connect byte 0 is 2 byte 1 is 0 byte 2 is e8 byte 3 is 7c byte 4 is 7f byte 5 is 0 byte 6 is 0 byte 7 is 1 byte 8 is 0 byte 9 is 0 byte 10 is 0 byte 11 is 0 byte 12 is 0 byte 13 is 0 byte 14 is 0 byte 15 is 0 A bad one: check parms ready for connect byte 0 is 2 byte 1 is 0 byte 2 is e8 byte 3 is 82 byte 4 is 0 byte 5 is 0 byte 6 is 0 byte 7 is 0 byte 8 is 0 byte 9 is 0 byte 10 is 0 byte 11 is 0 byte 12 is 0 byte 13 is 0 byte 14 is 0 byte 15 is 0 showing wipeout: byte 4c is 7f byte 4d is 7f byte 4e is 0 if (0 || SOCKET_ERROR == bind( temp_listen_socket, (SOCKADDR*) &localhost_a || (fprintf(stderr, "byte 4c is %x\n", ((unsigned char *)&localhost_add || SOCKET_ERROR == listen( temp_listen_socket, 1 ) || (fprintf(stderr, "byte 4d is %x\n", ((unsigned char *)&localhost_add || SOCKET_ERROR == getsockname( temp_listen_socket, (SOCKADDR*) &loca || (fprintf(stderr, "byte 4e is %x\n", ((unsigned char *)&localhost_addr)[4]) == 999) BFN. Paul. |
|
|
Re: socket woes--- In hercules-390@..., Ivan Warren <ivan@...> wrote:
> Ok Ivan, here's a working "fix". I tested this in my driver, to make it get called 100 times. No error. I then put it in my proper version and started it manually 20 odd times. No error. I don't know whether this a technically good fix though - ideally there would be a socket option or something that said "don't let sockgetname produce random results". Speaking of random results - now that we know where it is, can we put in some sort of return code checking so that if a failure should ever occur again, we get a log (is this logger even available?) or an fprintf to stderr, and perhaps an immediate exit (same as when the logger thread fails to start). Or at least, checking of the return code in the caller? Thanks. Paul. Index: hercules/w32util.c diff -c hercules/w32util.c:1.2 hercules/w32util.c:1.3 *** hercules/w32util.c:1.2 Sat Jun 27 17:58:38 2009 --- hercules/w32util.c Sun Jun 28 22:20:47 2009 *************** *** 1855,1860 **** --- 1855,1862 ---- SOCKET temp_listen_socket; struct sockaddr_in localhost_addr; int len = sizeof(localhost_addr); + struct sockaddr_in temp_addr; + int templen = sizeof(temp_addr); // Technique: create a pair of sockets bound to each other by first creating a // temporary listening socket bound to the localhost loopback address (127.0.0.1) *************** *** 1884,1890 **** if (0 || SOCKET_ERROR == bind( temp_listen_socket, (SOCKADDR*) &localhost_addr, len ) || SOCKET_ERROR == listen( temp_listen_socket, 1 ) ! || SOCKET_ERROR == getsockname( temp_listen_socket, (SOCKADDR*) &localhost_addr, &len ) || INVALID_SOCKET == (SOCKET)( socket_vector[1] = socket( AF_INET, SOCK_STREAM, 0 ) ) ) { --- 1886,1892 ---- if (0 || SOCKET_ERROR == bind( temp_listen_socket, (SOCKADDR*) &localhost_addr, len ) || SOCKET_ERROR == listen( temp_listen_socket, 1 ) ! || SOCKET_ERROR == getsockname( temp_listen_socket, (SOCKADDR*) &temp_addr, &templen ) || INVALID_SOCKET == (SOCKET)( socket_vector[1] = socket( AF_INET, SOCK_STREAM, 0 ) ) ) { *************** *** 1894,1899 **** --- 1896,1903 ---- return -1; } + localhost_addr.sin_port = temp_addr.sin_port; + if (0 || SOCKET_ERROR == connect( socket_vector[1], (SOCKADDR*) &localhost_addr, len ) || INVALID_SOCKET == (SOCKET)( socket_vector[0] = accept( temp_listen_socket, (SOCKADDR*) &localhost_addr, &len ) ) |
|
|
Re: Re: socket woeskerravon86 wrote:
> --- In hercules-390@..., Ivan Warren <ivan@...> wrote: > > Ok Ivan, here's a working "fix". > > Fair enough. A fix has been commited to SVN (with some styling changes and some comments). The fix has been marked as a workaround since we haven't yet been able to determine the actual root cause of the issue. Also commited a fix to match socketpair semantics with POSIX/BSD semantics --Ivan [Non-text portions of this message have been removed] |
|
|
Re: socket woes--- In hercules-390@..., Ivan Warren <ivan@...> wrote:
> > A fix has been commited to SVN (with some styling changes > and some comments). > The fix has been marked as a workaround since we haven't > yet been able to determine the actual root cause of the issue. Thanks Ivan, looks good. Nitpicking - the memset should have used templen. On to more serious matters - you presumably can't tell from the documentation what's going on with getsockname. I did a google search and found this: http://lists.freebsd.org/pipermail/freebsd-stable/2004-January/005486.html (not totally relevant) After fixing a bug fix to the bug fix to the bug, plus deciding that "close" was a substitute for "closesocket", I put the program into a loop and got basically the same results - using Cygwin gcc: listening on 127.0.0.1:62684 0 listening on 127.0.0.1:62685 0 listening on 127.0.0.1:62686 0 listening on 0.0.0.0:62687 0 listening on 127.0.0.1:62688 0 listening on 127.0.0.1:62689 0 listening on 127.0.0.1:62690 0 listening on 127.0.0.1:62691 Note the 0.0.0.0. It wasn't alone, and it turned up randomly. #include <stdio.h> #include <sys/types.h> #include <sys/socket.h> #include <string.h> #include <stdlib.h> #include <netinet/in.h> #include <arpa/inet.h> int foo() { int sock, len; struct sockaddr_in addr, foo; len = sizeof foo; if((sock=socket(AF_INET, SOCK_STREAM, 0))<0) { exit(0); } memset(&addr, 0, sizeof(struct sockaddr_in)); addr.sin_family = AF_INET; addr.sin_addr.s_addr = htonl(INADDR_LOOPBACK); addr.sin_port = htons(0); if(bind(sock, (struct sockaddr *) &addr, sizeof(struct sockaddr_in))<0) { perror("bind"); exit(0); } if(listen(sock, 5)<0) { perror("listen"); exit(0); } fprintf(stderr, "%d\n", getsockname(sock, (struct sockaddr *) &foo, &len)); fprintf(stderr, "listening on %s:%d\n", inet_ntoa(foo.sin_addr), ntohs(foo.sin_port)); close(sock); return 0; } int main(void) { int x; for (x = 0; x < 100; x++) { foo(); } return (0); } So, is there some place where we can get a good answer on Windows socket questions? I'm using Windows Vista. > Also commited a fix to match socketpair semantics with > POSIX/BSD semantics Thanks. Any chance of protection for -1 results for any other reason? BFN. Paul. P.S. I see my lost message turned up after 5 hours or something! |
|
|
Re: socket woes--- In hercules-390@...,
"kerravon86" <kerravon86@...> wrote: - - - snipped - - - > Not sure if it's relevant, but I have two >non-Hercules errors that started a few months >ago: - - - snipped - - - If you haven't put on Vista SP2, you can check for root-kits with: http://www.exterminate-it.com/downloads/rku.exe Knowing how to read it helps. Some good free trojan scans software is: http://www.SuperAntiSpyware.com Preference/Repair also fix damage caused by some infections. http://www.MalwareBytes.org Several software packages are on the page. Make certain that you get MalwareBytes. Check the release number and size. Currently it's Malwarebytes Anti-Malware 1.38 with 3.4MB install file. Software unlock files-in-use for deletion. http://www.exterminate-it.com The free one will only scan but you can manually delete. I bought several licenses just to save my time doing manual deletes. Other people may also recommend LavaSoft's Ad-aware. I don't use it but have heard good things. It does more than blocking ads now. Trends' Housecall may be better for viruses? Many other good products are available and many ransomware and other infections available. Watch out for what you down load and install. Always check for and apply updates often on scanning products. |
|
|
Re: socket woes--- In hercules-390@..., "somitcw" <somitcw@...> wrote:
> > --- In hercules-390@..., > "kerravon86" <kerravon86@> wrote: > - - - snipped - - - > > Not sure if it's relevant, but I have two > >non-Hercules errors that started a few months > >ago: > - - - snipped - - - > > If you haven't put on Vista SP2, you can > check for root-kits with: I have Vista automatic updates on, and I run AVG Free, and I don't install much software. I don't think it's likely to be a virus etc. BFN. Paul. |
|
|
Re: socket woes--- In hercules-390@...,
"kerravon86" <kerravon86@...> wrote: - - - snipped - - - > I have Vista automatic updates on, That is good but don't expect full protection from trojans, worms, viruses, root-kits, ad-ware, ransomeware, and other threats. >and I run AVG Free, Free Grisoft, Alvira, and less restrictive Avast Anti-Virus should take care of almost all viruses, a few trjoans, some ransomware, some worms, and many other threats. Anti-Adware would be expected to deal with pop-up ads and some other stuff. Anti-Spyware might not get any virus but should get many trojans and worms. >and I don't install much software. I don't >think it's likely to be a virus etc. > BFN. Paul. I didn't think that it is a virus either or I would have suggested more anti-virus software. |
|
|
RE: Re: socket woeskerravon86 wrote: > --- In hercules-390@... <mailto:hercules-390%40yahoogroups.com> , Ivan Warren <ivan@...> wrote: > > Ok Ivan, here's a working "fix". > > Fair enough. A fix has been commited to SVN (with some styling changes and some comments). The fix has been marked as a workaround since we haven't yet been able to determine the actual root cause of the issue. Also commited a fix to match socketpair semantics with POSIX/BSD semantics --Ivan [Non-text portions of this message have been removed] Well, I think it is not a work-around (the root cause is ver clearly indicated by Pauls great debug efforts !! Here is a note from BIND >> For TCP/IP, if the port is specified as zero, the service provider assigns a unique port to the application with a value between 1024 and 5000. The application can use <http://msdn.microsoft.com/en-us/library/ms738543(VS.85).aspx> getsockname after calling bind to learn the address and the port that has been assigned to it. If the Internet address is equal to INADDR_ANY, getsockname cannot necessarily supply the address until the socket is connected, since several addresses can be valid if the host is multihomed. Binding to a specific port number other than port 0 is discouraged for client applications, since there is a danger of conflicting with another socket already using that port number. Note When using bind with the SO_EXCLUSIVEADDR or SO_REUSEADDR socket option, the socket option must be set prior to executing bind to have any affect. Here is the msdn doc verbatum >> getsockname Function The getsockname function retrieves the local name for a socket. Syntax C++ int getsockname( __in SOCKET s, __out struct sockaddr *name, __inout int *namelen ); Parameters s [in] Descriptor identifying a socket. name [out] Pointer to a <http://msdn.microsoft.com/en-us/library/ms740496(VS.85).aspx> SOCKADDR structure that receives the address (name) of the socket. namelen [in, out] Size of the name buffer, in bytes. Return Value If no error occurs, getsockname returns zero. Otherwise, a value of SOCKET_ERROR is returned, and a specific error code can be retrieved by calling <http://msdn.microsoft.com/en-us/library/ms741580(VS.85).aspx> WSAGetLastError. Error code Meaning <http://msdn.microsoft.com/en-us/library/ms740668(VS.85).aspx#winsock.wsanot initialised_2> WSANOTINITIALISED A successful <http://msdn.microsoft.com/en-us/library/ms742213(VS.85).aspx> WSAStartup call must occur before using this API. <http://msdn.microsoft.com/en-us/library/ms740668(VS.85).aspx#winsock.wsaene tdown_2> WSAENETDOWN The network subsystem has failed. <http://msdn.microsoft.com/en-us/library/ms740668(VS.85).aspx#winsock.wsaefa ult_2> WSAEFAULT The name or the namelen parameter is not a valid part of the user address space, or the namelen parameter is too small. <http://msdn.microsoft.com/en-us/library/ms740668(VS.85).aspx#winsock.wsaein progress_2> WSAEINPROGRESS A blocking Windows Sockets 1.1 call is in progress, or the service provider is still processing a callback function. <http://msdn.microsoft.com/en-us/library/ms740668(VS.85).aspx#winsock.wsaeno tsock_2> WSAENOTSOCK The descriptor is not a socket. <http://msdn.microsoft.com/en-us/library/ms740668(VS.85).aspx#winsock.wsaein val_2> WSAEINVAL The socket has not been bound to an address with <http://msdn.microsoft.com/en-us/library/ms737550(VS.85).aspx> bind, or ADDR_ANY is specified in bind but connection has not yet occurred. Remarks The getsockname function retrieves the current name for the specified socket descriptor in name. It is used on the bound or connected socket specified by the s parameter. The local association is returned. This call is especially useful when a <http://msdn.microsoft.com/en-us/library/ms737625(VS.85).aspx> connect call has been made without doing a <http://msdn.microsoft.com/en-us/library/ms737550(VS.85).aspx> bind first; the getsockname function provides the only way to determine the local association that has been set by the system. On call, the namelen parameter contains the size of the name buffer, in bytes. On return, the namelen parameter contains the actual size in bytes of the name parameter. The getsockname function does not always return information about the host address when the socket has been bound to an unspecified address, unless the socket has been connected with <http://msdn.microsoft.com/en-us/library/ms737625(VS.85).aspx> connect or <http://msdn.microsoft.com/en-us/library/ms737526(VS.85).aspx> accept (for example, using ADDR_ANY). A Windows Sockets application must not assume that the address will be specified unless the socket is connected. The address that will be used for the socket is unknown unless the socket is connected when used in a multihomed host. If the socket is using a connectionless protocol, the address may not be available until I/O occurs on the socket. Requirements Minimum supported client Windows 2000 Professional Minimum supported server Windows 2000 Server Header Winsock2.h Library Ws2_32.lib DLL Ws2_32.dll See Also <http://msdn.microsoft.com/en-us/library/ms741416(VS.85).aspx> Winsock Reference <http://msdn.microsoft.com/en-us/library/ms741394(VS.85).aspx> Winsock Functions <http://msdn.microsoft.com/en-us/library/ms737550(VS.85).aspx> bind <http://msdn.microsoft.com/en-us/library/ms738533(VS.85).aspx> getpeername <http://msdn.microsoft.com/en-us/library/ms740496(VS.85).aspx> SOCKADDR <http://msdn.microsoft.com/en-us/library/ms740506(VS.85).aspx> socket BTW >>> Are er sure WSAStartup Has been called to init the windows socket env if we are using process-threads it must be done for each process Here is some more >> PRB: Getsockname() Returns IP Address 0.0.0.0 for UDP View products that this article applies to. <http://support.microsoft.com/kb/129065#appliesto> This article was previously published under Q129065 On This Page <javascript:void(0);> * <http://support.microsoft.com/kb/129065#> SYMPTOMS * <http://support.microsoft.com/kb/129065#> CAUSE * <http://support.microsoft.com/kb/129065#> UDP * <http://support.microsoft.com/kb/129065#> TCP * <http://support.microsoft.com/kb/129065#> STATUS Expand all <javascript:void(0);> | Collapse all <javascript:void(0);> SYMPTOMS <javascript:void(0);> By following the steps listed below, you might think you should get back the int... By following the steps listed below, you might think you should get back the interface address over which the connection was made. However, it actually returns the address 0.0.0.0. 1. Open a UDP socket. 2. Bind it to INADDR_ANY. 3. Call connect() to make a UDP connection. 4. Call getsockname() on your socket. However, if it was a TCP socket, you would get back the IP address of the interface. <http://support.microsoft.com/kb/129065#top> Back to the top CAUSE <javascript:void(0);> UDPThis is the behaviour expected from some flavors of UNIX, notably those deriv... UDP This is the behaviour expected from some flavors of UNIX, notably those derived from BSD. When an application calls connect() on a UDP socket that is bound to INADDR_ANY, the operating system associates the remote address with the local socket. This saves the programmer from having to specify the remote IP address in each sendto() or recvfrom(). Instead they may use send() and recv(). Note that this is just a convenience provided by the operating system; there is no network traffic associated with this call. At this point, the underlying IP software determines the interface over which packets will be sent. As described earlier, under BSD UNIX, calling getsockname() will return the IP address of the interface to the application. This however, is not expected behaviour under Windows NT, Windows 95, or Microsoft TCP IP/32 for Windows for Workgroups version 3.11. Calling getsockname() will return the IP address 0.0.0.0 (INADDR_ANY). Applications should not assume that they can get the IP address of the interface. <http://support.microsoft.com/kb/129065#top> Back to the top TCP The behaviour is different if it was a TCP socket. In this case, calling getsockname() on a connected socket that was bound to INADDR_ANY will return the IP address of the interface over which the connection was made. The state of the connection can also be observed by typing 'netstat' at a command prompt. NOTE: To enumerate all the IP addresses on an IP host, the application should call gethostname(), call gethostbyname(), and then iterate through the h_addr_list[] member of the hostent struct returned by gethostbyname() as in this example: char Hostname[100]; HOSTENT *pHostEnt; int nAdapter = 0; gethostname( Hostname, sizeof( Hostname )); pHostEnt = gethostbyname( Hostname ); while ( pHostEnt->h_addr_list[nAdapter] ) { // pHostEnt->h_addr_list[nAdapter] -the current address in host order nAdapter++; } <http://support.microsoft.com/kb/129065#top> Back to the top STATUS <javascript:void(0);> This behavior is by design. This behavior is by design. <http://support.microsoft.com/kb/129065#top> Back to the top _____ APPLIES TO * Microsoft Windows Software Development Kit 3.11 * Microsoft Platform Software Development Kit-January 2000 Edition * Microsoft Win32 Software Development Kit (SDK) for Windows NT 3.5 <http://support.microsoft.com/kb/129065#top> Back to the top Keywords: kbwinsock kbnetwork kbip kbprb KB129065 getsockname() returns error 10038 (WSAENOTSOCK) on a duplicated socket View products that this article applies to. <http://support.microsoft.com/kb/319952#appliesto> This article was previously published under Q319952 Expand all <javascript:void(0);> | Collapse all <javascript:void(0);> SYMPTOMS <javascript:void(0);> When you call Getsockname() on a duplicated socket, the call may not succeed and... When you call Getsockname() on a duplicated socket, the call may not succeed and you may receive error 10038 (WSAENOTSOCK). This problem occurs when the following conditions exist: * If the socket handle is duplicated using DuplicateHandle(). * The socket is bound and listening on the loopback address 127.0.0.1. * The network cable is unplugged. * Microsoft Proxy Client is installed. RESOLUTION <javascript:void(0);> To resolve the problem, use the WSADuplicateSocket() call to share a socket betw... To resolve the problem, use the WSADuplicateSocket() call to share a socket between processes. _____ APPLIES TO * Microsoft Windows XP Professional * Microsoft Windows XP Home Edition * Microsoft Windows 2000 Standard Edition <http://support.microsoft.com/kb/319952#top> Back to the top Keywords: kbdswnet2003swept kbapi kbfix kbnetwork kbprb kbwinsock KB319952 [Non-text portions of this message have been removed] |
| < Prev | 1 - 2 - 3 | Next > |
| Free embeddable forum powered by Nabble | Forum Help |