relayd http check connection failures; hoststated operates correctly

View: New views
12 Messages — Rating Filter:   Alert me  

relayd http check connection failures; hoststated operates correctly

by Ben Lovett :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

hello,

perhaps it's something that i'm doing wrong here, or a difference
in the way that relayd works compared to hoststated. but here
goes.. i'm attempting to get relayd configured to replace my existing
hoststated setup, doing layer 7 load balancing of web servers.

what's happening is with every http check done, relayd returns a
connect failure. in doing a tcpdump i see the session is
brought up by relayd to the destination servers, the server responds
with a syn/ack, and then a rst is sent by the system running relayd.

...

i have a similar hoststated configuration running on the very same
system, load balancing the very same hosts. it operates as expected,
with the hosts being seen as up and available.

i have attached relayd debug log output, my relayd configuration
file, as well as hoststated debug and the hoststated config.

could someone perhaps shed some light on what i'm doing wrong, if
anything? perhaps a bug in the http check/tcp check code?

if i could be cc'd on any replies, i'd appreciate it. i'm not
currently subscribed to misc@.

cheers,

-ben
startup
init_filter: filter init done
tcp_write: connect timed out
relay_privinit: adding relay www
init_tables: created 0 tables
hce_notify_done: aa.bb.cc.209 (tcp_write: connect failed)
protocol 0: name http
host aa.bb.cc.209, check http code (3ms), state unknown -> down, availability 0.00%
        flags: 0x0004
tcp_write: connect timed out
        type: hce_notify_done: aa.bb.cc.211 (tcp_write: connect failed)
http
host aa.bb.cc.211, check http code (4ms), state unknown -> down, availability 0.00%
                pfe_dispatch_imsg: state -1 for host 3 aa.bb.cc.209
request pfe_dispatch_imsg: state -1 for host 2 aa.bb.cc.211
append "$SERVER_ADDR:$SERVER_PORT" to "X-Forwarded-By"
                request append "$REMOTE_ADDR" to "X-Forwarded-For"
relay_init: max open files 1024
relay_init: max open files 1024
relay_init: max open files 1024
relay_init: max open files 1024
relay_init: max open files 1024
adding 2 hosts from table webhosts:80
adding 2 hosts from table webhosts:80
adding 2 hosts from table webhosts:80
adding 2 hosts from table webhosts:80
adding 2 hosts from table webhosts:80
relay_launch: running relay www
relay_launch: running relay www
relay_launch: running relay www
relay_launch: running relay www
relay_launch: running relay www
tcp_write: connect timed out
hce_notify_done: aa.bb.cc.209 (tcp_write: connect failed)
tcp_write: connect timed out
hce_notify_done: aa.bb.cc.211 (tcp_write: connect failed)
^Chost check engine exiting
kill_tables: deleted 0 tables
flush_rulesets: flushed rules
pf update engine exiting
socket relay engine exiting
socket relay engine exiting
terminating
root@nlb2-lax1$ socket relay engine exiting
socket relay engine exiting
socket relay engine exiting
startup
decremented the demote state of group 'carp'
init_filter: filter init done
relay_privinit: adding relay www
init_tables: created 0 tables
protocol 0: name http
        flags: 0x0004
        type: http
                request append "$SERVER_ADDR:$SERVER_PORT" to "X-Forwarded-By"
                request append "$REMOTE_ADDR" to "X-Forwarded-For"
relay_init: max open files 1024
relay_init: max open files 1024
relay_init: max open files 1024
relay_init: max open files 1024
relay_init: max open files 1024
adding 2 hosts from table http_hosts
adding 2 hosts from table http_hosts
adding 2 hosts from table http_hosts
adding 2 hosts from table http_hosts
adding 2 hosts from table http_hosts
relay_launch: running relay www
relay_launch: running relay www
relay_launch: running relay www
relay_launch: running relay www
relay_launch: running relay www
hce_notify_done: aa.bb.cc.209 (tcp_read_buf: check succeeded)
host aa.bb.cc.209, check http code (115ms), state unknown -> up, availability 100.00%
pfe_dispatch_imsg: state 1 for host 1 aa.bb.cc.209
hce_notify_done: aa.bb.cc.209 (tcp_read_buf: check succeeded)
host aa.bb.cc.209, check http code (116ms), state unknown -> up, availability 100.00%
pfe_dispatch_imsg: state 1 for host 0 aa.bb.cc.209
hce_notify_done: aa.bb.cc.209 (tcp_read_buf: check succeeded)
hce_notify_done: aa.bb.cc.209 (tcp_read_buf: check succeeded)
^Chost check engine exiting
kill_tables: deleted 0 tables
flush_rulesets: flushed rules
pf update engine exiting
socket relay engine exiting
socket relay engine exiting
socket relay engine exiting
socket relay engine exiting
incremented the demote state of group 'carp'
terminating
socket relay engine exiting
ext_addr="10.10.10.52"
webhost1="aa.bb.cc.209"
webhost2="aa.bb.cc.209"
timeout 800
prefork 5
log updates
demote carp
table http_hosts {
        real port http
        check http "/" host www.mysite.com code 200
        host $webhost1 retry 2
        host $webhost2 retry 2
}
protocol http {
        protocol http
        header append "$REMOTE_ADDR" to "X-Forwarded-For"
        header append "$SERVER_ADDR:$SERVER_PORT" to "X-Forwarded-By"
        # Various TCP performance options
        tcp { nodelay, sack, socket buffer 65536, backlog 128 }
}
relay www {
        listen on $ext_addr port http
        protocol http
        table http_hosts loadbalance
}
ext_addr="10.10.10.52"
webhost1="aa.bb.cc.209"
webhost2="aa.bb.cc.211"
timeout 800
table <webhosts> { $webhost1 $webhost2 }
http protocol http {
        header append "$REMOTE_ADDR" to "X-Forwarded-For"
        header append "$SERVER_ADDR:$SERVER_PORT" to "X-Forwarded-By"
        # header change "Connection" to "close"
        # Various TCP performance options
        tcp { nodelay, sack, socket buffer 65536, backlog 128 }
}
relay www {
        listen on $ext_addr port 80
        protocol http
        # Forward to hosts in the webhosts table using a src/dst hash
        forward to <webhosts> port http mode loadbalance \
                check http "/" host www.mysite.com code 200
}


Re: relayd http check connection failures; hoststated operates correctly

by Ben Lovett :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

forgot to include system details..

this is:
kern.version=OpenBSD 4.3-beta (GENERIC) #661: Thu Feb 21 15:39:36 MST 2008
    pvalchev@...:/usr/src/sys/arch/i386/compile/GENERIC

-ben


Re: relayd http check connection failures; hoststated operates correctly

by Brad Arrington :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi,

I ran into the same problem you did, I thought it was something I
was doing wrong until I read your email...

Here is the fix I came up with.

--- check_tcp.c-current Mon Feb 25 15:11:40 2008
+++ check_tcp.c Mon Feb 25 23:48:45 2008
@@ -82,6 +82,7 @@
        if (fcntl(s, F_SETFL, O_NONBLOCK) == -1)
                goto bad;

+       gettimeofday(&cte->table->conf.timeout, NULL);
        bcopy(&cte->table->conf.timeout, &tv, sizeof(tv));
        if (connect(s, (struct sockaddr *)&cte->host->conf.ss, len) == -1) {
                if (errno != EINPROGRESS)

I should check for return codes on gettimeofday but here it is anyway...
I submited a bug report too.

-Brad


> -----Original Message-----
> From: ben@...
> Sent: Fri, 22 Feb 2008 16:16:29 -0800
> To: misc@...
> Subject: relayd http check connection failures; hoststated operates
> correctly
>
> hello,
>
> perhaps it's something that i'm doing wrong here, or a difference
> in the way that relayd works compared to hoststated. but here
> goes.. i'm attempting to get relayd configured to replace my existing
> hoststated setup, doing layer 7 load balancing of web servers.
>
> what's happening is with every http check done, relayd returns a
> connect failure. in doing a tcpdump i see the session is
> brought up by relayd to the destination servers, the server responds
> with a syn/ack, and then a rst is sent by the system running relayd.
>
> ...
>
> i have a similar hoststated configuration running on the very same
> system, load balancing the very same hosts. it operates as expected,
> with the hosts being seen as up and available.
>
> i have attached relayd debug log output, my relayd configuration
> file, as well as hoststated debug and the hoststated config.
>
> could someone perhaps shed some light on what i'm doing wrong, if
> anything? perhaps a bug in the http check/tcp check code?
>
> if i could be cc'd on any replies, i'd appreciate it. i'm not
> currently subscribed to misc@.
>
> cheers,
>
> -ben
> startup
> init_filter: filter init done
> tcp_write: connect timed out
> relay_privinit: adding relay www
> init_tables: created 0 tables
> hce_notify_done: aa.bb.cc.209 (tcp_write: connect failed)
> protocol 0: name http
> host aa.bb.cc.209, check http code (3ms), state unknown -> down,
> availability 0.00%
>         flags: 0x0004
> tcp_write: connect timed out
>         type: hce_notify_done: aa.bb.cc.211 (tcp_write: connect failed)
> http
> host aa.bb.cc.211, check http code (4ms), state unknown -> down,
> availability 0.00%
>                 pfe_dispatch_imsg: state -1 for host 3 aa.bb.cc.209
> request pfe_dispatch_imsg: state -1 for host 2 aa.bb.cc.211
> append "$SERVER_ADDR:$SERVER_PORT" to "X-Forwarded-By"
>                 request append "$REMOTE_ADDR" to "X-Forwarded-For"
> relay_init: max open files 1024
> relay_init: max open files 1024
> relay_init: max open files 1024
> relay_init: max open files 1024
> relay_init: max open files 1024
> adding 2 hosts from table webhosts:80
> adding 2 hosts from table webhosts:80
> adding 2 hosts from table webhosts:80
> adding 2 hosts from table webhosts:80
> adding 2 hosts from table webhosts:80
> relay_launch: running relay www
> relay_launch: running relay www
> relay_launch: running relay www
> relay_launch: running relay www
> relay_launch: running relay www
> tcp_write: connect timed out
> hce_notify_done: aa.bb.cc.209 (tcp_write: connect failed)
> tcp_write: connect timed out
> hce_notify_done: aa.bb.cc.211 (tcp_write: connect failed)
> ^Chost check engine exiting
> kill_tables: deleted 0 tables
> flush_rulesets: flushed rules
> pf update engine exiting
> socket relay engine exiting
> socket relay engine exiting
> terminating
> root@nlb2-lax1$ socket relay engine exiting
> socket relay engine exiting
> socket relay engine exiting
> startup
> decremented the demote state of group 'carp'
> init_filter: filter init done
> relay_privinit: adding relay www
> init_tables: created 0 tables
> protocol 0: name http
>         flags: 0x0004
>         type: http
>                 request append "$SERVER_ADDR:$SERVER_PORT" to
> "X-Forwarded-By"
>                 request append "$REMOTE_ADDR" to "X-Forwarded-For"
> relay_init: max open files 1024
> relay_init: max open files 1024
> relay_init: max open files 1024
> relay_init: max open files 1024
> relay_init: max open files 1024
> adding 2 hosts from table http_hosts
> adding 2 hosts from table http_hosts
> adding 2 hosts from table http_hosts
> adding 2 hosts from table http_hosts
> adding 2 hosts from table http_hosts
> relay_launch: running relay www
> relay_launch: running relay www
> relay_launch: running relay www
> relay_launch: running relay www
> relay_launch: running relay www
> hce_notify_done: aa.bb.cc.209 (tcp_read_buf: check succeeded)
> host aa.bb.cc.209, check http code (115ms), state unknown -> up,
> availability 100.00%
> pfe_dispatch_imsg: state 1 for host 1 aa.bb.cc.209
> hce_notify_done: aa.bb.cc.209 (tcp_read_buf: check succeeded)
> host aa.bb.cc.209, check http code (116ms), state unknown -> up,
> availability 100.00%
> pfe_dispatch_imsg: state 1 for host 0 aa.bb.cc.209
> hce_notify_done: aa.bb.cc.209 (tcp_read_buf: check succeeded)
> hce_notify_done: aa.bb.cc.209 (tcp_read_buf: check succeeded)
> ^Chost check engine exiting
> kill_tables: deleted 0 tables
> flush_rulesets: flushed rules
> pf update engine exiting
> socket relay engine exiting
> socket relay engine exiting
> socket relay engine exiting
> socket relay engine exiting
> incremented the demote state of group 'carp'
> terminating
> socket relay engine exiting
> ext_addr="10.10.10.52"
> webhost1="aa.bb.cc.209"
> webhost2="aa.bb.cc.209"
> timeout 800
> prefork 5
> log updates
> demote carp
> table http_hosts {
>         real port http
>         check http "/" host www.mysite.com code 200
>         host $webhost1 retry 2
>         host $webhost2 retry 2
> }
> protocol http {
>         protocol http
>         header append "$REMOTE_ADDR" to "X-Forwarded-For"
>         header append "$SERVER_ADDR:$SERVER_PORT" to "X-Forwarded-By"
>         # Various TCP performance options
>         tcp { nodelay, sack, socket buffer 65536, backlog 128 }
> }
> relay www {
>         listen on $ext_addr port http
>         protocol http
>         table http_hosts loadbalance
> }
> ext_addr="10.10.10.52"
> webhost1="aa.bb.cc.209"
> webhost2="aa.bb.cc.211"
> timeout 800
> table <webhosts> { $webhost1 $webhost2 }
> http protocol http {
>         header append "$REMOTE_ADDR" to "X-Forwarded-For"
>         header append "$SERVER_ADDR:$SERVER_PORT" to "X-Forwarded-By"
>         # header change "Connection" to "close"
>         # Various TCP performance options
>         tcp { nodelay, sack, socket buffer 65536, backlog 128 }
> }
> relay www {
>         listen on $ext_addr port 80
>         protocol http
>         # Forward to hosts in the webhosts table using a src/dst hash
>         forward to <webhosts> port http mode loadbalance \
>                 check http "/" host www.mysite.com code 200
> }
Visit http://www.inbox.com/email to find out more!


Re: relayd http check connection failures; hoststated operates correctly

by Pierre-Yves Ritschard-4 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Brad Arrington <bradla@...> wrote:

> Hi,
>
> I ran into the same problem you did, I thought it was something I
> was doing wrong until I read your email...
>
> Here is the fix I came up with.
>
> --- check_tcp.c-current Mon Feb 25 15:11:40 2008
> +++ check_tcp.c Mon Feb 25 23:48:45 2008
> @@ -82,6 +82,7 @@
>         if (fcntl(s, F_SETFL, O_NONBLOCK) == -1)
>                 goto bad;
>
> +       gettimeofday(&cte->table->conf.timeout, NULL);
>         bcopy(&cte->table->conf.timeout, &tv, sizeof(tv));
>         if (connect(s, (struct sockaddr *)&cte->host->conf.ss, len)
> == -1) { if (errno != EINPROGRESS)
>
> I should check for return codes on gettimeofday but here it is
> anyway... I submited a bug report too.
>

I'll handle the bug report, thanks for reporting.


Re: relayd http check connection failures; hoststated operates correctly

by Pierre-Yves Ritschard-4 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Brad Arrington <bradla@...> wrote:

> Hi,
>
> I ran into the same problem you did, I thought it was something I
> was doing wrong until I read your email...
>
> Here is the fix I came up with.
>
> --- check_tcp.c-current Mon Feb 25 15:11:40 2008
> +++ check_tcp.c Mon Feb 25 23:48:45 2008
> @@ -82,6 +82,7 @@
>         if (fcntl(s, F_SETFL, O_NONBLOCK) == -1)
>                 goto bad;
>
> +       gettimeofday(&cte->table->conf.timeout, NULL);
>         bcopy(&cte->table->conf.timeout, &tv, sizeof(tv));
>         if (connect(s, (struct sockaddr *)&cte->host->conf.ss, len)
> == -1) { if (errno != EINPROGRESS)
>
> I should check for return codes on gettimeofday but here it is
> anyway... I submited a bug report too.
>
> -Brad
>

Hi Brad,

Your fix is wrong, you run in a timeout which happens because the
default relayd configuration supposes you are in the same broadcast
domain than your relayed host and has a 200ms timeout.

The error reporting is a bit confusing and should just mention that a
timeout occured, I will fix that. The gettimeofday you used indeed
fixed your issue but is really wrong since it modifies the value you
specify in the configuration file.

A simple fix for you would be to specify:

timeout 1000 # (or any appropriate timeout value for your application)

in your configuration file.

> > startup
> > init_filter: filter init done
> > tcp_write: connect timed out
> > relay_privinit: adding relay www
> > init_tables: created 0 tables
> > hce_notify_done: aa.bb.cc.209 (tcp_write: connect failed)
> > protocol 0: name http
> > host aa.bb.cc.209, check http code (3ms), state unknown -> down,
> > availability 0.00%
> >         flags: 0x0004
> > tcp_write: connect timed out

The timeout is mentionned here.

> >         type: hce_notify_done: aa.bb.cc.211 (tcp_write: connect
> > failed) http

And then a connect failed error happens which might have confused you.

pyr.


Re: relayd http check connection failures; hoststated operates correctly

by Brad Arrington :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi Pierre-Yves,

I guess we are both wrong...
I used a few different timeout values including 1000 before
changing any code. I just checked relayd(the unpatched version) again and I
get the same results.

These web servers just serve the default apache index page.
I can connect to them instantly from the load balancer (using lynx) or any
other (client)machine
I have tested.

So either the timeout value is not read/set correctly or it is something
else.

-Brad

> -----Original Message-----
> From: pyr@...
> Sent: Wed, 27 Feb 2008 11:53:03 +0100
> To: bradla@...
> Subject: Re: relayd http check connection failures; hoststated operates
> correctly
>
> Brad Arrington <bradla@...> wrote:
>> Hi,
>>
>> I ran into the same problem you did, I thought it was something I
>> was doing wrong until I read your email...
>>
>> Here is the fix I came up with.
>>
>> --- check_tcp.c-current Mon Feb 25 15:11:40 2008
>> +++ check_tcp.c Mon Feb 25 23:48:45 2008
>> @@ -82,6 +82,7 @@
>>         if (fcntl(s, F_SETFL, O_NONBLOCK) == -1)
>>                 goto bad;
>>
>> +       gettimeofday(&cte->table->conf.timeout, NULL);
>>         bcopy(&cte->table->conf.timeout, &tv, sizeof(tv));
>>         if (connect(s, (struct sockaddr *)&cte->host->conf.ss, len)
>> == -1) { if (errno != EINPROGRESS)
>>
>> I should check for return codes on gettimeofday but here it is
>> anyway... I submited a bug report too.
>>
>> -Brad
>>
>
> Hi Brad,
>
> Your fix is wrong, you run in a timeout which happens because the
> default relayd configuration supposes you are in the same broadcast
> domain than your relayed host and has a 200ms timeout.
>
> The error reporting is a bit confusing and should just mention that a
> timeout occured, I will fix that. The gettimeofday you used indeed
> fixed your issue but is really wrong since it modifies the value you
> specify in the configuration file.
>
> A simple fix for you would be to specify:
>
> timeout 1000 # (or any appropriate timeout value for your application)
>
> in your configuration file.
>
>>> startup
>>> init_filter: filter init done
>>> tcp_write: connect timed out
>>> relay_privinit: adding relay www
>>> init_tables: created 0 tables
>>> hce_notify_done: aa.bb.cc.209 (tcp_write: connect failed)
>>> protocol 0: name http
>>> host aa.bb.cc.209, check http code (3ms), state unknown -> down,
>>> availability 0.00%
>>>         flags: 0x0004
>>> tcp_write: connect timed out
>
> The timeout is mentionned here.
>
>>>         type: hce_notify_done: aa.bb.cc.211 (tcp_write: connect
>>> failed) http
>
> And then a connect failed error happens which might have confused you.
>
> pyr.


Re: relayd http check connection failures; hoststated operates correctly

by Pierre-Yves Ritschard-4 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Brad Arrington <bradla@...> wrote:

> Hi Pierre-Yves,
>
> I guess we are both wrong...
> I used a few different timeout values including 1000 before
> changing any code. I just checked relayd(the unpatched version) again
> and I get the same results.
>
> These web servers just serve the default apache index page.
> I can connect to them instantly from the load balancer (using lynx)
> or any other (client)machine
> I have tested.
>
> So either the timeout value is not read/set correctly or it is
> something else.
>
Please try with an insanely high value (10seconds) and see if you still
get a connection timeout message.

To make logging more meaningful you can try with this diff and send me
the relayd -dv output:

Index: check_tcp.c
===================================================================
RCS file: /cvs/src/usr.sbin/relayd/check_tcp.c,v
retrieving revision 1.31
diff -u -p -r1.31 check_tcp.c
--- check_tcp.c 7 Dec 2007 17:17:00 -0000 1.31
+++ check_tcp.c 27 Feb 2008 13:40:45 -0000
@@ -109,21 +109,24 @@ tcp_write(int s, short event, void *arg)
  if (event == EV_TIMEOUT) {
  log_debug("tcp_write: connect timed out");
  cte->host->up = HOST_DOWN;
- } else {
- len = sizeof(err);
- if (getsockopt(s, SOL_SOCKET, SO_ERROR, &err, &len))
- fatal("tcp_write: getsockopt");
- if (err != 0)
- cte->host->up = HOST_DOWN;
- else
- cte->host->up = HOST_UP;
+ close(s);
+ hce_notify_done(cte->host, "tcp_write: connect timed out");
+ return;
  }
 
+ len = sizeof(err);
+ if (getsockopt(s, SOL_SOCKET, SO_ERROR, &err, &len))
+ fatal("tcp_write: getsockopt");
+ if (err != 0)
+ cte->host->up = HOST_DOWN;
+ else
+ cte->host->up = HOST_UP;
+
  if (cte->host->up == HOST_UP)
  tcp_host_up(s, cte);
  else {
  close(s);
- hce_notify_done(cte->host, "tcp_write: connect failed");
+ hce_notify_done(cte->host, "tcp_write: connection refused");
  }
 }


Re: relayd http check connection failures; hoststated operates correctly

by Ben Lovett :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Wed, Feb 27, 2008 at 11:53:03AM +0100, Pierre-Yves Ritschard wrote:
> Your fix is wrong, you run in a timeout which happens because the
> default relayd configuration supposes you are in the same broadcast
> domain than your relayed host and has a 200ms timeout.

While my relay server isn't in the same broadcast domain as my
backend servers, there is on average 2ms rtt between the systems.
Average response time from the HTTP servers is about 300ms.

> The error reporting is a bit confusing and should just mention that a
> timeout occured, I will fix that. The gettimeofday you used indeed
> fixed your issue but is really wrong since it modifies the value you
> specify in the configuration file.
>
> A simple fix for you would be to specify:
>
> timeout 1000 # (or any appropriate timeout value for your application)
>
> in your configuration file.

I hate to say this Pierre-Yves, but this occurs even with a timeout
of 5000ms in my configuration file. The *very* same system, polling
the *very same* hosts with hoststated does not have this problem.

> > > startup
> > > init_filter: filter init done
> > > tcp_write: connect timed out
> > > relay_privinit: adding relay www
> > > init_tables: created 0 tables
> > > hce_notify_done: aa.bb.cc.209 (tcp_write: connect failed)
> > > protocol 0: name http
> > > host aa.bb.cc.209, check http code (3ms), state unknown -> down,
> > > availability 0.00%
> > >         flags: 0x0004
> > > tcp_write: connect timed out
>
> The timeout is mentionned here.

# grep timeout /root/relayd.conf
timeout 5000

>
> > >         type: hce_notify_done: aa.bb.cc.211 (tcp_write: connect
> > > failed) http
>
> And then a connect failed error happens which might have confused you.

If you look here, the connect succeeds..

The initial SYN:
11:07:56.249025 aa.bb.cc.140.43847 > dd.ee.ff.209.80: S [tcp sum ok] 1292907170:1292907170(0) win 16384 <mss 1460,nop,nop,sackOK,nop,wscale 0,nop,nop,timestamp 3626625731 0> (DF) (ttl 64, id 10238, len 64)

The SYN/ACK:
11:07:56.250782 dd.ee.ff.209.80 > aa.bb.cc.140.43847: S [tcp sum ok] 394683021:394683021(0) ack 1292907171 win 5792 <mss 1460,sackOK,timestamp 1366160992 3626625731,nop,wscale 2> (DF) (ttl 54, id 0, len 60)

The RST (by the host initiating the session in the first place):
11:07:56.250814 aa.bb.cc.140.43847 > dd.ee.ff.209.80: R [tcp sum ok] 1292907171:1292907171(0) win 0 (DF) (ttl 64, id 17473, len 40)

Ben


Re: relayd http check connection failures; hoststated operates correctly

by Ben Lovett :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

On Wed, Feb 27, 2008 at 06:28:40PM +0100, Pierre-Yves Ritschard wrote:
> Please try with an insanely high value (10seconds) and see if you still
> get a connection timeout message.
>
> To make logging more meaningful you can try with this diff and send me
> the relayd -dv output:

I can't set timeout to 10s (complains of "global timeout exceeds
interval".

Here are the results with your diff:

# obj/relayd -dv -f /root/relayd.conf        
startup
init_filter: filter init done
tcp_write: connect timed out
relay_privinit: adding relay www
init_tables: created 0 tables
hce_notify_done: dd.ee.ff.209 (tcp_write: connect timed out)
protocol 0: name http
host dd.ee.ff.209, check http code (2ms), state unknown -> down, availability 0.00%
        flags: 0x0004
tcp_write: connect timed out
        type: hce_notify_done: dd.ee.ff.211 (tcp_write: connect timed out)
http
host dd.ee.ff.211, check http code (3ms), state unknown -> down, availability 0.00%
                pfe_dispatch_imsg: state -1 for host 3 dd.ee.ff.209
request pfe_dispatch_imsg: state -1 for host 2 dd.ee.ff.211
append "$SERVER_ADDR:$SERVER_PORT" to "X-Forwarded-By"
                request append "$REMOTE_ADDR" to "X-Forwarded-For"
relay_init: max open files 1024
relay_init: max open files 1024
relay_init: max open files 1024
relay_init: max open files 1024
relay_init: max open files 1024
adding 2 hosts from table webhosts:80
adding 2 hosts from table webhosts:80
adding 2 hosts from table webhosts:80
adding 2 hosts from table webhosts:80
adding 2 hosts from table webhosts:80
relay_launch: running relay www
relay_launch: running relay www
relay_launch: running relay www
relay_launch: running relay www
relay_launch: running relay www
tcp_write: connect timed out
hce_notify_done: dd.ee.ff.209 (tcp_write: connect timed out)
tcp_write: connect timed out
hce_notify_done: dd.ee.ff.211 (tcp_write: connect timed out)
^Chost check engine exiting
kill_tables: deleted 0 tables
flush_rulesets: flushed rules
pf update engine exiting
socket relay engine exiting
socket relay engine exiting
socket relay engine exiting
socket relay engine exiting
socket relay engine exiting
terminating

The configuration file I'm using:

# cat /root/relayd.conf                                                    
ext_addr="aa.bb.cc.114"
webhost1="dd.ee.ff.209"
webhost2="dd.ee.ff.211"

timeout 9999

table <webhosts> { $webhost1 $webhost2 }

http protocol http {
        header append "$REMOTE_ADDR" to "X-Forwarded-For"
        header append "$SERVER_ADDR:$SERVER_PORT" to "X-Forwarded-By"
        tcp { nodelay, sack, socket buffer 65536, backlog 128 }
}

relay www {
        listen on $ext_addr port 80
        protocol http

        forward to <webhosts> port http mode loadbalance \
                check http "/" host www.mysite.com code 200
}

Ben


Re: relayd http check connection failures; hoststated operates correctly

by Brad Arrington :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Hi Ben,

Try changing the interval value to a higher value.
I tested it the results are the same. (with timeout set to 10 seconds)

-Brad

> -----Original Message-----
> From: ben@...
> Sent: Wed, 27 Feb 2008 11:27:19 -0800
> To: pyr@...
> Subject: Re: relayd http check connection failures; hoststated operates
> correctly
>
> On Wed, Feb 27, 2008 at 06:28:40PM +0100, Pierre-Yves Ritschard wrote:
>> Please try with an insanely high value (10seconds) and see if you still
>> get a connection timeout message.
>>
>> To make logging more meaningful you can try with this diff and send me
>> the relayd -dv output:
>
> I can't set timeout to 10s (complains of "global timeout exceeds
> interval".
>
> Here are the results with your diff:
>
> # obj/relayd -dv -f /root/relayd.conf
> startup
> init_filter: filter init done
> tcp_write: connect timed out
> relay_privinit: adding relay www
> init_tables: created 0 tables
> hce_notify_done: dd.ee.ff.209 (tcp_write: connect timed out)
> protocol 0: name http
> host dd.ee.ff.209, check http code (2ms), state unknown -> down,
> availability 0.00%
>         flags: 0x0004
> tcp_write: connect timed out
>         type: hce_notify_done: dd.ee.ff.211 (tcp_write: connect timed
> out)
> http
> host dd.ee.ff.211, check http code (3ms), state unknown -> down,
> availability 0.00%
>                 pfe_dispatch_imsg: state -1 for host 3 dd.ee.ff.209
> request pfe_dispatch_imsg: state -1 for host 2 dd.ee.ff.211
> append "$SERVER_ADDR:$SERVER_PORT" to "X-Forwarded-By"
>                 request append "$REMOTE_ADDR" to "X-Forwarded-For"
> relay_init: max open files 1024
> relay_init: max open files 1024
> relay_init: max open files 1024
> relay_init: max open files 1024
> relay_init: max open files 1024
> adding 2 hosts from table webhosts:80
> adding 2 hosts from table webhosts:80
> adding 2 hosts from table webhosts:80
> adding 2 hosts from table webhosts:80
> adding 2 hosts from table webhosts:80
> relay_launch: running relay www
> relay_launch: running relay www
> relay_launch: running relay www
> relay_launch: running relay www
> relay_launch: running relay www
> tcp_write: connect timed out
> hce_notify_done: dd.ee.ff.209 (tcp_write: connect timed out)
> tcp_write: connect timed out
> hce_notify_done: dd.ee.ff.211 (tcp_write: connect timed out)
> ^Chost check engine exiting
> kill_tables: deleted 0 tables
> flush_rulesets: flushed rules
> pf update engine exiting
> socket relay engine exiting
> socket relay engine exiting
> socket relay engine exiting
> socket relay engine exiting
> socket relay engine exiting
> terminating
>
> The configuration file I'm using:
>
> # cat /root/relayd.conf
> ext_addr="aa.bb.cc.114"
> webhost1="dd.ee.ff.209"
> webhost2="dd.ee.ff.211"
>
> timeout 9999
>
> table <webhosts> { $webhost1 $webhost2 }
>
> http protocol http {
>         header append "$REMOTE_ADDR" to "X-Forwarded-For"
>         header append "$SERVER_ADDR:$SERVER_PORT" to "X-Forwarded-By"
>         tcp { nodelay, sack, socket buffer 65536, backlog 128 }
> }
>
> relay www {
>         listen on $ext_addr port 80
>         protocol http
>
>         forward to <webhosts> port http mode loadbalance \
>                 check http "/" host www.mysite.com code 200
> }
>
> Ben

____________________________________________________________
FREE ONLINE PHOTOSHARING - Share your photos online with your friends and
family!
Visit http://www.inbox.com/photosharing to find out more!


Re: relayd http check connection failures; hoststated operates correctly

by Armin Wolfermann :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

* Ben Lovett <ben@...> [23.02.2008 01:22]:
> could someone perhaps shed some light on what i'm doing wrong, if
> anything? perhaps a bug in the http check/tcp check code?

Looks like a bug in the parser. Table options are not copied to derived
tables. Suggested fix:

Index: parse.y
===================================================================
RCS file: /cvs/src/usr.sbin/relayd/parse.y,v
retrieving revision 1.109
diff -u -r1.109 parse.y
--- parse.y 27 Feb 2008 15:36:42 -0000 1.109
+++ parse.y 28 Feb 2008 16:21:20 -0000
@@ -2052,6 +2052,10 @@
  }
  tb->conf.flags |= dsttb->conf.flags;
 
+ /* Inherit table options */
+ tb->conf.timeout = dsttb->conf.timeout;
+ strlcpy(dsttb->conf.demote_group, tb->conf.demote_group, sizeof(tb->conf.demote_group));
+
  /* Copy the associated hosts */
  bzero(&tb->hosts, sizeof(tb->hosts));
  TAILQ_FOREACH(dsth, &dsttb->hosts, entry) {

If you need a quick workaround duplicate your global timeout in every
forward statement.

Regards,
Armin Wolfermann


Re: relayd http check connection failures; hoststated operates correctly

by Wijnand Wiersma-3 :: Rate this Message:

Reply to Author | View Threaded | Show Only this Message

Armin Wolfermann wrote:
> If you need a quick workaround duplicate your global timeout in every
> forward statement.
>  
That is indeed a working workaround.

However, it seems that nothing is actually loaded.
 pfctl -a relayd -s Tables
returns nothing for example.
So maybe there are more things broken in the parser?

Wijnand