[Date Prev] [Date Index] [Date Next] [Thread Prev] [Thread Index] [Thread Next]

Re: Reconnect causes slowdown in 7.2.1

Bryan Stansell bryan@conserver.com
Sun, 12 Jan 2003 12:21:28 -0800 (PST)


On Tue, Jan 07, 2003 at 03:49:33PM -0800, Aaron Burt wrote:
> >From 1+ up to about 18 seconds to connect, depending on how many consoles
> are in reinit.  Keystroke-to-host response times are typically about half
> the connect time.  I timed by counting seconds, so the accuracy leaves
> something to be desired.
>
> Strangely, Conserver commands and responses took around 2 seconds
> consistently wheter 1 or 7 consoles were in reinit.  This was true for
> ^Ec commands and for "console down" messages when sending keystrokes to
> downed consoles.

ok...this is very bizarre (well, unexpected in my mind), but there has
to be a good reason for it.  what, i don't know, but maybe we can track
it down.

> Indeed.  The ability to turn off auto-reinit, for one.  I'll have to see
> if I can find a way to dump a list of consoles in reinit and to force a
> console down/up.  With that, I should be able to automagically find and
> fix blocked ports, which is a common problem after network outages and
> suchlike.

the 'console -i' output shows lots of data and is there for just this
purpose.  hopefully it has enough for what you need.

> > (although a second can seem like a long time too).  if you're seeing
> > very long pauses, it's more likely the call to connect() that's
> > hanging.  was your terminal server actively rejecting the
> > reverse-telnet connections, or is it just half-opening the socket?
> 
> It was sending "port in use" or some such and then dropping the
> connection.

that helps.  then, yeah, you're hitting just about every sleep call and
probably in rapid succession.  it still would be interesting to see
truss/strace/whatever output of a child process that was busily trying
to bring up a port.  and you would benefit from the new code in that
you could turn off the immediate auto-bringup of the console and let it
kick in only every minute or longer.

so, i *think*, now that i'm at the end of your message, that i
understand what's going on.  at least, i have a good guess (your note
of it getting a 'port in use' and being dropped was the key).  if
you're logging all those ports, you should have a heck of a set of
large logfiles, huh?

aside from the "work-around" possibilities (like the new-for-you
auto-retry options), i don't have much else that can help.  but it is
making me think of ways to redo the code so i can get rid of the sleep
statements and, hopefully, reduce or remove the noticable delays.
dunno if or when it'll become code, so try the work-arounds for now.

Bryan