[Date Prev] [Date Index] [Date Next] [Thread Prev] [Thread Index] [Thread Next]

Re: consolue -u stops, gets Connection timed out

Mark Wedel Mark.Wedel@Sun.Com
Fri, 5 Jan 2007 14:24:29 -0800 (PST)

Bryan Stansell wrote:

52044 is the port number that the master processes expected a
sub-process to be listening on (one that actually handles console
connections).  for whatever reason, that sub-process is either not
picking up the connection or the master process hasn't realized
something was wrong and taken it out of the list of sub-processes (many
possibilities here - bug dealing with SIGHUP, bug dealing with reaping
children, etc).  if any of the conserver processes is still lingering in
a bad state (say, you find the one that has that socket open but it's
wedged or looping), killing it off should clear things up (the master
would reap it, clean up it's list, respawn another, etc).  it would be
interesting to know if any consoles are missing from the -u output...it
could help narrow the possibilities of how it get into the broken state.

Ok - found some more details. Found the process that is responsible for that port:

8211:   conserver -d
 ff21fe5c write    (21, 10c428, 200)
 00030220 FileWrite (11d688, 3a400, 10c428, 400, 1, 0) + 2e0
 0001ee74 FlushConsole (c7c80, ffbffbf8, ffbffb78, ffffffff, ffbffbf8, 0) + 728
 0001fee0 Kiddie   (b0e08, 4cde8, 4c354, 4c2d4, 3, 4cc00) + dec
 00020660 Spawn    (11d4d0, ffffffff, 11d4d0, cb37, 0, 4d90d) + 3e4
 00022d94 main     (4b400, ffbffdec, ffbffdf8, 4ce04, 0, 0) + db8
 000152e4 _start   (0, 0, 0, 0, 0, 0) + 5c
# ksh -o vi
# truss -f -p 8211
8211:   write(33, 0x0010C428, 512)      (sleeping...)
8211:       Received signal #1, SIGHUP, in write() [caught]
8211:   write(33, "1B [ 2 5 ; 7 5 H1B [ 2 5".., 512)    Err#4 EINTR
8211:   setcontext(0xFFBFF768)
8211:   write(33, 0x0010C428, 512)      (sleeping...)

Have the sighup most likely because I have an automatic script that generates the conserver console database (pulling the information from another database).

FD 33:
conserver 8211 root 33u VCHR 23,159 0t27975504 641133 /devices/pseudo/clone@0:ptmx->ptm

I am running 8.1.14, on sparc solaris 9

I can see which consoles are being served by that process, and console -u <host> on them also times out. I'm presuming they are all missing from the console -u (no console specified) option.

It sounds like just killing 8211 should fix the problem (the master process will see it died and restart at anew). I don't know if this is a problem you want further debugging data from or not.