[Date Prev] [Date Index] [Date Next] [Thread Prev] [Thread Index] [Thread Next]

Re: down consoles fail to reconnect automatically

Lonni J Friedman netllama@gmail.com
Fri, 31 Jul 2015 17:41:54 GMT

Hi Bryan,
I was wondering if you had any ideas about this issue?  Or id you
needed any more info from me to investigate further?


On Tue, Jul 28, 2015 at 7:02 AM, Lonni J Friedman <netllama@gmail.com> wrote:
> On Sun, Jul 26, 2015 at 2:43 AM, Bryan Stansell <bryan@conserver.com> wrote:
>> I keep looking at code and thinking about this, and the only thing that makes sense is a bug somewhere.  When a SIGUSR1 or '-z bringup' happens, it just walks the consoles and performs a ConsInit() on them.  The exact same thing happens when you connect to a console and "open" it (with some extra stuff for feedback to the client).  So, my only explanation is that it's a bug somewhere.
>> And just to clarify, if you run 'console -z bringup' multiple times, they continue to get "connect timeout: forcing down" messages? But as soon as you connect to one, it'll come up on the first try?  I just want to make sure the situation is correct so I can, hopefully, think about how a bug might produce the situation and try and find a fix.  Right now, though, I'm just scratching my head.
> Yes, that's exactly the behavior I've seen.
>> Bryan
>>> On Jul 24, 2015, at 10:36 AM, Lonni J Friedman <netllama@gmail.com> wrote:
>>> Hi Bryan,
>>> When I run "console -v -z bringup", I see a lot of "console
>>> initializing" for every session that is currently down.  Then 10
>>> seconds later, I see:
>>> connect timeout: forcing down
>>> for every console that was previously listed as initializing.
>>> For a console which was down (c042.ytr001.ix), and where I manually
>>> connected and brought it up immediately, I see:
>>> [Thu Jul 23 14:05:02 2015] conserver (13867): [c042.ytr001.ix]
>>> automatic reinitialization
>>> [Thu Jul 23 14:05:02 2015] conserver (13867): [c042.ytr001.ix] console
>>> initializing
>>> [Thu Jul 23 14:05:12 2015] conserver (13867): ERROR: [c042.ytr001.ix]
>>> connect timeout: forcing down
>>> [Thu Jul 23 14:05:23 2015] conserver (13867): [c042.ytr001.ix] login
>>> ncconserverprod@localhost
>>> [Thu Jul 23 14:05:23 2015] conserver (13867): [c042.ytr001.ix] console
>>> initializing
>>> [Thu Jul 23 14:05:26 2015] conserver (13867): [c042.ytr001.ix] console up
>>> Unfortunately, I don't currently have any consoles in the weird state
>>> of failing to re-initialize automatically, yet coming up immediately
>>> with a manual console session, so I can only look at what was logged
>>> yesterday.
>>> Let me know if you need any other info.
>>> thanks
>>> On Fri, Jul 24, 2015 at 12:23 AM, Bryan Stansell <bryan@conserver.com> wrote:
>>>> What you’re doing sounds all correct, as are your expectations (it should attempt to bring up any downed consoles).  My simple test setup shows that it works for me, but with lots of consoles, there could be a bug or some side-effect that happens with more.  Or possibly some config settings that aren’t playing well together.  Do you have any “interesting” messages in the conserver log file that appear when you run the command?
>>>> Bryan
>>>>> On Jul 23, 2015, at 10:59 AM, Lonni J Friedman <netllama@gmail.com> wrote:
>>>>> Greetings,
>>>>> I'm running conserver-8.2.1 on an Ubuntu-14.04.2 server, with several
>>>>> thousand clients, connected over IPMI.  Most of the time it works
>>>>> fine, however occasionally we lose a VPN concentrator that maintains a
>>>>> VPN tunnel between remote sites and the console server, and we a large
>>>>> number of console sessions go into the 'down' state.  Usually when the
>>>>> tunnel comes back up, the console sessions come back up on their own,
>>>>> however there are times when they do not come back up for hours, or
>>>>> not at all for no obvious reason.  In nearly 100% of those cases, if
>>>>> someone manually runs 'console $consoleName' (where $consoleName is
>>>>> the name of the console session that is listed as 'down'), it will
>>>>> immediately come back up.
>>>>> According to the 'console' man page (
>>>>> http://www.conserver.com/docs/console.man.html ), if I invoke
>>>>> 'console' with:
>>>>> -z bringup
>>>>> it should "Try to connect all consoles marked as down (this is
>>>>> equivalent to sending the server a SIGUSR1)".  I've tried that:
>>>>> ####
>>>>> $ console -v -z bringup
>>>>> console: interface address (lo)
>>>>> console: interface address (eth0)
>>>>> ok -- bringing up consoles
>>>>> ####
>>>>> However it doesn't seem to do anything at all.  None of the down
>>>>> consoles come up ever.  Yet I can still force them up manually if I
>>>>> connect to them one at a time.
>>>>> I'm unclear whether I'm misunderstanding how the 'bringup' command is
>>>>> intended to work, or if there's a bug somewhere.
>>>>> Can someone comment?
>>>>> thanks!