[Date Prev] [Date Index] [Date Next] [Thread Prev] [Thread Index] [Thread Next]

Re: what is normal conserver hang during reconfig

Bryan Stansell bryan@conserver.com
Wed, 19 Oct 2016 03:39:20 GMT


Off the top of my head, I agree that there shouldn't be anything fixed in the newer code to address this.  The code does block all activity when it processes a HUP signal, but that's supposed to be "quick".  :-|

Each process (the main and children) rereads the config file and figures out if there's anything to do.  The main process is in charge of spawning new consoles (or reconfigured), and the children are responsible for letting go of old ones (or reconfigured).

With that in mind, how many consoles are each child managing?  The compile time default can be seen with a "conserver -V", but it can be overridden with -m.  I'm honestly not sure if having more or less would be better or even change things (more processes would use more cores, but also "slam" the system with that many things reading and processing the config).

Conserver tries very hard to be multiplex across all the consoles, even when bringing up and tearing down things.  The reread of the config puts all that on hold, so it probably has to do with that.

One issue I've seen before is the magnitude of DNS lookups done when a config is loaded.  It all depends on the config, of course, but you could end up generating a lot of requests.  Maybe it doesn't apply in your environment, but it can be an unexpected source of trouble.

Aside from that, another server will certainly share the load (and, set up right, the end users won't even notice).  It would be interesting to look at an strace (assuming linux) of a process when it gets a HUP (even without any changes to configs).  Just send one of the children a HUP so it minimizes the impact.  With timestamps, it might highlight what is causing the issue (like the DNS query case, but could be anything).

Bryan 

> On Oct 18, 2016, at 6:04 PM, Denis Hainsworth via users <users@conserver.com> wrote:
> 
> Running v 8.1.18.  Rereading the SIGHUP section of the man page I'm
> still thinking I've configured something wrong.  SIGHUP says conserver
> rereads the config files and then adds/deletes consoles as needed and
> only touches running consoles if they have changed.  If thats true I
> wouldn't expect a 30s buffer of input/output on a console that hasn't
> changed, should I?
> I also don't see anything in CHANGES that sounds like this is a bug
> that has been fixed.
> 
> -denis
> 
> On Fri, Oct 14, 2016 at 12:05:44PM -0400, Denis Hainsworth wrote:
>> I love conserver.  I have  a minor issue and I was curious what options
>> there might be.
>> 
>> So I have a conserver setup running against 262 servers (mostly digis or
>> ser2net machines).  It works great.  However when we need to update due
>> to a config change we run "kill -HUP" against the parent.  With the
>> number of consoles (I think) this causes about a 30s "hang" when
>> interacting with any console which corresponds to the reconfig time.
>> 
>> Does this make sense and is per the current design?  Any chance there is
>> a clever way to make it block for less time?  Barring that I intend to
>> spin up a new server to share the load of my current server and reduce
>> the reconfig time.
>> 
>> I was mostly curious if there was a config issue or if this description
>> doesn't make any sense to folks and it means I have something else going
>> on like too many down consoles or something.
>> -denis
>> 
>> -- 
>> __________________________
>> Denis Alan Hainsworth     
>> denis.hainsworth@gmail.com
> 
> -- 
> __________________________
> Denis Alan Hainsworth     
> denis.hainsworth@gmail.com
> _______________________________________________
> users mailing list
> users@conserver.com
> https://www.conserver.com/mailman/listinfo/users