[Date Prev] [Date Index] [Date Next] [Thread Prev] [Thread Index] [Thread Next]

Re: what is normal conserver hang during reconfig

Denis Hainsworth denis.hainsworth@gmail.com
Mon, 24 Oct 2016 00:46:39 GMT

So i think I have a solution which avoids the issue rather than fixing
anything :) 

I had to try to recall why we have things the way we do since its going
on like 10 years or more (have i said I love conserver?)

So back in the day we set up the single conserver instance which was
obviously going to be the master.  We later started populating a couple
other servers at a few different sites.  To keep things simple and
robust every server got the same config set and could be the master.
However in reality I'm pretty sure no one uses anything but the default
master ever.  So if I reduce the configs on all the slave servers,
especially the ones that reconfig the most and are causing the most
grief to the users when it takes 40s, to only the configs they actually
own then my times drop down to 3s and 13s respectively.

Not a perfect solution but it should require a minimum of changes to
everyone involved.  Hopefully no one will remind me tomorrow of
something I forgot :)

-denis (purveyor of good enough solutions)

On Sun, Oct 23, 2016 at 01:57:35PM -0400, Denis Hainsworth wrote:
> Dang it, my theory didn't pan out.  While the slower of the two did in
> fact have slower disks my IT was able to move the VM to some ultra fast
> storage and my reconfig loop wasn't any faster.  :(   And it was such a
> lovely theory too.
> So I'm still digging to see if I can come up with a second clever idea
> but I have a feeling to reduce to reconfig time I'll just have to spread
> the load over more systems.
> -denis
> On Sun, Oct 23, 2016 at 10:34:36AM -0700, Bryan Stansell via users wrote:
> > I'm glad you were able to find the source of "most" of your troubles.  I quote that because, yes, theoretically the code could be a lot nicer and not block while reconfiguring.  The code that does that never got folded into the loop that handles I/O, but could...and really should.  No one has ever called it out as a serious enough problem before.  :-)
> > 
> > I'll certainly put it on the list to look at...but it's not a "simple" change, that's for sure.
> > 
> > Bryan
> > 
> > > On Oct 22, 2016, at 11:42 PM, Denis Hainsworth via users <users@conserver.com> wrote:
> > > 
> > > Finally got time to look at things.  strace is perfect, thanks for
> > > suggesting that.
> > > 
> > > So running something like
> > > strace -t -o strace.out.2 -p 3198  
> > > and sending a SIGHUP to the parent process showed the issue.
> > > 
> > > So the way we've always set things up was to automatically generate one
> > > config file per console server from our equipment database.   This means
> > > There are 264 files that are #included into the main config file. 
> > > The first 30s of "hang" is each process opening each file reading it in
> > > and closing it, I'm wondering if we need to block I/O during this or
> > > perhaps that could be done before we start blocking?
> > > Once that is done there is another 10s of hang while we do the dns
> > > lookup for each console host as you thought (open /etc/hosts, make a dns
> > > query, resolve it).
> > > 
> > > I tried putting all the configs into one file but that didnt change
> > > anything.  So then I started wondering.  Our IT had long ago made the
> > > console servers VMs.   Its never seemed like an issue but I compared
> > > some basic dd commands and found my problem server has terrible IO
> > > throughput ... sigh.   To compare one of my good servers has about
> > > 80Mbp/s read/write and the bad one has around 15Mbp/s read/write.  
> > > 
> > > So I'm going to look into moving the VM or get the disk perf up which
> > > should solve most of my issues but I also wonder if the conserver code
> > > could be re-organized without too much trouble to avoid issues of
> > > blocking when there is slow disk?  Its possible what I'm asking is dumb,
> > > just throwing it out there.
> > > 
> > > -denis
> > 
> > 
> > _______________________________________________
> > users mailing list
> > users@conserver.com
> > https://www.conserver.com/mailman/listinfo/users
> -- 
> __________________________
> Denis Alan Hainsworth     
> denis.hainsworth@gmail.com

Denis Alan Hainsworth