[Date Prev] [Date Index] [Date Next] [Thread Prev] [Thread Index] [Thread Next]

Re: what is normal conserver hang during reconfig

Denis Hainsworth denis.hainsworth@gmail.com
Sun, 23 Oct 2016 17:57:41 GMT


Dang it, my theory didn't pan out.  While the slower of the two did in
fact have slower disks my IT was able to move the VM to some ultra fast
storage and my reconfig loop wasn't any faster.  :(   And it was such a
lovely theory too.

So I'm still digging to see if I can come up with a second clever idea
but I have a feeling to reduce to reconfig time I'll just have to spread
the load over more systems.

-denis

On Sun, Oct 23, 2016 at 10:34:36AM -0700, Bryan Stansell via users wrote:
> I'm glad you were able to find the source of "most" of your troubles.  I quote that because, yes, theoretically the code could be a lot nicer and not block while reconfiguring.  The code that does that never got folded into the loop that handles I/O, but could...and really should.  No one has ever called it out as a serious enough problem before.  :-)
> 
> I'll certainly put it on the list to look at...but it's not a "simple" change, that's for sure.
> 
> Bryan
> 
> > On Oct 22, 2016, at 11:42 PM, Denis Hainsworth via users <users@conserver.com> wrote:
> > 
> > Finally got time to look at things.  strace is perfect, thanks for
> > suggesting that.
> > 
> > So running something like
> > strace -t -o strace.out.2 -p 3198  
> > and sending a SIGHUP to the parent process showed the issue.
> > 
> > So the way we've always set things up was to automatically generate one
> > config file per console server from our equipment database.   This means
> > There are 264 files that are #included into the main config file. 
> > The first 30s of "hang" is each process opening each file reading it in
> > and closing it, I'm wondering if we need to block I/O during this or
> > perhaps that could be done before we start blocking?
> > Once that is done there is another 10s of hang while we do the dns
> > lookup for each console host as you thought (open /etc/hosts, make a dns
> > query, resolve it).
> > 
> > I tried putting all the configs into one file but that didnt change
> > anything.  So then I started wondering.  Our IT had long ago made the
> > console servers VMs.   Its never seemed like an issue but I compared
> > some basic dd commands and found my problem server has terrible IO
> > throughput ... sigh.   To compare one of my good servers has about
> > 80Mbp/s read/write and the bad one has around 15Mbp/s read/write.  
> > 
> > So I'm going to look into moving the VM or get the disk perf up which
> > should solve most of my issues but I also wonder if the conserver code
> > could be re-organized without too much trouble to avoid issues of
> > blocking when there is slow disk?  Its possible what I'm asking is dumb,
> > just throwing it out there.
> > 
> > -denis
> 
> 
> _______________________________________________
> users mailing list
> users@conserver.com
> https://www.conserver.com/mailman/listinfo/users

-- 
__________________________
Denis Alan Hainsworth     
denis.hainsworth@gmail.com