From denis.hainsworth@gmail.com Wed Oct 19 01:04:27 2016
Received: from mail-qk0-f169.google.com (mail-qk0-f169.google.com
 [209.85.220.169])
 by underdog.stansell.org (8.15.2/8.15.2) with ESMTPS id u9J14PqL029793
 (version=TLSv1.2 cipher=AES128-GCM-SHA256 bits=128 verify=OK)
 for <users@conserver.com>; Wed, 19 Oct 2016 01:04:27 GMT
Received: by mail-qk0-f169.google.com with SMTP id z190so13890320qkc.2
 for <users@conserver.com>; Tue, 18 Oct 2016 18:04:26 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=date:from:to:subject:message-id:reply-to:references:mime-version
 :content-disposition:in-reply-to:user-agent;
 bh=Q2Qg0XO7oHeyUbF2HwSPBvbIKM5trKyNpi+H0HIJc+4=;
 b=wsezaV+CfWcW4eaNRRijgygEcXxeWg5BSPij3yoeFcRJe/WgEmFKLRqIyWLOZgqGzR
 6o27iq7mAdN4hqnqU0eWOpgSYWr9z8TJWxzxXDJC2/JbG4HyBr/Tlh+ozseZF3vjQRtZ
 pviWhk+m68Cjwh/Md39VKjEnfXrZe1ihTrHBrp3so28Ed4rMCIgGaQ99GzATjFA4SidG
 IFyEYB+SPP4Rtc9jCC96Q+Wi4ZFPFp6fnf3KrKitq+D+UbLeNauFo6vauR/NXMhNh2Jh
 rpJQbQuj6PL3FArKFrClJ4JYGozJIhEKjJK9VlHmTuJXEzNJpwcD1mQmmZVUWAOqr0XC
 VqXw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20130820;
 h=x-gm-message-state:date:from:to:subject:message-id:reply-to
 :references:mime-version:content-disposition:in-reply-to:user-agent;
 bh=Q2Qg0XO7oHeyUbF2HwSPBvbIKM5trKyNpi+H0HIJc+4=;
 b=GU07QVTZOD9aheROeuF4oCV4MoAHNxQDxNxuQqQEZBAU+2V7mOUA9wsxUoa+8S0prh
 N5g5XZDm9lgb6OdyvgvpI8VRuHajJe8x8FKGdVbNYc1crLN1NyxfU/PLWe2m0uy2JewC
 mDNRKFFPQQnUMxKWTFrIjsSu4OoFofe6mN21KozbqLnc8qXZFRcygwtm05xG7gjL9Y8f
 a42P+1nw3r8pbbXCiwzqMqTz7iMDuOos1zjtbmhppkBSvI9PISpXw2CeIBN98ejJP7Qb
 DQFul4F8ErZb5SQ+589ow1xSvNTj40YOI5Xr9IjpNF9FokxGOuKZ/aNMdxPzwWB00Fpj
 ilsw==
X-Gm-Message-State: AA6/9RkEnNe87BUIYR1uZSHP3yiEvW9A5oysCH+gF+DWFgw8EkDb8b/7RT/XZSmnVbr9uw==
X-Received: by 10.55.207.12 with SMTP id e12mr3444036qkj.206.1476839064346;
 Tue, 18 Oct 2016 18:04:24 -0700 (PDT)
Received: from xmas.dyndns.org (cl-890.chi-02.us.sixxs.net.
 [2001:4978:f:379::2])
 by smtp.gmail.com with ESMTPSA id i207sm19522847qke.40.2016.10.18.18.04.23
 for <users@conserver.com>
 (version=TLS1 cipher=AES128-SHA bits=128/128);
 Tue, 18 Oct 2016 18:04:23 -0700 (PDT)
Received: by xmas.dyndns.org (Postfix, from userid 501)
 id 9C311BFA090; Tue, 18 Oct 2016 21:04:22 -0400 (EDT)
Date: Tue, 18 Oct 2016 21:04:22 -0400
From: Denis Hainsworth <denis.hainsworth@gmail.com>
To: users@conserver.com
Subject: Re: what is normal conserver hang during reconfig
Message-ID: <20161019010422.GQ27007@cs.brandeis.edu>
Reply-To: Denis Hainsworth <denis.hainsworth@gmail.com>
References: <20161014160544.GZ27007@cs.brandeis.edu>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20161014160544.GZ27007@cs.brandeis.edu>
User-Agent: Mutt/1.5.21 (2010-09-15)
X-Spam-Score: -2 () BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU,
 FREEMAIL_FROM, SPF_PASS
X-Scanned-By: MIMEDefang 2.72 on 198.151.248.21
X-BeenThere: users@conserver.com
X-Mailman-Version: 2.1.23
Precedence: list
List-Id: Conserver Users <users.conserver.com>
List-Unsubscribe: <https://www.conserver.com/mailman/options/users>,
 <mailto:users-request@conserver.com?subject=unsubscribe>
List-Archive: <https://www.conserver.com/pipermail/users/>
List-Post: <mailto:users@conserver.com>
List-Help: <mailto:users-request@conserver.com?subject=help>
List-Subscribe: <https://www.conserver.com/mailman/listinfo/users>,
 <mailto:users-request@conserver.com?subject=subscribe>
X-List-Received-Date: Wed, 19 Oct 2016 01:04:28 -0000

Running v 8.1.18.  Rereading the SIGHUP section of the man page I'm
still thinking I've configured something wrong.  SIGHUP says conserver
rereads the config files and then adds/deletes consoles as needed and
only touches running consoles if they have changed.  If thats true I
wouldn't expect a 30s buffer of input/output on a console that hasn't
changed, should I?
I also don't see anything in CHANGES that sounds like this is a bug
that has been fixed.

-denis

On Fri, Oct 14, 2016 at 12:05:44PM -0400, Denis Hainsworth wrote:
> I love conserver.  I have  a minor issue and I was curious what options
> there might be.
> 
> So I have a conserver setup running against 262 servers (mostly digis or
> ser2net machines).  It works great.  However when we need to update due
> to a config change we run "kill -HUP" against the parent.  With the
> number of consoles (I think) this causes about a 30s "hang" when
> interacting with any console which corresponds to the reconfig time.
> 
> Does this make sense and is per the current design?  Any chance there is
> a clever way to make it block for less time?  Barring that I intend to
> spin up a new server to share the load of my current server and reduce
> the reconfig time.
> 
> I was mostly curious if there was a config issue or if this description
> doesn't make any sense to folks and it means I have something else going
> on like too many down consoles or something.
> -denis
> 
> -- 
> __________________________
> Denis Alan Hainsworth     
> denis.hainsworth@gmail.com

-- 
__________________________
Denis Alan Hainsworth     
denis.hainsworth@gmail.com
From bryan@conserver.com Wed Oct 19 03:39:20 2016
Received: from [192.168.0.133] (c-98-207-6-47.hsd1.ca.comcast.net
 [98.207.6.47]) (authenticated bits=0)
 by underdog.stansell.org (8.15.2/8.15.2) with ESMTPSA id u9J3dIF2004878
 (version=TLSv1.2 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO)
 for <users@conserver.com>; Wed, 19 Oct 2016 03:39:20 GMT
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=conserver.com;
 s=s0001; t=1476848360;
 bh=EVhbaCdNZN+EogH1RjQEc+XOa5HfA1Hzg05V+lSI8is=;
 h=From:Subject:Date:References:To:In-Reply-To;
 b=cdvxnRoUa2uNi36owjIocA+rFhfsbCuNMGbszum92z8XjrzMvBQ8FJQHPX//VLOX5
 qVyfWQyb2q258ydSYBfmZs2yOlEO94OC1El4ghEWyPvtzFzCBYeVCt0XT0cASV4jAa
 kiAk9zkMUVgMN1ulBNcg6WVzUG3GkYP2uEFMI//xiGJFO9sWNMYx51IPE14n42muIX
 1XiuskNPukRMIAFqxEpESGeDNqgJOcoFqzFCMJi4tzLlwg3avjJIFJsfFkNfRGGrW3
 EnYj67jXrWyzQzcCDYJpg2QVfQPQV7JI1RpnngKchVKVjsVi/+BAfC4E5J2p6BNNNE
 ikGF8FSWBKEfw==
From: Bryan Stansell <bryan@conserver.com>
Content-Type: text/plain; charset=us-ascii
Mime-Version: 1.0 (Mac OS X Mail 10.0 \(3226\))
Subject: Re: what is normal conserver hang during reconfig
Date: Tue, 18 Oct 2016 20:39:19 -0700
References: <20161014160544.GZ27007@cs.brandeis.edu>
 <20161019010422.GQ27007@cs.brandeis.edu>
To: users@conserver.com
In-Reply-To: <20161019010422.GQ27007@cs.brandeis.edu>
Message-Id: <3FF6DA5D-E802-4C5B-A37A-B85CF2EF737D@conserver.com>
X-Mailer: Apple Mail (2.3226)
X-Scanned-By: MIMEDefang 2.72 on 198.151.248.21
Content-Transfer-Encoding: 8bit
X-MIME-Autoconverted: from quoted-printable to 8bit by underdog.stansell.org
 id u9J3dIF2004878
X-BeenThere: users@conserver.com
X-Mailman-Version: 2.1.23
Precedence: list
List-Id: Conserver Users <users.conserver.com>
List-Unsubscribe: <https://www.conserver.com/mailman/options/users>,
 <mailto:users-request@conserver.com?subject=unsubscribe>
List-Archive: <https://www.conserver.com/pipermail/users/>
List-Post: <mailto:users@conserver.com>
List-Help: <mailto:users-request@conserver.com?subject=help>
List-Subscribe: <https://www.conserver.com/mailman/listinfo/users>,
 <mailto:users-request@conserver.com?subject=subscribe>
X-List-Received-Date: Wed, 19 Oct 2016 03:39:21 -0000

Off the top of my head, I agree that there shouldn't be anything fixed in the newer code to address this.  The code does block all activity when it processes a HUP signal, but that's supposed to be "quick".  :-|

Each process (the main and children) rereads the config file and figures out if there's anything to do.  The main process is in charge of spawning new consoles (or reconfigured), and the children are responsible for letting go of old ones (or reconfigured).

With that in mind, how many consoles are each child managing?  The compile time default can be seen with a "conserver -V", but it can be overridden with -m.  I'm honestly not sure if having more or less would be better or even change things (more processes would use more cores, but also "slam" the system with that many things reading and processing the config).

Conserver tries very hard to be multiplex across all the consoles, even when bringing up and tearing down things.  The reread of the config puts all that on hold, so it probably has to do with that.

One issue I've seen before is the magnitude of DNS lookups done when a config is loaded.  It all depends on the config, of course, but you could end up generating a lot of requests.  Maybe it doesn't apply in your environment, but it can be an unexpected source of trouble.

Aside from that, another server will certainly share the load (and, set up right, the end users won't even notice).  It would be interesting to look at an strace (assuming linux) of a process when it gets a HUP (even without any changes to configs).  Just send one of the children a HUP so it minimizes the impact.  With timestamps, it might highlight what is causing the issue (like the DNS query case, but could be anything).

Bryan 

> On Oct 18, 2016, at 6:04 PM, Denis Hainsworth via users <users@conserver.com> wrote:
> 
> Running v 8.1.18.  Rereading the SIGHUP section of the man page I'm
> still thinking I've configured something wrong.  SIGHUP says conserver
> rereads the config files and then adds/deletes consoles as needed and
> only touches running consoles if they have changed.  If thats true I
> wouldn't expect a 30s buffer of input/output on a console that hasn't
> changed, should I?
> I also don't see anything in CHANGES that sounds like this is a bug
> that has been fixed.
> 
> -denis
> 
> On Fri, Oct 14, 2016 at 12:05:44PM -0400, Denis Hainsworth wrote:
>> I love conserver.  I have  a minor issue and I was curious what options
>> there might be.
>> 
>> So I have a conserver setup running against 262 servers (mostly digis or
>> ser2net machines).  It works great.  However when we need to update due
>> to a config change we run "kill -HUP" against the parent.  With the
>> number of consoles (I think) this causes about a 30s "hang" when
>> interacting with any console which corresponds to the reconfig time.
>> 
>> Does this make sense and is per the current design?  Any chance there is
>> a clever way to make it block for less time?  Barring that I intend to
>> spin up a new server to share the load of my current server and reduce
>> the reconfig time.
>> 
>> I was mostly curious if there was a config issue or if this description
>> doesn't make any sense to folks and it means I have something else going
>> on like too many down consoles or something.
>> -denis
>> 
>> -- 
>> __________________________
>> Denis Alan Hainsworth     
>> denis.hainsworth@gmail.com
> 
> -- 
> __________________________
> Denis Alan Hainsworth     
> denis.hainsworth@gmail.com
> _______________________________________________
> users mailing list
> users@conserver.com
> https://www.conserver.com/mailman/listinfo/users


From cfowler@outpostsentinel.com Wed Oct 19 04:16:13 2016
Received: from zcs-mta.vps-host.net (zcs-mta.vps-host.net [69.89.1.77])
 by underdog.stansell.org (8.15.2/8.15.2) with ESMTPS id u9J4GApl006512
 (version=TLSv1.2 cipher=ADH-AES256-GCM-SHA384 bits=256 verify=NO);
 Wed, 19 Oct 2016 04:16:12 GMT
Received: from localhost (localhost.localdomain [127.0.0.1])
 by zcs-mta.vps-host.net (Postfix) with ESMTP id 88A7181776F2;
 Wed, 19 Oct 2016 00:16:09 -0400 (EDT)
Received: from zcs-mta.vps-host.net ([127.0.0.1])
 by localhost (zcs-mta.vps-host.net [127.0.0.1]) (amavisd-new, port 10032)
 with ESMTP id Wb-Bes1my-pT; Wed, 19 Oct 2016 00:16:07 -0400 (EDT)
Received: from localhost (localhost.localdomain [127.0.0.1])
 by zcs-mta.vps-host.net (Postfix) with ESMTP id 71C5481776F1;
 Wed, 19 Oct 2016 00:16:07 -0400 (EDT)
X-Virus-Scanned: amavisd-new at zcs-mta.vps-host.net
Received: from zcs-mta.vps-host.net ([127.0.0.1])
 by localhost (zcs-mta.vps-host.net [127.0.0.1]) (amavisd-new, port 10026)
 with ESMTP id 77_R5iRzTf0q; Wed, 19 Oct 2016 00:16:07 -0400 (EDT)
Received: from enterprisemail2.vps-host.net (unknown [10.0.6.109])
 by zcs-mta.vps-host.net (Postfix) with ESMTP id 506A48176AAE;
 Wed, 19 Oct 2016 00:16:07 -0400 (EDT)
Date: Wed, 19 Oct 2016 00:16:07 -0400 (EDT)
From: Chris Fowler <cfowler@outpostsentinel.com>
To: Bryan Stansell <bryan@conserver.com>
Cc: users@conserver.com
Message-ID: <82546220.9805268.1476850567099.JavaMail.zimbra@outpostsentinel.com>
In-Reply-To: <3FF6DA5D-E802-4C5B-A37A-B85CF2EF737D@conserver.com>
References: <20161014160544.GZ27007@cs.brandeis.edu>
 <20161019010422.GQ27007@cs.brandeis.edu>
 <3FF6DA5D-E802-4C5B-A37A-B85CF2EF737D@conserver.com>
Subject: Re: what is normal conserver hang during reconfig
MIME-Version: 1.0
Content-Type: multipart/alternative; 
 boundary="----=_Part_9805267_1130606666.1476850567098"
X-Mailer: Zimbra 8.6.0_GA_1194 (ZimbraWebClient - GC52 (Linux)/8.6.0_GA_1194)
Thread-Topic: what is normal conserver hang during reconfig
Thread-Index: kiVqs4mkBxZhlOj7ESKvTomJUPGaMQ==
X-Spam-Score: -0.277 () BAYES_00,HTML_MESSAGE,SPF_HELO_PASS,URIBL_SBL
X-Scanned-By: MIMEDefang 2.72 on 198.151.248.21
X-BeenThere: users@conserver.com
X-Mailman-Version: 2.1.23
Precedence: list
List-Id: Conserver Users <users.conserver.com>
List-Unsubscribe: <https://www.conserver.com/mailman/options/users>,
 <mailto:users-request@conserver.com?subject=unsubscribe>
List-Archive: <https://www.conserver.com/pipermail/users/>
List-Post: <mailto:users@conserver.com>
List-Help: <mailto:users-request@conserver.com?subject=help>
List-Subscribe: <https://www.conserver.com/mailman/listinfo/users>,
 <mailto:users-request@conserver.com?subject=subscribe>
X-List-Received-Date: Wed, 19 Oct 2016 04:16:14 -0000

------=_Part_9805267_1130606666.1476850567098
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit

> From: "Bryan Stansell via users" <users@conserver.com>
> To: users@conserver.com
> Sent: Tuesday, October 18, 2016 11:39:19 PM
> Subject: Re: what is normal conserver hang during reconfig

> With that in mind, how many consoles are each child managing? The compile time
> default can be seen with a "conserver -V", but it can be overridden with -m.
> I'm honestly not sure if having more or less would be better or even change
> things (more processes would use more cores, but also "slam" the system with
> that many things reading and processing the config).

> Conserver tries very hard to be multiplex across all the consoles, even when
> bringing up and tearing down things. The reread of the config puts all that on
> hold, so it probably has to do with that.

> One issue I've seen before is the magnitude of DNS lookups done when a config is
> loaded. It all depends on the config, of course, but you could end up
> generating a lot of requests. Maybe it doesn't apply in your environment, but
> it can be an unexpected source of trouble.

Does a HUP close and open consoles? Does a HUP open consoles that are down? If it si going after consoles that are down and blocking that could be what is going on. 

On a local device I can manage 100 consoles with ease. Just a couple serial, the rest are programs or log file tails. 

Chris 

Chris 

------=_Part_9805267_1130606666.1476850567098
Content-Type: text/html; charset=utf-8
Content-Transfer-Encoding: quoted-printable

<html><body><div style=3D"font-family: arial, helvetica, sans-serif; font-s=
ize: 12pt; color: #000000"><div><br></div><div><br></div><hr id=3D"zwchr" d=
ata-marker=3D"__DIVIDER__"><div data-marker=3D"__HEADERS__"><blockquote sty=
le=3D"border-left: 2px solid #1010FF; margin-left: 5px; padding-left: 5px; =
color: #000; font-weight: normal; font-style: normal; text-decoration: none=
; font-family: Helvetica,Arial,sans-serif; font-size: 12pt;" data-mce-style=
=3D"border-left: 2px solid #1010FF; margin-left: 5px; padding-left: 5px; co=
lor: #000; font-weight: normal; font-style: normal; text-decoration: none; =
font-family: Helvetica,Arial,sans-serif; font-size: 12pt;"><b>From: </b>"Br=
yan Stansell via users" &lt;users@conserver.com&gt;<br><b>To: </b>users@con=
server.com<br><b>Sent: </b>Tuesday, October 18, 2016 11:39:19 PM<br><b>Subj=
ect: </b>Re: what is normal conserver hang during reconfig<br></blockquote>=
</div><div data-marker=3D"__QUOTED_TEXT__"><blockquote style=3D"border-left=
: 2px solid #1010FF; margin-left: 5px; padding-left: 5px; color: #000; font=
-weight: normal; font-style: normal; text-decoration: none; font-family: He=
lvetica,Arial,sans-serif; font-size: 12pt;" data-mce-style=3D"border-left: =
2px solid #1010FF; margin-left: 5px; padding-left: 5px; color: #000; font-w=
eight: normal; font-style: normal; text-decoration: none; font-family: Helv=
etica,Arial,sans-serif; font-size: 12pt;"><br><br>With that in mind, how ma=
ny consoles are each child managing? &nbsp;The compile time default can be =
seen with a "conserver -V", but it can be overridden with -m. &nbsp;I'm hon=
estly not sure if having more or less would be better or even change things=
 (more processes would use more cores, but also "slam" the system with that=
 many things reading and processing the config).<br><br>Conserver tries ver=
y hard to be multiplex across all the consoles, even when bringing up and t=
earing down things. &nbsp;The reread of the config puts all that on hold, s=
o it probably has to do with that.<br><br>One issue I've seen before is the=
 magnitude of DNS lookups done when a config is loaded. &nbsp;It all depend=
s on the config, of course, but you could end up generating a lot of reques=
ts. &nbsp;Maybe it doesn't apply in your environment, but it can be an unex=
pected source of trouble.<br></blockquote><div>Does a HUP close and open co=
nsoles? &nbsp;Does a HUP open consoles that are down? &nbsp; If it si going=
 after consoles that are down and blocking that could be what is going on.<=
/div><div><br data-mce-bogus=3D"1"></div><div>On a local device I can manag=
e 100 consoles with ease. &nbsp;Just a couple serial, the rest are programs=
 or log file tails.</div><div><br data-mce-bogus=3D"1"></div><div>Chris</di=
v><div><br data-mce-bogus=3D"1"></div><div><br data-mce-bogus=3D"1"></div><=
div>Chris</div></div></div></body></html>
------=_Part_9805267_1130606666.1476850567098--
From bryan@conserver.com Wed Oct 19 05:10:46 2016
Received: from [192.168.0.133] (c-98-207-6-47.hsd1.ca.comcast.net
 [98.207.6.47]) (authenticated bits=0)
 by underdog.stansell.org (8.15.2/8.15.2) with ESMTPSA id u9J5AiiK008388
 (version=TLSv1.2 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO)
 for <users@conserver.com>; Wed, 19 Oct 2016 05:10:45 GMT
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=conserver.com;
 s=s0001; t=1476853846;
 bh=WEN0UKPRdwKWLUWj/EytUnYsrEQLD5Zo7eBjroCK6xU=;
 h=From:Subject:Date:References:To:In-Reply-To;
 b=jmXxwHF+SXlluetyz7e3Vi79cUSBi71nxWJ9D/HQwIvRvd6KPKkhLUCBgKRGCvqFR
 g8RquJxadfrFCY9eE2OfihR0+AxJ27oNhc6ILRU6itl9LzF+cLpAxBoSH8df0peobQ
 S0KYBnay0rmwVu7UmbdVa2/Cohp/ZwNrjTcw+WaSUSD1PRXbZWeh7wvPUxPDx60+4R
 eU4t08w7Hp1P7cGqo+CVPlWICm7d1Kc2QIqYDU2/WpUfibbFbSeF0f670mDYIFdE6u
 O3ah5uVZxr/qureAoz8PfZmZFOoxYPCOP4exw0fFNJNnmCuneINiiPUl9NP+TlOCcl
 OLY+iM+o8iTTA==
From: Bryan Stansell <bryan@conserver.com>
Content-Type: text/plain; charset=us-ascii
Mime-Version: 1.0 (Mac OS X Mail 10.0 \(3226\))
Subject: Re: what is normal conserver hang during reconfig
Date: Tue, 18 Oct 2016 22:10:43 -0700
References: <20161014160544.GZ27007@cs.brandeis.edu>
 <20161019010422.GQ27007@cs.brandeis.edu>
 <3FF6DA5D-E802-4C5B-A37A-B85CF2EF737D@conserver.com>
 <82546220.9805268.1476850567099.JavaMail.zimbra@outpostsentinel.com>
To: users@conserver.com
In-Reply-To: <82546220.9805268.1476850567099.JavaMail.zimbra@outpostsentinel.com>
Message-Id: <3F2E8C78-12B9-462D-8017-9068A84A674A@conserver.com>
X-Mailer: Apple Mail (2.3226)
X-Scanned-By: MIMEDefang 2.72 on 198.151.248.21
Content-Transfer-Encoding: 8bit
X-MIME-Autoconverted: from quoted-printable to 8bit by underdog.stansell.org
 id u9J5AiiK008388
X-BeenThere: users@conserver.com
X-Mailman-Version: 2.1.23
Precedence: list
List-Id: Conserver Users <users.conserver.com>
List-Unsubscribe: <https://www.conserver.com/mailman/options/users>,
 <mailto:users-request@conserver.com?subject=unsubscribe>
List-Archive: <https://www.conserver.com/pipermail/users/>
List-Post: <mailto:users@conserver.com>
List-Help: <mailto:users-request@conserver.com?subject=help>
List-Subscribe: <https://www.conserver.com/mailman/listinfo/users>,
 <mailto:users-request@conserver.com?subject=subscribe>
X-List-Received-Date: Wed, 19 Oct 2016 05:10:46 -0000


> On Oct 18, 2016, at 9:16 PM, Chris Fowler via users <users@conserver.com> wrote:
> 
> Does a HUP close and open consoles?  Does a HUP open consoles that are down?   If it si going after consoles that are down and blocking that could be what is going on.
> 

A HUP doesn't close and open consoles (it will reopen log files though).  It will try and open anything down (socket connections are set up to be non-blocking).

Bryan


From cfowler@outpostsentinel.com Wed Oct 19 05:15:06 2016
Received: from zcs-mta.vps-host.net (zcs-mta.vps-host.net [69.89.1.77])
 by underdog.stansell.org (8.15.2/8.15.2) with ESMTPS id u9J5F2Ys008528
 (version=TLSv1.2 cipher=ADH-AES256-GCM-SHA384 bits=256 verify=NO);
 Wed, 19 Oct 2016 05:15:05 GMT
Received: from localhost (localhost.localdomain [127.0.0.1])
 by zcs-mta.vps-host.net (Postfix) with ESMTP id 46BAA813C214;
 Wed, 19 Oct 2016 01:15:02 -0400 (EDT)
Received: from zcs-mta.vps-host.net ([127.0.0.1])
 by localhost (zcs-mta.vps-host.net [127.0.0.1]) (amavisd-new, port 10032)
 with ESMTP id oFYWxUx0L3X7; Wed, 19 Oct 2016 01:15:01 -0400 (EDT)
Received: from localhost (localhost.localdomain [127.0.0.1])
 by zcs-mta.vps-host.net (Postfix) with ESMTP id CCD33813C230;
 Wed, 19 Oct 2016 01:15:01 -0400 (EDT)
X-Virus-Scanned: amavisd-new at zcs-mta.vps-host.net
Received: from zcs-mta.vps-host.net ([127.0.0.1])
 by localhost (zcs-mta.vps-host.net [127.0.0.1]) (amavisd-new, port 10026)
 with ESMTP id FACf4IKm6oY1; Wed, 19 Oct 2016 01:15:01 -0400 (EDT)
Received: from enterprisemail2.vps-host.net (unknown [10.0.6.109])
 by zcs-mta.vps-host.net (Postfix) with ESMTP id AB1E9813C20A;
 Wed, 19 Oct 2016 01:15:01 -0400 (EDT)
Date: Wed, 19 Oct 2016 01:15:01 -0400 (EDT)
From: Chris Fowler <cfowler@outpostsentinel.com>
To: Bryan Stansell <bryan@conserver.com>
Cc: users@conserver.com
Message-ID: <1361003158.9807224.1476854101546.JavaMail.zimbra@outpostsentinel.com>
In-Reply-To: <3F2E8C78-12B9-462D-8017-9068A84A674A@conserver.com>
References: <20161014160544.GZ27007@cs.brandeis.edu>
 <20161019010422.GQ27007@cs.brandeis.edu>
 <3FF6DA5D-E802-4C5B-A37A-B85CF2EF737D@conserver.com>
 <82546220.9805268.1476850567099.JavaMail.zimbra@outpostsentinel.com>
 <3F2E8C78-12B9-462D-8017-9068A84A674A@conserver.com>
Subject: Re: what is normal conserver hang during reconfig
MIME-Version: 1.0
Content-Type: multipart/alternative; 
 boundary="----=_Part_9807223_892677169.1476854101545"
X-Mailer: Zimbra 8.6.0_GA_1194 (ZimbraWebClient - GC52 (Linux)/8.6.0_GA_1194)
Thread-Topic: what is normal conserver hang during reconfig
Thread-Index: gmUmuYlIxXi+2O0yVaRDg2c7hAoX+A==
X-Spam-Score: -0.277 () BAYES_00,HTML_MESSAGE,SPF_HELO_PASS,URIBL_SBL
X-Scanned-By: MIMEDefang 2.72 on 198.151.248.21
X-BeenThere: users@conserver.com
X-Mailman-Version: 2.1.23
Precedence: list
List-Id: Conserver Users <users.conserver.com>
List-Unsubscribe: <https://www.conserver.com/mailman/options/users>,
 <mailto:users-request@conserver.com?subject=unsubscribe>
List-Archive: <https://www.conserver.com/pipermail/users/>
List-Post: <mailto:users@conserver.com>
List-Help: <mailto:users-request@conserver.com?subject=help>
List-Subscribe: <https://www.conserver.com/mailman/listinfo/users>,
 <mailto:users-request@conserver.com?subject=subscribe>
X-List-Received-Date: Wed, 19 Oct 2016 05:15:06 -0000

------=_Part_9807223_892677169.1476854101545
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit

> From: "Bryan Stansell via users" <users@conserver.com>
> To: users@conserver.com
> Sent: Wednesday, October 19, 2016 1:10:43 AM
> Subject: Re: what is normal conserver hang during reconfig

> > On Oct 18, 2016, at 9:16 PM, Chris Fowler via users <users@conserver.com> wrote:

>> Does a HUP close and open consoles? Does a HUP open consoles that are down? If
>> it si going after consoles that are down and blocking that could be what is
> > going on.


> A HUP doesn't close and open consoles (it will reopen log files though). It will
> try and open anything down (socket connections are set up to be non-blocking).

A could down with DNS look ups could be the culprit. At this point I'd strace it to see where it is spending time. 

------=_Part_9807223_892677169.1476854101545
Content-Type: text/html; charset=utf-8
Content-Transfer-Encoding: quoted-printable

<html><body><div style=3D"font-family: arial, helvetica, sans-serif; font-s=
ize: 12pt; color: #000000"><div><br></div><div><br></div><hr id=3D"zwchr" d=
ata-marker=3D"__DIVIDER__"><div data-marker=3D"__HEADERS__"><blockquote sty=
le=3D"border-left: 2px solid #1010FF; margin-left: 5px; padding-left: 5px; =
color: #000; font-weight: normal; font-style: normal; text-decoration: none=
; font-family: Helvetica,Arial,sans-serif; font-size: 12pt;" data-mce-style=
=3D"border-left: 2px solid #1010FF; margin-left: 5px; padding-left: 5px; co=
lor: #000; font-weight: normal; font-style: normal; text-decoration: none; =
font-family: Helvetica,Arial,sans-serif; font-size: 12pt;"><b>From: </b>"Br=
yan Stansell via users" &lt;users@conserver.com&gt;<br><b>To: </b>users@con=
server.com<br><b>Sent: </b>Wednesday, October 19, 2016 1:10:43 AM<br><b>Sub=
ject: </b>Re: what is normal conserver hang during reconfig<br></blockquote=
></div><div data-marker=3D"__QUOTED_TEXT__"><blockquote style=3D"border-lef=
t: 2px solid #1010FF; margin-left: 5px; padding-left: 5px; color: #000; fon=
t-weight: normal; font-style: normal; text-decoration: none; font-family: H=
elvetica,Arial,sans-serif; font-size: 12pt;" data-mce-style=3D"border-left:=
 2px solid #1010FF; margin-left: 5px; padding-left: 5px; color: #000; font-=
weight: normal; font-style: normal; text-decoration: none; font-family: Hel=
vetica,Arial,sans-serif; font-size: 12pt;">&gt; On Oct 18, 2016, at 9:16 PM=
, Chris Fowler via users &lt;users@conserver.com&gt; wrote:<br>&gt; <br>&gt=
; Does a HUP close and open consoles? &nbsp;Does a HUP open consoles that a=
re down? &nbsp; If it si going after consoles that are down and blocking th=
at could be what is going on.<br>&gt; <br><br>A HUP doesn't close and open =
consoles (it will reopen log files though). &nbsp;It will try and open anyt=
hing down (socket connections are set up to be non-blocking).<br></blockquo=
te><div>A could down with DNS look ups could be the culprit. &nbsp;At this =
point I'd strace it to see where it is spending time.</div><div><br data-mc=
e-bogus=3D"1"></div><div><br data-mce-bogus=3D"1"></div></div></div></body>=
</html>
------=_Part_9807223_892677169.1476854101545--
From denis.hainsworth@gmail.com Wed Oct 19 13:35:09 2016
Received: from mail-qk0-f175.google.com (mail-qk0-f175.google.com
 [209.85.220.175])
 by underdog.stansell.org (8.15.2/8.15.2) with ESMTPS id u9JDZ6Rb014391
 (version=TLSv1.2 cipher=AES128-GCM-SHA256 bits=128 verify=OK);
 Wed, 19 Oct 2016 13:35:08 GMT
Received: by mail-qk0-f175.google.com with SMTP id n189so35565847qke.0;
 Wed, 19 Oct 2016 06:35:07 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=date:from:to:cc:subject:message-id:reply-to:references:mime-version
 :content-disposition:in-reply-to:user-agent;
 bh=DCGXnpfEygyzU1Pi3T7mCny0Be8nw/tr3yajPHolK40=;
 b=GeDbOe0cO94Nsux3OaywyUY6ICqpFK3BaAC3Xl3rhE58hC1GpWkAyosDk+iIr/4G96
 6ghCIXBy5xFztzDxwTNmA+0LPi8oXMshvv+Z2a28iWXPYLTXYKZyGqEzf4ryA7aouRt3
 I9O9vCaItG59TVaQ5Mh/PXmXLeaNNobV8q3xuRsJdKELMHyvjD0qQ63nA96coHDmhwb7
 fS8H226TQ/iDT/gRLvLXorIRAd10WvwpAwl78Xyb/gCPpaf4VmYT9Tw6n1Jst/PXHUZK
 HLbXzGvrOtjffOrC9P8eruDdxdvLnwkp0BH6QCfxlzh8oXyrHoOtqWADU/24WoW4zms8
 V9fA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20130820;
 h=x-gm-message-state:date:from:to:cc:subject:message-id:reply-to
 :references:mime-version:content-disposition:in-reply-to:user-agent;
 bh=DCGXnpfEygyzU1Pi3T7mCny0Be8nw/tr3yajPHolK40=;
 b=SyJJA3CqHELqUP4PoUMEGNWBtVOoYZhCwpuJyYL66E5VFCRrXCB9t5tLooINFJxuWU
 JImSq3cx071WELf+p6i8FBXU514WU2zWXoEfyFK95QtPn62GhrV5qxhWPmo8xDP+MQm8
 XOBsGUtkzmXmurPUJSd8bEsEtVJyaYQXlEfxR8hnLZEfbyhx+7hpE5GQWwhUAgJSiQGL
 O1rrRgZRatgdN8nwQZbGlofpQ7fiAif6MbCJK7/rqKS68gBYCaJI3f81BYSrapnlXPA5
 5dRSyXJYPBPiWFm4eAL6jl7ZfZZpUD7TXzmwX7gdUmnBt1Y3d/zAO/Wel5YBVVQF6Hcp
 JUUQ==
X-Gm-Message-State: ABUngvcggZZWidtYb4kpM8CV66byA+9c0awepHiolS74mqjaDyjg//nR/jpSLeCSg5ixHw==
X-Received: by 10.55.212.195 with SMTP id s64mr5944879qks.216.1476884103597;
 Wed, 19 Oct 2016 06:35:03 -0700 (PDT)
Received: from xmas.dyndns.org (cl-890.chi-02.us.sixxs.net.
 [2001:4978:f:379::2])
 by smtp.gmail.com with ESMTPSA id w72sm20862186qkb.33.2016.10.19.06.35.02
 (version=TLS1 cipher=AES128-SHA bits=128/128);
 Wed, 19 Oct 2016 06:35:02 -0700 (PDT)
Received: by xmas.dyndns.org (Postfix, from userid 501)
 id B0A9BBFA090; Wed, 19 Oct 2016 09:35:00 -0400 (EDT)
Date: Wed, 19 Oct 2016 09:35:00 -0400
From: Denis Hainsworth <denis.hainsworth@gmail.com>
To: Chris Fowler <cfowler@outpostsentinel.com>
Cc: Bryan Stansell <bryan@conserver.com>, users@conserver.com
Subject: Re: what is normal conserver hang during reconfig
Message-ID: <20161019133500.GR27007@cs.brandeis.edu>
Reply-To: Denis Hainsworth <denis.hainsworth@gmail.com>
References: <20161014160544.GZ27007@cs.brandeis.edu>
 <20161019010422.GQ27007@cs.brandeis.edu>
 <3FF6DA5D-E802-4C5B-A37A-B85CF2EF737D@conserver.com>
 <82546220.9805268.1476850567099.JavaMail.zimbra@outpostsentinel.com>
 <3F2E8C78-12B9-462D-8017-9068A84A674A@conserver.com>
 <1361003158.9807224.1476854101546.JavaMail.zimbra@outpostsentinel.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <1361003158.9807224.1476854101546.JavaMail.zimbra@outpostsentinel.com>
User-Agent: Mutt/1.5.21 (2010-09-15)
X-Spam-Score: -2 () BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU,
 FREEMAIL_FROM, SPF_PASS
X-Scanned-By: MIMEDefang 2.72 on 198.151.248.21
X-BeenThere: users@conserver.com
X-Mailman-Version: 2.1.23
Precedence: list
List-Id: Conserver Users <users.conserver.com>
List-Unsubscribe: <https://www.conserver.com/mailman/options/users>,
 <mailto:users-request@conserver.com?subject=unsubscribe>
List-Archive: <https://www.conserver.com/pipermail/users/>
List-Post: <mailto:users@conserver.com>
List-Help: <mailto:users-request@conserver.com?subject=help>
List-Subscribe: <https://www.conserver.com/mailman/listinfo/users>,
 <mailto:users-request@conserver.com?subject=subscribe>
X-List-Received-Date: Wed, 19 Oct 2016 13:35:10 -0000

Thanks for the ideas guys, I'll see what I can dig up.  I only realized
last night my first email was sent before I updated my subscription
address so the list just quietly ignored it :)
-denis
From denis.hainsworth@gmail.com Sun Oct 23 06:42:18 2016
Received: from mail-qk0-f175.google.com (mail-qk0-f175.google.com
 [209.85.220.175])
 by underdog.stansell.org (8.15.2/8.15.2) with ESMTPS id u9N6gDsB001310
 (version=TLSv1.2 cipher=AES128-GCM-SHA256 bits=128 verify=OK);
 Sun, 23 Oct 2016 06:42:17 GMT
Received: by mail-qk0-f175.google.com with SMTP id o68so201059808qkf.3;
 Sat, 22 Oct 2016 23:42:15 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=date:from:to:cc:subject:message-id:reply-to:references:mime-version
 :content-disposition:in-reply-to:user-agent;
 bh=tjxkh0QchxcBAlu45AsQxoGRZIkEUPAlLe24VrsCn68=;
 b=skUUkwboAENNyMxHI3oCLw+CWJP4d2UEEj1Vc8lFzhAbdSMMeaGyaD3Jk7XAh/KO9w
 CeIcmmYxXUR0rssuBBw/MAHEdnKX2BUDAd9uwRRA+Ded3bvDUXXeV2covPOmTjH0Rmuq
 i/CCMb/FAeJcm07Dcww/XHf7wuHrgiEwivqpfGfdSF3X/f8nGKt6GEMeREnMIA/z9i9Z
 D9kG/cr1h7hEAzDJ+KOHloYiXCTM45qFGKbZlsCpBtvb/FdfyoSJt3gUyWj970G9HA6X
 KOaKXxbbGdA1/TjtoIU8VXeDQk72nsX0CqWHAnrB23r7FhYlJ+3oGFcs7Rk2mD2cFpk+
 MdxQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20130820;
 h=x-gm-message-state:date:from:to:cc:subject:message-id:reply-to
 :references:mime-version:content-disposition:in-reply-to:user-agent;
 bh=tjxkh0QchxcBAlu45AsQxoGRZIkEUPAlLe24VrsCn68=;
 b=eu51WEcLwEQxz3bELJKdOoWfGedTf9sDwBrUISJCJ9IPJaR0+azvHWpetIA+V5LNwB
 EGsuYDdzUwUMzhAU8AYv7LUrsSONmo0nRe5trLlyA2pXeAeq3bYYVoKr6Vse91KHWV2K
 5oOi25pXAYHG6xBEpBTlPYVDLTjLWFv75fC8eGc939mcaVtwYdDMkkZcTHoyyf0dyBFZ
 MZ01WLfrTUX+cDGc+dmP+mYOAQTRNqCANhx+hK9/1HIQSg1PQTCRiamsW7FqLbkJpHWY
 WIHjYISgfHDUtvgA84GabfMFFxcpUs5SNbndVcEL652yODltEQGy8a2+XL+xEryKyJhu
 A66g==
X-Gm-Message-State: ABUngvcfeILp2bKKR2fLx2k26hiQkVHXjeVtDaD7bB9jVdxH/wkrfYrb3dWS72JQGusQdw==
X-Received: by 10.55.209.147 with SMTP id o19mr10217645qkl.125.1477204932796; 
 Sat, 22 Oct 2016 23:42:12 -0700 (PDT)
Received: from xmas.dyndns.org (cl-890.chi-02.us.sixxs.net.
 [2001:4978:f:379::2])
 by smtp.gmail.com with ESMTPSA id f62sm5595163qka.3.2016.10.22.23.42.11
 (version=TLS1 cipher=AES128-SHA bits=128/128);
 Sat, 22 Oct 2016 23:42:12 -0700 (PDT)
Received: by xmas.dyndns.org (Postfix, from userid 501)
 id 3A19D8C21B9; Sun, 23 Oct 2016 02:42:10 -0400 (EDT)
Date: Sun, 23 Oct 2016 02:42:10 -0400
From: Denis Hainsworth <denis.hainsworth@gmail.com>
To: Bryan Stansell <bryan@conserver.com>
Cc: users@conserver.com
Subject: Re: what is normal conserver hang during reconfig
Message-ID: <20161023064210.GF6698@cs.brandeis.edu>
Reply-To: Denis Hainsworth <denis.hainsworth@gmail.com>
References: <20161014160544.GZ27007@cs.brandeis.edu>
 <20161019010422.GQ27007@cs.brandeis.edu>
 <3FF6DA5D-E802-4C5B-A37A-B85CF2EF737D@conserver.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <3FF6DA5D-E802-4C5B-A37A-B85CF2EF737D@conserver.com>
User-Agent: Mutt/1.5.21 (2010-09-15)
X-Spam-Score: -0.377 () BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU,
 FREEMAIL_FROM, SPF_PASS, URIBL_SBL
X-Scanned-By: MIMEDefang 2.72 on 198.151.248.21
X-BeenThere: users@conserver.com
X-Mailman-Version: 2.1.23
Precedence: list
List-Id: Conserver Users <users.conserver.com>
List-Unsubscribe: <https://www.conserver.com/mailman/options/users>,
 <mailto:users-request@conserver.com?subject=unsubscribe>
List-Archive: <https://www.conserver.com/pipermail/users/>
List-Post: <mailto:users@conserver.com>
List-Help: <mailto:users-request@conserver.com?subject=help>
List-Subscribe: <https://www.conserver.com/mailman/listinfo/users>,
 <mailto:users-request@conserver.com?subject=subscribe>
X-List-Received-Date: Sun, 23 Oct 2016 06:42:18 -0000

Finally got time to look at things.  strace is perfect, thanks for
suggesting that.

So running something like
strace -t -o strace.out.2 -p 3198  
and sending a SIGHUP to the parent process showed the issue.

So the way we've always set things up was to automatically generate one
config file per console server from our equipment database.   This means
There are 264 files that are #included into the main config file. 
The first 30s of "hang" is each process opening each file reading it in
and closing it, I'm wondering if we need to block I/O during this or
perhaps that could be done before we start blocking?
Once that is done there is another 10s of hang while we do the dns
lookup for each console host as you thought (open /etc/hosts, make a dns
query, resolve it).

I tried putting all the configs into one file but that didnt change
anything.  So then I started wondering.  Our IT had long ago made the
console servers VMs.   Its never seemed like an issue but I compared
some basic dd commands and found my problem server has terrible IO
throughput ... sigh.   To compare one of my good servers has about
80Mbp/s read/write and the bad one has around 15Mbp/s read/write.  

So I'm going to look into moving the VM or get the disk perf up which
should solve most of my issues but I also wonder if the conserver code
could be re-organized without too much trouble to avoid issues of
blocking when there is slow disk?  Its possible what I'm asking is dumb,
just throwing it out there.

-denis

On Tue, Oct 18, 2016 at 08:39:19PM -0700, Bryan Stansell via users wrote:
> Off the top of my head, I agree that there shouldn't be anything fixed in the newer code to address this.  The code does block all activity when it processes a HUP signal, but that's supposed to be "quick".  :-|
> 
> Each process (the main and children) rereads the config file and figures out if there's anything to do.  The main process is in charge of spawning new consoles (or reconfigured), and the children are responsible for letting go of old ones (or reconfigured).
> 
> With that in mind, how many consoles are each child managing?  The compile time default can be seen with a "conserver -V", but it can be overridden with -m.  I'm honestly not sure if having more or less would be better or even change things (more processes would use more cores, but also "slam" the system with that many things reading and processing the config).
> 
> Conserver tries very hard to be multiplex across all the consoles, even when bringing up and tearing down things.  The reread of the config puts all that on hold, so it probably has to do with that.
> 
> One issue I've seen before is the magnitude of DNS lookups done when a config is loaded.  It all depends on the config, of course, but you could end up generating a lot of requests.  Maybe it doesn't apply in your environment, but it can be an unexpected source of trouble.
> 
> Aside from that, another server will certainly share the load (and, set up right, the end users won't even notice).  It would be interesting to look at an strace (assuming linux) of a process when it gets a HUP (even without any changes to configs).  Just send one of the children a HUP so it minimizes the impact.  With timestamps, it might highlight what is causing the issue (like the DNS query case, but could be anything).
> 
> Bryan 
> 
> > On Oct 18, 2016, at 6:04 PM, Denis Hainsworth via users <users@conserver.com> wrote:
> > 
> > Running v 8.1.18.  Rereading the SIGHUP section of the man page I'm
> > still thinking I've configured something wrong.  SIGHUP says conserver
> > rereads the config files and then adds/deletes consoles as needed and
> > only touches running consoles if they have changed.  If thats true I
> > wouldn't expect a 30s buffer of input/output on a console that hasn't
> > changed, should I?
> > I also don't see anything in CHANGES that sounds like this is a bug
> > that has been fixed.
> > 
> > -denis
> > 
> > On Fri, Oct 14, 2016 at 12:05:44PM -0400, Denis Hainsworth wrote:
> >> I love conserver.  I have  a minor issue and I was curious what options
> >> there might be.
> >> 
> >> So I have a conserver setup running against 262 servers (mostly digis or
> >> ser2net machines).  It works great.  However when we need to update due
> >> to a config change we run "kill -HUP" against the parent.  With the
> >> number of consoles (I think) this causes about a 30s "hang" when
> >> interacting with any console which corresponds to the reconfig time.
> >> 
> >> Does this make sense and is per the current design?  Any chance there is
> >> a clever way to make it block for less time?  Barring that I intend to
> >> spin up a new server to share the load of my current server and reduce
> >> the reconfig time.
> >> 
> >> I was mostly curious if there was a config issue or if this description
> >> doesn't make any sense to folks and it means I have something else going
> >> on like too many down consoles or something.
> >> -denis
> >> 
> >> -- 
> >> __________________________
> >> Denis Alan Hainsworth     
> >> denis.hainsworth@gmail.com
> > 
> > -- 
> > __________________________
> > Denis Alan Hainsworth     
> > denis.hainsworth@gmail.com
> > _______________________________________________
> > users mailing list
> > users@conserver.com
> > https://www.conserver.com/mailman/listinfo/users
> 
> 
> _______________________________________________
> users mailing list
> users@conserver.com
> https://www.conserver.com/mailman/listinfo/users

-- 
__________________________
Denis Alan Hainsworth     
denis.hainsworth@gmail.com
From bryan@conserver.com Sun Oct 23 17:34:36 2016
Received: from [192.168.0.132] (c-98-207-6-47.hsd1.ca.comcast.net
 [98.207.6.47]) (authenticated bits=0)
 by underdog.stansell.org (8.15.2/8.15.2) with ESMTPSA id u9NHYY2g012565
 (version=TLSv1.2 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO)
 for <users@conserver.com>; Sun, 23 Oct 2016 17:34:36 GMT
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=conserver.com;
 s=s0001; t=1477244076;
 bh=xXEDm1btcvPLVErj9fGuwzUMqrksitFGhAAlNxInC7k=;
 h=From:Subject:Date:References:To:In-Reply-To;
 b=AzALwaUFdFG8VuRPsUevase76KOR/H0lygxjMofuTH8mTiUiHWoFCszjE+TjQSsHh
 X2UYvjSPIuH/0XZ1FMbw295dBE7GV8DoywTta3Ropk0o13bcAv1QdtqPcG4dpAKs8p
 QrY2/vS1ltb7YMxg3pFHAe4uoPRhYs8rnfmgHblYr0drfhzPKzHdPDh6RU4Q7x9xDI
 xzZjOQn30n2JgnmIdNoHVNCLVTx6Sy0JqWYvwfgNTEoza0z7O9Iffmba9RVZ2SVI5B
 +byfLqY6X7g/iXwhOfAvVlagfBASwaNVY5nNSq4IBgEtBjrVSy2oeX7ale05G8TQwh
 AFJi641BtfbCQ==
From: Bryan Stansell <bryan@conserver.com>
Content-Type: text/plain; charset=us-ascii
Mime-Version: 1.0 (Mac OS X Mail 10.0 \(3226\))
Subject: Re: what is normal conserver hang during reconfig
Date: Sun, 23 Oct 2016 10:34:36 -0700
References: <20161014160544.GZ27007@cs.brandeis.edu>
 <20161019010422.GQ27007@cs.brandeis.edu>
 <3FF6DA5D-E802-4C5B-A37A-B85CF2EF737D@conserver.com>
 <20161023064210.GF6698@cs.brandeis.edu>
To: users@conserver.com
In-Reply-To: <20161023064210.GF6698@cs.brandeis.edu>
Message-Id: <4863A133-4C31-424B-9837-5777A6EA321F@conserver.com>
X-Mailer: Apple Mail (2.3226)
X-Scanned-By: MIMEDefang 2.72 on 198.151.248.21
Content-Transfer-Encoding: 8bit
X-MIME-Autoconverted: from quoted-printable to 8bit by underdog.stansell.org
 id u9NHYY2g012565
X-BeenThere: users@conserver.com
X-Mailman-Version: 2.1.23
Precedence: list
List-Id: Conserver Users <users.conserver.com>
List-Unsubscribe: <https://www.conserver.com/mailman/options/users>,
 <mailto:users-request@conserver.com?subject=unsubscribe>
List-Archive: <https://www.conserver.com/pipermail/users/>
List-Post: <mailto:users@conserver.com>
List-Help: <mailto:users-request@conserver.com?subject=help>
List-Subscribe: <https://www.conserver.com/mailman/listinfo/users>,
 <mailto:users-request@conserver.com?subject=subscribe>
X-List-Received-Date: Sun, 23 Oct 2016 17:34:37 -0000

I'm glad you were able to find the source of "most" of your troubles.  I quote that because, yes, theoretically the code could be a lot nicer and not block while reconfiguring.  The code that does that never got folded into the loop that handles I/O, but could...and really should.  No one has ever called it out as a serious enough problem before.  :-)

I'll certainly put it on the list to look at...but it's not a "simple" change, that's for sure.

Bryan

> On Oct 22, 2016, at 11:42 PM, Denis Hainsworth via users <users@conserver.com> wrote:
> 
> Finally got time to look at things.  strace is perfect, thanks for
> suggesting that.
> 
> So running something like
> strace -t -o strace.out.2 -p 3198  
> and sending a SIGHUP to the parent process showed the issue.
> 
> So the way we've always set things up was to automatically generate one
> config file per console server from our equipment database.   This means
> There are 264 files that are #included into the main config file. 
> The first 30s of "hang" is each process opening each file reading it in
> and closing it, I'm wondering if we need to block I/O during this or
> perhaps that could be done before we start blocking?
> Once that is done there is another 10s of hang while we do the dns
> lookup for each console host as you thought (open /etc/hosts, make a dns
> query, resolve it).
> 
> I tried putting all the configs into one file but that didnt change
> anything.  So then I started wondering.  Our IT had long ago made the
> console servers VMs.   Its never seemed like an issue but I compared
> some basic dd commands and found my problem server has terrible IO
> throughput ... sigh.   To compare one of my good servers has about
> 80Mbp/s read/write and the bad one has around 15Mbp/s read/write.  
> 
> So I'm going to look into moving the VM or get the disk perf up which
> should solve most of my issues but I also wonder if the conserver code
> could be re-organized without too much trouble to avoid issues of
> blocking when there is slow disk?  Its possible what I'm asking is dumb,
> just throwing it out there.
> 
> -denis


From denis.hainsworth@gmail.com Sun Oct 23 17:57:42 2016
Received: from mail-qk0-f182.google.com (mail-qk0-f182.google.com
 [209.85.220.182])
 by underdog.stansell.org (8.15.2/8.15.2) with ESMTPS id u9NHvd4I013079
 (version=TLSv1.2 cipher=AES128-GCM-SHA256 bits=128 verify=OK);
 Sun, 23 Oct 2016 17:57:41 GMT
Received: by mail-qk0-f182.google.com with SMTP id o68so210298078qkf.3;
 Sun, 23 Oct 2016 10:57:41 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=date:from:to:cc:subject:message-id:reply-to:references:mime-version
 :content-disposition:in-reply-to:user-agent;
 bh=CUn74pvi+GDRhFo+Cwsq3reRmKxpDRH1bdxitb22rig=;
 b=zAlkh4GiL+/mctxh74qjFPQ20YBCHRHmSjXEvq/OCI7z1a4CFRBuAuUI0Bey3RWm3v
 lvYZss7D06PjygNQu3vvTf+OlWTt53V+lX8mWY/vNmwBKIiZKPUtUJH+r+dlw+dlYrRW
 z8lBFJv7PWqZvXVQaod6Z2ykhGe0Ca85Cjn7uqfe5FrR+ds4YsoRIHbzNOp4TPiwxM6y
 br7Zsa9XeAqqsZruqQgq9JIt/Kjw4hmBZ0tXonkg9ziSsQWRlwNUdg8WH7rym2/9N/Pn
 fN9t4BBtkPqQaxh9+d0vhtA/EIgw29DVdOLR5Z9mZkLhOEZ/K7fvRKMTdwwsscvQfhLp
 aTIA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20130820;
 h=x-gm-message-state:date:from:to:cc:subject:message-id:reply-to
 :references:mime-version:content-disposition:in-reply-to:user-agent;
 bh=CUn74pvi+GDRhFo+Cwsq3reRmKxpDRH1bdxitb22rig=;
 b=NnWnp2SOrghtFICqQzYFC7OJas4yfsX3yRCt3iAiwxOECEb5eKofHDV8v0yEYvL74I
 70LhcsE5xbyE0foygluqJO/+z1T7tobOPzjNDbtTPpeD+YnDUJzvP/z1g7N/s9upUfbz
 IfHd4g53Ag6wE2I3P9IYOKNPmshJjuGrciBQvE73BZFkf7Fx3/xsmBO4LTJtctLSShY4
 TEquXV6oDRT7bbHQpfj9NxoCG65jl2ro0k2zjj8eCf1MGEoYb1JylCRoa0GRP0AkCak3
 ABYxMTgSi84M8yLjLkOz9ZOZ0c0ZgPLwHNqOeOoDGfDqWcSgt3AzEFg4xA4YOdQHwVo2
 MO2g==
X-Gm-Message-State: ABUngvdHjD63JN/8YH4CQmVM8HUN/BeGms6fgH574HX4gq1YQEzdP3OjM8RCCms8SstTPA==
X-Received: by 10.55.200.152 with SMTP id t24mr12622482qkl.205.1477245457769; 
 Sun, 23 Oct 2016 10:57:37 -0700 (PDT)
Received: from xmas.dyndns.org (cl-890.chi-02.us.sixxs.net.
 [2001:4978:f:379::2])
 by smtp.gmail.com with ESMTPSA id n77sm6580252qkn.28.2016.10.23.10.57.37
 (version=TLS1 cipher=AES128-SHA bits=128/128);
 Sun, 23 Oct 2016 10:57:37 -0700 (PDT)
Received: by xmas.dyndns.org (Postfix, from userid 501)
 id F07F68C21B9; Sun, 23 Oct 2016 13:57:35 -0400 (EDT)
Date: Sun, 23 Oct 2016 13:57:35 -0400
From: Denis Hainsworth <denis.hainsworth@gmail.com>
To: Bryan Stansell <bryan@conserver.com>
Cc: users@conserver.com
Subject: Re: what is normal conserver hang during reconfig
Message-ID: <20161023175735.GH6698@cs.brandeis.edu>
Reply-To: Denis Hainsworth <denis.hainsworth@gmail.com>
References: <20161014160544.GZ27007@cs.brandeis.edu>
 <20161019010422.GQ27007@cs.brandeis.edu>
 <3FF6DA5D-E802-4C5B-A37A-B85CF2EF737D@conserver.com>
 <20161023064210.GF6698@cs.brandeis.edu>
 <4863A133-4C31-424B-9837-5777A6EA321F@conserver.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <4863A133-4C31-424B-9837-5777A6EA321F@conserver.com>
User-Agent: Mutt/1.5.21 (2010-09-15)
X-Spam-Score: -0.377 () BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU,
 FREEMAIL_FROM, SPF_PASS, URIBL_SBL
X-Scanned-By: MIMEDefang 2.72 on 198.151.248.21
X-BeenThere: users@conserver.com
X-Mailman-Version: 2.1.23
Precedence: list
List-Id: Conserver Users <users.conserver.com>
List-Unsubscribe: <https://www.conserver.com/mailman/options/users>,
 <mailto:users-request@conserver.com?subject=unsubscribe>
List-Archive: <https://www.conserver.com/pipermail/users/>
List-Post: <mailto:users@conserver.com>
List-Help: <mailto:users-request@conserver.com?subject=help>
List-Subscribe: <https://www.conserver.com/mailman/listinfo/users>,
 <mailto:users-request@conserver.com?subject=subscribe>
X-List-Received-Date: Sun, 23 Oct 2016 17:57:43 -0000

Dang it, my theory didn't pan out.  While the slower of the two did in
fact have slower disks my IT was able to move the VM to some ultra fast
storage and my reconfig loop wasn't any faster.  :(   And it was such a
lovely theory too.

So I'm still digging to see if I can come up with a second clever idea
but I have a feeling to reduce to reconfig time I'll just have to spread
the load over more systems.

-denis

On Sun, Oct 23, 2016 at 10:34:36AM -0700, Bryan Stansell via users wrote:
> I'm glad you were able to find the source of "most" of your troubles.  I quote that because, yes, theoretically the code could be a lot nicer and not block while reconfiguring.  The code that does that never got folded into the loop that handles I/O, but could...and really should.  No one has ever called it out as a serious enough problem before.  :-)
> 
> I'll certainly put it on the list to look at...but it's not a "simple" change, that's for sure.
> 
> Bryan
> 
> > On Oct 22, 2016, at 11:42 PM, Denis Hainsworth via users <users@conserver.com> wrote:
> > 
> > Finally got time to look at things.  strace is perfect, thanks for
> > suggesting that.
> > 
> > So running something like
> > strace -t -o strace.out.2 -p 3198  
> > and sending a SIGHUP to the parent process showed the issue.
> > 
> > So the way we've always set things up was to automatically generate one
> > config file per console server from our equipment database.   This means
> > There are 264 files that are #included into the main config file. 
> > The first 30s of "hang" is each process opening each file reading it in
> > and closing it, I'm wondering if we need to block I/O during this or
> > perhaps that could be done before we start blocking?
> > Once that is done there is another 10s of hang while we do the dns
> > lookup for each console host as you thought (open /etc/hosts, make a dns
> > query, resolve it).
> > 
> > I tried putting all the configs into one file but that didnt change
> > anything.  So then I started wondering.  Our IT had long ago made the
> > console servers VMs.   Its never seemed like an issue but I compared
> > some basic dd commands and found my problem server has terrible IO
> > throughput ... sigh.   To compare one of my good servers has about
> > 80Mbp/s read/write and the bad one has around 15Mbp/s read/write.  
> > 
> > So I'm going to look into moving the VM or get the disk perf up which
> > should solve most of my issues but I also wonder if the conserver code
> > could be re-organized without too much trouble to avoid issues of
> > blocking when there is slow disk?  Its possible what I'm asking is dumb,
> > just throwing it out there.
> > 
> > -denis
> 
> 
> _______________________________________________
> users mailing list
> users@conserver.com
> https://www.conserver.com/mailman/listinfo/users

-- 
__________________________
Denis Alan Hainsworth     
denis.hainsworth@gmail.com
From denis.hainsworth@gmail.com Mon Oct 24 00:46:40 2016
Received: from mail-qk0-f180.google.com (mail-qk0-f180.google.com
 [209.85.220.180])
 by underdog.stansell.org (8.15.2/8.15.2) with ESMTPS id u9O0kbQq026657
 (version=TLSv1.2 cipher=AES128-GCM-SHA256 bits=128 verify=OK);
 Mon, 24 Oct 2016 00:46:39 GMT
Received: by mail-qk0-f180.google.com with SMTP id o68so216548837qkf.3;
 Sun, 23 Oct 2016 17:46:39 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=date:from:to:cc:subject:message-id:reply-to:references:mime-version
 :content-disposition:in-reply-to:user-agent;
 bh=HFq0OvgiMayf6CVAM6zlwpLFFbmGD8a9pywWjEXAlcU=;
 b=FnBXNc4zojsSQcdzBU347qOqngvRvaVwXQPcBjuYj6Vtfgq3LT+bz9l477glSJtfwJ
 VMk9+WE5OQTHBuoMZ6B8n6/8qEkuD2tuzALAcn90h7uSI8lfKW3fxKfWl/fsZ31cEfPq
 pN4MszvOVjzAXdY4pM9F1NF+76V8Iu46BvueBl49iBVzwQ/cJQYW6f/YMnC84L+mmO4p
 sMhXt91OBxK77sSYcYX33P3p5SIhOpBZplvHM3J97s5LWsDTRA64hb8fJ+ohZ5hCsZpL
 NrBpLb4rmF6qGuDWWtCG4I/5X2+f+nVaDtArR35y9ivheg0u7tW1LoP5pzTccsysy+/B
 J0mg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20130820;
 h=x-gm-message-state:date:from:to:cc:subject:message-id:reply-to
 :references:mime-version:content-disposition:in-reply-to:user-agent;
 bh=HFq0OvgiMayf6CVAM6zlwpLFFbmGD8a9pywWjEXAlcU=;
 b=C/wEzEp1x81hT7uoOtJvDA6XwU2vmnWJDPd5bmTZEuChULoG8mwS8iUGfyfU1QeLvB
 3ohIX2JfnsdRgEeFxmCbQ1lt5yzdcM5nwS/ELgK/i9xD+H7fwErTq/T9VN1lnhPa7BiK
 Ar3OoTfBe/7fgW+pSOXhnAJVqGiZEanirkCTnN0KyHbeyweQyoQLkMtyMz8zNmhvdR8z
 xiHaNQme5eb9lQHaJ0QmHw4Y3v5C9Og2FoaXJusysjrte0cgAaX7zaZbM/a0wb0Sj3mN
 6AGfz0AAjNqD+NzSgHWJ9fTm96CQYkMMK3Ft/A2w5ioaynsST38rwoG+rvoqNIFB+8ah
 FLEA==
X-Gm-Message-State: ABUngvd9g0d4qcjqwl+vaV9kvjOt3fGHbDw47gr6aQAtbNvwozmIxiKxTijD1w1bWJS3JA==
X-Received: by 10.55.167.201 with SMTP id q192mr9515523qke.61.1477269996431;
 Sun, 23 Oct 2016 17:46:36 -0700 (PDT)
Received: from xmas.dyndns.org (cl-890.chi-02.us.sixxs.net.
 [2001:4978:f:379::2])
 by smtp.gmail.com with ESMTPSA id 21sm7247104qkg.27.2016.10.23.17.46.35
 (version=TLS1 cipher=AES128-SHA bits=128/128);
 Sun, 23 Oct 2016 17:46:35 -0700 (PDT)
Received: by xmas.dyndns.org (Postfix, from userid 501)
 id 17EB68C21B9; Sun, 23 Oct 2016 20:46:34 -0400 (EDT)
Date: Sun, 23 Oct 2016 20:46:33 -0400
From: Denis Hainsworth <denis.hainsworth@gmail.com>
To: Bryan Stansell <bryan@conserver.com>
Cc: users@conserver.com
Subject: Re: what is normal conserver hang during reconfig
Message-ID: <20161024004633.GK6698@cs.brandeis.edu>
Reply-To: Denis Hainsworth <denis.hainsworth@gmail.com>
References: <20161014160544.GZ27007@cs.brandeis.edu>
 <20161019010422.GQ27007@cs.brandeis.edu>
 <3FF6DA5D-E802-4C5B-A37A-B85CF2EF737D@conserver.com>
 <20161023064210.GF6698@cs.brandeis.edu>
 <4863A133-4C31-424B-9837-5777A6EA321F@conserver.com>
 <20161023175735.GH6698@cs.brandeis.edu>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20161023175735.GH6698@cs.brandeis.edu>
User-Agent: Mutt/1.5.21 (2010-09-15)
X-Spam-Score: -0.377 () BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU,
 FREEMAIL_FROM, SPF_PASS, URIBL_SBL
X-Scanned-By: MIMEDefang 2.72 on 198.151.248.21
X-BeenThere: users@conserver.com
X-Mailman-Version: 2.1.23
Precedence: list
List-Id: Conserver Users <users.conserver.com>
List-Unsubscribe: <https://www.conserver.com/mailman/options/users>,
 <mailto:users-request@conserver.com?subject=unsubscribe>
List-Archive: <https://www.conserver.com/pipermail/users/>
List-Post: <mailto:users@conserver.com>
List-Help: <mailto:users-request@conserver.com?subject=help>
List-Subscribe: <https://www.conserver.com/mailman/listinfo/users>,
 <mailto:users-request@conserver.com?subject=subscribe>
X-List-Received-Date: Mon, 24 Oct 2016 00:46:42 -0000

So i think I have a solution which avoids the issue rather than fixing
anything :) 

I had to try to recall why we have things the way we do since its going
on like 10 years or more (have i said I love conserver?)

So back in the day we set up the single conserver instance which was
obviously going to be the master.  We later started populating a couple
other servers at a few different sites.  To keep things simple and
robust every server got the same config set and could be the master.
However in reality I'm pretty sure no one uses anything but the default
master ever.  So if I reduce the configs on all the slave servers,
especially the ones that reconfig the most and are causing the most
grief to the users when it takes 40s, to only the configs they actually
own then my times drop down to 3s and 13s respectively.

Not a perfect solution but it should require a minimum of changes to
everyone involved.  Hopefully no one will remind me tomorrow of
something I forgot :)

-denis (purveyor of good enough solutions)

On Sun, Oct 23, 2016 at 01:57:35PM -0400, Denis Hainsworth wrote:
> Dang it, my theory didn't pan out.  While the slower of the two did in
> fact have slower disks my IT was able to move the VM to some ultra fast
> storage and my reconfig loop wasn't any faster.  :(   And it was such a
> lovely theory too.
> 
> So I'm still digging to see if I can come up with a second clever idea
> but I have a feeling to reduce to reconfig time I'll just have to spread
> the load over more systems.
> 
> -denis
> 
> On Sun, Oct 23, 2016 at 10:34:36AM -0700, Bryan Stansell via users wrote:
> > I'm glad you were able to find the source of "most" of your troubles.  I quote that because, yes, theoretically the code could be a lot nicer and not block while reconfiguring.  The code that does that never got folded into the loop that handles I/O, but could...and really should.  No one has ever called it out as a serious enough problem before.  :-)
> > 
> > I'll certainly put it on the list to look at...but it's not a "simple" change, that's for sure.
> > 
> > Bryan
> > 
> > > On Oct 22, 2016, at 11:42 PM, Denis Hainsworth via users <users@conserver.com> wrote:
> > > 
> > > Finally got time to look at things.  strace is perfect, thanks for
> > > suggesting that.
> > > 
> > > So running something like
> > > strace -t -o strace.out.2 -p 3198  
> > > and sending a SIGHUP to the parent process showed the issue.
> > > 
> > > So the way we've always set things up was to automatically generate one
> > > config file per console server from our equipment database.   This means
> > > There are 264 files that are #included into the main config file. 
> > > The first 30s of "hang" is each process opening each file reading it in
> > > and closing it, I'm wondering if we need to block I/O during this or
> > > perhaps that could be done before we start blocking?
> > > Once that is done there is another 10s of hang while we do the dns
> > > lookup for each console host as you thought (open /etc/hosts, make a dns
> > > query, resolve it).
> > > 
> > > I tried putting all the configs into one file but that didnt change
> > > anything.  So then I started wondering.  Our IT had long ago made the
> > > console servers VMs.   Its never seemed like an issue but I compared
> > > some basic dd commands and found my problem server has terrible IO
> > > throughput ... sigh.   To compare one of my good servers has about
> > > 80Mbp/s read/write and the bad one has around 15Mbp/s read/write.  
> > > 
> > > So I'm going to look into moving the VM or get the disk perf up which
> > > should solve most of my issues but I also wonder if the conserver code
> > > could be re-organized without too much trouble to avoid issues of
> > > blocking when there is slow disk?  Its possible what I'm asking is dumb,
> > > just throwing it out there.
> > > 
> > > -denis
> > 
> > 
> > _______________________________________________
> > users mailing list
> > users@conserver.com
> > https://www.conserver.com/mailman/listinfo/users
> 
> -- 
> __________________________
> Denis Alan Hainsworth     
> denis.hainsworth@gmail.com

-- 
__________________________
Denis Alan Hainsworth     
denis.hainsworth@gmail.com
From consoleteam@gmail.com Tue Oct 25 15:00:23 2016
Received: from mail-wm0-f47.google.com (mail-wm0-f47.google.com [74.125.82.47])
 by underdog.stansell.org (8.15.2/8.15.2) with ESMTPS id u9PF0J4s023352
 (version=TLSv1.2 cipher=AES128-GCM-SHA256 bits=128 verify=OK);
 Tue, 25 Oct 2016 15:00:22 GMT
Received: by mail-wm0-f47.google.com with SMTP id d128so32157426wmf.1;
 Tue, 25 Oct 2016 08:00:21 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:in-reply-to:references:from:date:message-id:subject:to
 :cc; bh=ItVS39PAg2Kzd5ByyQ2hEoCbg+xH6+ttk0SW+iKQbuY=;
 b=au53wTlp7+Kj5GZ7Y+YcRSN26F5Uits4zysGiLa53cM6Lx5lbp01PhYbl5DRbLvqOa
 L09k1ZhlyU4dZt2sAPCLvHS7iAidCeYuOXv1++OqA9+CbZlqlANkfy6VAZFeraE7grI0
 PdvpYm7yRp3LJ0lu99BtNnQg0vKe/Kk/0mKoImSPnPIQZ9F0qrXZxBd8qv0V6BY5/kDA
 G47euD9NqOmEVJKJi6VLVgv3UeJEvP1uCoo+zVMjfkGrRSGJj5Hk9hh4V9lEfjAc4/yG
 DyTSnQ6AscLb/eNcFCfb16Mh599cLWlfvFKM5QourYdyYZVFK2CRfMkJAnld2SXOEqJ3
 TuhA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20130820;
 h=x-gm-message-state:mime-version:in-reply-to:references:from:date
 :message-id:subject:to:cc;
 bh=ItVS39PAg2Kzd5ByyQ2hEoCbg+xH6+ttk0SW+iKQbuY=;
 b=lLvr2Z0I3cLL9/vQJUdNq3DqCJ5n0/aQnTmhDh12WwdBdSr7QFEKv3f72aDpY0ZGTJ
 3s+7jvwr5agCePb+D8jfCcx5UGW2mQQQpggqbdWfVlz1RgeQWGEUp9Xb6Gs2/hLoirgD
 GigBfCeMMVIpdl+hVR0Np6jA5TmHINE+1FFGgJF9AK6ZVbHvxtIsk+Ic/6CZwhar0650
 CKf8ZMHLk+0vhxKzAGj6HN2AZw7bdQ3WnfUBdN633BWY0PKZEaJynSR/qhza7muWbxPf
 l/FgCoTjv2vnMZbdehzLOEuTrvhV1Az3xl7mMs/trYWpUkC0zNMLP8f8v84Woy3Tju0M
 l2rw==
X-Gm-Message-State: ABUngve7F3HcQd9sEOBw8Fd7b8reYyIQMGH1oJd8zZrIt6VC1zcu34S1h3XaFdd19UgjcpDsRxT9llwE/n7trQ==
X-Received: by 10.28.154.150 with SMTP id c144mr3799370wme.25.1477407617219;
 Tue, 25 Oct 2016 08:00:17 -0700 (PDT)
MIME-Version: 1.0
Received: by 10.194.118.73 with HTTP; Tue, 25 Oct 2016 08:00:16 -0700 (PDT)
In-Reply-To: <3FF6DA5D-E802-4C5B-A37A-B85CF2EF737D@conserver.com>
References: <20161014160544.GZ27007@cs.brandeis.edu>
 <20161019010422.GQ27007@cs.brandeis.edu>
 <3FF6DA5D-E802-4C5B-A37A-B85CF2EF737D@conserver.com>
From: Zonker <consoleteam@gmail.com>
Date: Tue, 25 Oct 2016 08:00:16 -0700
Message-ID: <CAD7Ezq2dFyJgGANOAmzBSUysYb1vMX2J6MtgdA3cU=gqD9Ay9Q@mail.gmail.com>
Subject: Re: what is normal conserver hang during reconfig
To: Bryan Stansell <bryan@conserver.com>
Cc: "users@conserver.com" <users@conserver.com>
Content-Type: multipart/alternative; boundary=001a114b303836821c053fb1c44b
X-Spam-Score: 0.624 () BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU,
 FREEMAIL_FROM, FREEMAIL_REPLY, HTML_MESSAGE, SPF_PASS, URIBL_SBL
X-Scanned-By: MIMEDefang 2.72 on 198.151.248.21
X-BeenThere: users@conserver.com
X-Mailman-Version: 2.1.23
Precedence: list
List-Id: Conserver Users <users.conserver.com>
List-Unsubscribe: <https://www.conserver.com/mailman/options/users>,
 <mailto:users-request@conserver.com?subject=unsubscribe>
List-Archive: <https://www.conserver.com/pipermail/users/>
List-Post: <mailto:users@conserver.com>
List-Help: <mailto:users-request@conserver.com?subject=help>
List-Subscribe: <https://www.conserver.com/mailman/listinfo/users>,
 <mailto:users-request@conserver.com?subject=subscribe>
X-List-Received-Date: Tue, 25 Oct 2016 15:00:25 -0000

--001a114b303836821c053fb1c44b
Content-Type: text/plain; charset=UTF-8

To Bryan's  point about DNS lookups, I expect that my main conserver will
be the first thing online (and then then network comes up...) and it will
be the last thing down. As a result, I have all of my console servers
listed in the /etc/hosts file, and I look at the file first. I have 67
conserverver child processes with 16 ports under each, and my hup is just a
few seconds.

    Best regards,

                -Z-


On Tue, Oct 18, 2016 at 8:39 PM, Bryan Stansell via users <
users@conserver.com> wrote:

> Off the top of my head, I agree that there shouldn't be anything fixed in
> the newer code to address this.  The code does block all activity when it
> processes a HUP signal, but that's supposed to be "quick".  :-|
>
> Each process (the main and children) rereads the config file and figures
> out if there's anything to do.  The main process is in charge of spawning
> new consoles (or reconfigured), and the children are responsible for
> letting go of old ones (or reconfigured).
>
> With that in mind, how many consoles are each child managing?  The compile
> time default can be seen with a "conserver -V", but it can be overridden
> with -m.  I'm honestly not sure if having more or less would be better or
> even change things (more processes would use more cores, but also "slam"
> the system with that many things reading and processing the config).
>
> Conserver tries very hard to be multiplex across all the consoles, even
> when bringing up and tearing down things.  The reread of the config puts
> all that on hold, so it probably has to do with that.
>
> One issue I've seen before is the magnitude of DNS lookups done when a
> config is loaded.  It all depends on the config, of course, but you could
> end up generating a lot of requests.  Maybe it doesn't apply in your
> environment, but it can be an unexpected source of trouble.
>
> Aside from that, another server will certainly share the load (and, set up
> right, the end users won't even notice).  It would be interesting to look
> at an strace (assuming linux) of a process when it gets a HUP (even without
> any changes to configs).  Just send one of the children a HUP so it
> minimizes the impact.  With timestamps, it might highlight what is causing
> the issue (like the DNS query case, but could be anything).
>
> Bryan
>
> > On Oct 18, 2016, at 6:04 PM, Denis Hainsworth via users <
> users@conserver.com> wrote:
> >
> > Running v 8.1.18.  Rereading the SIGHUP section of the man page I'm
> > still thinking I've configured something wrong.  SIGHUP says conserver
> > rereads the config files and then adds/deletes consoles as needed and
> > only touches running consoles if they have changed.  If thats true I
> > wouldn't expect a 30s buffer of input/output on a console that hasn't
> > changed, should I?
> > I also don't see anything in CHANGES that sounds like this is a bug
> > that has been fixed.
> >
> > -denis
> >
> > On Fri, Oct 14, 2016 at 12:05:44PM -0400, Denis Hainsworth wrote:
> >> I love conserver.  I have  a minor issue and I was curious what options
> >> there might be.
> >>
> >> So I have a conserver setup running against 262 servers (mostly digis or
> >> ser2net machines).  It works great.  However when we need to update due
> >> to a config change we run "kill -HUP" against the parent.  With the
> >> number of consoles (I think) this causes about a 30s "hang" when
> >> interacting with any console which corresponds to the reconfig time.
> >>
> >> Does this make sense and is per the current design?  Any chance there is
> >> a clever way to make it block for less time?  Barring that I intend to
> >> spin up a new server to share the load of my current server and reduce
> >> the reconfig time.
> >>
> >> I was mostly curious if there was a config issue or if this description
> >> doesn't make any sense to folks and it means I have something else going
> >> on like too many down consoles or something.
> >> -denis
> >>
> >> --
> >> __________________________
> >> Denis Alan Hainsworth
> >> denis.hainsworth@gmail.com
> >
> > --
> > __________________________
> > Denis Alan Hainsworth
> > denis.hainsworth@gmail.com
> > _______________________________________________
> > users mailing list
> > users@conserver.com
> > https://www.conserver.com/mailman/listinfo/users
>
>
> _______________________________________________
> users mailing list
> users@conserver.com
> https://www.conserver.com/mailman/listinfo/users
>


-- 
Train of Lights reminder email list signup - http://tinyurl.com/ncry
-announce

--001a114b303836821c053fb1c44b
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><div class=3D"gmail_default" style=3D"font-family:georgia,=
serif;color:#3d85c6">To Bryan&#39;s =C2=A0point about DNS lookups, I expect=
 that my main conserver will be the first thing online (and then then netwo=
rk comes up...) and it will be the last thing down. As a result, I have all=
 of my console servers listed in the /etc/hosts file, and I look at the fil=
e first. I have 67 conserverver child processes with 16 ports under each, a=
nd my hup is just a few seconds.</div><div class=3D"gmail_default" style=3D=
"font-family:georgia,serif;color:#3d85c6"><br></div><div class=3D"gmail_def=
ault" style=3D"font-family:georgia,serif;color:#3d85c6">=C2=A0 =C2=A0 Best =
regards,</div><div class=3D"gmail_default" style=3D"font-family:georgia,ser=
if;color:#3d85c6"><br></div><div class=3D"gmail_default" style=3D"font-fami=
ly:georgia,serif;color:#3d85c6">=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =
=C2=A0 =C2=A0 -Z-</div><div class=3D"gmail_default" style=3D"font-family:ge=
orgia,serif;color:#3d85c6"><br></div></div><div class=3D"gmail_extra"><br><=
div class=3D"gmail_quote">On Tue, Oct 18, 2016 at 8:39 PM, Bryan Stansell v=
ia users <span dir=3D"ltr">&lt;<a href=3D"mailto:users@conserver.com" targe=
t=3D"_blank">users@conserver.com</a>&gt;</span> wrote:<br><blockquote class=
=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padd=
ing-left:1ex">Off the top of my head, I agree that there shouldn&#39;t be a=
nything fixed in the newer code to address this.=C2=A0 The code does block =
all activity when it processes a HUP signal, but that&#39;s supposed to be =
&quot;quick&quot;.=C2=A0 :-|<br>
<br>
Each process (the main and children) rereads the config file and figures ou=
t if there&#39;s anything to do.=C2=A0 The main process is in charge of spa=
wning new consoles (or reconfigured), and the children are responsible for =
letting go of old ones (or reconfigured).<br>
<br>
With that in mind, how many consoles are each child managing?=C2=A0 The com=
pile time default can be seen with a &quot;conserver -V&quot;, but it can b=
e overridden with -m.=C2=A0 I&#39;m honestly not sure if having more or les=
s would be better or even change things (more processes would use more core=
s, but also &quot;slam&quot; the system with that many things reading and p=
rocessing the config).<br>
<br>
Conserver tries very hard to be multiplex across all the consoles, even whe=
n bringing up and tearing down things.=C2=A0 The reread of the config puts =
all that on hold, so it probably has to do with that.<br>
<br>
One issue I&#39;ve seen before is the magnitude of DNS lookups done when a =
config is loaded.=C2=A0 It all depends on the config, of course, but you co=
uld end up generating a lot of requests.=C2=A0 Maybe it doesn&#39;t apply i=
n your environment, but it can be an unexpected source of trouble.<br>
<br>
Aside from that, another server will certainly share the load (and, set up =
right, the end users won&#39;t even notice).=C2=A0 It would be interesting =
to look at an strace (assuming linux) of a process when it gets a HUP (even=
 without any changes to configs).=C2=A0 Just send one of the children a HUP=
 so it minimizes the impact.=C2=A0 With timestamps, it might highlight what=
 is causing the issue (like the DNS query case, but could be anything).<br>
<br>
Bryan<br>
<div class=3D"HOEnZb"><div class=3D"h5"><br>
&gt; On Oct 18, 2016, at 6:04 PM, Denis Hainsworth via users &lt;<a href=3D=
"mailto:users@conserver.com">users@conserver.com</a>&gt; wrote:<br>
&gt;<br>
&gt; Running v 8.1.18.=C2=A0 Rereading the SIGHUP section of the man page I=
&#39;m<br>
&gt; still thinking I&#39;ve configured something wrong.=C2=A0 SIGHUP says =
conserver<br>
&gt; rereads the config files and then adds/deletes consoles as needed and<=
br>
&gt; only touches running consoles if they have changed.=C2=A0 If thats tru=
e I<br>
&gt; wouldn&#39;t expect a 30s buffer of input/output on a console that has=
n&#39;t<br>
&gt; changed, should I?<br>
&gt; I also don&#39;t see anything in CHANGES that sounds like this is a bu=
g<br>
&gt; that has been fixed.<br>
&gt;<br>
&gt; -denis<br>
&gt;<br>
&gt; On Fri, Oct 14, 2016 at 12:05:44PM -0400, Denis Hainsworth wrote:<br>
&gt;&gt; I love conserver.=C2=A0 I have=C2=A0 a minor issue and I was curio=
us what options<br>
&gt;&gt; there might be.<br>
&gt;&gt;<br>
&gt;&gt; So I have a conserver setup running against 262 servers (mostly di=
gis or<br>
&gt;&gt; ser2net machines).=C2=A0 It works great.=C2=A0 However when we nee=
d to update due<br>
&gt;&gt; to a config change we run &quot;kill -HUP&quot; against the parent=
.=C2=A0 With the<br>
&gt;&gt; number of consoles (I think) this causes about a 30s &quot;hang&qu=
ot; when<br>
&gt;&gt; interacting with any console which corresponds to the reconfig tim=
e.<br>
&gt;&gt;<br>
&gt;&gt; Does this make sense and is per the current design?=C2=A0 Any chan=
ce there is<br>
&gt;&gt; a clever way to make it block for less time?=C2=A0 Barring that I =
intend to<br>
&gt;&gt; spin up a new server to share the load of my current server and re=
duce<br>
&gt;&gt; the reconfig time.<br>
&gt;&gt;<br>
&gt;&gt; I was mostly curious if there was a config issue or if this descri=
ption<br>
&gt;&gt; doesn&#39;t make any sense to folks and it means I have something =
else going<br>
&gt;&gt; on like too many down consoles or something.<br>
&gt;&gt; -denis<br>
&gt;&gt;<br>
&gt;&gt; --<br>
&gt;&gt; __________________________<br>
&gt;&gt; Denis Alan Hainsworth<br>
&gt;&gt; <a href=3D"mailto:denis.hainsworth@gmail.com">denis.hainsworth@gma=
il.com</a><br>
&gt;<br>
&gt; --<br>
&gt; __________________________<br>
&gt; Denis Alan Hainsworth<br>
&gt; <a href=3D"mailto:denis.hainsworth@gmail.com">denis.hainsworth@gmail.c=
om</a><br>
&gt; ______________________________<wbr>_________________<br>
&gt; users mailing list<br>
&gt; <a href=3D"mailto:users@conserver.com">users@conserver.com</a><br>
&gt; <a href=3D"https://www.conserver.com/mailman/listinfo/users" rel=3D"no=
referrer" target=3D"_blank">https://www.conserver.com/<wbr>mailman/listinfo=
/users</a><br>
<br>
<br>
______________________________<wbr>_________________<br>
users mailing list<br>
<a href=3D"mailto:users@conserver.com">users@conserver.com</a><br>
<a href=3D"https://www.conserver.com/mailman/listinfo/users" rel=3D"norefer=
rer" target=3D"_blank">https://www.conserver.com/<wbr>mailman/listinfo/user=
s</a><br>
</div></div></blockquote></div><br><br clear=3D"all"><div><br></div>-- <br>=
<div class=3D"gmail_signature" data-smartmail=3D"gmail_signature"><div dir=
=3D"ltr"><div><div dir=3D"ltr"><div><div dir=3D"ltr"><div><div dir=3D"ltr">=
<div><div dir=3D"ltr"><div><div dir=3D"ltr"><div><div dir=3D"ltr"><div><div=
 dir=3D"ltr"><span style=3D"font-size:12.8px;font-family:monospace,monospac=
e"><font size=3D"2"><span style=3D"color:rgb(147,196,125)">Train of Lights =
reminder email list signup -=C2=A0</span></font></span><a href=3D"http://ti=
nyurl.com/ncry-announce" style=3D"font-size:12.8px" target=3D"_blank"><font=
 face=3D"monospace, monospace"><font size=3D"2">http://tinyurl.com/ncry</fo=
nt>-announce</font></a><span style=3D"color:rgb(106,168,79)"><br></span></d=
iv></div></div></div></div></div></div></div></div></div></div></div></div>=
</div></div></div>
</div>

--001a114b303836821c053fb1c44b--
From consoleteam@gmail.com Tue Oct 25 15:06:56 2016
Received: from mail-wm0-f54.google.com (mail-wm0-f54.google.com [74.125.82.54])
 by underdog.stansell.org (8.15.2/8.15.2) with ESMTPS id u9PF6oB7024257
 (version=TLSv1.2 cipher=AES128-GCM-SHA256 bits=128 verify=OK);
 Tue, 25 Oct 2016 15:06:52 GMT
Received: by mail-wm0-f54.google.com with SMTP id c78so168496658wme.0;
 Tue, 25 Oct 2016 08:06:52 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=mime-version:in-reply-to:references:from:date:message-id:subject:to
 :cc; bh=XDe5s7Tg7lBxw/LPnT+iv0Vw9TM37h5XiSOD7Fddx5M=;
 b=hFb50ffS7LTA3ggw82k6Yl3UjH5TdfyYl8cwTn9kiMElk2KD2lr5P0SazdwZw7t6qF
 ercSRDRYZXYmwvJbXm5mS6XWtNyK4ma/AsDeiarznv+UYFimxvsk0tiRPs1llcYXej2A
 tOTke4wTo0kqNbRubo93fMWQz78LYuQ3Gwu3UG9Kx8KYPP+sGc6VODnaxQ97McDa7od0
 iPSmqR0q6RyTZ0bI0kfTXmJ5lEjfoYnauJqW8IoRIoHCsZQGMCkLeJI723SuiOMqJyic
 PEUkdzB2guA9V2cwZpuofBCrzQOPJ47V/KcfVsaymxNplH3SnRkOsU6KMyVP7wR4aqEv
 lL3g==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20130820;
 h=x-gm-message-state:mime-version:in-reply-to:references:from:date
 :message-id:subject:to:cc;
 bh=XDe5s7Tg7lBxw/LPnT+iv0Vw9TM37h5XiSOD7Fddx5M=;
 b=MWw2uikcN7Uh2G6WBMJfufuXcF+NuoxQbQEg22lxBHjamBYm4g3MX56vUFyn4+ghCL
 mi6UmLVU2gagSZ2jWdTIu2981IrXG4oSNGStk4bbvOcFdI43nhdJp0PlFW7w9+JeePMO
 tcxJX20Jt0DSI+Sf90dR20+O2ulcN6ZFuMarG/C+rITi8zP/owlkWt2tYVUgI/YpffjH
 tSrEv0V3TtNrTKrNu4eAg/pejo1wmJ4y5iosEyKVYpe/1NpHQyBdIIYYuEenEIx3n7x+
 HpFvGWvqkE9urCUlWj5SlnR6PlCm0XhJLxBo7vy0z0676jgVDfU46W2nIrncKGI2XuPq
 rCOg==
X-Gm-Message-State: ABUngvfWh8GPJgPvVGggkGJ9423ZlsxA7lbyl/BI9zdTd0ASWiRdB/zYDPaFDuyXgvFsvW1AXJCcvfAaDPTg5w==
X-Received: by 10.28.154.150 with SMTP id c144mr3832647wme.25.1477408009493;
 Tue, 25 Oct 2016 08:06:49 -0700 (PDT)
MIME-Version: 1.0
Received: by 10.194.118.73 with HTTP; Tue, 25 Oct 2016 08:06:48 -0700 (PDT)
In-Reply-To: <CAD7Ezq2dFyJgGANOAmzBSUysYb1vMX2J6MtgdA3cU=gqD9Ay9Q@mail.gmail.com>
References: <20161014160544.GZ27007@cs.brandeis.edu>
 <20161019010422.GQ27007@cs.brandeis.edu>
 <3FF6DA5D-E802-4C5B-A37A-B85CF2EF737D@conserver.com>
 <CAD7Ezq2dFyJgGANOAmzBSUysYb1vMX2J6MtgdA3cU=gqD9Ay9Q@mail.gmail.com>
From: Zonker <consoleteam@gmail.com>
Date: Tue, 25 Oct 2016 08:06:48 -0700
Message-ID: <CAD7Ezq1mhx1m6Gr_0P+oFfb2BmQiX66FRQKp5dqiWrx+6Qw=Cg@mail.gmail.com>
Subject: Re: what is normal conserver hang during reconfig
To: Bryan Stansell <bryan@conserver.com>
Cc: "users@conserver.com" <users@conserver.com>
Content-Type: multipart/alternative; boundary=001a114b3038982246053fb1db0e
X-Spam-Score: 0.624 () BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU,
 FREEMAIL_FROM, FREEMAIL_REPLY, HTML_MESSAGE, SPF_PASS, URIBL_SBL
X-Scanned-By: MIMEDefang 2.72 on 198.151.248.21
X-BeenThere: users@conserver.com
X-Mailman-Version: 2.1.23
Precedence: list
List-Id: Conserver Users <users.conserver.com>
List-Unsubscribe: <https://www.conserver.com/mailman/options/users>,
 <mailto:users-request@conserver.com?subject=unsubscribe>
List-Archive: <https://www.conserver.com/pipermail/users/>
List-Post: <mailto:users@conserver.com>
List-Help: <mailto:users-request@conserver.com?subject=help>
List-Subscribe: <https://www.conserver.com/mailman/listinfo/users>,
 <mailto:users-request@conserver.com?subject=subscribe>
X-List-Received-Date: Tue, 25 Oct 2016 15:06:56 -0000

--001a114b3038982246053fb1db0e
Content-Type: text/plain; charset=UTF-8

Also, my main conserver is a dedicated host, on it's own UPS, which is
powered by the data canter UPS... maybe it's overkill for some, but I can
tell you if there were any problems shutting down the rest of the world, or
bringing it back online again. (This was a really important feature after
our campus lost PG&E Mains power TWICE in 14 hours last Monday. :-)

All the log files go into /var/consoles/current, and we rotate timestamped
files into /var/consoles/archive.

    Best regards,

            -Z-


On Tue, Oct 25, 2016 at 8:00 AM, Zonker <consoleteam@gmail.com> wrote:

> To Bryan's  point about DNS lookups, I expect that my main conserver will
> be the first thing online (and then then network comes up...) and it will
> be the last thing down. As a result, I have all of my console servers
> listed in the /etc/hosts file, and I look at the file first. I have 67
> conserverver child processes with 16 ports under each, and my hup is just a
> few seconds.
>
>     Best regards,
>
>                 -Z-
>
>
> On Tue, Oct 18, 2016 at 8:39 PM, Bryan Stansell via users <
> users@conserver.com> wrote:
>
>> Off the top of my head, I agree that there shouldn't be anything fixed in
>> the newer code to address this.  The code does block all activity when it
>> processes a HUP signal, but that's supposed to be "quick".  :-|
>>
>> Each process (the main and children) rereads the config file and figures
>> out if there's anything to do.  The main process is in charge of spawning
>> new consoles (or reconfigured), and the children are responsible for
>> letting go of old ones (or reconfigured).
>>
>> With that in mind, how many consoles are each child managing?  The
>> compile time default can be seen with a "conserver -V", but it can be
>> overridden with -m.  I'm honestly not sure if having more or less would be
>> better or even change things (more processes would use more cores, but also
>> "slam" the system with that many things reading and processing the config).
>>
>> Conserver tries very hard to be multiplex across all the consoles, even
>> when bringing up and tearing down things.  The reread of the config puts
>> all that on hold, so it probably has to do with that.
>>
>> One issue I've seen before is the magnitude of DNS lookups done when a
>> config is loaded.  It all depends on the config, of course, but you could
>> end up generating a lot of requests.  Maybe it doesn't apply in your
>> environment, but it can be an unexpected source of trouble.
>>
>> Aside from that, another server will certainly share the load (and, set
>> up right, the end users won't even notice).  It would be interesting to
>> look at an strace (assuming linux) of a process when it gets a HUP (even
>> without any changes to configs).  Just send one of the children a HUP so it
>> minimizes the impact.  With timestamps, it might highlight what is causing
>> the issue (like the DNS query case, but could be anything).
>>
>> Bryan
>>
>> > On Oct 18, 2016, at 6:04 PM, Denis Hainsworth via users <
>> users@conserver.com> wrote:
>> >
>> > Running v 8.1.18.  Rereading the SIGHUP section of the man page I'm
>> > still thinking I've configured something wrong.  SIGHUP says conserver
>> > rereads the config files and then adds/deletes consoles as needed and
>> > only touches running consoles if they have changed.  If thats true I
>> > wouldn't expect a 30s buffer of input/output on a console that hasn't
>> > changed, should I?
>> > I also don't see anything in CHANGES that sounds like this is a bug
>> > that has been fixed.
>> >
>> > -denis
>> >
>> > On Fri, Oct 14, 2016 at 12:05:44PM -0400, Denis Hainsworth wrote:
>> >> I love conserver.  I have  a minor issue and I was curious what options
>> >> there might be.
>> >>
>> >> So I have a conserver setup running against 262 servers (mostly digis
>> or
>> >> ser2net machines).  It works great.  However when we need to update due
>> >> to a config change we run "kill -HUP" against the parent.  With the
>> >> number of consoles (I think) this causes about a 30s "hang" when
>> >> interacting with any console which corresponds to the reconfig time.
>> >>
>> >> Does this make sense and is per the current design?  Any chance there
>> is
>> >> a clever way to make it block for less time?  Barring that I intend to
>> >> spin up a new server to share the load of my current server and reduce
>> >> the reconfig time.
>> >>
>> >> I was mostly curious if there was a config issue or if this description
>> >> doesn't make any sense to folks and it means I have something else
>> going
>> >> on like too many down consoles or something.
>> >> -denis
>> >>
>> >> --
>> >> __________________________
>> >> Denis Alan Hainsworth
>> >> denis.hainsworth@gmail.com
>> >
>> > --
>> > __________________________
>> > Denis Alan Hainsworth
>> > denis.hainsworth@gmail.com
>> > _______________________________________________
>> > users mailing list
>> > users@conserver.com
>> > https://www.conserver.com/mailman/listinfo/users
>>
>>
>> _______________________________________________
>> users mailing list
>> users@conserver.com
>> https://www.conserver.com/mailman/listinfo/users
>>
>
>
>
> --
> Train of Lights reminder email list signup - http://tinyurl.com/ncry-
> announce
>


-- 
Train of Lights reminder email list signup - http://tinyurl.com/ncry
-announce

--001a114b3038982246053fb1db0e
Content-Type: text/html; charset=UTF-8
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><div class=3D"gmail_default" style=3D"font-family:georgia,=
serif;color:#3d85c6">Also, my main conserver is a dedicated host, on it&#39=
;s own UPS, which is powered by the data canter UPS... maybe it&#39;s overk=
ill for some, but I can tell you if there were any problems shutting down t=
he rest of the world, or bringing it back online again. (This was a really =
important feature after our campus lost PG&amp;E Mains power TWICE in 14 ho=
urs last Monday. :-)</div><div class=3D"gmail_default" style=3D"font-family=
:georgia,serif;color:#3d85c6"><br></div><div class=3D"gmail_default" style=
=3D"font-family:georgia,serif;color:#3d85c6">All the log files go into /var=
/consoles/current, and we rotate timestamped files into /var/consoles/archi=
ve.=C2=A0</div><div class=3D"gmail_default" style=3D"font-family:georgia,se=
rif;color:#3d85c6"><br></div><div class=3D"gmail_default" style=3D"font-fam=
ily:georgia,serif;color:#3d85c6">=C2=A0 =C2=A0 Best regards,</div><div clas=
s=3D"gmail_default" style=3D"font-family:georgia,serif;color:#3d85c6"><br><=
/div><div class=3D"gmail_default" style=3D"font-family:georgia,serif;color:=
#3d85c6">=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 -Z-</div><div class=3D"g=
mail_default" style=3D"font-family:georgia,serif;color:#3d85c6"><br></div><=
/div><div class=3D"gmail_extra"><br><div class=3D"gmail_quote">On Tue, Oct =
25, 2016 at 8:00 AM, Zonker <span dir=3D"ltr">&lt;<a href=3D"mailto:console=
team@gmail.com" target=3D"_blank">consoleteam@gmail.com</a>&gt;</span> wrot=
e:<br><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-l=
eft:1px #ccc solid;padding-left:1ex"><div dir=3D"ltr"><div class=3D"gmail_d=
efault" style=3D"font-family:georgia,serif;color:#3d85c6">To Bryan&#39;s =
=C2=A0point about DNS lookups, I expect that my main conserver will be the =
first thing online (and then then network comes up...) and it will be the l=
ast thing down. As a result, I have all of my console servers listed in the=
 /etc/hosts file, and I look at the file first. I have 67 conserverver chil=
d processes with 16 ports under each, and my hup is just a few seconds.</di=
v><div class=3D"gmail_default" style=3D"font-family:georgia,serif;color:#3d=
85c6"><br></div><div class=3D"gmail_default" style=3D"font-family:georgia,s=
erif;color:#3d85c6">=C2=A0 =C2=A0 Best regards,</div><div class=3D"gmail_de=
fault" style=3D"font-family:georgia,serif;color:#3d85c6"><br></div><div cla=
ss=3D"gmail_default" style=3D"font-family:georgia,serif;color:#3d85c6">=C2=
=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 -Z-</div><div class=3D=
"gmail_default" style=3D"font-family:georgia,serif;color:#3d85c6"><br></div=
></div><div class=3D"gmail_extra"><div><div class=3D"h5"><br><div class=3D"=
gmail_quote">On Tue, Oct 18, 2016 at 8:39 PM, Bryan Stansell via users <spa=
n dir=3D"ltr">&lt;<a href=3D"mailto:users@conserver.com" target=3D"_blank">=
users@conserver.com</a>&gt;</span> wrote:<br><blockquote class=3D"gmail_quo=
te" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"=
>Off the top of my head, I agree that there shouldn&#39;t be anything fixed=
 in the newer code to address this.=C2=A0 The code does block all activity =
when it processes a HUP signal, but that&#39;s supposed to be &quot;quick&q=
uot;.=C2=A0 :-|<br>
<br>
Each process (the main and children) rereads the config file and figures ou=
t if there&#39;s anything to do.=C2=A0 The main process is in charge of spa=
wning new consoles (or reconfigured), and the children are responsible for =
letting go of old ones (or reconfigured).<br>
<br>
With that in mind, how many consoles are each child managing?=C2=A0 The com=
pile time default can be seen with a &quot;conserver -V&quot;, but it can b=
e overridden with -m.=C2=A0 I&#39;m honestly not sure if having more or les=
s would be better or even change things (more processes would use more core=
s, but also &quot;slam&quot; the system with that many things reading and p=
rocessing the config).<br>
<br>
Conserver tries very hard to be multiplex across all the consoles, even whe=
n bringing up and tearing down things.=C2=A0 The reread of the config puts =
all that on hold, so it probably has to do with that.<br>
<br>
One issue I&#39;ve seen before is the magnitude of DNS lookups done when a =
config is loaded.=C2=A0 It all depends on the config, of course, but you co=
uld end up generating a lot of requests.=C2=A0 Maybe it doesn&#39;t apply i=
n your environment, but it can be an unexpected source of trouble.<br>
<br>
Aside from that, another server will certainly share the load (and, set up =
right, the end users won&#39;t even notice).=C2=A0 It would be interesting =
to look at an strace (assuming linux) of a process when it gets a HUP (even=
 without any changes to configs).=C2=A0 Just send one of the children a HUP=
 so it minimizes the impact.=C2=A0 With timestamps, it might highlight what=
 is causing the issue (like the DNS query case, but could be anything).<br>
<br>
Bryan<br>
<div class=3D"m_-1527394455731502406HOEnZb"><div class=3D"m_-15273944557315=
02406h5"><br>
&gt; On Oct 18, 2016, at 6:04 PM, Denis Hainsworth via users &lt;<a href=3D=
"mailto:users@conserver.com" target=3D"_blank">users@conserver.com</a>&gt; =
wrote:<br>
&gt;<br>
&gt; Running v 8.1.18.=C2=A0 Rereading the SIGHUP section of the man page I=
&#39;m<br>
&gt; still thinking I&#39;ve configured something wrong.=C2=A0 SIGHUP says =
conserver<br>
&gt; rereads the config files and then adds/deletes consoles as needed and<=
br>
&gt; only touches running consoles if they have changed.=C2=A0 If thats tru=
e I<br>
&gt; wouldn&#39;t expect a 30s buffer of input/output on a console that has=
n&#39;t<br>
&gt; changed, should I?<br>
&gt; I also don&#39;t see anything in CHANGES that sounds like this is a bu=
g<br>
&gt; that has been fixed.<br>
&gt;<br>
&gt; -denis<br>
&gt;<br>
&gt; On Fri, Oct 14, 2016 at 12:05:44PM -0400, Denis Hainsworth wrote:<br>
&gt;&gt; I love conserver.=C2=A0 I have=C2=A0 a minor issue and I was curio=
us what options<br>
&gt;&gt; there might be.<br>
&gt;&gt;<br>
&gt;&gt; So I have a conserver setup running against 262 servers (mostly di=
gis or<br>
&gt;&gt; ser2net machines).=C2=A0 It works great.=C2=A0 However when we nee=
d to update due<br>
&gt;&gt; to a config change we run &quot;kill -HUP&quot; against the parent=
.=C2=A0 With the<br>
&gt;&gt; number of consoles (I think) this causes about a 30s &quot;hang&qu=
ot; when<br>
&gt;&gt; interacting with any console which corresponds to the reconfig tim=
e.<br>
&gt;&gt;<br>
&gt;&gt; Does this make sense and is per the current design?=C2=A0 Any chan=
ce there is<br>
&gt;&gt; a clever way to make it block for less time?=C2=A0 Barring that I =
intend to<br>
&gt;&gt; spin up a new server to share the load of my current server and re=
duce<br>
&gt;&gt; the reconfig time.<br>
&gt;&gt;<br>
&gt;&gt; I was mostly curious if there was a config issue or if this descri=
ption<br>
&gt;&gt; doesn&#39;t make any sense to folks and it means I have something =
else going<br>
&gt;&gt; on like too many down consoles or something.<br>
&gt;&gt; -denis<br>
&gt;&gt;<br>
&gt;&gt; --<br>
&gt;&gt; __________________________<br>
&gt;&gt; Denis Alan Hainsworth<br>
&gt;&gt; <a href=3D"mailto:denis.hainsworth@gmail.com" target=3D"_blank">de=
nis.hainsworth@gmail.com</a><br>
&gt;<br>
&gt; --<br>
&gt; __________________________<br>
&gt; Denis Alan Hainsworth<br>
&gt; <a href=3D"mailto:denis.hainsworth@gmail.com" target=3D"_blank">denis.=
hainsworth@gmail.com</a><br>
&gt; ______________________________<wbr>_________________<br>
&gt; users mailing list<br>
&gt; <a href=3D"mailto:users@conserver.com" target=3D"_blank">users@conserv=
er.com</a><br>
&gt; <a href=3D"https://www.conserver.com/mailman/listinfo/users" rel=3D"no=
referrer" target=3D"_blank">https://www.conserver.com/mail<wbr>man/listinfo=
/users</a><br>
<br>
<br>
______________________________<wbr>_________________<br>
users mailing list<br>
<a href=3D"mailto:users@conserver.com" target=3D"_blank">users@conserver.co=
m</a><br>
<a href=3D"https://www.conserver.com/mailman/listinfo/users" rel=3D"norefer=
rer" target=3D"_blank">https://www.conserver.com/mail<wbr>man/listinfo/user=
s</a><br>
</div></div></blockquote></div><br><br clear=3D"all"><div><br></div></div><=
/div><span class=3D"HOEnZb"><font color=3D"#888888">-- <br><div class=3D"m_=
-1527394455731502406gmail_signature" data-smartmail=3D"gmail_signature"><di=
v dir=3D"ltr"><div><div dir=3D"ltr"><div><div dir=3D"ltr"><div><div dir=3D"=
ltr"><div><div dir=3D"ltr"><div><div dir=3D"ltr"><div><div dir=3D"ltr"><div=
><div dir=3D"ltr"><span style=3D"font-size:12.8px;font-family:monospace,mon=
ospace"><font size=3D"2"><span style=3D"color:rgb(147,196,125)">Train of Li=
ghts reminder email list signup -=C2=A0</span></font></span><a href=3D"http=
://tinyurl.com/ncry-announce" style=3D"font-size:12.8px" target=3D"_blank">=
<font face=3D"monospace, monospace"><font size=3D"2">http://tinyurl.com/ncr=
y</font>-<wbr>announce</font></a><span style=3D"color:rgb(106,168,79)"><br>=
</span></div></div></div></div></div></div></div></div></div></div></div></=
div></div></div></div></div>
</font></span></div>
</blockquote></div><br><br clear=3D"all"><div><br></div>-- <br><div class=
=3D"gmail_signature" data-smartmail=3D"gmail_signature"><div dir=3D"ltr"><d=
iv><div dir=3D"ltr"><div><div dir=3D"ltr"><div><div dir=3D"ltr"><div><div d=
ir=3D"ltr"><div><div dir=3D"ltr"><div><div dir=3D"ltr"><div><div dir=3D"ltr=
"><span style=3D"font-size:12.8px;font-family:monospace,monospace"><font si=
ze=3D"2"><span style=3D"color:rgb(147,196,125)">Train of Lights reminder em=
ail list signup -=C2=A0</span></font></span><a href=3D"http://tinyurl.com/n=
cry-announce" style=3D"font-size:12.8px" target=3D"_blank"><font face=3D"mo=
nospace, monospace"><font size=3D"2">http://tinyurl.com/ncry</font>-announc=
e</font></a><span style=3D"color:rgb(106,168,79)"><br></span></div></div></=
div></div></div></div></div></div></div></div></div></div></div></div></div=
></div>
</div>

--001a114b3038982246053fb1db0e--
From cfowler@outpostsentinel.com Tue Oct 25 15:27:29 2016
Received: from zcs-mta.vps-host.net (zcs-mta.vps-host.net [69.89.1.77])
 by underdog.stansell.org (8.15.2/8.15.2) with ESMTPS id u9PFRPuR024854
 (version=TLSv1.2 cipher=ADH-AES256-GCM-SHA384 bits=256 verify=NO);
 Tue, 25 Oct 2016 15:27:27 GMT
Received: from localhost (localhost.localdomain [127.0.0.1])
 by zcs-mta.vps-host.net (Postfix) with ESMTP id 704E8815EECF;
 Tue, 25 Oct 2016 11:27:24 -0400 (EDT)
Received: from zcs-mta.vps-host.net ([127.0.0.1])
 by localhost (zcs-mta.vps-host.net [127.0.0.1]) (amavisd-new, port 10032)
 with ESMTP id c7Iq1ttqS5_R; Tue, 25 Oct 2016 11:27:23 -0400 (EDT)
Received: from localhost (localhost.localdomain [127.0.0.1])
 by zcs-mta.vps-host.net (Postfix) with ESMTP id B6650814D27D;
 Tue, 25 Oct 2016 11:27:23 -0400 (EDT)
X-Virus-Scanned: amavisd-new at zcs-mta.vps-host.net
Received: from zcs-mta.vps-host.net ([127.0.0.1])
 by localhost (zcs-mta.vps-host.net [127.0.0.1]) (amavisd-new, port 10026)
 with ESMTP id kwMNpQuYiY1P; Tue, 25 Oct 2016 11:27:23 -0400 (EDT)
Received: from enterprisemail2.vps-host.net (unknown [10.0.6.109])
 by zcs-mta.vps-host.net (Postfix) with ESMTP id 8DDFA815EECF;
 Tue, 25 Oct 2016 11:27:23 -0400 (EDT)
Date: Tue, 25 Oct 2016 11:27:23 -0400 (EDT)
From: Chris Fowler <cfowler@outpostsentinel.com>
To: Zonker <consoleteam@gmail.com>
Cc: Bryan Stansell <bryan@conserver.com>, users@conserver.com
Message-ID: <1127348975.10548952.1477409243375.JavaMail.zimbra@outpostsentinel.com>
In-Reply-To: <CAD7Ezq1mhx1m6Gr_0P+oFfb2BmQiX66FRQKp5dqiWrx+6Qw=Cg@mail.gmail.com>
References: <20161014160544.GZ27007@cs.brandeis.edu>
 <20161019010422.GQ27007@cs.brandeis.edu>
 <3FF6DA5D-E802-4C5B-A37A-B85CF2EF737D@conserver.com>
 <CAD7Ezq2dFyJgGANOAmzBSUysYb1vMX2J6MtgdA3cU=gqD9Ay9Q@mail.gmail.com>
 <CAD7Ezq1mhx1m6Gr_0P+oFfb2BmQiX66FRQKp5dqiWrx+6Qw=Cg@mail.gmail.com>
Subject: Re: what is normal conserver hang during reconfig
MIME-Version: 1.0
Content-Type: multipart/alternative; 
 boundary="----=_Part_10548951_1316098443.1477409243375"
X-Mailer: Zimbra 8.6.0_GA_1194 (ZimbraWebClient - GC52 (Linux)/8.6.0_GA_1194)
Thread-Topic: what is normal conserver hang during reconfig
Thread-Index: zymskSU+uV4zfZCXFUUeWryOlS9Lfg==
X-Spam-Score: -0.277 () BAYES_00,HTML_MESSAGE,SPF_HELO_PASS,URIBL_SBL
X-Scanned-By: MIMEDefang 2.72 on 198.151.248.21
X-BeenThere: users@conserver.com
X-Mailman-Version: 2.1.23
Precedence: list
List-Id: Conserver Users <users.conserver.com>
List-Unsubscribe: <https://www.conserver.com/mailman/options/users>,
 <mailto:users-request@conserver.com?subject=unsubscribe>
List-Archive: <https://www.conserver.com/pipermail/users/>
List-Post: <mailto:users@conserver.com>
List-Help: <mailto:users-request@conserver.com?subject=help>
List-Subscribe: <https://www.conserver.com/mailman/listinfo/users>,
 <mailto:users-request@conserver.com?subject=subscribe>
X-List-Received-Date: Tue, 25 Oct 2016 15:27:29 -0000

------=_Part_10548951_1316098443.1477409243375
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit

> From: "Zonker via users" <users@conserver.com>
> To: "Bryan Stansell" <bryan@conserver.com>
> Cc: users@conserver.com
> Sent: Tuesday, October 25, 2016 11:06:48 AM
> Subject: Re: what is normal conserver hang during reconfig

> Also, my main conserver is a dedicated host, on it's own UPS, which is powered
> by the data canter UPS... maybe it's overkill for some, but I can tell you if
> there were any problems shutting down the rest of the world, or bringing it
> back online again. (This was a really important feature after our campus lost
> PG&E Mains power TWICE in 14 hours last Monday. :-)

I'm different. 

All mine are independent. They have their own configs. Console output is stored on their local storage. 

Main location is a program I wrote that pulls its info from a database on what to connect to. It stores output to its local disk. To connect from the main I have 2 programs. One uses the console protocol. The other uses SSH to the target host and then executes console on it. 

Chris 

------=_Part_10548951_1316098443.1477409243375
Content-Type: text/html; charset=utf-8
Content-Transfer-Encoding: quoted-printable

<html><body><div style=3D"font-family: arial, helvetica, sans-serif; font-s=
ize: 12pt; color: #000000"><div><br></div><div><br></div><hr id=3D"zwchr" d=
ata-marker=3D"__DIVIDER__"><div data-marker=3D"__HEADERS__"><blockquote sty=
le=3D"border-left: 2px solid #1010FF; margin-left: 5px; padding-left: 5px; =
color: #000; font-weight: normal; font-style: normal; text-decoration: none=
; font-family: Helvetica,Arial,sans-serif; font-size: 12pt;" data-mce-style=
=3D"border-left: 2px solid #1010FF; margin-left: 5px; padding-left: 5px; co=
lor: #000; font-weight: normal; font-style: normal; text-decoration: none; =
font-family: Helvetica,Arial,sans-serif; font-size: 12pt;"><b>From: </b>"Zo=
nker via users" &lt;users@conserver.com&gt;<br><b>To: </b>"Bryan Stansell" =
&lt;bryan@conserver.com&gt;<br><b>Cc: </b>users@conserver.com<br><b>Sent: <=
/b>Tuesday, October 25, 2016 11:06:48 AM<br><b>Subject: </b>Re: what is nor=
mal conserver hang during reconfig<br></blockquote></div><div data-marker=
=3D"__QUOTED_TEXT__"><blockquote style=3D"border-left: 2px solid #1010FF; m=
argin-left: 5px; padding-left: 5px; color: #000; font-weight: normal; font-=
style: normal; text-decoration: none; font-family: Helvetica,Arial,sans-ser=
if; font-size: 12pt;" data-mce-style=3D"border-left: 2px solid #1010FF; mar=
gin-left: 5px; padding-left: 5px; color: #000; font-weight: normal; font-st=
yle: normal; text-decoration: none; font-family: Helvetica,Arial,sans-serif=
; font-size: 12pt;"><div dir=3D"ltr"><div class=3D"gmail_default" style=3D"=
font-family: georgia,serif; color: #3d85c6;" data-mce-style=3D"font-family:=
 georgia,serif; color: #3d85c6;">Also, my main conserver is a dedicated hos=
t, on it's own UPS, which is powered by the data canter UPS... maybe it's o=
verkill for some, but I can tell you if there were any problems shutting do=
wn the rest of the world, or bringing it back online again. (This was a rea=
lly important feature after our campus lost PG&amp;E Mains power TWICE in 1=
4 hours last Monday. :-)</div></div></blockquote><div><br></div><div>I'm di=
fferent.</div><div><br data-mce-bogus=3D"1"></div><div>All mine are indepen=
dent. &nbsp;They have their own configs. &nbsp;Console output is stored on =
their local storage.</div><div><br data-mce-bogus=3D"1"></div><div>Main loc=
ation is a program I wrote that pulls its info from a database on what to c=
onnect to. &nbsp;It stores output to its local disk. &nbsp;To connect from =
the main I have 2 programs. &nbsp;One uses the console protocol. &nbsp;The =
other uses SSH to the target host and then executes console on it.</div><di=
v><br data-mce-bogus=3D"1"></div><div>Chris</div><div><br data-mce-bogus=3D=
"1"></div></div></div></body></html>
------=_Part_10548951_1316098443.1477409243375--
From denis.hainsworth@gmail.com Tue Oct 25 18:16:50 2016
Received: from mail-qt0-f180.google.com (mail-qt0-f180.google.com
 [209.85.216.180])
 by underdog.stansell.org (8.15.2/8.15.2) with ESMTPS id u9PIGlKe017837
 (version=TLSv1.2 cipher=AES128-GCM-SHA256 bits=128 verify=OK);
 Tue, 25 Oct 2016 18:16:49 GMT
Received: by mail-qt0-f180.google.com with SMTP id q20so324552qtc.0;
 Tue, 25 Oct 2016 11:16:49 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
 h=date:from:to:cc:subject:message-id:reply-to:references:mime-version
 :content-disposition:in-reply-to:user-agent;
 bh=wAIv868ibrwAKVxfDKAVvtzOlLO/Nr3TE6tEqq7Nnsw=;
 b=jviE5T8Y0oN+jctElUJeTMJ75exdNsQWgR/vVVLLq/zYdQueDCc7JOgMnZBmCVenDq
 IFLG47GQQSNB2wJyq63D6/U/2QZlk5hzVBgyJp8LrGb2x3vp//oxJnfTjauW/8kRPI0B
 z+rwkEnVoq6teILWDBda1OTRWlarJfkv/UNmtL9cdaWu4u+Ja5e3lrzOuMPnZPDVMHx0
 rbIRa6/IG5U4Sx6F/bdwRjX+v/AwAaV5SXzVxRDhRjMyzFCtvfnOdmwCX2t+puPhH1CH
 Wo6jYCBS7Tn71+v4GEfdPFVXYtoUC0Bc8tVnZb7EG12d8RS6bofGZbKxTZYd+tomZbiH
 PiNQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20130820;
 h=x-gm-message-state:date:from:to:cc:subject:message-id:reply-to
 :references:mime-version:content-disposition:in-reply-to:user-agent;
 bh=wAIv868ibrwAKVxfDKAVvtzOlLO/Nr3TE6tEqq7Nnsw=;
 b=YW/PDW8ftGRUN1RYTMt+r26tHSJha6sPpeGMQua+blUb96gOCuG8ZbetCsG3nBWqFG
 6WE6ylV8lBlNuvEaDyfvS6+B+hjU7rtdU+i7rOnAvigv2j73JirnbJG+By+YtAPk1V6e
 rAGbBaGgI2aj4OS5BNcfbEyc8DK0G9gyDKk4tQ2ZNwviGNNaDNcCdA+/1dfQkK2wI/bQ
 8zOFfOAo0LUbskXL161NincI/5gcYTsmw6n8hWKvm7uLZuzT95+jH/X+seA74P1QJkGp
 Dza9w8PWhPJJP/GcnGJz/ZVcTtteSGl503ILeCMuoF9sE7slY+z57m9xJCBBSS9aS4Wt
 6WkA==
X-Gm-Message-State: ABUngvc60opU8UPU2gJOHQCifl7hAcq44MlkOp0maim/QvAkGeX+pk1n0WNEbCPLWXb6Og==
X-Received: by 10.200.39.125 with SMTP id h58mr21462866qth.142.1477419407160; 
 Tue, 25 Oct 2016 11:16:47 -0700 (PDT)
Received: from xmas.dyndns.org (cl-890.chi-02.us.sixxs.net.
 [2001:4978:f:379::2])
 by smtp.gmail.com with ESMTPSA id t34sm11645855qtc.28.2016.10.25.11.16.46
 (version=TLS1 cipher=AES128-SHA bits=128/128);
 Tue, 25 Oct 2016 11:16:46 -0700 (PDT)
Received: by xmas.dyndns.org (Postfix, from userid 501)
 id 6812C8C21B9; Tue, 25 Oct 2016 14:16:44 -0400 (EDT)
Date: Tue, 25 Oct 2016 14:16:44 -0400
From: Denis Hainsworth <denis.hainsworth@gmail.com>
To: Zonker <consoleteam@gmail.com>
Cc: Bryan Stansell <bryan@conserver.com>,
 "users@conserver.com" <users@conserver.com>
Subject: Re: what is normal conserver hang during reconfig
Message-ID: <20161025181644.GO6698@cs.brandeis.edu>
Reply-To: Denis Hainsworth <denis.hainsworth@gmail.com>
References: <20161014160544.GZ27007@cs.brandeis.edu>
 <20161019010422.GQ27007@cs.brandeis.edu>
 <3FF6DA5D-E802-4C5B-A37A-B85CF2EF737D@conserver.com>
 <CAD7Ezq2dFyJgGANOAmzBSUysYb1vMX2J6MtgdA3cU=gqD9Ay9Q@mail.gmail.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <CAD7Ezq2dFyJgGANOAmzBSUysYb1vMX2J6MtgdA3cU=gqD9Ay9Q@mail.gmail.com>
User-Agent: Mutt/1.5.21 (2010-09-15)
X-Spam-Score: -2 () BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU,
 FREEMAIL_FROM, SPF_PASS
X-Scanned-By: MIMEDefang 2.72 on 198.151.248.21
X-BeenThere: users@conserver.com
X-Mailman-Version: 2.1.23
Precedence: list
List-Id: Conserver Users <users.conserver.com>
List-Unsubscribe: <https://www.conserver.com/mailman/options/users>,
 <mailto:users-request@conserver.com?subject=unsubscribe>
List-Archive: <https://www.conserver.com/pipermail/users/>
List-Post: <mailto:users@conserver.com>
List-Help: <mailto:users-request@conserver.com?subject=help>
List-Subscribe: <https://www.conserver.com/mailman/listinfo/users>,
 <mailto:users-request@conserver.com?subject=subscribe>
X-List-Received-Date: Tue, 25 Oct 2016 18:16:51 -0000

On Tue, Oct 25, 2016 at 08:00:16AM -0700, Zonker via users wrote:
> To Bryan's  point about DNS lookups, I expect that my main conserver will
> be the first thing online (and then then network comes up...) and it will
> be the last thing down. As a result, I have all of my console servers
> listed in the /etc/hosts file, and I look at the file first. I have 67
> conserverver child processes with 16 ports under each, and my hup is just a
> few seconds.

Yeah DNS is a hit but its dwarfed by what I was seeing.  Understand my
"small" site has 72 children and HUPs in 3s now that I'm reading in
only the configs it manages (aka its now no longer a master).   
My large site has 126 children :) even with paring it
down to only the stuff it manages.  Thats now HUPing in 13s.
We have other boxes that can continue to serve as masters.
-denis