From denis.hainsworth@gmail.com Wed Oct 19 01:04:27 2016 Received: from mail-qk0-f169.google.com (mail-qk0-f169.google.com [209.85.220.169]) by underdog.stansell.org (8.15.2/8.15.2) with ESMTPS id u9J14PqL029793 (version=TLSv1.2 cipher=AES128-GCM-SHA256 bits=128 verify=OK) for ; Wed, 19 Oct 2016 01:04:27 GMT Received: by mail-qk0-f169.google.com with SMTP id z190so13890320qkc.2 for ; Tue, 18 Oct 2016 18:04:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=date:from:to:subject:message-id:reply-to:references:mime-version :content-disposition:in-reply-to:user-agent; bh=Q2Qg0XO7oHeyUbF2HwSPBvbIKM5trKyNpi+H0HIJc+4=; b=wsezaV+CfWcW4eaNRRijgygEcXxeWg5BSPij3yoeFcRJe/WgEmFKLRqIyWLOZgqGzR 6o27iq7mAdN4hqnqU0eWOpgSYWr9z8TJWxzxXDJC2/JbG4HyBr/Tlh+ozseZF3vjQRtZ pviWhk+m68Cjwh/Md39VKjEnfXrZe1ihTrHBrp3so28Ed4rMCIgGaQ99GzATjFA4SidG IFyEYB+SPP4Rtc9jCC96Q+Wi4ZFPFp6fnf3KrKitq+D+UbLeNauFo6vauR/NXMhNh2Jh rpJQbQuj6PL3FArKFrClJ4JYGozJIhEKjJK9VlHmTuJXEzNJpwcD1mQmmZVUWAOqr0XC VqXw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:date:from:to:subject:message-id:reply-to :references:mime-version:content-disposition:in-reply-to:user-agent; bh=Q2Qg0XO7oHeyUbF2HwSPBvbIKM5trKyNpi+H0HIJc+4=; b=GU07QVTZOD9aheROeuF4oCV4MoAHNxQDxNxuQqQEZBAU+2V7mOUA9wsxUoa+8S0prh N5g5XZDm9lgb6OdyvgvpI8VRuHajJe8x8FKGdVbNYc1crLN1NyxfU/PLWe2m0uy2JewC mDNRKFFPQQnUMxKWTFrIjsSu4OoFofe6mN21KozbqLnc8qXZFRcygwtm05xG7gjL9Y8f a42P+1nw3r8pbbXCiwzqMqTz7iMDuOos1zjtbmhppkBSvI9PISpXw2CeIBN98ejJP7Qb DQFul4F8ErZb5SQ+589ow1xSvNTj40YOI5Xr9IjpNF9FokxGOuKZ/aNMdxPzwWB00Fpj ilsw== X-Gm-Message-State: AA6/9RkEnNe87BUIYR1uZSHP3yiEvW9A5oysCH+gF+DWFgw8EkDb8b/7RT/XZSmnVbr9uw== X-Received: by 10.55.207.12 with SMTP id e12mr3444036qkj.206.1476839064346; Tue, 18 Oct 2016 18:04:24 -0700 (PDT) Received: from xmas.dyndns.org (cl-890.chi-02.us.sixxs.net. [2001:4978:f:379::2]) by smtp.gmail.com with ESMTPSA id i207sm19522847qke.40.2016.10.18.18.04.23 for (version=TLS1 cipher=AES128-SHA bits=128/128); Tue, 18 Oct 2016 18:04:23 -0700 (PDT) Received: by xmas.dyndns.org (Postfix, from userid 501) id 9C311BFA090; Tue, 18 Oct 2016 21:04:22 -0400 (EDT) Date: Tue, 18 Oct 2016 21:04:22 -0400 From: Denis Hainsworth To: users@conserver.com Subject: Re: what is normal conserver hang during reconfig Message-ID: <20161019010422.GQ27007@cs.brandeis.edu> Reply-To: Denis Hainsworth References: <20161014160544.GZ27007@cs.brandeis.edu> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20161014160544.GZ27007@cs.brandeis.edu> User-Agent: Mutt/1.5.21 (2010-09-15) X-Spam-Score: -2 () BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_FROM, SPF_PASS X-Scanned-By: MIMEDefang 2.72 on 198.151.248.21 X-BeenThere: users@conserver.com X-Mailman-Version: 2.1.23 Precedence: list List-Id: Conserver Users List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 19 Oct 2016 01:04:28 -0000 Running v 8.1.18. Rereading the SIGHUP section of the man page I'm still thinking I've configured something wrong. SIGHUP says conserver rereads the config files and then adds/deletes consoles as needed and only touches running consoles if they have changed. If thats true I wouldn't expect a 30s buffer of input/output on a console that hasn't changed, should I? I also don't see anything in CHANGES that sounds like this is a bug that has been fixed. -denis On Fri, Oct 14, 2016 at 12:05:44PM -0400, Denis Hainsworth wrote: > I love conserver. I have a minor issue and I was curious what options > there might be. > > So I have a conserver setup running against 262 servers (mostly digis or > ser2net machines). It works great. However when we need to update due > to a config change we run "kill -HUP" against the parent. With the > number of consoles (I think) this causes about a 30s "hang" when > interacting with any console which corresponds to the reconfig time. > > Does this make sense and is per the current design? Any chance there is > a clever way to make it block for less time? Barring that I intend to > spin up a new server to share the load of my current server and reduce > the reconfig time. > > I was mostly curious if there was a config issue or if this description > doesn't make any sense to folks and it means I have something else going > on like too many down consoles or something. > -denis > > -- > __________________________ > Denis Alan Hainsworth > denis.hainsworth@gmail.com -- __________________________ Denis Alan Hainsworth denis.hainsworth@gmail.com From bryan@conserver.com Wed Oct 19 03:39:20 2016 Received: from [192.168.0.133] (c-98-207-6-47.hsd1.ca.comcast.net [98.207.6.47]) (authenticated bits=0) by underdog.stansell.org (8.15.2/8.15.2) with ESMTPSA id u9J3dIF2004878 (version=TLSv1.2 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO) for ; Wed, 19 Oct 2016 03:39:20 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=conserver.com; s=s0001; t=1476848360; bh=EVhbaCdNZN+EogH1RjQEc+XOa5HfA1Hzg05V+lSI8is=; h=From:Subject:Date:References:To:In-Reply-To; b=cdvxnRoUa2uNi36owjIocA+rFhfsbCuNMGbszum92z8XjrzMvBQ8FJQHPX//VLOX5 qVyfWQyb2q258ydSYBfmZs2yOlEO94OC1El4ghEWyPvtzFzCBYeVCt0XT0cASV4jAa kiAk9zkMUVgMN1ulBNcg6WVzUG3GkYP2uEFMI//xiGJFO9sWNMYx51IPE14n42muIX 1XiuskNPukRMIAFqxEpESGeDNqgJOcoFqzFCMJi4tzLlwg3avjJIFJsfFkNfRGGrW3 EnYj67jXrWyzQzcCDYJpg2QVfQPQV7JI1RpnngKchVKVjsVi/+BAfC4E5J2p6BNNNE ikGF8FSWBKEfw== From: Bryan Stansell Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 10.0 \(3226\)) Subject: Re: what is normal conserver hang during reconfig Date: Tue, 18 Oct 2016 20:39:19 -0700 References: <20161014160544.GZ27007@cs.brandeis.edu> <20161019010422.GQ27007@cs.brandeis.edu> To: users@conserver.com In-Reply-To: <20161019010422.GQ27007@cs.brandeis.edu> Message-Id: <3FF6DA5D-E802-4C5B-A37A-B85CF2EF737D@conserver.com> X-Mailer: Apple Mail (2.3226) X-Scanned-By: MIMEDefang 2.72 on 198.151.248.21 Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by underdog.stansell.org id u9J3dIF2004878 X-BeenThere: users@conserver.com X-Mailman-Version: 2.1.23 Precedence: list List-Id: Conserver Users List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 19 Oct 2016 03:39:21 -0000 Off the top of my head, I agree that there shouldn't be anything fixed in the newer code to address this. The code does block all activity when it processes a HUP signal, but that's supposed to be "quick". :-| Each process (the main and children) rereads the config file and figures out if there's anything to do. The main process is in charge of spawning new consoles (or reconfigured), and the children are responsible for letting go of old ones (or reconfigured). With that in mind, how many consoles are each child managing? The compile time default can be seen with a "conserver -V", but it can be overridden with -m. I'm honestly not sure if having more or less would be better or even change things (more processes would use more cores, but also "slam" the system with that many things reading and processing the config). Conserver tries very hard to be multiplex across all the consoles, even when bringing up and tearing down things. The reread of the config puts all that on hold, so it probably has to do with that. One issue I've seen before is the magnitude of DNS lookups done when a config is loaded. It all depends on the config, of course, but you could end up generating a lot of requests. Maybe it doesn't apply in your environment, but it can be an unexpected source of trouble. Aside from that, another server will certainly share the load (and, set up right, the end users won't even notice). It would be interesting to look at an strace (assuming linux) of a process when it gets a HUP (even without any changes to configs). Just send one of the children a HUP so it minimizes the impact. With timestamps, it might highlight what is causing the issue (like the DNS query case, but could be anything). Bryan > On Oct 18, 2016, at 6:04 PM, Denis Hainsworth via users wrote: > > Running v 8.1.18. Rereading the SIGHUP section of the man page I'm > still thinking I've configured something wrong. SIGHUP says conserver > rereads the config files and then adds/deletes consoles as needed and > only touches running consoles if they have changed. If thats true I > wouldn't expect a 30s buffer of input/output on a console that hasn't > changed, should I? > I also don't see anything in CHANGES that sounds like this is a bug > that has been fixed. > > -denis > > On Fri, Oct 14, 2016 at 12:05:44PM -0400, Denis Hainsworth wrote: >> I love conserver. I have a minor issue and I was curious what options >> there might be. >> >> So I have a conserver setup running against 262 servers (mostly digis or >> ser2net machines). It works great. However when we need to update due >> to a config change we run "kill -HUP" against the parent. With the >> number of consoles (I think) this causes about a 30s "hang" when >> interacting with any console which corresponds to the reconfig time. >> >> Does this make sense and is per the current design? Any chance there is >> a clever way to make it block for less time? Barring that I intend to >> spin up a new server to share the load of my current server and reduce >> the reconfig time. >> >> I was mostly curious if there was a config issue or if this description >> doesn't make any sense to folks and it means I have something else going >> on like too many down consoles or something. >> -denis >> >> -- >> __________________________ >> Denis Alan Hainsworth >> denis.hainsworth@gmail.com > > -- > __________________________ > Denis Alan Hainsworth > denis.hainsworth@gmail.com > _______________________________________________ > users mailing list > users@conserver.com > https://www.conserver.com/mailman/listinfo/users From cfowler@outpostsentinel.com Wed Oct 19 04:16:13 2016 Received: from zcs-mta.vps-host.net (zcs-mta.vps-host.net [69.89.1.77]) by underdog.stansell.org (8.15.2/8.15.2) with ESMTPS id u9J4GApl006512 (version=TLSv1.2 cipher=ADH-AES256-GCM-SHA384 bits=256 verify=NO); Wed, 19 Oct 2016 04:16:12 GMT Received: from localhost (localhost.localdomain [127.0.0.1]) by zcs-mta.vps-host.net (Postfix) with ESMTP id 88A7181776F2; Wed, 19 Oct 2016 00:16:09 -0400 (EDT) Received: from zcs-mta.vps-host.net ([127.0.0.1]) by localhost (zcs-mta.vps-host.net [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id Wb-Bes1my-pT; Wed, 19 Oct 2016 00:16:07 -0400 (EDT) Received: from localhost (localhost.localdomain [127.0.0.1]) by zcs-mta.vps-host.net (Postfix) with ESMTP id 71C5481776F1; Wed, 19 Oct 2016 00:16:07 -0400 (EDT) X-Virus-Scanned: amavisd-new at zcs-mta.vps-host.net Received: from zcs-mta.vps-host.net ([127.0.0.1]) by localhost (zcs-mta.vps-host.net [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id 77_R5iRzTf0q; Wed, 19 Oct 2016 00:16:07 -0400 (EDT) Received: from enterprisemail2.vps-host.net (unknown [10.0.6.109]) by zcs-mta.vps-host.net (Postfix) with ESMTP id 506A48176AAE; Wed, 19 Oct 2016 00:16:07 -0400 (EDT) Date: Wed, 19 Oct 2016 00:16:07 -0400 (EDT) From: Chris Fowler To: Bryan Stansell Cc: users@conserver.com Message-ID: <82546220.9805268.1476850567099.JavaMail.zimbra@outpostsentinel.com> In-Reply-To: <3FF6DA5D-E802-4C5B-A37A-B85CF2EF737D@conserver.com> References: <20161014160544.GZ27007@cs.brandeis.edu> <20161019010422.GQ27007@cs.brandeis.edu> <3FF6DA5D-E802-4C5B-A37A-B85CF2EF737D@conserver.com> Subject: Re: what is normal conserver hang during reconfig MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_Part_9805267_1130606666.1476850567098" X-Mailer: Zimbra 8.6.0_GA_1194 (ZimbraWebClient - GC52 (Linux)/8.6.0_GA_1194) Thread-Topic: what is normal conserver hang during reconfig Thread-Index: kiVqs4mkBxZhlOj7ESKvTomJUPGaMQ== X-Spam-Score: -0.277 () BAYES_00,HTML_MESSAGE,SPF_HELO_PASS,URIBL_SBL X-Scanned-By: MIMEDefang 2.72 on 198.151.248.21 X-BeenThere: users@conserver.com X-Mailman-Version: 2.1.23 Precedence: list List-Id: Conserver Users List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 19 Oct 2016 04:16:14 -0000 ------=_Part_9805267_1130606666.1476850567098 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit > From: "Bryan Stansell via users" > To: users@conserver.com > Sent: Tuesday, October 18, 2016 11:39:19 PM > Subject: Re: what is normal conserver hang during reconfig > With that in mind, how many consoles are each child managing? The compile time > default can be seen with a "conserver -V", but it can be overridden with -m. > I'm honestly not sure if having more or less would be better or even change > things (more processes would use more cores, but also "slam" the system with > that many things reading and processing the config). > Conserver tries very hard to be multiplex across all the consoles, even when > bringing up and tearing down things. The reread of the config puts all that on > hold, so it probably has to do with that. > One issue I've seen before is the magnitude of DNS lookups done when a config is > loaded. It all depends on the config, of course, but you could end up > generating a lot of requests. Maybe it doesn't apply in your environment, but > it can be an unexpected source of trouble. Does a HUP close and open consoles? Does a HUP open consoles that are down? If it si going after consoles that are down and blocking that could be what is going on. On a local device I can manage 100 consoles with ease. Just a couple serial, the rest are programs or log file tails. Chris Chris ------=_Part_9805267_1130606666.1476850567098 Content-Type: text/html; charset=utf-8 Content-Transfer-Encoding: quoted-printable



From: "Br= yan Stansell via users" <users@conserver.com>
To: users@con= server.com
Sent: Tuesday, October 18, 2016 11:39:19 PM
Subj= ect: Re: what is normal conserver hang during reconfig
=


With that in mind, how ma= ny consoles are each child managing?  The compile time default can be = seen with a "conserver -V", but it can be overridden with -m.  I'm hon= estly not sure if having more or less would be better or even change things= (more processes would use more cores, but also "slam" the system with that= many things reading and processing the config).

Conserver tries ver= y hard to be multiplex across all the consoles, even when bringing up and t= earing down things.  The reread of the config puts all that on hold, s= o it probably has to do with that.

One issue I've seen before is the= magnitude of DNS lookups done when a config is loaded.  It all depend= s on the config, of course, but you could end up generating a lot of reques= ts.  Maybe it doesn't apply in your environment, but it can be an unex= pected source of trouble.
Does a HUP close and open co= nsoles?  Does a HUP open consoles that are down?   If it si going= after consoles that are down and blocking that could be what is going on.<= /div>

On a local device I can manag= e 100 consoles with ease.  Just a couple serial, the rest are programs= or log file tails.

Chris


<= div>Chris
------=_Part_9805267_1130606666.1476850567098-- From bryan@conserver.com Wed Oct 19 05:10:46 2016 Received: from [192.168.0.133] (c-98-207-6-47.hsd1.ca.comcast.net [98.207.6.47]) (authenticated bits=0) by underdog.stansell.org (8.15.2/8.15.2) with ESMTPSA id u9J5AiiK008388 (version=TLSv1.2 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO) for ; Wed, 19 Oct 2016 05:10:45 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=conserver.com; s=s0001; t=1476853846; bh=WEN0UKPRdwKWLUWj/EytUnYsrEQLD5Zo7eBjroCK6xU=; h=From:Subject:Date:References:To:In-Reply-To; b=jmXxwHF+SXlluetyz7e3Vi79cUSBi71nxWJ9D/HQwIvRvd6KPKkhLUCBgKRGCvqFR g8RquJxadfrFCY9eE2OfihR0+AxJ27oNhc6ILRU6itl9LzF+cLpAxBoSH8df0peobQ S0KYBnay0rmwVu7UmbdVa2/Cohp/ZwNrjTcw+WaSUSD1PRXbZWeh7wvPUxPDx60+4R eU4t08w7Hp1P7cGqo+CVPlWICm7d1Kc2QIqYDU2/WpUfibbFbSeF0f670mDYIFdE6u O3ah5uVZxr/qureAoz8PfZmZFOoxYPCOP4exw0fFNJNnmCuneINiiPUl9NP+TlOCcl OLY+iM+o8iTTA== From: Bryan Stansell Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 10.0 \(3226\)) Subject: Re: what is normal conserver hang during reconfig Date: Tue, 18 Oct 2016 22:10:43 -0700 References: <20161014160544.GZ27007@cs.brandeis.edu> <20161019010422.GQ27007@cs.brandeis.edu> <3FF6DA5D-E802-4C5B-A37A-B85CF2EF737D@conserver.com> <82546220.9805268.1476850567099.JavaMail.zimbra@outpostsentinel.com> To: users@conserver.com In-Reply-To: <82546220.9805268.1476850567099.JavaMail.zimbra@outpostsentinel.com> Message-Id: <3F2E8C78-12B9-462D-8017-9068A84A674A@conserver.com> X-Mailer: Apple Mail (2.3226) X-Scanned-By: MIMEDefang 2.72 on 198.151.248.21 Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by underdog.stansell.org id u9J5AiiK008388 X-BeenThere: users@conserver.com X-Mailman-Version: 2.1.23 Precedence: list List-Id: Conserver Users List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 19 Oct 2016 05:10:46 -0000 > On Oct 18, 2016, at 9:16 PM, Chris Fowler via users wrote: > > Does a HUP close and open consoles? Does a HUP open consoles that are down? If it si going after consoles that are down and blocking that could be what is going on. > A HUP doesn't close and open consoles (it will reopen log files though). It will try and open anything down (socket connections are set up to be non-blocking). Bryan From cfowler@outpostsentinel.com Wed Oct 19 05:15:06 2016 Received: from zcs-mta.vps-host.net (zcs-mta.vps-host.net [69.89.1.77]) by underdog.stansell.org (8.15.2/8.15.2) with ESMTPS id u9J5F2Ys008528 (version=TLSv1.2 cipher=ADH-AES256-GCM-SHA384 bits=256 verify=NO); Wed, 19 Oct 2016 05:15:05 GMT Received: from localhost (localhost.localdomain [127.0.0.1]) by zcs-mta.vps-host.net (Postfix) with ESMTP id 46BAA813C214; Wed, 19 Oct 2016 01:15:02 -0400 (EDT) Received: from zcs-mta.vps-host.net ([127.0.0.1]) by localhost (zcs-mta.vps-host.net [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id oFYWxUx0L3X7; Wed, 19 Oct 2016 01:15:01 -0400 (EDT) Received: from localhost (localhost.localdomain [127.0.0.1]) by zcs-mta.vps-host.net (Postfix) with ESMTP id CCD33813C230; Wed, 19 Oct 2016 01:15:01 -0400 (EDT) X-Virus-Scanned: amavisd-new at zcs-mta.vps-host.net Received: from zcs-mta.vps-host.net ([127.0.0.1]) by localhost (zcs-mta.vps-host.net [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id FACf4IKm6oY1; Wed, 19 Oct 2016 01:15:01 -0400 (EDT) Received: from enterprisemail2.vps-host.net (unknown [10.0.6.109]) by zcs-mta.vps-host.net (Postfix) with ESMTP id AB1E9813C20A; Wed, 19 Oct 2016 01:15:01 -0400 (EDT) Date: Wed, 19 Oct 2016 01:15:01 -0400 (EDT) From: Chris Fowler To: Bryan Stansell Cc: users@conserver.com Message-ID: <1361003158.9807224.1476854101546.JavaMail.zimbra@outpostsentinel.com> In-Reply-To: <3F2E8C78-12B9-462D-8017-9068A84A674A@conserver.com> References: <20161014160544.GZ27007@cs.brandeis.edu> <20161019010422.GQ27007@cs.brandeis.edu> <3FF6DA5D-E802-4C5B-A37A-B85CF2EF737D@conserver.com> <82546220.9805268.1476850567099.JavaMail.zimbra@outpostsentinel.com> <3F2E8C78-12B9-462D-8017-9068A84A674A@conserver.com> Subject: Re: what is normal conserver hang during reconfig MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_Part_9807223_892677169.1476854101545" X-Mailer: Zimbra 8.6.0_GA_1194 (ZimbraWebClient - GC52 (Linux)/8.6.0_GA_1194) Thread-Topic: what is normal conserver hang during reconfig Thread-Index: gmUmuYlIxXi+2O0yVaRDg2c7hAoX+A== X-Spam-Score: -0.277 () BAYES_00,HTML_MESSAGE,SPF_HELO_PASS,URIBL_SBL X-Scanned-By: MIMEDefang 2.72 on 198.151.248.21 X-BeenThere: users@conserver.com X-Mailman-Version: 2.1.23 Precedence: list List-Id: Conserver Users List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 19 Oct 2016 05:15:06 -0000 ------=_Part_9807223_892677169.1476854101545 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit > From: "Bryan Stansell via users" > To: users@conserver.com > Sent: Wednesday, October 19, 2016 1:10:43 AM > Subject: Re: what is normal conserver hang during reconfig > > On Oct 18, 2016, at 9:16 PM, Chris Fowler via users wrote: >> Does a HUP close and open consoles? Does a HUP open consoles that are down? If >> it si going after consoles that are down and blocking that could be what is > > going on. > A HUP doesn't close and open consoles (it will reopen log files though). It will > try and open anything down (socket connections are set up to be non-blocking). A could down with DNS look ups could be the culprit. At this point I'd strace it to see where it is spending time. ------=_Part_9807223_892677169.1476854101545 Content-Type: text/html; charset=utf-8 Content-Transfer-Encoding: quoted-printable



From: "Br= yan Stansell via users" <users@conserver.com>
To: users@con= server.com
Sent: Wednesday, October 19, 2016 1:10:43 AM
Sub= ject: Re: what is normal conserver hang during reconfig
> On Oct 18, 2016, at 9:16 PM= , Chris Fowler via users <users@conserver.com> wrote:
>
>= ; Does a HUP close and open consoles?  Does a HUP open consoles that a= re down?   If it si going after consoles that are down and blocking th= at could be what is going on.
>

A HUP doesn't close and open = consoles (it will reopen log files though).  It will try and open anyt= hing down (socket connections are set up to be non-blocking).
A could down with DNS look ups could be the culprit.  At this = point I'd strace it to see where it is spending time.


= ------=_Part_9807223_892677169.1476854101545-- From denis.hainsworth@gmail.com Wed Oct 19 13:35:09 2016 Received: from mail-qk0-f175.google.com (mail-qk0-f175.google.com [209.85.220.175]) by underdog.stansell.org (8.15.2/8.15.2) with ESMTPS id u9JDZ6Rb014391 (version=TLSv1.2 cipher=AES128-GCM-SHA256 bits=128 verify=OK); Wed, 19 Oct 2016 13:35:08 GMT Received: by mail-qk0-f175.google.com with SMTP id n189so35565847qke.0; Wed, 19 Oct 2016 06:35:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=date:from:to:cc:subject:message-id:reply-to:references:mime-version :content-disposition:in-reply-to:user-agent; bh=DCGXnpfEygyzU1Pi3T7mCny0Be8nw/tr3yajPHolK40=; b=GeDbOe0cO94Nsux3OaywyUY6ICqpFK3BaAC3Xl3rhE58hC1GpWkAyosDk+iIr/4G96 6ghCIXBy5xFztzDxwTNmA+0LPi8oXMshvv+Z2a28iWXPYLTXYKZyGqEzf4ryA7aouRt3 I9O9vCaItG59TVaQ5Mh/PXmXLeaNNobV8q3xuRsJdKELMHyvjD0qQ63nA96coHDmhwb7 fS8H226TQ/iDT/gRLvLXorIRAd10WvwpAwl78Xyb/gCPpaf4VmYT9Tw6n1Jst/PXHUZK HLbXzGvrOtjffOrC9P8eruDdxdvLnwkp0BH6QCfxlzh8oXyrHoOtqWADU/24WoW4zms8 V9fA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:date:from:to:cc:subject:message-id:reply-to :references:mime-version:content-disposition:in-reply-to:user-agent; bh=DCGXnpfEygyzU1Pi3T7mCny0Be8nw/tr3yajPHolK40=; b=SyJJA3CqHELqUP4PoUMEGNWBtVOoYZhCwpuJyYL66E5VFCRrXCB9t5tLooINFJxuWU JImSq3cx071WELf+p6i8FBXU514WU2zWXoEfyFK95QtPn62GhrV5qxhWPmo8xDP+MQm8 XOBsGUtkzmXmurPUJSd8bEsEtVJyaYQXlEfxR8hnLZEfbyhx+7hpE5GQWwhUAgJSiQGL O1rrRgZRatgdN8nwQZbGlofpQ7fiAif6MbCJK7/rqKS68gBYCaJI3f81BYSrapnlXPA5 5dRSyXJYPBPiWFm4eAL6jl7ZfZZpUD7TXzmwX7gdUmnBt1Y3d/zAO/Wel5YBVVQF6Hcp JUUQ== X-Gm-Message-State: ABUngvcggZZWidtYb4kpM8CV66byA+9c0awepHiolS74mqjaDyjg//nR/jpSLeCSg5ixHw== X-Received: by 10.55.212.195 with SMTP id s64mr5944879qks.216.1476884103597; Wed, 19 Oct 2016 06:35:03 -0700 (PDT) Received: from xmas.dyndns.org (cl-890.chi-02.us.sixxs.net. [2001:4978:f:379::2]) by smtp.gmail.com with ESMTPSA id w72sm20862186qkb.33.2016.10.19.06.35.02 (version=TLS1 cipher=AES128-SHA bits=128/128); Wed, 19 Oct 2016 06:35:02 -0700 (PDT) Received: by xmas.dyndns.org (Postfix, from userid 501) id B0A9BBFA090; Wed, 19 Oct 2016 09:35:00 -0400 (EDT) Date: Wed, 19 Oct 2016 09:35:00 -0400 From: Denis Hainsworth To: Chris Fowler Cc: Bryan Stansell , users@conserver.com Subject: Re: what is normal conserver hang during reconfig Message-ID: <20161019133500.GR27007@cs.brandeis.edu> Reply-To: Denis Hainsworth References: <20161014160544.GZ27007@cs.brandeis.edu> <20161019010422.GQ27007@cs.brandeis.edu> <3FF6DA5D-E802-4C5B-A37A-B85CF2EF737D@conserver.com> <82546220.9805268.1476850567099.JavaMail.zimbra@outpostsentinel.com> <3F2E8C78-12B9-462D-8017-9068A84A674A@conserver.com> <1361003158.9807224.1476854101546.JavaMail.zimbra@outpostsentinel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1361003158.9807224.1476854101546.JavaMail.zimbra@outpostsentinel.com> User-Agent: Mutt/1.5.21 (2010-09-15) X-Spam-Score: -2 () BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_FROM, SPF_PASS X-Scanned-By: MIMEDefang 2.72 on 198.151.248.21 X-BeenThere: users@conserver.com X-Mailman-Version: 2.1.23 Precedence: list List-Id: Conserver Users List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 19 Oct 2016 13:35:10 -0000 Thanks for the ideas guys, I'll see what I can dig up. I only realized last night my first email was sent before I updated my subscription address so the list just quietly ignored it :) -denis From denis.hainsworth@gmail.com Sun Oct 23 06:42:18 2016 Received: from mail-qk0-f175.google.com (mail-qk0-f175.google.com [209.85.220.175]) by underdog.stansell.org (8.15.2/8.15.2) with ESMTPS id u9N6gDsB001310 (version=TLSv1.2 cipher=AES128-GCM-SHA256 bits=128 verify=OK); Sun, 23 Oct 2016 06:42:17 GMT Received: by mail-qk0-f175.google.com with SMTP id o68so201059808qkf.3; Sat, 22 Oct 2016 23:42:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=date:from:to:cc:subject:message-id:reply-to:references:mime-version :content-disposition:in-reply-to:user-agent; bh=tjxkh0QchxcBAlu45AsQxoGRZIkEUPAlLe24VrsCn68=; b=skUUkwboAENNyMxHI3oCLw+CWJP4d2UEEj1Vc8lFzhAbdSMMeaGyaD3Jk7XAh/KO9w CeIcmmYxXUR0rssuBBw/MAHEdnKX2BUDAd9uwRRA+Ded3bvDUXXeV2covPOmTjH0Rmuq i/CCMb/FAeJcm07Dcww/XHf7wuHrgiEwivqpfGfdSF3X/f8nGKt6GEMeREnMIA/z9i9Z D9kG/cr1h7hEAzDJ+KOHloYiXCTM45qFGKbZlsCpBtvb/FdfyoSJt3gUyWj970G9HA6X KOaKXxbbGdA1/TjtoIU8VXeDQk72nsX0CqWHAnrB23r7FhYlJ+3oGFcs7Rk2mD2cFpk+ MdxQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:date:from:to:cc:subject:message-id:reply-to :references:mime-version:content-disposition:in-reply-to:user-agent; bh=tjxkh0QchxcBAlu45AsQxoGRZIkEUPAlLe24VrsCn68=; b=eu51WEcLwEQxz3bELJKdOoWfGedTf9sDwBrUISJCJ9IPJaR0+azvHWpetIA+V5LNwB EGsuYDdzUwUMzhAU8AYv7LUrsSONmo0nRe5trLlyA2pXeAeq3bYYVoKr6Vse91KHWV2K 5oOi25pXAYHG6xBEpBTlPYVDLTjLWFv75fC8eGc939mcaVtwYdDMkkZcTHoyyf0dyBFZ MZ01WLfrTUX+cDGc+dmP+mYOAQTRNqCANhx+hK9/1HIQSg1PQTCRiamsW7FqLbkJpHWY WIHjYISgfHDUtvgA84GabfMFFxcpUs5SNbndVcEL652yODltEQGy8a2+XL+xEryKyJhu A66g== X-Gm-Message-State: ABUngvcfeILp2bKKR2fLx2k26hiQkVHXjeVtDaD7bB9jVdxH/wkrfYrb3dWS72JQGusQdw== X-Received: by 10.55.209.147 with SMTP id o19mr10217645qkl.125.1477204932796; Sat, 22 Oct 2016 23:42:12 -0700 (PDT) Received: from xmas.dyndns.org (cl-890.chi-02.us.sixxs.net. [2001:4978:f:379::2]) by smtp.gmail.com with ESMTPSA id f62sm5595163qka.3.2016.10.22.23.42.11 (version=TLS1 cipher=AES128-SHA bits=128/128); Sat, 22 Oct 2016 23:42:12 -0700 (PDT) Received: by xmas.dyndns.org (Postfix, from userid 501) id 3A19D8C21B9; Sun, 23 Oct 2016 02:42:10 -0400 (EDT) Date: Sun, 23 Oct 2016 02:42:10 -0400 From: Denis Hainsworth To: Bryan Stansell Cc: users@conserver.com Subject: Re: what is normal conserver hang during reconfig Message-ID: <20161023064210.GF6698@cs.brandeis.edu> Reply-To: Denis Hainsworth References: <20161014160544.GZ27007@cs.brandeis.edu> <20161019010422.GQ27007@cs.brandeis.edu> <3FF6DA5D-E802-4C5B-A37A-B85CF2EF737D@conserver.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <3FF6DA5D-E802-4C5B-A37A-B85CF2EF737D@conserver.com> User-Agent: Mutt/1.5.21 (2010-09-15) X-Spam-Score: -0.377 () BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_FROM, SPF_PASS, URIBL_SBL X-Scanned-By: MIMEDefang 2.72 on 198.151.248.21 X-BeenThere: users@conserver.com X-Mailman-Version: 2.1.23 Precedence: list List-Id: Conserver Users List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 23 Oct 2016 06:42:18 -0000 Finally got time to look at things. strace is perfect, thanks for suggesting that. So running something like strace -t -o strace.out.2 -p 3198 and sending a SIGHUP to the parent process showed the issue. So the way we've always set things up was to automatically generate one config file per console server from our equipment database. This means There are 264 files that are #included into the main config file. The first 30s of "hang" is each process opening each file reading it in and closing it, I'm wondering if we need to block I/O during this or perhaps that could be done before we start blocking? Once that is done there is another 10s of hang while we do the dns lookup for each console host as you thought (open /etc/hosts, make a dns query, resolve it). I tried putting all the configs into one file but that didnt change anything. So then I started wondering. Our IT had long ago made the console servers VMs. Its never seemed like an issue but I compared some basic dd commands and found my problem server has terrible IO throughput ... sigh. To compare one of my good servers has about 80Mbp/s read/write and the bad one has around 15Mbp/s read/write. So I'm going to look into moving the VM or get the disk perf up which should solve most of my issues but I also wonder if the conserver code could be re-organized without too much trouble to avoid issues of blocking when there is slow disk? Its possible what I'm asking is dumb, just throwing it out there. -denis On Tue, Oct 18, 2016 at 08:39:19PM -0700, Bryan Stansell via users wrote: > Off the top of my head, I agree that there shouldn't be anything fixed in the newer code to address this. The code does block all activity when it processes a HUP signal, but that's supposed to be "quick". :-| > > Each process (the main and children) rereads the config file and figures out if there's anything to do. The main process is in charge of spawning new consoles (or reconfigured), and the children are responsible for letting go of old ones (or reconfigured). > > With that in mind, how many consoles are each child managing? The compile time default can be seen with a "conserver -V", but it can be overridden with -m. I'm honestly not sure if having more or less would be better or even change things (more processes would use more cores, but also "slam" the system with that many things reading and processing the config). > > Conserver tries very hard to be multiplex across all the consoles, even when bringing up and tearing down things. The reread of the config puts all that on hold, so it probably has to do with that. > > One issue I've seen before is the magnitude of DNS lookups done when a config is loaded. It all depends on the config, of course, but you could end up generating a lot of requests. Maybe it doesn't apply in your environment, but it can be an unexpected source of trouble. > > Aside from that, another server will certainly share the load (and, set up right, the end users won't even notice). It would be interesting to look at an strace (assuming linux) of a process when it gets a HUP (even without any changes to configs). Just send one of the children a HUP so it minimizes the impact. With timestamps, it might highlight what is causing the issue (like the DNS query case, but could be anything). > > Bryan > > > On Oct 18, 2016, at 6:04 PM, Denis Hainsworth via users wrote: > > > > Running v 8.1.18. Rereading the SIGHUP section of the man page I'm > > still thinking I've configured something wrong. SIGHUP says conserver > > rereads the config files and then adds/deletes consoles as needed and > > only touches running consoles if they have changed. If thats true I > > wouldn't expect a 30s buffer of input/output on a console that hasn't > > changed, should I? > > I also don't see anything in CHANGES that sounds like this is a bug > > that has been fixed. > > > > -denis > > > > On Fri, Oct 14, 2016 at 12:05:44PM -0400, Denis Hainsworth wrote: > >> I love conserver. I have a minor issue and I was curious what options > >> there might be. > >> > >> So I have a conserver setup running against 262 servers (mostly digis or > >> ser2net machines). It works great. However when we need to update due > >> to a config change we run "kill -HUP" against the parent. With the > >> number of consoles (I think) this causes about a 30s "hang" when > >> interacting with any console which corresponds to the reconfig time. > >> > >> Does this make sense and is per the current design? Any chance there is > >> a clever way to make it block for less time? Barring that I intend to > >> spin up a new server to share the load of my current server and reduce > >> the reconfig time. > >> > >> I was mostly curious if there was a config issue or if this description > >> doesn't make any sense to folks and it means I have something else going > >> on like too many down consoles or something. > >> -denis > >> > >> -- > >> __________________________ > >> Denis Alan Hainsworth > >> denis.hainsworth@gmail.com > > > > -- > > __________________________ > > Denis Alan Hainsworth > > denis.hainsworth@gmail.com > > _______________________________________________ > > users mailing list > > users@conserver.com > > https://www.conserver.com/mailman/listinfo/users > > > _______________________________________________ > users mailing list > users@conserver.com > https://www.conserver.com/mailman/listinfo/users -- __________________________ Denis Alan Hainsworth denis.hainsworth@gmail.com From bryan@conserver.com Sun Oct 23 17:34:36 2016 Received: from [192.168.0.132] (c-98-207-6-47.hsd1.ca.comcast.net [98.207.6.47]) (authenticated bits=0) by underdog.stansell.org (8.15.2/8.15.2) with ESMTPSA id u9NHYY2g012565 (version=TLSv1.2 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO) for ; Sun, 23 Oct 2016 17:34:36 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=conserver.com; s=s0001; t=1477244076; bh=xXEDm1btcvPLVErj9fGuwzUMqrksitFGhAAlNxInC7k=; h=From:Subject:Date:References:To:In-Reply-To; b=AzALwaUFdFG8VuRPsUevase76KOR/H0lygxjMofuTH8mTiUiHWoFCszjE+TjQSsHh X2UYvjSPIuH/0XZ1FMbw295dBE7GV8DoywTta3Ropk0o13bcAv1QdtqPcG4dpAKs8p QrY2/vS1ltb7YMxg3pFHAe4uoPRhYs8rnfmgHblYr0drfhzPKzHdPDh6RU4Q7x9xDI xzZjOQn30n2JgnmIdNoHVNCLVTx6Sy0JqWYvwfgNTEoza0z7O9Iffmba9RVZ2SVI5B +byfLqY6X7g/iXwhOfAvVlagfBASwaNVY5nNSq4IBgEtBjrVSy2oeX7ale05G8TQwh AFJi641BtfbCQ== From: Bryan Stansell Content-Type: text/plain; charset=us-ascii Mime-Version: 1.0 (Mac OS X Mail 10.0 \(3226\)) Subject: Re: what is normal conserver hang during reconfig Date: Sun, 23 Oct 2016 10:34:36 -0700 References: <20161014160544.GZ27007@cs.brandeis.edu> <20161019010422.GQ27007@cs.brandeis.edu> <3FF6DA5D-E802-4C5B-A37A-B85CF2EF737D@conserver.com> <20161023064210.GF6698@cs.brandeis.edu> To: users@conserver.com In-Reply-To: <20161023064210.GF6698@cs.brandeis.edu> Message-Id: <4863A133-4C31-424B-9837-5777A6EA321F@conserver.com> X-Mailer: Apple Mail (2.3226) X-Scanned-By: MIMEDefang 2.72 on 198.151.248.21 Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by underdog.stansell.org id u9NHYY2g012565 X-BeenThere: users@conserver.com X-Mailman-Version: 2.1.23 Precedence: list List-Id: Conserver Users List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 23 Oct 2016 17:34:37 -0000 I'm glad you were able to find the source of "most" of your troubles. I quote that because, yes, theoretically the code could be a lot nicer and not block while reconfiguring. The code that does that never got folded into the loop that handles I/O, but could...and really should. No one has ever called it out as a serious enough problem before. :-) I'll certainly put it on the list to look at...but it's not a "simple" change, that's for sure. Bryan > On Oct 22, 2016, at 11:42 PM, Denis Hainsworth via users wrote: > > Finally got time to look at things. strace is perfect, thanks for > suggesting that. > > So running something like > strace -t -o strace.out.2 -p 3198 > and sending a SIGHUP to the parent process showed the issue. > > So the way we've always set things up was to automatically generate one > config file per console server from our equipment database. This means > There are 264 files that are #included into the main config file. > The first 30s of "hang" is each process opening each file reading it in > and closing it, I'm wondering if we need to block I/O during this or > perhaps that could be done before we start blocking? > Once that is done there is another 10s of hang while we do the dns > lookup for each console host as you thought (open /etc/hosts, make a dns > query, resolve it). > > I tried putting all the configs into one file but that didnt change > anything. So then I started wondering. Our IT had long ago made the > console servers VMs. Its never seemed like an issue but I compared > some basic dd commands and found my problem server has terrible IO > throughput ... sigh. To compare one of my good servers has about > 80Mbp/s read/write and the bad one has around 15Mbp/s read/write. > > So I'm going to look into moving the VM or get the disk perf up which > should solve most of my issues but I also wonder if the conserver code > could be re-organized without too much trouble to avoid issues of > blocking when there is slow disk? Its possible what I'm asking is dumb, > just throwing it out there. > > -denis From denis.hainsworth@gmail.com Sun Oct 23 17:57:42 2016 Received: from mail-qk0-f182.google.com (mail-qk0-f182.google.com [209.85.220.182]) by underdog.stansell.org (8.15.2/8.15.2) with ESMTPS id u9NHvd4I013079 (version=TLSv1.2 cipher=AES128-GCM-SHA256 bits=128 verify=OK); Sun, 23 Oct 2016 17:57:41 GMT Received: by mail-qk0-f182.google.com with SMTP id o68so210298078qkf.3; Sun, 23 Oct 2016 10:57:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=date:from:to:cc:subject:message-id:reply-to:references:mime-version :content-disposition:in-reply-to:user-agent; bh=CUn74pvi+GDRhFo+Cwsq3reRmKxpDRH1bdxitb22rig=; b=zAlkh4GiL+/mctxh74qjFPQ20YBCHRHmSjXEvq/OCI7z1a4CFRBuAuUI0Bey3RWm3v lvYZss7D06PjygNQu3vvTf+OlWTt53V+lX8mWY/vNmwBKIiZKPUtUJH+r+dlw+dlYrRW z8lBFJv7PWqZvXVQaod6Z2ykhGe0Ca85Cjn7uqfe5FrR+ds4YsoRIHbzNOp4TPiwxM6y br7Zsa9XeAqqsZruqQgq9JIt/Kjw4hmBZ0tXonkg9ziSsQWRlwNUdg8WH7rym2/9N/Pn fN9t4BBtkPqQaxh9+d0vhtA/EIgw29DVdOLR5Z9mZkLhOEZ/K7fvRKMTdwwsscvQfhLp aTIA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:date:from:to:cc:subject:message-id:reply-to :references:mime-version:content-disposition:in-reply-to:user-agent; bh=CUn74pvi+GDRhFo+Cwsq3reRmKxpDRH1bdxitb22rig=; b=NnWnp2SOrghtFICqQzYFC7OJas4yfsX3yRCt3iAiwxOECEb5eKofHDV8v0yEYvL74I 70LhcsE5xbyE0foygluqJO/+z1T7tobOPzjNDbtTPpeD+YnDUJzvP/z1g7N/s9upUfbz IfHd4g53Ag6wE2I3P9IYOKNPmshJjuGrciBQvE73BZFkf7Fx3/xsmBO4LTJtctLSShY4 TEquXV6oDRT7bbHQpfj9NxoCG65jl2ro0k2zjj8eCf1MGEoYb1JylCRoa0GRP0AkCak3 ABYxMTgSi84M8yLjLkOz9ZOZ0c0ZgPLwHNqOeOoDGfDqWcSgt3AzEFg4xA4YOdQHwVo2 MO2g== X-Gm-Message-State: ABUngvdHjD63JN/8YH4CQmVM8HUN/BeGms6fgH574HX4gq1YQEzdP3OjM8RCCms8SstTPA== X-Received: by 10.55.200.152 with SMTP id t24mr12622482qkl.205.1477245457769; Sun, 23 Oct 2016 10:57:37 -0700 (PDT) Received: from xmas.dyndns.org (cl-890.chi-02.us.sixxs.net. [2001:4978:f:379::2]) by smtp.gmail.com with ESMTPSA id n77sm6580252qkn.28.2016.10.23.10.57.37 (version=TLS1 cipher=AES128-SHA bits=128/128); Sun, 23 Oct 2016 10:57:37 -0700 (PDT) Received: by xmas.dyndns.org (Postfix, from userid 501) id F07F68C21B9; Sun, 23 Oct 2016 13:57:35 -0400 (EDT) Date: Sun, 23 Oct 2016 13:57:35 -0400 From: Denis Hainsworth To: Bryan Stansell Cc: users@conserver.com Subject: Re: what is normal conserver hang during reconfig Message-ID: <20161023175735.GH6698@cs.brandeis.edu> Reply-To: Denis Hainsworth References: <20161014160544.GZ27007@cs.brandeis.edu> <20161019010422.GQ27007@cs.brandeis.edu> <3FF6DA5D-E802-4C5B-A37A-B85CF2EF737D@conserver.com> <20161023064210.GF6698@cs.brandeis.edu> <4863A133-4C31-424B-9837-5777A6EA321F@conserver.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4863A133-4C31-424B-9837-5777A6EA321F@conserver.com> User-Agent: Mutt/1.5.21 (2010-09-15) X-Spam-Score: -0.377 () BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_FROM, SPF_PASS, URIBL_SBL X-Scanned-By: MIMEDefang 2.72 on 198.151.248.21 X-BeenThere: users@conserver.com X-Mailman-Version: 2.1.23 Precedence: list List-Id: Conserver Users List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 23 Oct 2016 17:57:43 -0000 Dang it, my theory didn't pan out. While the slower of the two did in fact have slower disks my IT was able to move the VM to some ultra fast storage and my reconfig loop wasn't any faster. :( And it was such a lovely theory too. So I'm still digging to see if I can come up with a second clever idea but I have a feeling to reduce to reconfig time I'll just have to spread the load over more systems. -denis On Sun, Oct 23, 2016 at 10:34:36AM -0700, Bryan Stansell via users wrote: > I'm glad you were able to find the source of "most" of your troubles. I quote that because, yes, theoretically the code could be a lot nicer and not block while reconfiguring. The code that does that never got folded into the loop that handles I/O, but could...and really should. No one has ever called it out as a serious enough problem before. :-) > > I'll certainly put it on the list to look at...but it's not a "simple" change, that's for sure. > > Bryan > > > On Oct 22, 2016, at 11:42 PM, Denis Hainsworth via users wrote: > > > > Finally got time to look at things. strace is perfect, thanks for > > suggesting that. > > > > So running something like > > strace -t -o strace.out.2 -p 3198 > > and sending a SIGHUP to the parent process showed the issue. > > > > So the way we've always set things up was to automatically generate one > > config file per console server from our equipment database. This means > > There are 264 files that are #included into the main config file. > > The first 30s of "hang" is each process opening each file reading it in > > and closing it, I'm wondering if we need to block I/O during this or > > perhaps that could be done before we start blocking? > > Once that is done there is another 10s of hang while we do the dns > > lookup for each console host as you thought (open /etc/hosts, make a dns > > query, resolve it). > > > > I tried putting all the configs into one file but that didnt change > > anything. So then I started wondering. Our IT had long ago made the > > console servers VMs. Its never seemed like an issue but I compared > > some basic dd commands and found my problem server has terrible IO > > throughput ... sigh. To compare one of my good servers has about > > 80Mbp/s read/write and the bad one has around 15Mbp/s read/write. > > > > So I'm going to look into moving the VM or get the disk perf up which > > should solve most of my issues but I also wonder if the conserver code > > could be re-organized without too much trouble to avoid issues of > > blocking when there is slow disk? Its possible what I'm asking is dumb, > > just throwing it out there. > > > > -denis > > > _______________________________________________ > users mailing list > users@conserver.com > https://www.conserver.com/mailman/listinfo/users -- __________________________ Denis Alan Hainsworth denis.hainsworth@gmail.com From denis.hainsworth@gmail.com Mon Oct 24 00:46:40 2016 Received: from mail-qk0-f180.google.com (mail-qk0-f180.google.com [209.85.220.180]) by underdog.stansell.org (8.15.2/8.15.2) with ESMTPS id u9O0kbQq026657 (version=TLSv1.2 cipher=AES128-GCM-SHA256 bits=128 verify=OK); Mon, 24 Oct 2016 00:46:39 GMT Received: by mail-qk0-f180.google.com with SMTP id o68so216548837qkf.3; Sun, 23 Oct 2016 17:46:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=date:from:to:cc:subject:message-id:reply-to:references:mime-version :content-disposition:in-reply-to:user-agent; bh=HFq0OvgiMayf6CVAM6zlwpLFFbmGD8a9pywWjEXAlcU=; b=FnBXNc4zojsSQcdzBU347qOqngvRvaVwXQPcBjuYj6Vtfgq3LT+bz9l477glSJtfwJ VMk9+WE5OQTHBuoMZ6B8n6/8qEkuD2tuzALAcn90h7uSI8lfKW3fxKfWl/fsZ31cEfPq pN4MszvOVjzAXdY4pM9F1NF+76V8Iu46BvueBl49iBVzwQ/cJQYW6f/YMnC84L+mmO4p sMhXt91OBxK77sSYcYX33P3p5SIhOpBZplvHM3J97s5LWsDTRA64hb8fJ+ohZ5hCsZpL NrBpLb4rmF6qGuDWWtCG4I/5X2+f+nVaDtArR35y9ivheg0u7tW1LoP5pzTccsysy+/B J0mg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:date:from:to:cc:subject:message-id:reply-to :references:mime-version:content-disposition:in-reply-to:user-agent; bh=HFq0OvgiMayf6CVAM6zlwpLFFbmGD8a9pywWjEXAlcU=; b=C/wEzEp1x81hT7uoOtJvDA6XwU2vmnWJDPd5bmTZEuChULoG8mwS8iUGfyfU1QeLvB 3ohIX2JfnsdRgEeFxmCbQ1lt5yzdcM5nwS/ELgK/i9xD+H7fwErTq/T9VN1lnhPa7BiK Ar3OoTfBe/7fgW+pSOXhnAJVqGiZEanirkCTnN0KyHbeyweQyoQLkMtyMz8zNmhvdR8z xiHaNQme5eb9lQHaJ0QmHw4Y3v5C9Og2FoaXJusysjrte0cgAaX7zaZbM/a0wb0Sj3mN 6AGfz0AAjNqD+NzSgHWJ9fTm96CQYkMMK3Ft/A2w5ioaynsST38rwoG+rvoqNIFB+8ah FLEA== X-Gm-Message-State: ABUngvd9g0d4qcjqwl+vaV9kvjOt3fGHbDw47gr6aQAtbNvwozmIxiKxTijD1w1bWJS3JA== X-Received: by 10.55.167.201 with SMTP id q192mr9515523qke.61.1477269996431; Sun, 23 Oct 2016 17:46:36 -0700 (PDT) Received: from xmas.dyndns.org (cl-890.chi-02.us.sixxs.net. [2001:4978:f:379::2]) by smtp.gmail.com with ESMTPSA id 21sm7247104qkg.27.2016.10.23.17.46.35 (version=TLS1 cipher=AES128-SHA bits=128/128); Sun, 23 Oct 2016 17:46:35 -0700 (PDT) Received: by xmas.dyndns.org (Postfix, from userid 501) id 17EB68C21B9; Sun, 23 Oct 2016 20:46:34 -0400 (EDT) Date: Sun, 23 Oct 2016 20:46:33 -0400 From: Denis Hainsworth To: Bryan Stansell Cc: users@conserver.com Subject: Re: what is normal conserver hang during reconfig Message-ID: <20161024004633.GK6698@cs.brandeis.edu> Reply-To: Denis Hainsworth References: <20161014160544.GZ27007@cs.brandeis.edu> <20161019010422.GQ27007@cs.brandeis.edu> <3FF6DA5D-E802-4C5B-A37A-B85CF2EF737D@conserver.com> <20161023064210.GF6698@cs.brandeis.edu> <4863A133-4C31-424B-9837-5777A6EA321F@conserver.com> <20161023175735.GH6698@cs.brandeis.edu> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20161023175735.GH6698@cs.brandeis.edu> User-Agent: Mutt/1.5.21 (2010-09-15) X-Spam-Score: -0.377 () BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_FROM, SPF_PASS, URIBL_SBL X-Scanned-By: MIMEDefang 2.72 on 198.151.248.21 X-BeenThere: users@conserver.com X-Mailman-Version: 2.1.23 Precedence: list List-Id: Conserver Users List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 24 Oct 2016 00:46:42 -0000 So i think I have a solution which avoids the issue rather than fixing anything :) I had to try to recall why we have things the way we do since its going on like 10 years or more (have i said I love conserver?) So back in the day we set up the single conserver instance which was obviously going to be the master. We later started populating a couple other servers at a few different sites. To keep things simple and robust every server got the same config set and could be the master. However in reality I'm pretty sure no one uses anything but the default master ever. So if I reduce the configs on all the slave servers, especially the ones that reconfig the most and are causing the most grief to the users when it takes 40s, to only the configs they actually own then my times drop down to 3s and 13s respectively. Not a perfect solution but it should require a minimum of changes to everyone involved. Hopefully no one will remind me tomorrow of something I forgot :) -denis (purveyor of good enough solutions) On Sun, Oct 23, 2016 at 01:57:35PM -0400, Denis Hainsworth wrote: > Dang it, my theory didn't pan out. While the slower of the two did in > fact have slower disks my IT was able to move the VM to some ultra fast > storage and my reconfig loop wasn't any faster. :( And it was such a > lovely theory too. > > So I'm still digging to see if I can come up with a second clever idea > but I have a feeling to reduce to reconfig time I'll just have to spread > the load over more systems. > > -denis > > On Sun, Oct 23, 2016 at 10:34:36AM -0700, Bryan Stansell via users wrote: > > I'm glad you were able to find the source of "most" of your troubles. I quote that because, yes, theoretically the code could be a lot nicer and not block while reconfiguring. The code that does that never got folded into the loop that handles I/O, but could...and really should. No one has ever called it out as a serious enough problem before. :-) > > > > I'll certainly put it on the list to look at...but it's not a "simple" change, that's for sure. > > > > Bryan > > > > > On Oct 22, 2016, at 11:42 PM, Denis Hainsworth via users wrote: > > > > > > Finally got time to look at things. strace is perfect, thanks for > > > suggesting that. > > > > > > So running something like > > > strace -t -o strace.out.2 -p 3198 > > > and sending a SIGHUP to the parent process showed the issue. > > > > > > So the way we've always set things up was to automatically generate one > > > config file per console server from our equipment database. This means > > > There are 264 files that are #included into the main config file. > > > The first 30s of "hang" is each process opening each file reading it in > > > and closing it, I'm wondering if we need to block I/O during this or > > > perhaps that could be done before we start blocking? > > > Once that is done there is another 10s of hang while we do the dns > > > lookup for each console host as you thought (open /etc/hosts, make a dns > > > query, resolve it). > > > > > > I tried putting all the configs into one file but that didnt change > > > anything. So then I started wondering. Our IT had long ago made the > > > console servers VMs. Its never seemed like an issue but I compared > > > some basic dd commands and found my problem server has terrible IO > > > throughput ... sigh. To compare one of my good servers has about > > > 80Mbp/s read/write and the bad one has around 15Mbp/s read/write. > > > > > > So I'm going to look into moving the VM or get the disk perf up which > > > should solve most of my issues but I also wonder if the conserver code > > > could be re-organized without too much trouble to avoid issues of > > > blocking when there is slow disk? Its possible what I'm asking is dumb, > > > just throwing it out there. > > > > > > -denis > > > > > > _______________________________________________ > > users mailing list > > users@conserver.com > > https://www.conserver.com/mailman/listinfo/users > > -- > __________________________ > Denis Alan Hainsworth > denis.hainsworth@gmail.com -- __________________________ Denis Alan Hainsworth denis.hainsworth@gmail.com From consoleteam@gmail.com Tue Oct 25 15:00:23 2016 Received: from mail-wm0-f47.google.com (mail-wm0-f47.google.com [74.125.82.47]) by underdog.stansell.org (8.15.2/8.15.2) with ESMTPS id u9PF0J4s023352 (version=TLSv1.2 cipher=AES128-GCM-SHA256 bits=128 verify=OK); Tue, 25 Oct 2016 15:00:22 GMT Received: by mail-wm0-f47.google.com with SMTP id d128so32157426wmf.1; Tue, 25 Oct 2016 08:00:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=ItVS39PAg2Kzd5ByyQ2hEoCbg+xH6+ttk0SW+iKQbuY=; b=au53wTlp7+Kj5GZ7Y+YcRSN26F5Uits4zysGiLa53cM6Lx5lbp01PhYbl5DRbLvqOa L09k1ZhlyU4dZt2sAPCLvHS7iAidCeYuOXv1++OqA9+CbZlqlANkfy6VAZFeraE7grI0 PdvpYm7yRp3LJ0lu99BtNnQg0vKe/Kk/0mKoImSPnPIQZ9F0qrXZxBd8qv0V6BY5/kDA G47euD9NqOmEVJKJi6VLVgv3UeJEvP1uCoo+zVMjfkGrRSGJj5Hk9hh4V9lEfjAc4/yG DyTSnQ6AscLb/eNcFCfb16Mh599cLWlfvFKM5QourYdyYZVFK2CRfMkJAnld2SXOEqJ3 TuhA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=ItVS39PAg2Kzd5ByyQ2hEoCbg+xH6+ttk0SW+iKQbuY=; b=lLvr2Z0I3cLL9/vQJUdNq3DqCJ5n0/aQnTmhDh12WwdBdSr7QFEKv3f72aDpY0ZGTJ 3s+7jvwr5agCePb+D8jfCcx5UGW2mQQQpggqbdWfVlz1RgeQWGEUp9Xb6Gs2/hLoirgD GigBfCeMMVIpdl+hVR0Np6jA5TmHINE+1FFGgJF9AK6ZVbHvxtIsk+Ic/6CZwhar0650 CKf8ZMHLk+0vhxKzAGj6HN2AZw7bdQ3WnfUBdN633BWY0PKZEaJynSR/qhza7muWbxPf l/FgCoTjv2vnMZbdehzLOEuTrvhV1Az3xl7mMs/trYWpUkC0zNMLP8f8v84Woy3Tju0M l2rw== X-Gm-Message-State: ABUngve7F3HcQd9sEOBw8Fd7b8reYyIQMGH1oJd8zZrIt6VC1zcu34S1h3XaFdd19UgjcpDsRxT9llwE/n7trQ== X-Received: by 10.28.154.150 with SMTP id c144mr3799370wme.25.1477407617219; Tue, 25 Oct 2016 08:00:17 -0700 (PDT) MIME-Version: 1.0 Received: by 10.194.118.73 with HTTP; Tue, 25 Oct 2016 08:00:16 -0700 (PDT) In-Reply-To: <3FF6DA5D-E802-4C5B-A37A-B85CF2EF737D@conserver.com> References: <20161014160544.GZ27007@cs.brandeis.edu> <20161019010422.GQ27007@cs.brandeis.edu> <3FF6DA5D-E802-4C5B-A37A-B85CF2EF737D@conserver.com> From: Zonker Date: Tue, 25 Oct 2016 08:00:16 -0700 Message-ID: Subject: Re: what is normal conserver hang during reconfig To: Bryan Stansell Cc: "users@conserver.com" Content-Type: multipart/alternative; boundary=001a114b303836821c053fb1c44b X-Spam-Score: 0.624 () BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_FROM, FREEMAIL_REPLY, HTML_MESSAGE, SPF_PASS, URIBL_SBL X-Scanned-By: MIMEDefang 2.72 on 198.151.248.21 X-BeenThere: users@conserver.com X-Mailman-Version: 2.1.23 Precedence: list List-Id: Conserver Users List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 25 Oct 2016 15:00:25 -0000 --001a114b303836821c053fb1c44b Content-Type: text/plain; charset=UTF-8 To Bryan's point about DNS lookups, I expect that my main conserver will be the first thing online (and then then network comes up...) and it will be the last thing down. As a result, I have all of my console servers listed in the /etc/hosts file, and I look at the file first. I have 67 conserverver child processes with 16 ports under each, and my hup is just a few seconds. Best regards, -Z- On Tue, Oct 18, 2016 at 8:39 PM, Bryan Stansell via users < users@conserver.com> wrote: > Off the top of my head, I agree that there shouldn't be anything fixed in > the newer code to address this. The code does block all activity when it > processes a HUP signal, but that's supposed to be "quick". :-| > > Each process (the main and children) rereads the config file and figures > out if there's anything to do. The main process is in charge of spawning > new consoles (or reconfigured), and the children are responsible for > letting go of old ones (or reconfigured). > > With that in mind, how many consoles are each child managing? The compile > time default can be seen with a "conserver -V", but it can be overridden > with -m. I'm honestly not sure if having more or less would be better or > even change things (more processes would use more cores, but also "slam" > the system with that many things reading and processing the config). > > Conserver tries very hard to be multiplex across all the consoles, even > when bringing up and tearing down things. The reread of the config puts > all that on hold, so it probably has to do with that. > > One issue I've seen before is the magnitude of DNS lookups done when a > config is loaded. It all depends on the config, of course, but you could > end up generating a lot of requests. Maybe it doesn't apply in your > environment, but it can be an unexpected source of trouble. > > Aside from that, another server will certainly share the load (and, set up > right, the end users won't even notice). It would be interesting to look > at an strace (assuming linux) of a process when it gets a HUP (even without > any changes to configs). Just send one of the children a HUP so it > minimizes the impact. With timestamps, it might highlight what is causing > the issue (like the DNS query case, but could be anything). > > Bryan > > > On Oct 18, 2016, at 6:04 PM, Denis Hainsworth via users < > users@conserver.com> wrote: > > > > Running v 8.1.18. Rereading the SIGHUP section of the man page I'm > > still thinking I've configured something wrong. SIGHUP says conserver > > rereads the config files and then adds/deletes consoles as needed and > > only touches running consoles if they have changed. If thats true I > > wouldn't expect a 30s buffer of input/output on a console that hasn't > > changed, should I? > > I also don't see anything in CHANGES that sounds like this is a bug > > that has been fixed. > > > > -denis > > > > On Fri, Oct 14, 2016 at 12:05:44PM -0400, Denis Hainsworth wrote: > >> I love conserver. I have a minor issue and I was curious what options > >> there might be. > >> > >> So I have a conserver setup running against 262 servers (mostly digis or > >> ser2net machines). It works great. However when we need to update due > >> to a config change we run "kill -HUP" against the parent. With the > >> number of consoles (I think) this causes about a 30s "hang" when > >> interacting with any console which corresponds to the reconfig time. > >> > >> Does this make sense and is per the current design? Any chance there is > >> a clever way to make it block for less time? Barring that I intend to > >> spin up a new server to share the load of my current server and reduce > >> the reconfig time. > >> > >> I was mostly curious if there was a config issue or if this description > >> doesn't make any sense to folks and it means I have something else going > >> on like too many down consoles or something. > >> -denis > >> > >> -- > >> __________________________ > >> Denis Alan Hainsworth > >> denis.hainsworth@gmail.com > > > > -- > > __________________________ > > Denis Alan Hainsworth > > denis.hainsworth@gmail.com > > _______________________________________________ > > users mailing list > > users@conserver.com > > https://www.conserver.com/mailman/listinfo/users > > > _______________________________________________ > users mailing list > users@conserver.com > https://www.conserver.com/mailman/listinfo/users > -- Train of Lights reminder email list signup - http://tinyurl.com/ncry -announce --001a114b303836821c053fb1c44b Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
To Bryan's =C2=A0point about DNS lookups, I expect= that my main conserver will be the first thing online (and then then netwo= rk comes up...) and it will be the last thing down. As a result, I have all= of my console servers listed in the /etc/hosts file, and I look at the fil= e first. I have 67 conserverver child processes with 16 ports under each, a= nd my hup is just a few seconds.

=C2=A0 =C2=A0 Best = regards,

=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 -Z-


<= div class=3D"gmail_quote">On Tue, Oct 18, 2016 at 8:39 PM, Bryan Stansell v= ia users <users@conserver.com> wrote:
Off the top of my head, I agree that there shouldn't be a= nything fixed in the newer code to address this.=C2=A0 The code does block = all activity when it processes a HUP signal, but that's supposed to be = "quick".=C2=A0 :-|

Each process (the main and children) rereads the config file and figures ou= t if there's anything to do.=C2=A0 The main process is in charge of spa= wning new consoles (or reconfigured), and the children are responsible for = letting go of old ones (or reconfigured).

With that in mind, how many consoles are each child managing?=C2=A0 The com= pile time default can be seen with a "conserver -V", but it can b= e overridden with -m.=C2=A0 I'm honestly not sure if having more or les= s would be better or even change things (more processes would use more core= s, but also "slam" the system with that many things reading and p= rocessing the config).

Conserver tries very hard to be multiplex across all the consoles, even whe= n bringing up and tearing down things.=C2=A0 The reread of the config puts = all that on hold, so it probably has to do with that.

One issue I've seen before is the magnitude of DNS lookups done when a = config is loaded.=C2=A0 It all depends on the config, of course, but you co= uld end up generating a lot of requests.=C2=A0 Maybe it doesn't apply i= n your environment, but it can be an unexpected source of trouble.

Aside from that, another server will certainly share the load (and, set up = right, the end users won't even notice).=C2=A0 It would be interesting = to look at an strace (assuming linux) of a process when it gets a HUP (even= without any changes to configs).=C2=A0 Just send one of the children a HUP= so it minimizes the impact.=C2=A0 With timestamps, it might highlight what= is causing the issue (like the DNS query case, but could be anything).

Bryan

> On Oct 18, 2016, at 6:04 PM, Denis Hainsworth via users <users@conserver.com> wrote:
>
> Running v 8.1.18.=C2=A0 Rereading the SIGHUP section of the man page I= 'm
> still thinking I've configured something wrong.=C2=A0 SIGHUP says = conserver
> rereads the config files and then adds/deletes consoles as needed and<= br> > only touches running consoles if they have changed.=C2=A0 If thats tru= e I
> wouldn't expect a 30s buffer of input/output on a console that has= n't
> changed, should I?
> I also don't see anything in CHANGES that sounds like this is a bu= g
> that has been fixed.
>
> -denis
>
> On Fri, Oct 14, 2016 at 12:05:44PM -0400, Denis Hainsworth wrote:
>> I love conserver.=C2=A0 I have=C2=A0 a minor issue and I was curio= us what options
>> there might be.
>>
>> So I have a conserver setup running against 262 servers (mostly di= gis or
>> ser2net machines).=C2=A0 It works great.=C2=A0 However when we nee= d to update due
>> to a config change we run "kill -HUP" against the parent= .=C2=A0 With the
>> number of consoles (I think) this causes about a 30s "hang&qu= ot; when
>> interacting with any console which corresponds to the reconfig tim= e.
>>
>> Does this make sense and is per the current design?=C2=A0 Any chan= ce there is
>> a clever way to make it block for less time?=C2=A0 Barring that I = intend to
>> spin up a new server to share the load of my current server and re= duce
>> the reconfig time.
>>
>> I was mostly curious if there was a config issue or if this descri= ption
>> doesn't make any sense to folks and it means I have something = else going
>> on like too many down consoles or something.
>> -denis
>>
>> --
>> __________________________
>> Denis Alan Hainsworth
>> denis.hainsworth@gma= il.com
>
> --
> __________________________
> Denis Alan Hainsworth
> denis.hainsworth@gmail.c= om
> _______________________________________________
> users mailing list
> users@conserver.com
> https://www.conserver.com/mailman/listinfo= /users


_______________________________________________
users mailing list
users@conserver.com
https://www.conserver.com/mailman/listinfo/user= s



--
=
=
Train of Lights = reminder email list signup -=C2=A0http://tinyurl.com/ncry-announce
=
--001a114b303836821c053fb1c44b-- From consoleteam@gmail.com Tue Oct 25 15:06:56 2016 Received: from mail-wm0-f54.google.com (mail-wm0-f54.google.com [74.125.82.54]) by underdog.stansell.org (8.15.2/8.15.2) with ESMTPS id u9PF6oB7024257 (version=TLSv1.2 cipher=AES128-GCM-SHA256 bits=128 verify=OK); Tue, 25 Oct 2016 15:06:52 GMT Received: by mail-wm0-f54.google.com with SMTP id c78so168496658wme.0; Tue, 25 Oct 2016 08:06:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=XDe5s7Tg7lBxw/LPnT+iv0Vw9TM37h5XiSOD7Fddx5M=; b=hFb50ffS7LTA3ggw82k6Yl3UjH5TdfyYl8cwTn9kiMElk2KD2lr5P0SazdwZw7t6qF ercSRDRYZXYmwvJbXm5mS6XWtNyK4ma/AsDeiarznv+UYFimxvsk0tiRPs1llcYXej2A tOTke4wTo0kqNbRubo93fMWQz78LYuQ3Gwu3UG9Kx8KYPP+sGc6VODnaxQ97McDa7od0 iPSmqR0q6RyTZ0bI0kfTXmJ5lEjfoYnauJqW8IoRIoHCsZQGMCkLeJI723SuiOMqJyic PEUkdzB2guA9V2cwZpuofBCrzQOPJ47V/KcfVsaymxNplH3SnRkOsU6KMyVP7wR4aqEv lL3g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=XDe5s7Tg7lBxw/LPnT+iv0Vw9TM37h5XiSOD7Fddx5M=; b=MWw2uikcN7Uh2G6WBMJfufuXcF+NuoxQbQEg22lxBHjamBYm4g3MX56vUFyn4+ghCL mi6UmLVU2gagSZ2jWdTIu2981IrXG4oSNGStk4bbvOcFdI43nhdJp0PlFW7w9+JeePMO tcxJX20Jt0DSI+Sf90dR20+O2ulcN6ZFuMarG/C+rITi8zP/owlkWt2tYVUgI/YpffjH tSrEv0V3TtNrTKrNu4eAg/pejo1wmJ4y5iosEyKVYpe/1NpHQyBdIIYYuEenEIx3n7x+ HpFvGWvqkE9urCUlWj5SlnR6PlCm0XhJLxBo7vy0z0676jgVDfU46W2nIrncKGI2XuPq rCOg== X-Gm-Message-State: ABUngvfWh8GPJgPvVGggkGJ9423ZlsxA7lbyl/BI9zdTd0ASWiRdB/zYDPaFDuyXgvFsvW1AXJCcvfAaDPTg5w== X-Received: by 10.28.154.150 with SMTP id c144mr3832647wme.25.1477408009493; Tue, 25 Oct 2016 08:06:49 -0700 (PDT) MIME-Version: 1.0 Received: by 10.194.118.73 with HTTP; Tue, 25 Oct 2016 08:06:48 -0700 (PDT) In-Reply-To: References: <20161014160544.GZ27007@cs.brandeis.edu> <20161019010422.GQ27007@cs.brandeis.edu> <3FF6DA5D-E802-4C5B-A37A-B85CF2EF737D@conserver.com> From: Zonker Date: Tue, 25 Oct 2016 08:06:48 -0700 Message-ID: Subject: Re: what is normal conserver hang during reconfig To: Bryan Stansell Cc: "users@conserver.com" Content-Type: multipart/alternative; boundary=001a114b3038982246053fb1db0e X-Spam-Score: 0.624 () BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_FROM, FREEMAIL_REPLY, HTML_MESSAGE, SPF_PASS, URIBL_SBL X-Scanned-By: MIMEDefang 2.72 on 198.151.248.21 X-BeenThere: users@conserver.com X-Mailman-Version: 2.1.23 Precedence: list List-Id: Conserver Users List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 25 Oct 2016 15:06:56 -0000 --001a114b3038982246053fb1db0e Content-Type: text/plain; charset=UTF-8 Also, my main conserver is a dedicated host, on it's own UPS, which is powered by the data canter UPS... maybe it's overkill for some, but I can tell you if there were any problems shutting down the rest of the world, or bringing it back online again. (This was a really important feature after our campus lost PG&E Mains power TWICE in 14 hours last Monday. :-) All the log files go into /var/consoles/current, and we rotate timestamped files into /var/consoles/archive. Best regards, -Z- On Tue, Oct 25, 2016 at 8:00 AM, Zonker wrote: > To Bryan's point about DNS lookups, I expect that my main conserver will > be the first thing online (and then then network comes up...) and it will > be the last thing down. As a result, I have all of my console servers > listed in the /etc/hosts file, and I look at the file first. I have 67 > conserverver child processes with 16 ports under each, and my hup is just a > few seconds. > > Best regards, > > -Z- > > > On Tue, Oct 18, 2016 at 8:39 PM, Bryan Stansell via users < > users@conserver.com> wrote: > >> Off the top of my head, I agree that there shouldn't be anything fixed in >> the newer code to address this. The code does block all activity when it >> processes a HUP signal, but that's supposed to be "quick". :-| >> >> Each process (the main and children) rereads the config file and figures >> out if there's anything to do. The main process is in charge of spawning >> new consoles (or reconfigured), and the children are responsible for >> letting go of old ones (or reconfigured). >> >> With that in mind, how many consoles are each child managing? The >> compile time default can be seen with a "conserver -V", but it can be >> overridden with -m. I'm honestly not sure if having more or less would be >> better or even change things (more processes would use more cores, but also >> "slam" the system with that many things reading and processing the config). >> >> Conserver tries very hard to be multiplex across all the consoles, even >> when bringing up and tearing down things. The reread of the config puts >> all that on hold, so it probably has to do with that. >> >> One issue I've seen before is the magnitude of DNS lookups done when a >> config is loaded. It all depends on the config, of course, but you could >> end up generating a lot of requests. Maybe it doesn't apply in your >> environment, but it can be an unexpected source of trouble. >> >> Aside from that, another server will certainly share the load (and, set >> up right, the end users won't even notice). It would be interesting to >> look at an strace (assuming linux) of a process when it gets a HUP (even >> without any changes to configs). Just send one of the children a HUP so it >> minimizes the impact. With timestamps, it might highlight what is causing >> the issue (like the DNS query case, but could be anything). >> >> Bryan >> >> > On Oct 18, 2016, at 6:04 PM, Denis Hainsworth via users < >> users@conserver.com> wrote: >> > >> > Running v 8.1.18. Rereading the SIGHUP section of the man page I'm >> > still thinking I've configured something wrong. SIGHUP says conserver >> > rereads the config files and then adds/deletes consoles as needed and >> > only touches running consoles if they have changed. If thats true I >> > wouldn't expect a 30s buffer of input/output on a console that hasn't >> > changed, should I? >> > I also don't see anything in CHANGES that sounds like this is a bug >> > that has been fixed. >> > >> > -denis >> > >> > On Fri, Oct 14, 2016 at 12:05:44PM -0400, Denis Hainsworth wrote: >> >> I love conserver. I have a minor issue and I was curious what options >> >> there might be. >> >> >> >> So I have a conserver setup running against 262 servers (mostly digis >> or >> >> ser2net machines). It works great. However when we need to update due >> >> to a config change we run "kill -HUP" against the parent. With the >> >> number of consoles (I think) this causes about a 30s "hang" when >> >> interacting with any console which corresponds to the reconfig time. >> >> >> >> Does this make sense and is per the current design? Any chance there >> is >> >> a clever way to make it block for less time? Barring that I intend to >> >> spin up a new server to share the load of my current server and reduce >> >> the reconfig time. >> >> >> >> I was mostly curious if there was a config issue or if this description >> >> doesn't make any sense to folks and it means I have something else >> going >> >> on like too many down consoles or something. >> >> -denis >> >> >> >> -- >> >> __________________________ >> >> Denis Alan Hainsworth >> >> denis.hainsworth@gmail.com >> > >> > -- >> > __________________________ >> > Denis Alan Hainsworth >> > denis.hainsworth@gmail.com >> > _______________________________________________ >> > users mailing list >> > users@conserver.com >> > https://www.conserver.com/mailman/listinfo/users >> >> >> _______________________________________________ >> users mailing list >> users@conserver.com >> https://www.conserver.com/mailman/listinfo/users >> > > > > -- > Train of Lights reminder email list signup - http://tinyurl.com/ncry- > announce > -- Train of Lights reminder email list signup - http://tinyurl.com/ncry -announce --001a114b3038982246053fb1db0e Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Also, my main conserver is a dedicated host, on it'= ;s own UPS, which is powered by the data canter UPS... maybe it's overk= ill for some, but I can tell you if there were any problems shutting down t= he rest of the world, or bringing it back online again. (This was a really = important feature after our campus lost PG&E Mains power TWICE in 14 ho= urs last Monday. :-)

All the log files go into /var= /consoles/current, and we rotate timestamped files into /var/consoles/archi= ve.=C2=A0

=C2=A0 =C2=A0 Best regards,

<= /div>
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 -Z-

<= /div>

On Tue, Oct = 25, 2016 at 8:00 AM, Zonker <consoleteam@gmail.com> wrot= e:
To Bryan's = =C2=A0point about DNS lookups, I expect that my main conserver will be the = first thing online (and then then network comes up...) and it will be the l= ast thing down. As a result, I have all of my console servers listed in the= /etc/hosts file, and I look at the file first. I have 67 conserverver chil= d processes with 16 ports under each, and my hup is just a few seconds.

=C2=A0 =C2=A0 Best regards,

=C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 -Z-


On Tue, Oct 18, 2016 at 8:39 PM, Bryan Stansell via users <= users@conserver.com> wrote:
Off the top of my head, I agree that there shouldn't be anything fixed= in the newer code to address this.=C2=A0 The code does block all activity = when it processes a HUP signal, but that's supposed to be "quick&q= uot;.=C2=A0 :-|

Each process (the main and children) rereads the config file and figures ou= t if there's anything to do.=C2=A0 The main process is in charge of spa= wning new consoles (or reconfigured), and the children are responsible for = letting go of old ones (or reconfigured).

With that in mind, how many consoles are each child managing?=C2=A0 The com= pile time default can be seen with a "conserver -V", but it can b= e overridden with -m.=C2=A0 I'm honestly not sure if having more or les= s would be better or even change things (more processes would use more core= s, but also "slam" the system with that many things reading and p= rocessing the config).

Conserver tries very hard to be multiplex across all the consoles, even whe= n bringing up and tearing down things.=C2=A0 The reread of the config puts = all that on hold, so it probably has to do with that.

One issue I've seen before is the magnitude of DNS lookups done when a = config is loaded.=C2=A0 It all depends on the config, of course, but you co= uld end up generating a lot of requests.=C2=A0 Maybe it doesn't apply i= n your environment, but it can be an unexpected source of trouble.

Aside from that, another server will certainly share the load (and, set up = right, the end users won't even notice).=C2=A0 It would be interesting = to look at an strace (assuming linux) of a process when it gets a HUP (even= without any changes to configs).=C2=A0 Just send one of the children a HUP= so it minimizes the impact.=C2=A0 With timestamps, it might highlight what= is causing the issue (like the DNS query case, but could be anything).

Bryan

> On Oct 18, 2016, at 6:04 PM, Denis Hainsworth via users <users@conserver.com> = wrote:
>
> Running v 8.1.18.=C2=A0 Rereading the SIGHUP section of the man page I= 'm
> still thinking I've configured something wrong.=C2=A0 SIGHUP says = conserver
> rereads the config files and then adds/deletes consoles as needed and<= br> > only touches running consoles if they have changed.=C2=A0 If thats tru= e I
> wouldn't expect a 30s buffer of input/output on a console that has= n't
> changed, should I?
> I also don't see anything in CHANGES that sounds like this is a bu= g
> that has been fixed.
>
> -denis
>
> On Fri, Oct 14, 2016 at 12:05:44PM -0400, Denis Hainsworth wrote:
>> I love conserver.=C2=A0 I have=C2=A0 a minor issue and I was curio= us what options
>> there might be.
>>
>> So I have a conserver setup running against 262 servers (mostly di= gis or
>> ser2net machines).=C2=A0 It works great.=C2=A0 However when we nee= d to update due
>> to a config change we run "kill -HUP" against the parent= .=C2=A0 With the
>> number of consoles (I think) this causes about a 30s "hang&qu= ot; when
>> interacting with any console which corresponds to the reconfig tim= e.
>>
>> Does this make sense and is per the current design?=C2=A0 Any chan= ce there is
>> a clever way to make it block for less time?=C2=A0 Barring that I = intend to
>> spin up a new server to share the load of my current server and re= duce
>> the reconfig time.
>>
>> I was mostly curious if there was a config issue or if this descri= ption
>> doesn't make any sense to folks and it means I have something = else going
>> on like too many down consoles or something.
>> -denis
>>
>> --
>> __________________________
>> Denis Alan Hainsworth
>> de= nis.hainsworth@gmail.com
>
> --
> __________________________
> Denis Alan Hainsworth
> denis.= hainsworth@gmail.com
> _______________________________________________
> users mailing list
> users@conserv= er.com
> https://www.conserver.com/mailman/listinfo= /users


_______________________________________________
users mailing list
users@conserver.co= m
https://www.conserver.com/mailman/listinfo/user= s



<= /div>--
Train of Li= ghts reminder email list signup -=C2=A0= http://tinyurl.com/ncr= y-announce
=



--
Train of Lights reminder em= ail list signup -=C2=A0http://tinyurl.com/ncry-announc= e
--001a114b3038982246053fb1db0e-- From cfowler@outpostsentinel.com Tue Oct 25 15:27:29 2016 Received: from zcs-mta.vps-host.net (zcs-mta.vps-host.net [69.89.1.77]) by underdog.stansell.org (8.15.2/8.15.2) with ESMTPS id u9PFRPuR024854 (version=TLSv1.2 cipher=ADH-AES256-GCM-SHA384 bits=256 verify=NO); Tue, 25 Oct 2016 15:27:27 GMT Received: from localhost (localhost.localdomain [127.0.0.1]) by zcs-mta.vps-host.net (Postfix) with ESMTP id 704E8815EECF; Tue, 25 Oct 2016 11:27:24 -0400 (EDT) Received: from zcs-mta.vps-host.net ([127.0.0.1]) by localhost (zcs-mta.vps-host.net [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id c7Iq1ttqS5_R; Tue, 25 Oct 2016 11:27:23 -0400 (EDT) Received: from localhost (localhost.localdomain [127.0.0.1]) by zcs-mta.vps-host.net (Postfix) with ESMTP id B6650814D27D; Tue, 25 Oct 2016 11:27:23 -0400 (EDT) X-Virus-Scanned: amavisd-new at zcs-mta.vps-host.net Received: from zcs-mta.vps-host.net ([127.0.0.1]) by localhost (zcs-mta.vps-host.net [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id kwMNpQuYiY1P; Tue, 25 Oct 2016 11:27:23 -0400 (EDT) Received: from enterprisemail2.vps-host.net (unknown [10.0.6.109]) by zcs-mta.vps-host.net (Postfix) with ESMTP id 8DDFA815EECF; Tue, 25 Oct 2016 11:27:23 -0400 (EDT) Date: Tue, 25 Oct 2016 11:27:23 -0400 (EDT) From: Chris Fowler To: Zonker Cc: Bryan Stansell , users@conserver.com Message-ID: <1127348975.10548952.1477409243375.JavaMail.zimbra@outpostsentinel.com> In-Reply-To: References: <20161014160544.GZ27007@cs.brandeis.edu> <20161019010422.GQ27007@cs.brandeis.edu> <3FF6DA5D-E802-4C5B-A37A-B85CF2EF737D@conserver.com> Subject: Re: what is normal conserver hang during reconfig MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_Part_10548951_1316098443.1477409243375" X-Mailer: Zimbra 8.6.0_GA_1194 (ZimbraWebClient - GC52 (Linux)/8.6.0_GA_1194) Thread-Topic: what is normal conserver hang during reconfig Thread-Index: zymskSU+uV4zfZCXFUUeWryOlS9Lfg== X-Spam-Score: -0.277 () BAYES_00,HTML_MESSAGE,SPF_HELO_PASS,URIBL_SBL X-Scanned-By: MIMEDefang 2.72 on 198.151.248.21 X-BeenThere: users@conserver.com X-Mailman-Version: 2.1.23 Precedence: list List-Id: Conserver Users List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 25 Oct 2016 15:27:29 -0000 ------=_Part_10548951_1316098443.1477409243375 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit > From: "Zonker via users" > To: "Bryan Stansell" > Cc: users@conserver.com > Sent: Tuesday, October 25, 2016 11:06:48 AM > Subject: Re: what is normal conserver hang during reconfig > Also, my main conserver is a dedicated host, on it's own UPS, which is powered > by the data canter UPS... maybe it's overkill for some, but I can tell you if > there were any problems shutting down the rest of the world, or bringing it > back online again. (This was a really important feature after our campus lost > PG&E Mains power TWICE in 14 hours last Monday. :-) I'm different. All mine are independent. They have their own configs. Console output is stored on their local storage. Main location is a program I wrote that pulls its info from a database on what to connect to. It stores output to its local disk. To connect from the main I have 2 programs. One uses the console protocol. The other uses SSH to the target host and then executes console on it. Chris ------=_Part_10548951_1316098443.1477409243375 Content-Type: text/html; charset=utf-8 Content-Transfer-Encoding: quoted-printable



From: "Zo= nker via users" <users@conserver.com>
To: "Bryan Stansell" = <bryan@conserver.com>
Cc: users@conserver.com
Sent: <= /b>Tuesday, October 25, 2016 11:06:48 AM
Subject: Re: what is nor= mal conserver hang during reconfig
Also, my main conserver is a dedicated hos= t, on it's own UPS, which is powered by the data canter UPS... maybe it's o= verkill for some, but I can tell you if there were any problems shutting do= wn the rest of the world, or bringing it back online again. (This was a rea= lly important feature after our campus lost PG&E Mains power TWICE in 1= 4 hours last Monday. :-)

I'm di= fferent.

All mine are indepen= dent.  They have their own configs.  Console output is stored on = their local storage.

Main loc= ation is a program I wrote that pulls its info from a database on what to c= onnect to.  It stores output to its local disk.  To connect from = the main I have 2 programs.  One uses the console protocol.  The = other uses SSH to the target host and then executes console on it.

Chris

------=_Part_10548951_1316098443.1477409243375-- From denis.hainsworth@gmail.com Tue Oct 25 18:16:50 2016 Received: from mail-qt0-f180.google.com (mail-qt0-f180.google.com [209.85.216.180]) by underdog.stansell.org (8.15.2/8.15.2) with ESMTPS id u9PIGlKe017837 (version=TLSv1.2 cipher=AES128-GCM-SHA256 bits=128 verify=OK); Tue, 25 Oct 2016 18:16:49 GMT Received: by mail-qt0-f180.google.com with SMTP id q20so324552qtc.0; Tue, 25 Oct 2016 11:16:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=date:from:to:cc:subject:message-id:reply-to:references:mime-version :content-disposition:in-reply-to:user-agent; bh=wAIv868ibrwAKVxfDKAVvtzOlLO/Nr3TE6tEqq7Nnsw=; b=jviE5T8Y0oN+jctElUJeTMJ75exdNsQWgR/vVVLLq/zYdQueDCc7JOgMnZBmCVenDq IFLG47GQQSNB2wJyq63D6/U/2QZlk5hzVBgyJp8LrGb2x3vp//oxJnfTjauW/8kRPI0B z+rwkEnVoq6teILWDBda1OTRWlarJfkv/UNmtL9cdaWu4u+Ja5e3lrzOuMPnZPDVMHx0 rbIRa6/IG5U4Sx6F/bdwRjX+v/AwAaV5SXzVxRDhRjMyzFCtvfnOdmwCX2t+puPhH1CH Wo6jYCBS7Tn71+v4GEfdPFVXYtoUC0Bc8tVnZb7EG12d8RS6bofGZbKxTZYd+tomZbiH PiNQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:date:from:to:cc:subject:message-id:reply-to :references:mime-version:content-disposition:in-reply-to:user-agent; bh=wAIv868ibrwAKVxfDKAVvtzOlLO/Nr3TE6tEqq7Nnsw=; b=YW/PDW8ftGRUN1RYTMt+r26tHSJha6sPpeGMQua+blUb96gOCuG8ZbetCsG3nBWqFG 6WE6ylV8lBlNuvEaDyfvS6+B+hjU7rtdU+i7rOnAvigv2j73JirnbJG+By+YtAPk1V6e rAGbBaGgI2aj4OS5BNcfbEyc8DK0G9gyDKk4tQ2ZNwviGNNaDNcCdA+/1dfQkK2wI/bQ 8zOFfOAo0LUbskXL161NincI/5gcYTsmw6n8hWKvm7uLZuzT95+jH/X+seA74P1QJkGp Dza9w8PWhPJJP/GcnGJz/ZVcTtteSGl503ILeCMuoF9sE7slY+z57m9xJCBBSS9aS4Wt 6WkA== X-Gm-Message-State: ABUngvc60opU8UPU2gJOHQCifl7hAcq44MlkOp0maim/QvAkGeX+pk1n0WNEbCPLWXb6Og== X-Received: by 10.200.39.125 with SMTP id h58mr21462866qth.142.1477419407160; Tue, 25 Oct 2016 11:16:47 -0700 (PDT) Received: from xmas.dyndns.org (cl-890.chi-02.us.sixxs.net. [2001:4978:f:379::2]) by smtp.gmail.com with ESMTPSA id t34sm11645855qtc.28.2016.10.25.11.16.46 (version=TLS1 cipher=AES128-SHA bits=128/128); Tue, 25 Oct 2016 11:16:46 -0700 (PDT) Received: by xmas.dyndns.org (Postfix, from userid 501) id 6812C8C21B9; Tue, 25 Oct 2016 14:16:44 -0400 (EDT) Date: Tue, 25 Oct 2016 14:16:44 -0400 From: Denis Hainsworth To: Zonker Cc: Bryan Stansell , "users@conserver.com" Subject: Re: what is normal conserver hang during reconfig Message-ID: <20161025181644.GO6698@cs.brandeis.edu> Reply-To: Denis Hainsworth References: <20161014160544.GZ27007@cs.brandeis.edu> <20161019010422.GQ27007@cs.brandeis.edu> <3FF6DA5D-E802-4C5B-A37A-B85CF2EF737D@conserver.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) X-Spam-Score: -2 () BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_FROM, SPF_PASS X-Scanned-By: MIMEDefang 2.72 on 198.151.248.21 X-BeenThere: users@conserver.com X-Mailman-Version: 2.1.23 Precedence: list List-Id: Conserver Users List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 25 Oct 2016 18:16:51 -0000 On Tue, Oct 25, 2016 at 08:00:16AM -0700, Zonker via users wrote: > To Bryan's point about DNS lookups, I expect that my main conserver will > be the first thing online (and then then network comes up...) and it will > be the last thing down. As a result, I have all of my console servers > listed in the /etc/hosts file, and I look at the file first. I have 67 > conserverver child processes with 16 ports under each, and my hup is just a > few seconds. Yeah DNS is a hit but its dwarfed by what I was seeing. Understand my "small" site has 72 children and HUPs in 3s now that I'm reading in only the configs it manages (aka its now no longer a master). My large site has 126 children :) even with paring it down to only the stuff it manages. Thats now HUPing in 13s. We have other boxes that can continue to serve as masters. -denis