Updating the software taking into account the characteristics of platforms, backup, reaction to incidents, and so on becoming a time consuming process.
To address these challenges ideal configuration management system. Like Chef, Puppet, SaltStack and others. If you are in the company may not use a configuration management system - start.
But in that case, if the infrastructure has been growing for several years. Gradually increase the number of servers. Or for some other reason you do not use the configuration management system. This article talks about how to add structure to your fleet of servers, network equipment and workstations.
Structure
Think about the structure of your company. Which groups of servers perform the same tasks. On what grounds and what groups can merge virtual machines. By type: Dev, Test, Prod. Feature: Vpn-servers, Web-servers, Db-servers. By location: BY, Client-name, Amazon. And so on.
[dev ] +----+ +------+ +---+ [test] <=> |asia| |docker| |ec2| [prod] +----+ +------+ +---+
Then describe the structure and purpose of servers using DNS.
Examples:
vpn1.mts.devel.aptinfo.net balancer2.us.dev.aptinfo.net db3.azure.prod.aptinfo.net app4.vmware.stage.aptinfo.net sql5.ec2-west.preprod.aptinfo.net www.aptinfo.net
Try to make the domain name gave the most information about the server as possible.
Convert the DNS structure
From a set of DNS records is easy to get a hierarchical structure.
{ "domain": "aptinfo.net", "records": [ { "content": "10.0.101.223", "fqdn": "app1.nl.stage.aptinfo.net", "subdomain": "app1.nl.stage", "type": "A" }, { "content": "10.0.101.224", "fqdn": "app2.nl.stage.aptinfo.net", "subdomain": "app2.nl.stage", "type": "A" },
Transform list A, CNAME and other records in an associative array. We separate the domain part and group records by common parts of the subdomain.
'nl.stage': ['app1.nl.stage.aptinfo.net', 'app2.nl.stage.aptinfo.net', 'db1.nl.stage.aptinfo.net']
After this easy to get information about a group of servers using commands bash. It is also easy to manage groups of servers using different tools, such as Fabric.
$ fab -R nl.stage -- who [app1.nl.stage.aptinfo.net] Executing task '' [app1.nl.stage.aptinfo.net] run: who [app1.nl.stage.aptinfo.net] out: root pts/1 Jul 23 21:30 (10.50.124.15) [app1.nl.stage.aptinfo.net] out: [app2.nl.stage.aptinfo.net] Executing task ' ' [app2.nl.stage.aptinfo.net] run: who [app2.nl.stage.aptinfo.net] out: root pts/3 Jul 23 21:30 (10.50.124.15) [app2.nl.stage.aptinfo.net] out: [db1.nl.stage.aptinfo.net] Executing task ' ' [db1.nl.stage.aptinfo.net] run: who No handlers could be found for logger "paramiko.transport" Fatal error: Error reading SSH protocol banner Underlying exception: Error reading SSH protocol banner Aborting. Disconnecting from root@app1.nl.stage.aptinfo.net... done. Disconnecting from root@app2.nl.stage.aptinfo.net... done. Error reading SSH protocol banner Underlying exception: Error reading SSH protocol banner
But fabric has disadvantages. First: the program is interrupted if the return code is not equal to 0. The commands Secondly: an interrupt when one of the hosts is not available. For this purpose, more suitable Ansible
$ ansible us.prod -i ~/ansible/dynamic.py -m shell -a uptime db2.us.prod.aptinfo.net | success | rc=0 >> 15:33:38 up 1116 days, 21:55, 2 users, load average: 0.04, 0.05, 0.07 app2.us.prod.aptinfo.net | success | rc=0 >> 15:33:46 up 1117 days, 27 min, 2 users, load average: 0.42, 0.53, 0.53 bckp.us.prod.aptinfo.net | success | rc=0 >> 15:33:51 up 1152 days, 27 min, 1 user, load average: 0.08, 0.43, 0.49 app1.us.prod.aptinfo.net | success | rc=0 >> 15:33:52 up 1006 days, 7:07, 2 users, load average: 2.19, 2.25, 2.20
Quickly and conveniently.
I hope this article will be useful.