The days of DIY system administration are rapidly coming to a close. Why? Because the open source tools available are just too good not to use. Presenting Bcfg2, Cfengine, Chef and Puppet.
This summer the USENIX 2010 conference in Boston hosted the first Configuration Management Summit on automating system administration using open source configuration management tools. The summit brought together developers, power users and new adopters.
Why Configuration Management?
Internet use is growing and new services are appearing hourly.The number of servers (both physical and virtual) is becoming uncountable. Automation of system administration is a must to handle the deluge; else swarms of sysadmins would be needed to handle all these systems.
Drivers for automating system administration:
- In companies with multiple sysadmins working the old way,in interactive root sessions, there is a potential for sysadmins making changes at the same time to step on each other’s toes (and on the config!);
- system administration is a relatively new profession,without a standard curriculum, so practitioners have different philosophies and practices. Going from organization to organization, it is a challenge for a new sysadmin to learn:
- how is the system setup,
- why was it setup that way,
- how it needs to be setup to keep operating,
- how to set it up that way again in case of disaster or normal growth.
Automating system administration addresses all the above and makes new things possible.
For example, a CM tool can respond faster than a human sysadmin to a deviation from configuration policy to remedy it or it may automatically instantiate, configure and bring online a new virtual server instance if an old one dies.
There are over a dozen different CM tools actively used in production.
So many choices can bewilder a sysadmin searching for a CM tool.
The summit included representatives for 4 tools: Bcfg2 (pronounced “bee-config 2″), Cfengine, Chef and Puppet.
The summit had three parts: 4 presentations; a panel session; and a mini BarCamp with 6 presentations. The panel session was quite lively.
I will attempt to compare and contrast the 4 tools; however using any robust configuration management tool, with discipline, is better than administering systems manually.
Bcfg2: Came out of Argonne National Lab. Lightweight on the node. Each server can easily handle 1000 nodes.Relies on centralization. Uses a complete model of each node’s configuration,both desired and current.
Strengths: Reporting system and debugging.
Weaknesses: Documentation. (New set of documentation is coming out now, but still weak in examples.) Sharing policies between sites is not easy; group names need to be standardized first.
Cfengine: Came out of Oslo University. Strong philosophy of allowing decentralization and potential local autonomy. Oriented toward consensus building as opposed to top-down policy dictation. Underlying philosophies are promise theory, convergence and self-healing. Also has a healthy paranoid streak and an impressive security record (only 3 serious vulnerabilities in 17 years).
Strengths: Highly multi-platform (it even runs on underwater unmanned vehicles!).Lightweight. Largest userbase – more companies using it than all the other tools combined! Able to continue operating under degraded condition (network down,for example).
Weaknesses: It’s hard to get started because there is a lot to learn.
Chef: Has its origins in Ruby-on-Rails world in the cloud. Grew out of dissatisfaction with Puppet’s non-deterministic ordering. Resilient (each node can run stand-alone if the server disappears). Sequence of execution is tightly ordered.
Strengths: Cloud integration (automating provisioning and configuration of new instances in one fell swoop). Multi-node orchestration (more below). Reusable policy cookbooks and highest degree of recipe reuse amongst its users amongst the four tools.
Weaknesses: Attributes have 9 different levels of precedences (role, node, etc.) and this can be daunting.
Puppet: Grew out of dissatisfaction with Cfengine 2. Centralized model, however if the server is unreachable, node agents will still run, applying the cached configuration. Simple and human-readable DSL gives safety at cost of flexibility. Determines and runs delta changes only.
Strengths: Large community of users (over 2000 users on the Puppet mailing list).
Weakness: Puppet server right now is a potential bottleneck (which is solved by going to multiple servers.) Execution ordering can be non-deterministic. (But reports will always tell you what succeeded and what failed. And order can be mandated if order is required.)