Distributed Automation with Masterless Salt
Author: Nicholas M. Hughes
I recently spoke with someone (Hi Tom!) who was excited about Salt Project, but installing a central server didn’t fit with their deployment scheme. They also were not able to find any really good information on running Salt without a central server. The Salt Project documentation is pretty basic for this type of implementation. In Salt terms, the server component is called the “master” and the distributed nodes with a client agent installed are “minions”. The default implementation of Salt will assume that a central master is installed, and minions will attempt to contact it for jobs to accomplish and files to pull down for execution or configuration purposes. But while this is the default (and most widely used) configuration, there are many ways to implement Salt. This post will show some basic concepts for deploying Salt minions in a distributed and disconnected manner, also known as a “masterless” deployment.
Installing and Configuring the Salt Minion
Feel free to follow along on any system you have available, but I provided some really basic Terraform in the GitHub repository listed at the end of this post. It’s really basic… Don’t judge me… This isn’t a Terraform post. It’ll provision a small Ubuntu server in the default VPC for your AWS account. The main thing to understand is that we’re going to use a user data script to lay the groundwork for Salt to come in and manage everything else.
Once we have a machine to work with, the first thing we need is Salt. The absolute easiest way to install the Salt open source software is by using the Salt Bootstrap Script. There are many different options you can pass to the script, but the main things we care about is passing in our minion configuration that will put us in masterless mode. The Terraform code provides user data which runs a wrapper script from the GitHub repository, but the following Salt Bootstrap call is made under the hood.
curl -L https://bootstrap.saltproject.io \ | sudo sh -s -- -x python3 \ -j '{ "master_type": "disable", "file_roots": { "top": ["/srv/local/top"], "base": ["/srv/local/salt", "/srv/remote/salt"] }, "startup_states": "highstate", "pub_ret": false, "mine_enabled": false, "return": "rawfile_json", "top_file_merging_strategy": "merge_all", "file_client": "local" }' \ stable 3004
This will install Salt version 3004 and pass in the following configuration:
# /etc/salt/minion # Disable master communication master_type: disable pub_ret: false mine_enabled: false # Local directories for the Salt file server file_client: local file_roots: top: - /srv/local/top base: - /srv/local/salt - /srv/remote/salt # Merge top files from other envs top_file_merging_strategy: merge_all # Run highstate on service start startup_states: highstate # Returner configuration return: rawfile_json
There is a lot going on here, but here is a high-level overview of the tasks:
We disable attempts by the minion to find a Salt master and communicate with it. This is important because it’ll take resources performing that operation, but also so rogue Salt masters aren’t able to hijack your system.
We set the minion to look locally for the Salt filesystem. Otherwise, we’d have a disconnected minion with no way to find configuration code… which isn’t terribly useful. We also set up specific locations for local and “remote” code to be found.
We ensure that we’re going to merge our “top file” from multiple environments. This is important as a mechanism to define both a local and “remote” top file for highstate applications. More on this in a bit.
The system is configured to run a highstate on restart of the
salt-minion
service. A “highstate” is a way for the minion to compile a top file and run all states defined inside of it. It’s a great way to enforce the configuration of an entire system.We set up a local returner that captures jobs as JSON lines in a file. The value of this will become apparent later in the post.
Environments and State Trees
Based upon this configuration, we’re going to house Salt code in two locations:
/srv/local/salt /srv/remote/salt
The local
directory will hold the basic states we need in order to pull down and run the remote code from our GitHub repository. The remote
directory is just going to be the cloned repository.
You’ll notice that I also defined two environments in the minion configuration. This is due to the way that the top file and Salt file system is merged. If I just had my local and remote directories under the same base
environment, then only the first top.sls
file found would be used.
/srv/local/salt/top.sls <-- we found this one first, so it's used /srv/remote/salt/top.sls
However, top files in different environments are merged and used in each environment. This means that I can define a top
environment and have a top.sls
in it that defines state application for the base
environment! This is a non-standard use for this mechanism, but it totally works for our purposes.
/srv/local/top/top.sls <-- top env /srv/remote/salt/top.sls <-- base env
So, our local top file will sit in our top
environment and really only points to the state file that performs the synchronization of the remote code.
# /srv/local/top/top.sls base: "*": - entrypoint
Notice that the base
environment is defined in that file despite the top.sls
file being in the top
environment. Since our minion is in the default base
environment, we’d never be able to access states defined in other environments. Our entrypoint
state lives in the local section of the base
environment and contains the information we need to synchronize remote states to the local file system.
# /srv/local/salt/entrypoint.sls sync_states: git.latest: - name: https://github.com/eitrtechnologies/salt-masterless-example.git - target: /srv/remote - force_checkout: True - force_clone: True - force_fetch: True - force_reset: True - submodules: True - order: 1 sync_all_modules: saltutil.sync_all: - refresh: True - order: 1 - onchanges: - git: sync_states highstate_schedule: schedule.present: - function: state.apply - cron: "*/30 * * * *"
The first state block pull our remote code down into the /srv/remote
directory. The second state block will synchronize any custom or overridden modules defined in our remote repository. Finally, we put a schedule in place to run a highstate on an interval. The first time we run a state.apply
(or restart the salt-minion
service to initiate startup_states
), only the local code is available, so we’ll run the entrypoint
state and exit. Afterward, the remote code will be accessible on the minion and any states defined in the top file there will be run.
# /srv/remote/salt/top.sls base: "*": - core - docker - security
So, a top file like we have in our example will result in four states being run during the next highstate.
What Next?
This gets us in a pretty good spot to start playing around with Salt in a masterless environment, but this probably isn’t good enough for a production deployment. What are some other things we’d need?
Test Environments
All of our code is in a single place. Awesome, right? Well… What if we push some code that breaks all of our hosts? Not so awesome. You can implement some checks in your CI to prevent broken code from being merged to your main branch, but you’ll probably want a subset of systems pointing at another branch to ensure that you have some non-critical systems testing out your changes before it’s pushed everywhere. Think about how that concept could be applied to your use case.
Monitoring Completion
Without the benefit of a Salt master, the minions won’t be sending job status anywhere. So, we won’t have any idea if we push code that breaks when attempting to be applied to our deployed systems. To solve this, you could use Salt Returners. The returner interface allows the return data to be sent to any system that can receive data. This means that return data can be sent to a Redis server, a MongoDB server, a MySQL server, or any system for analysis and archival. In our example, I configured the rawfile_json
returner in order to send job returns to a local file on the minion as single-line JSON structures. This file could then parsed by an agent such as Elastic Filebeat to send those events to a central dashboard. I also recently opened a pull request to add the ability to export state run data as a file readable by the Prometheus Node Exporter agent. Either (or both) of these mechanisms could be used to visualize and alert on failures in Kibana and/or Grafana, or another returner could be used with a tool of your choice. The sky is the limit!
Remote Execution
Now that we have our minion using Salt for configuration management, what about remote execution? In “normal” deployments, Salt can be used to run ad-hoc commands across your entire fleet. However, masterless Salt minions don’t have a central point of communication in order to receive those types of jobs. Instead, you’ll need to use another mechanism to send commands to hosts. For our example, we’re in Amazon Web Services so we could use AWS Systems Manager for that. One could also use PSSH from a bastion host for parallel execution of commands. The bottom line is that they are a lot of ways to accomplish this, but it’s something that you’ll probably want to solve sooner rather than later.
Secrets
Salt’s “normal” mechanism for distributing secrets is called Pillar. Unfortunately, our situation doesn’t work with Pillar because it is a master-only component. However, Salt has many execution modules which can be used to retrieve sensitive items from a secure secrets management platform. Additionally, Salt’s pluggable architecture allows you to extend its capabilities for any platform not currently supported. If your organization uses AWS Secrets Manager, Azure Key Vault, HashiCorp Vault, Pinterest’s Knox, or any other platform, you should be able to integrate a secrets management workflow into your Salt code.
Conclusion
I hope this information was helpful for anyone starting down the path with Salt in a distributed environment. All of the artifacts to test this out are up in a EITR Technologies GitHub repository. Feel free to reach out to us on GitHub with any issues in the example, or send me a message in email, on LinkedIn, or the Salt Project Community Slack. Always happy to chat with other Salt users!
Nicholas Hughes helps businesses integrate cloud and cybersecurity automation into their IT lifecycle processes in order to amplify the efforts of the existing workforce, prevent employee burnout, and free them to do the more important tasks that might be currently neglected. As part of his daily duties as a founding partner and CEO of EITR Technologies LLC., he’s responsible for all of those super awesome elements of the CEO job that you read about as a kid, like setting the strategic direction of the company and modeling corporate values. Additionally, Nick still performs technical consulting work with specializations in Automation & Orchestration, Cloud Infrastructure, Cloud Security, and Systems Architecture. He has nearly 20 years of experience in a wide breadth of roles within Information Technology, which is invaluable to clients seeking comprehensive technical solutions to business problems. Nick highly values pragmatism, logical thinking, and integrity in both his business and personal life… which is a decidedly boring set of core values that reap great results when applied to the task at hand. He also has a wonderful wife and two boys who keep him on his toes.