preloader

Environments And Cost Management

Best practices for keeping costs in check when working with Kubernetes environments

/images/blog/cover-images/environments-cost-management.png
environments-cost-management

by Shipyard Team on Jul 20, 2021

Managing Kubernetes costs can be a daunting task, especially now with the prevalence of multi and hybrid-cloud computing environments. Implementing Kubernetes environments correctly can help make this process smoother and easier, especially when it comes to managing your ephemeral environments.

Managing clusters

One of the best ways to limit a Kubernetes cluster’s costs is understanding how to manage physical clusters and ephemeral environments.

Although it’s considered a best practice to organize a cluster using namespaces, it can incur additional costs when done incorrectly. While namespaces alone will not increase cost, poor namespace usage makes it harder to keep track of where costs are coming from.

Shipyard believes it is best practice to use one cluster exclusively for production and keep any other environments in a separate cluster. This gives your team room to experiment and make mistakes, while ensuring your production environments will not be affected by lower-priority ones. While this sounds like it’d be less cost-effective, isolating two clusters with very different tolerances/profiles will allow for more targeted cost management in each.

What is the best way to manage your Kubernetes clusters? Well, that depends on who you ask. In our opinion, a good starting point is looking at the cluster’s size. In general, the number of microservices running in a cluster is a relatively straightforward way to get an idea of your deployment’s size.

  • Small clusters. Deployments with 2-10 microservices only needs a single production cluster. These small clusters can handle both production and ephemeral workloads, which allows more cost-effective binning of microservices on its nodes. This will also save you the monthly fees from running an extra managed Kubernetes cluster.

  • Medium clusters. When a deployment has more than 10 microservices, it becomes more difficult and time-consuming for developers to manage multiple running copies of an application. In these cases, the most cost-effective option is to have developers manually shut down or destroy any ephemeral environments when not in use.

  • Large clusters. When it comes to enterprise-level clusters, removing ephemeral environments may not be a good option. First, it’s time-consuming to ad-hoc restore each environment when needed, particularly since more users in the company will mean more usage. Companies can keep costs down by implementing auto-scaling and scaling ephemeral environments to zero when not in use.

Balancing cluster costs

Another important aspect to consider is the cost ratio between the production cluster and the other ephemeral environments. Since larger companies tend to have significantly more environments, production costs may directly correlate with a company’s size.

For example, a smaller company might use one cluster for production and another for development. However, a larger organization may require a higher level of isolation between departments, primarily for security reasons. In this case, using a namespace to isolate deployments can help further maximize security, because namespaces are often supported as a scope for rules.

Also, the type of organization also determines the ratio of production vs. other environments. For example, a single-site SaaS startup will often have fewer ephemeral environments than a large multi-department enterprise software company.

Considering these scenarios, how can we optimize the cost ratio between the production environment and other environments? Here are some ideas:

  • Perform a scheduled prune of your Kubernetes cluster/environments. Performing scheduled cluster maintenance can help identify environments which are no longer in use. These environments should be destroyed where necessary.

  • Create ephemeral environments with limited resources. When destroying low-use environments isn’t an option, we can combine the strategies discussed above (auto-scaling, scaling to zero) with the allocation of limited resources at the namespace level (via ResourceQuota).

Share:
comments powered by Disqus

SRE news and thoughts from Shipyard