OpenTofu - Infrastructure configuration management

OpenTofu is a great tool for managing infrastructure and in this notes series I am going to share a few tips I found useful as I find myself using it for more than just infrastructure management.

Workspace Management

OpenTofu workspaces provide a mechanism for managing multiple environments of your infrastructure using the same OpenTofu code files. By default, OpenTofu operates in a 'default' workspace, but you can create additional workspaces to maintain separate states for different deployments.

Think of it this way: when you write OpenTofu code to manage your resources, that code is like a blueprint. Workspaces allow you to use that same blueprint to manage multiple "instances" of your infrastructure, each with its own state file that tracks what's been managed.

This is particularly valuable for scenarios like:

developing and testing new OpenTofu code in isolation and avoid disrupting production systems
managing infrastructure in different regions just by using different variable files (ie. using different encryption keys)
or to separate different projects that require deploying similar infrastructure (ie. deploying the same app for multiple clients)

OpenTofu's operations are marked in a state file which it's like a ledger that keeps track of all resources OpenTofu manages. When you run OpenTofu commands, it first checks this state file to understand what exists and what needs to change. After making changes, OpenTofu updates the state file to reflect the new reality of the managed resources.

While all this state management happens automatically in the default workspace, creating additional workspaces provides a way to maintain separate states for independent operations, all while using the same OpenTofu code. This approach provides isolation without requiring duplicated code or complicated workflows.

Let's take as example using OpenTofu to manage a MikroTik RouterOS configuration.

The resources file would look like this:

# main.tf
...
########## configure VLANs
resource "routeros_interface_vlan" "interface_vlan" {
  for_each = var.vlans

  interface = each.value.interface
  name      = each.value.name
  vlan_id   = each.value.vlan_id
}
########## end configure VLANs

While the code remais the same, the variables would be different between environments, like this:

for testing environment or for developing the OpenTofu code we would have this:

# env/testing.tfvars
vlans = {
  "VNET_TEST_1" = {
    interface = "ether1"
    name      = "VNET_TEST_1"
    vlan_id   = 10
  }
  "VNET_TEST_2" = {
    interface = "ether2"
    name      = "VNET_TEST_2"
    vlan_id   = 11
  }
}

while for the production systems the variables would look similar to this:

# env/production.tfvars
vlans = {
  "PRIVNET" = {
    interface = "ether1"
    name      = "PRIVNET"
    vlan_id   = 100
  }
  "DMZ" = {
    interface = "ether2"
    name      = "DMZ"
    vlan_id   = 101
  }
}

Creating the different workspaces requires using the workspace comamnd:

$ tofu workspace new testing # where testing is the workspace name

Then simply issue an apply command using the testing variables:

$ tofu plan -out=plan.out -var-file=env/$(tofu workspace show).tfvars
$ tofu apply plan.out

If the testing processes are successful the new RouterOS config can be easily applied by simply switching to the production workspace and issueing the exact same commands.

When using workspaces, it's important to parameterize your OpenTofu configuration to handle differences between environments. Use OpenTofu variables to define values that may change across workspaces. This allows you to maintain a single source of truth while still customizing each environment as needed.

In this way you can safely develop and test your OpenTofu code in isolation and "promote" the code to production just by using a different variables file.

Organize Your Modules Like a Pro

As OpenTofu codebases grow, managing infrastructure code can become a challenge, especially when the same resource blocks are used across multiple files and projects.

This is where OpenTofu modules come in handy – they're reusable building blocks that encapsulate groups of resource blocks dedicated to specific tasks, making the infrastructure code easier to maintain and consistent.

When organizing modules, it's helpful to group them based on functionality or architectural components – for example, separate modules for networking, compute, and storage components.

This modular approach will reduce code duplication while making the OpenTofu code more readable and maintainable.

Root and child modules

An OpenTofu module is basically a directory containing OpenTofu files. Interestingly, every OpenTofu project is itself a module - which is called the "root module". This root module includes all resources and variables defined in the .tf and .tfvars files of the main OpenTofu code.

For OpenTofu, the concept of modules is recursive - meaning that any module can call other modules which are known as "child modules" for the root module. These child modules can even be reused multiple times within the same module code. Think of it like nesting building blocks, where each block (module) can contain other blocks to complete a task.

When using a module block in the OpenTofu code to reference another module, OpenTofu will include all the resources defined in that module as part of the current OpenTofu code.

Using this approach the code can be organized hierarchically and efficiently while drastically lowering the number of bugs and improving code quality and reusability.

There are several ways to share modules, the more common ones being:

local paths - where the modules are stored in a folder inside the main OpenTofu project
module registry - a remote registry for distributing modules as well as providers
Git - any accessible Git repository can be used as a module (ie. Github, Gitlab, etc.)

Building a module

To create a module start by creating a new folder and add the following files:

netbox-customization
├── LICENSE.md
├── README.md
├── example
│   └── data.tfvars
├── main.tf
├── outputs.tf
└── variables.tf

There are no constraints on the file names within an OpenTofu module. You can use any file name with a .tf or .tfvars extension, and OpenTofu will process all files in a directory, irrespective of their names. The order in which files are named doesn't matter either since OpenTofu loads all files and combines them into a single configuration.

However, there are common conventions that I recommend to be followed:

main.tf - primary configuration file
variables.tf - variable declarations
outputs.tf - output declarations
LICENSE.md and README.md - license and readme documentation
example/ - folder containing example of usages and variables

That said, there are still a few important points to note:

files must have the .tf extension (or .tf.json for JSON format)
override files must end in _override.tf or _override.tf.json
files beginning with . or _ are ignored
variables file have the .tfvars extension (or .tfvars.json for JSON format)
files ending in .auto.tfvars or .auto.tfvars.json are automatically loaded by OpenTofu to populate variable values without explicitly specifying variable files on the command line

While it's not mandatory, using consistent file names across modules helps fellow colleagues understand the code faster.

# main.tf
...
resource "netbox_custom_field_choice_set" "custom_field_choices" {
  for_each = var.custom_field_choices

  name                 = each.value.name
  extra_choices        = local.custom_field_choices[each.key].choices
  description          = try(each.value.description, null)
  order_alphabetically = try(each.value.order_alphabetically, false)
}

# variables.tf
variable "custom_field_choices" {
  description = "Map of custom field choice sets"
  type = map(object({
    name                 = string
    choices              = list(string)
    description          = optional(string)
    order_alphabetically = optional(bool)
  }))
  default = {}
}

Using the module

as a local module:

# environments/dev/main.tf
module "vpc" {
  source = "./modules/netbox-customization"

  custom_field_choices = customizations.custom_field_choices
 ...
}

as a Git stored module:

# environments/dev/main.tf
module "vpc" {
  source = "github.com/rendler-denis/tf-mod-netbox//netbox-customization"

  custom_field_choices = customizations.custom_field_choices
  ...
}

Notice the double slashes in the path - that is not a typo. It is needed when the module is residing in a subfolder inside a Git repository.

Now, to download and prepare the module inside the workspace run the following command:

$ tofu init

Dynamic Configuration with Data Sources

Data sources in OpenTofu allows querying and fetching information from existing infrastructure or external services, making this data available for use in the OpenTofu code during runtime.

Unlike resources that create and manage infrastructure, data sources are read-only and provide a way to reference existing resources, whether they're managed by OpenTofu or not. For example, OpenTofu can use data sources to look up an existing VPC, query an IPAM system for the first available IP etc.

The key advantage of data sources is that they enable dynamic and flexible infrastructure configurations. Instead of hardcoding values that might change over time (like subnet IDs), data sources can be used to automatically fetch the most current information. This approach reduces maintenance overhead, prevents errors from outdated values, and allows OpenTofu code to adapt to changes in the infrastructure automatically.

Let's look at an example:

# Query existing VPC
data "aws_vpc" "existing" {
  tags = {
    Environment = OpenTofu.workspace
  }
}

# Query available AZs
data "aws_availability_zones" "available" {
  state = "available"
}

# Create subnets dynamically
resource "aws_subnet" "private" {
  count             = length(data.aws_availability_zones.available.names)
  vpc_id            = data.aws_vpc.existing.id
  cidr_block        = cidrsubnet(data.aws_vpc.existing.cidr_block, 4, count.index)
  availability_zone = data.aws_availability_zones.available.names[count.index]

  tags = {
    Name = "private-subnet-${count.index + 1}"
    Type = "private"
  }
}

Efficient Data Handling with Data-Only Modules

Data-only modules in OpenTofu are specialized modules that focus solely on computing and providing data without creating any actual infrastructure resources. They serve as a way to encapsulate complex data transformations, lookups, or business logic that can be reused across different parts of OpenTofu project code.

Using this concept, modules that manage infrastructure can remain single purpose, decupled and overall easier to maintain and share. It also reduces the execution times because calculations or data retrieval is done only once and the data can be reused by all code being executed during runtime.

Let's quickly look at an example data-only module from my Netbox modules.

The definition for the data-only module remains the same:

the folder structure

netbox-data-org
├── LICENSE.md
├── README.md
├── main.tf
├── outputs.tf
└── variables.tf

the module code

# main.tf

data "netbox_site" "sites" {
  for_each = toset(var.sites)
  name     = each.value
}

data "netbox_tenant" "tenants" {
    for_each = toset(var.tenants)
    name     = each.value
}
...


# variables.tf
variable "sites" {
  description = "List of site names to retrieve from Netbox"
  type        = list(string)
  default     = []
}

variable "tenants" {
  description = "List of tenant names to retrieve from Netbox"
  type        = list(string)
  default     = []
}
...


# outputs.tf
output "sites_map" {
  description = "Map of site names to their IDs"
  value       = {
    for name, site in data.netbox_site.sites : name => site.id
  }
}

output "tenants_map" {
  description = "Map of tenant names to their IDs"
  value       = {
    for name, tenant in data.netbox_tenant.tenants : name => tenant.id
  }
}
....

using the data-only module in a root module

# main.tf - the main OpenTofu project file (ie. the root module)
module "netbox-org-data" {
  source = "./lib/netbox/netbox-data-org"

  sites       = data.sites
  tenants     = data.tenants
  ...
}

# using the fetched data
module "netbox-racks" {
  source = "./lib/netbox/netbox-racks"

  depends_on = [module.netbox-org, module.netbox-org-data]

  ...

  site_id_map     = module.netbox-org-data.sites_map
  tenant_id_map   = module.netbox-org-data.tenants_map
}

# using the same data in other modules
module "netbox-devices" {
  source = "./lib/netbox/netbox-devices"

  depends_on = [module.netbox-org, module.netbox-org-data]

  ...

  site_id_map     = module.netbox-org-data.sites_map
  tenant_id_map   = module.netbox-org-data.tenants_map
}

To access data from a data-only module, simply reference its outputs using the standard module output syntax: module.[module_name].[output_name]. This follows the same pattern you'd use to reference outputs from any other type of module.

The depends_on is not really necessary, but it does make the dependency explicit for other users.

Final thoughts

The combination of OpenTofu's workspaces, modules, and data-only modules creates a robust foundation for maintainable, scalable, and efficient infrastructure code. Workspaces provide the necessary isolation for testing and development, modules reduce code duplication and improve reusability, and data-only modules offer a clean approach to handling complex data operations.

While these concepts can introduce additional complexity, mastering them will help create more robust infrastructure-as-code. These tools represent essential components in the modern infrastructure management toolkit, whether applied to small deployments or large-scale infrastructure projects.

P.S.: The concepts discussed in this note are also applicable to any Terraform code. For brevity, I only mentioned OpenTofu, but feel free to apply these principles to your Terraform projects as well.