How can I load input data from a file in Terraform?

I would use the data template_file resource. Like so...

data "template_file" "my_file" {
  template = "${file("${path.module}/my_file.json")}"
  vars = {
    var_to_use_in_file = "${var.my_value}"
  }
}

Then in your resource block....

resource "aws_cloudwatch_event_target" "data" {
  rule      = "${aws_cloudwatch_event_rule.scheduler.name}"
  target_id = "finance_producer_cloudwatch"
  arn       = "${aws_lambda_function.finance_data_producer.arn}"
  input     = "${data.template_file.my_file.rendered}"
}

The answer here depends on a few different questions:

  • Is this file a static part of your configuration, checked in to version control alongside your .tf files, or is it dynamically generated as part of the apply process?
  • Do you want to use the file contents literally, or do you need to substitute values into it from elsewhere in the Terraform configuration?

These two questions form a matrix of four different answers:

             |  Literal Content            Include Values from Elsewhere
-------------|----------------------------------------------------------
Static File  |  file(...) function         templatefile(...) function
Dynamic File |  local_file data source     template_file data source

I'll describe each of these four options in more detail below.

A common theme in all of these examples will be references to path.module, which evaluates to the path where the current module is loaded from. Another way to think about that is that it is the directory containing the current .tf file. Accessing files in other directories is allowed, but in most cases it's appropriate to keep things self-contained in your module by keeping the data files and the configuration files together.

Terraform strings are sequences of unicode characters, so Terraform can only read files containing valid UTF-8 encoded text. For JSON that's no problem, but worth keeping in mind for other file formats that might not conventionally be UTF-8 encoded.

The file function

The file function reads the literal content of a file from disk as part of the initial evaluation of the configuration. The content of the file is treated as if it were a literal string value for validation purposes, and so the file must exist on disk (and usually, in your version control) as a static part of your configuration, as opposed to being generated dynamically during terraform apply.

resource "aws_cloudwatch_event_target" "data" {
  rule      = aws_cloudwatch_event_rule.scheduler.name
  target_id = "finance_producer_cloudwatch"
  arn       = aws_lambda_function.finance_data_producer.arn
  input     = file("${path.module}/input.json")
}

This is the most common and simplest option. If the file function is sufficient for your needs then it's the best option to use as a default choice.

The templatefile function

The templatefile function is similar to the file function, but rather than just returning the file contents literally it instead parses the file contents as a string template and then evaluates it using a set of local variables given in its second argument. This is useful if you need to pass some data from elsewhere in the Terraform configuration, as in this example:

resource "aws_cloudwatch_event_target" "data" {
  rule      = aws_cloudwatch_event_rule.scheduler.name
  target_id = "finance_producer_cloudwatch"
  arn       = aws_lambda_function.finance_data_producer.arn
  input     = templatefile("${path.module}/input.json.tmpl", {
    instance_id = aws_instance.example.id
  })
}

In input.json.tmpl you can use the Terraform template syntax to substitute that variable value:

{"instance_id":${jsonencode(instance_id)}}

In cases like this where the whole result is JSON, I'd suggest just generating the whole result using jsonencode, since then you can let Terraform worry about the JSON escaping etc and just write the data structure in Terraform's object syntax:

${jsonencode({
  instance_id = instance_id
})}

As with file, because templatefile is a function it gets evaluated during initial decoding of the configuration and its result is validated as a literal value. The template file must therefore also be a static file that is distributed as part of the configuration, rather than a dynamically-generated file.

The local_file data source

Data sources are special resource types that read an existing object or compute a result, rather than creating and managing a new object. Because they are resources, they can participate in the dependency graph and can thus make use of objects (including local files) that are created by other resources in the same Terraform configuration during terraform apply.

The local_file data source belongs to the local provider and is essentially the data source equivalent of the file function.

In the following example, I'm using var.input_file as a placeholder for any reference to a file path that is created by some other resource in the same configuration. In a real example, that is most likely to be a direct reference to an attribute of a resource.

data "local_file" "input" {
  filename = var.input_file
}

resource "aws_cloudwatch_event_target" "data" {
  rule      = aws_cloudwatch_event_rule.scheduler.name
  target_id = "finance_producer_cloudwatch"
  arn       = aws_lambda_function.finance_data_producer.arn
  input     = data.local_file.input.content
}

The template_file data source

NOTE: Since I originally wrote this answer, the provider where template_file was implemented has been declared obsolete and no longer maintained, and there is no replacement. In particular, the provider was archived prior to the release of Apple Silicon and so there is no available port for macOS on that architecture.

The Terraform team does not recommend rendering of dynamically-loaded templates, because it pushes various errors that could normally be detected at plan time to be detected only during apply time instead.

I've retained this content as I originally wrote it in case it's useful, but I would suggest treating this option as a last resort.

The template_file data source is the data source equivalent of the templatefile function. It's similar in usage to local_file though in this case we populate the template itself by reading it as a static file, using either the file function or local_file as described above depending on whether the template is in a static file or a dynamically-generated one, though if it were a static file we'd prefer to use the templatefile function and so we'll use the local_file data source here:

data "local_file" "input_template" {
  filename = var.input_template_file
}

data "template_file" "input" {
  template = data.local_file.input_template.content
  vars = {
    instance_id = aws_instance.example.id
  }
}

resource "aws_cloudwatch_event_target" "data" {
  rule      = aws_cloudwatch_event_rule.scheduler.name
  target_id = "finance_producer_cloudwatch"
  arn       = aws_lambda_function.finance_data_producer.arn
  input     = data.template_file.input.rendered
}

The templatefile function was added in Terraform 0.12.0, so you may see examples elsewhere of using the template_file data source to render static template files. That is an old pattern, now deprecated in Terraform 0.12, because the templatefile function makes for a more direct and readable configuration in most cases.

One quirk of the template_file data source as opposed to the templatefile function is that the data source belongs to the template provider rather than to Terraform Core, and so which template features are available in it will depend on which version of the provider is installed rather than which version of Terraform CLI is installed. The template provider is likely to lag behind Terraform Core in terms of which template language features are available, which is another reason to prefer the templatefile function where possible.

Other Possibilities

This question was specifically about reading data from a file, but for completeness I also want to note that for small JSON payloads it can sometimes be preferable to inline them directly in the configuration as a Terraform data structure and convert to JSON using jsonencode, like this:

resource "aws_cloudwatch_event_target" "data" {
  rule      = aws_cloudwatch_event_rule.scheduler.name
  target_id = "finance_producer_cloudwatch"
  arn       = aws_lambda_function.finance_data_producer.arn
  input     = jsonencode({
    instance_id = aws_instance.example.id
  })
}

Writing the data structure inline as a Terraform expression means that a future reader can see directly what will be sent without needing to refer to a separate file. However, if the data structure is very large and complicated then it can hurt overall readability to include it inline because it could overwhelm the other configuration in the same file.

Which option to choose will therefore depend a lot on the specific circumstances, but always worth considering whether the indirection of a separate file is the best choice for readability.

Terraform also has a yamlencode function (experimental at the time of writing) which can do similarly for YAML-formatted data structures, either directly inside a .tf file or in an interpolation sequence in an external template.

Tags:

Terraform