Snakemake using a rule in a loop

I think this is a nice opportunity to use recursive programming. Rather than explicitly including conditionals for every iteration, write a single rule that transitions from iteration (n-1) to n. So, something along these lines:

SAMPLES = ["SampleA", "SampleB"]

rule all:
    input:
        expand("loop3/{sample}.txt", sample=SAMPLES)

def recurse_sample(wcs):
    n = int(wcs.n)
    if n == 1:
        return "test/%s.txt" % wcs.sample
    elif n > 1:
        return "loop%d/%s.txt" % (n-1, wcs.sample)
    else:
        raise ValueError("loop numbers must be 1 or greater: received %s" % wcs.n)

rule loop_n:
    input: recurse_sample
    output: "loop{n}/{sample}.txt"
    wildcard_constraints:
        sample="[^/]+",
        n="[0-9]+"
    shell:
        """
        awk -v loop='loop{wildcards.n}' '{{print $0, loop}}' {input} > {output}
        """

As @RussHyde said, you need to be proactive about ensuring no infinite loops are triggered. To this end, we ensure all cases are covered in recurse_sample and use wildcard_constraints to make sure the matching is precise.


My understanding is that your rules are converted to python code before they are ran and that all the raw python code present in your Snakefile is ran sequentially during this process. Think of it as your snakemake rules being evaluated as python functions.

But there's a constraint that any rule can only be evaluated to a function once.

You can have if/else expressions and differentially evaluate a rule (once) based on config values etc, but you can't evaluate a rule multiple times.

I'm not really sure how to rewrite your Snakefile to achieve what you want. Is there a real example that you could give where looping constructs appear to be required?

--- Edit

For fixed number of iterations, it may be possible to use an input-function to run the rule several times. (I would caution against doing this though, be extremely careful to disallow infinite loops)

SAMPLES = ["SampleA", "SampleB"]

rule all:
    input:
        # Output of the final loop
        expand("loop3/{sample}.txt", sample = SAMPLES)

def looper_input(wildcards):
    # could be written more cleanly with a dictionary
    if (wildcards["prefix"] == "loop0"):
        input = "test/{}.txt".format(wildcards["sample"])
    else if (wildcards["prefix"] == "loop1"):
        input = "loop0/{}.txt".format(wildcards["sample"])
    ...
    return input


rule looper:
    input:
            looper_input
    output:
            "{prefix}/{sample}.txt"
    params:
            # ? should this be add="{prefix}" ?
            add=prefix
    shell:
            "awk '{{print $0, {params.add}}}' {input} > {output}"