Do you know what I think about most while working?

Why am I working?

You see, my job isn’t about creating wonderful pieces of art. My job isn’t about solving the world’s problems. My jobs isn’t about entertaining people. My job is about finishing my PhD.

Quit the whining and take me to the tutorial.

If you are thinking about doing a PhD for any other reason than having a PhD, reconsider. Actually, no I’m just kidding. We just don’t like people like you. With your god damn ambition and your barbie doll outlook on life. And your happiness.

But I digress.

A PhD such as mine involves two kinds of activities:

Ask insightful scientific questions that drive the discovery of amazing new phenomena that may change the world. Try to answer the questions. You don’t need to be a scientist to do the former, and the latter is impossible (or at least somewhat unlikely). Some people think being in academia has made me a cynic. I can’t reasonably dispute this1.

Running 10,000 simulations. Luckily for me, I’m not an experimentalist. Unluckily for me I’m a computationalist2. And being a computationalist involves running countless simulations using weird esoteric software that was written by other computationalists. Or if you’re really stupid, software that you wrote yourself. And this means writing 10,000 input files.

At the moment these are Gromacs input files. This is what my normal day looks like:

  1. Write input file 1 to 29. Use a template file because I’m smart. Write changes to each file manually because I’m not smart.
  2. Launch simulation 3,7,13 on server3c
  3. Launch simulation 5,8,1 on server1a
  4. Procrastinate
  5. Launch simulation 2,9 on server13
  6. All simulations crash because of wrong input
  7. Shoot self with gun

  1. Gun doesn’t fire because of wrong input
  2. :’(
  3. Manually go through 29 input files and fix mistake
  4. goto 2

Running 10,000 simulations means writing 10,000 input files. So when I’m working, I’m writing 10,000 input files. Do you know what I think about most while I’m working?

When you procrastinate: self intoxicate, generously lubricate, deeply aspirate

And automate!3

Here, I’ll show you how to automate the creation of 10,000 gromacs .mdp files. Though this method can be used to create any kind of input files that are text. And it can also be used to generate any sort or repetitive code.

We’ll be using a template engine. What is a template engine you may ask? I’m not entirely sure. But we can use one to automate the creation of many input files. And that is all that matters for now.

There are a few out there, but we’ll be using ibis because it’s lightweight, it’s written in python, it’s in the public domain, and its website is so gosh darn sexy.

It was written for creating static websites lazily (mine is one of these). But it should work the same for .mdp files, or any kind of text files.

First we should create a template. Here is a gromacs template file for some simulations. I want to a 100 simulations, with temperatures ranging from 200K to 300K. My .mdp file template looks like this:

; template.mdp
; ...
; Crap I don't care about right now
; ...

ref_t = {{temp}} ; This is template markup. Note the double curly braces.

; ...
; More crap I don't care about right now
; ...

gen_temp = {{temp}} ; what we can do it twice!?!?

{{temp}} will hold the value for the temperature. And we’ll script creating the files with python and ibis. {{temp}} will be used as a key in a dictionary that we will give Ibis. Ibis will then replace the keys with their values. Of course, we will be sneaky and change the value of the keys on the fly.

Let’s start.

$ pip install ibis
$ mkdir input_files
$ cp template.mdp input_files/
$ cd input_files
# run.py
import ibis
# read in the template file as a string into an ibis template
with open("template.mdp", "r") as tempFile:
    template = ibis.Template("".join(tempFile.readlines()))

# now for the fun bit:
for t in range(200, 301):
    with open("temp_{}.mdp".format(t), "w") as outFile:
        d = {"temp": t} # replace  {{temp}}  with the value in the variable t
        outFile.writelines(template.render(d))

Now to generate the files. Kiss your SSD goodbye.

$ python run.py
$ ls -v temp_*
temp_200.mdp  temp_215.mdp  temp_230.mdp  temp_245.mdp  temp_260.mdp  temp_275.mdp  temp_290.mdp
temp_201.mdp  temp_216.mdp  temp_231.mdp  temp_246.mdp  temp_261.mdp  temp_276.mdp  temp_291.mdp
temp_202.mdp  temp_217.mdp  temp_232.mdp  temp_247.mdp  temp_262.mdp  temp_277.mdp  temp_292.mdp
temp_203.mdp  temp_218.mdp  temp_233.mdp  temp_248.mdp  temp_263.mdp  temp_278.mdp  temp_293.mdp
temp_204.mdp  temp_219.mdp  temp_234.mdp  temp_249.mdp  temp_264.mdp  temp_279.mdp  temp_294.mdp
temp_205.mdp  temp_220.mdp  temp_235.mdp  temp_250.mdp  temp_265.mdp  temp_280.mdp  temp_295.mdp
temp_206.mdp  temp_221.mdp  temp_236.mdp  temp_251.mdp  temp_266.mdp  temp_281.mdp  temp_296.mdp
temp_207.mdp  temp_222.mdp  temp_237.mdp  temp_252.mdp  temp_267.mdp  temp_282.mdp  temp_297.mdp
temp_208.mdp  temp_223.mdp  temp_238.mdp  temp_253.mdp  temp_268.mdp  temp_283.mdp  temp_298.mdp
temp_209.mdp  temp_224.mdp  temp_239.mdp  temp_254.mdp  temp_269.mdp  temp_284.mdp  temp_299.mdp
temp_210.mdp  temp_225.mdp  temp_240.mdp  temp_255.mdp  temp_270.mdp  temp_285.mdp  temp_300.mdp
temp_211.mdp  temp_226.mdp  temp_241.mdp  temp_256.mdp  temp_271.mdp  temp_286.mdp
temp_212.mdp  temp_227.mdp  temp_242.mdp  temp_257.mdp  temp_272.mdp  temp_287.mdp
temp_213.mdp  temp_228.mdp  temp_243.mdp  temp_258.mdp  temp_273.mdp  temp_288.mdp
temp_214.mdp  temp_229.mdp  temp_244.mdp  temp_259.mdp  temp_274.mdp  temp_289.mdp

Each with the proper value of temperature.

$ diff -y -d --suppress-common-lines temp_200.mdp temp_201.mdp
ref_t = 200 ; note the double curly braces.		      |	ref_t = 201 ; note the double curly braces.
gen_temp = 200 ; what we can do it twice!?!?		      |	gen_temp = 201 ; what we can do it twi

“But wait!”, you say. I am Smarty McNotLazyPants. I don’t need this. Why not just do

template.replace("{{temp}}", str(t))

Well, I don’t know. You could just do that I suppose. But then you’re not Maximizing Laziness. And with a template engine you can do some more interesting things.

Let’s make things a little more complicated. I have a protein that I want to simulate in a variety of conditions. I’m going to do this because I don’t know what is going on. And maybe something interesting will happen if I run enough simulations. What else am I going to do?

I want to run simulations at a variety of temperatures and pressures, with the Berendsen barostat, itself with a number of different parameters, which I will tweak if the pressure is higher than 1.2 bar. I suspect that the most interesting case is at 285 K, so I will write the trajectory more frequently at that temperature and run the simulation with a smaller time step. I could just use a string replace (or even sed shudder), but this way the script is slightly easier to read. And I can generate all the input files with one template file and one script.

Again the template input file (I’ve cut the irrelevant bits out)

dt                       = {{dt}}
nstxtcout                = {{out_freq}}
ref_t                    = {{temp}}

ref_p                    = {{pressure}}

gen_vel                  = yes
gen_temp                 = {{temp}}
compressibility          = {{comp}}
# awesme.py
import ibis
def SomeFunction(*args):
    return 5e-5

# these parameters are for a Coarse-Grained simulation
pressures = [0.5, 1.0, 1.5] # in bar
temperatures = range(200, 301)
out_freq1 = 25000
out_freq2 = 5000
dt1 = 0.02
dt2 = 0.01
comp1 = 4.5e-5

# read in the template file as a string into an ibis template
with open("template.mdp", "r") as tempFile:
    template = ibis.Template("".join(tempFile.readlines()))

# now for the fun bit:
for t in temperatures:
    for p in pressures:
        paramDict = {"temp":t, "pressure": p, "dt": dt1,
                     "out_freq": out_freq1, "comp":comp1}

        if t == 285:
            paramDict["out_freq"] = out_freq2
            paramDict["dt"] = dt2

        if p >= 1.2:
            paramDict["comp"] = SomeFunction(paramDict["temp"], paramDict["dt"], p) # I can let someone else make this decision.

        with open("input_{}_{}.mdp".format(t, p), "w") as outFile:
            outFile.writelines(template.render(paramDict))
$ python awesome.py
$ ls -v input*.mdp
input_200_0.5.mdp
input_200_1.0.mdp
input_200_1.5.mdp
...
$ cat input_285_1.5.mdp
dt                       = 0.01
nstxtcout                = 5000
ref_t                    = 285

ref_p                    = 1.5

gen_vel                  = yes
gen_temp                 = 285
compressibility          = 5e-05

Needless to say, the files you generate are still subject to the rules of the program you intend to use them with. Don’t blame me if something breaks.

The most interesting thing about this is that a combination of different templates and Ibis template markup, we can write fairly complicated scripts that produce a diverse set of input files. And because the whole thing is in python, it is possible to automate the set up of and running many different simulations related to a certain system. But be careful, too much automation makes hair grow on your palms.

Ibis is extremely powerful and you can do all kinds of amazing things with it. It even supports complicated programming done in the template itself. Though I suspect I won’t be needing the complicated bits. I recommend visiting its beautiful website. Drink plenty of fluids.

For some resources on learning python see A Whirlwind Tour of Python and especially for data scientists (basically all scientists) Python Data Science Handbook. These books are free, open source and focused on data processing, analysis. Why are there so many online resources for unimportant bullshit. WTF is Django? By the way, Django is itself built with a template engine.


  1. This sentence is very useful if you ever write a paper that gets to peer review. 

  2. Noun: Someone that does runs computational experiments(simulations). Was that too hard Oxford dictionary? How dare you call me a Psychologist? Ew

  3. It took me only 90 minutes to come up with this rhyming joke. Good thing I did that at my office, where I don’t have anything important to do.