What to do if your long running simulation gets accidentally killed? Have it automatically recover from a checkpoint of course!

Attached is a python script which counts slowly from zero to nine:

$ python2.5 checkpoint.py
0
1
2
3
4
5
6
7
8
9

However, at every step the state of the counter is stored, so if the process dies it can be restored:

$ python2.5 checkpoint.py
0
1
2
^CTraceback (most recent call last):
  File "checkpoint.py", line 41, in <module>
    for i in checkpointed_range(10):
  File "checkpoint.py", line 22, in next
    sleep(1);
Keyboard Interrupt
$ python2.5 checkpoint.py
3
4
5
6
7
8
9

Attachments:

  1. checkpoint.py