Integrating Pylearn2 and Jobman
Jan 13, 2014
This post is adapted from an iPython Notebook I wrote which is part of a pull request to be added to the Pylearn2 documentation. I assume the reader is familiar with Pylearn2 (mostly its YAML file framework for describing experiments) and with Jobman, a tool to launch and manage experiments.
The problem
Suppose you have a YAML file describing an experiment which looks like that:
You’re not sure if the learning rate and the momentum coefficient are optimal, though, and you’d like to try different hyperparameter values to see if you can come up with something better.
One (painful) way to do it would be to create multiple copies of the YAML file
and, for each copy, manually change the value of the learning rate and the
momentum coefficient. You’d then call the train
script on each of these
copies. This solution is not satisfying for multiple reasons:
- This is long and tedious
- There’s lot of code duplication going on
- You’d better be sure there are no errors in the original YAML file, or else you’re in for a nice editing ride (been there)
Ideally, the solution should involve a single YAML file and some way of specifying how hyperparameter should be handled. One such solution exists, thanks to Pylearn2 and Jobman.
Solution overview
Pylearn2 can instantiate a Train
object specified by a YAML string via the
pylearn2.config.yaml_parse.load
method; using this method and Python’s string
substitution syntax, we can “fill the blanks” of a template YAML string based
on our original YAML file and run the experiment described by that string.
In order to to that, we’ll need a dictionary mapping hyperparameter names to their value. This is where Jobman will prove useful: Jobman accepts configuration files describing a job’s parameters, and its syntax allows to initialize parameters by calling an external Python method. This way, we can randomly sample hyperparameters for our experiment.
To summarize it all, we will
- Adapt the YAML file by replacing hyperparameter values with string substitution statements
- Write a configuration file specifying how to initialize the hyperparameter dictionary
- Read the YAML file into a string
- Fill in hyperparameter values using string substitution with the hyperparameter dictionary
- Instantiate a
Train
object with the YAML string by callingpylearn2.config.yaml_parse.load
- Call the
Train
object’smain_loop
method - Extract results from the trained model
Let’s break it down.
Adapting the YAML file
This step is pretty straightforward. Looking back to our example, the only lines we have to replace are
Using string subsitution syntax, they become
String substitution and training logic
The next step, assuming we already have a dictionary mapping hyperparameters to their values, would be to build a method which
- takes the YAML string and the hyperparameter dictionary as inputs,
- does string substitution on the YAML string,
- calls the
pylearn2.config.yaml_parse.load
method to instantiate aTrain
object and calls itsmain_loop
method and - extracts and returns results after the model is trained.
Luckily for us, one such method already exists:
pylearn2.scripts.jobman.experiment.train_experiment
.
This method integrates with Jobman: it expects state
and channel
arguments as input and returns channel.COMPLETE
at the end of training.
Here’s the method’s full implementation:
As you can see, it builds a dictionary out of state.hyper_parameters and uses it to do string substitution on state.yaml_template.
It then instantiates the Train
object as described in the YAML string and
calls its main_loop
method.
Finally, when the method returns, it calls the method referenced in the
state.extract_results
string by passing it the Train
object as argument.
This method is responsible to extract any relevant results from the Train
object and returning them, either as is or in a DD
object. The return value
is stored in state.results
.
Writing the extraction method
Your extraction method should accept a Train
object instance and return
either a single value (float
, int
, str
, etc.) or a DD
object containing
your values.
For the purpose of this tutorial, let’s write a simple method which extracts the misclassification rate and the NLL from the model’s monitor:
Here we extract misclassification rate and NLL values at the last training
epoch from their respective channels of the model’s monitor and return a DD
object containing those values.
Building the hyperparameter dictionary
Let’s now focus on the last piece of the puzzle: the Jobman configuration file. Your configuration file should contain
yaml_template
: a YAML string representing your experimenthyper_parameters.[name]
: the value of the[name]
hyperparameter. You must have at least one such item, but you can have as many as you want.extract_results
: a string written inmodule.method
form representing the result extraction method which is to be used
Here’s how a configuration file could look for our experiment:
Notice how we’re using the key:=@method
statement. This serves two purposes:
- We don’t have to copy the yaml file to the configuration file as a long, hard to edit string.
- We don’t have to hard-code hyperparameter values, which means every time Jobman is called with this configuration file, it’ll receive different hyperparameters.
For reference, here’s utils.log_uniform
’s implementation:
Running the whole thing
Here’s how you would train your model:
Alternatively, you can chain jobs using jobdispatch
: