This post is adapted from an iPython Notebook I wrote which is part of a pull request to be added to the Pylearn2 documentation. I assume the reader is familiar with Pylearn2 (mostly its YAML file framework for describing experiments) and with Jobman, a tool to launch and manage experiments.
Suppose you have a YAML file describing an experiment which looks like that:
You’re not sure if the learning rate and the momentum coefficient are optimal, though, and you’d like to try different hyperparameter values to see if you can come up with something better.
One (painful) way to do it would be to create multiple copies of the YAML file
and, for each copy, manually change the value of the learning rate and the
momentum coefficient. You’d then call the
train script on each of these
copies. This solution is not satisfying for multiple reasons:
- This is long and tedious
- There’s lot of code duplication going on
- You’d better be sure there are no errors in the original YAML file, or else you’re in for a nice editing ride (been there)
Ideally, the solution should involve a single YAML file and some way of specifying how hyperparameter should be handled. One such solution exists, thanks to Pylearn2 and Jobman.
Pylearn2 can instantiate a
Train object specified by a YAML string via the
pylearn2.config.yaml_parse.load method; using this method and Python’s string
substitution syntax, we can “fill the blanks” of a template YAML string based
on our original YAML file and run the experiment described by that string.
In order to to that, we’ll need a dictionary mapping hyperparameter names to their value. This is where Jobman will prove useful: Jobman accepts configuration files describing a job’s parameters, and its syntax allows to initialize parameters by calling an external Python method. This way, we can randomly sample hyperparameters for our experiment.
To summarize it all, we will
- Adapt the YAML file by replacing hyperparameter values with string substitution statements
- Write a configuration file specifying how to initialize the hyperparameter dictionary
- Read the YAML file into a string
- Fill in hyperparameter values using string substitution with the hyperparameter dictionary
- Instantiate a
Trainobject with the YAML string by calling
- Call the
- Extract results from the trained model
Let’s break it down.
Adapting the YAML file
This step is pretty straightforward. Looking back to our example, the only lines we have to replace are
Using string subsitution syntax, they become
String substitution and training logic
The next step, assuming we already have a dictionary mapping hyperparameters to their values, would be to build a method which
- takes the YAML string and the hyperparameter dictionary as inputs,
- does string substitution on the YAML string,
- calls the
pylearn2.config.yaml_parse.loadmethod to instantiate a
Trainobject and calls its
- extracts and returns results after the model is trained.
Luckily for us, one such method already exists:
This method integrates with Jobman: it expects
arguments as input and returns
channel.COMPLETE at the end of training.
Here’s the method’s full implementation:
As you can see, it builds a dictionary out of state.hyper_parameters and uses it to do string substitution on state.yaml_template.
It then instantiates the
Train object as described in the YAML string and
Finally, when the method returns, it calls the method referenced in the
state.extract_results string by passing it the
Train object as argument.
This method is responsible to extract any relevant results from the
object and returning them, either as is or in a
DD object. The return value
is stored in
Writing the extraction method
Your extraction method should accept a
Train object instance and return
either a single value (
str, etc.) or a
DD object containing
For the purpose of this tutorial, let’s write a simple method which extracts the misclassification rate and the NLL from the model’s monitor:
Here we extract misclassification rate and NLL values at the last training
epoch from their respective channels of the model’s monitor and return a
object containing those values.
Building the hyperparameter dictionary
Let’s now focus on the last piece of the puzzle: the Jobman configuration file. Your configuration file should contain
yaml_template: a YAML string representing your experiment
hyper_parameters.[name]: the value of the
[name]hyperparameter. You must have at least one such item, but you can have as many as you want.
extract_results: a string written in
module.methodform representing the result extraction method which is to be used
Here’s how a configuration file could look for our experiment:
Notice how we’re using the
key:=@method statement. This serves two purposes:
- We don’t have to copy the yaml file to the configuration file as a long, hard to edit string.
- We don’t have to hard-code hyperparameter values, which means every time Jobman is called with this configuration file, it’ll receive different hyperparameters.
For reference, here’s
Running the whole thing
Here’s how you would train your model:
Alternatively, you can chain jobs using