This post is adapted from an iPython Notebook I wrote which is part of a pull request to be added to the Pylearn2 documentation. I assume the reader is familiar with Pylearn2 (mostly its YAML file framework for describing experiments) and with Jobman, a tool to launch and manage experiments.

## The problem

Suppose you have a YAML file describing an experiment which looks like that:

You’re not sure if the learning rate and the momentum coefficient are optimal, though, and you’d like to try different hyperparameter values to see if you can come up with something better.

One (painful) way to do it would be to create multiple copies of the YAML file and, for each copy, manually change the value of the learning rate and the momentum coefficient. You’d then call the train script on each of these copies. This solution is not satisfying for multiple reasons:

• This is long and tedious
• There’s lot of code duplication going on
• You’d better be sure there are no errors in the original YAML file, or else you’re in for a nice editing ride (been there)

Ideally, the solution should involve a single YAML file and some way of specifying how hyperparameter should be handled. One such solution exists, thanks to Pylearn2 and Jobman.

## Solution overview

Pylearn2 can instantiate a Train object specified by a YAML string via the pylearn2.config.yaml_parse.load method; using this method and Python’s string substitution syntax, we can “fill the blanks” of a template YAML string based on our original YAML file and run the experiment described by that string.

In order to to that, we’ll need a dictionary mapping hyperparameter names to their value. This is where Jobman will prove useful: Jobman accepts configuration files describing a job’s parameters, and its syntax allows to initialize parameters by calling an external Python method. This way, we can randomly sample hyperparameters for our experiment.

To summarize it all, we will

1. Adapt the YAML file by replacing hyperparameter values with string substitution statements
2. Write a configuration file specifying how to initialize the hyperparameter dictionary
3. Read the YAML file into a string
4. Fill in hyperparameter values using string substitution with the hyperparameter dictionary
5. Instantiate a Train object with the YAML string by calling pylearn2.config.yaml_parse.load
6. Call the Train object’s main_loop method
7. Extract results from the trained model

Let’s break it down.

This step is pretty straightforward. Looking back to our example, the only lines we have to replace are

Using string subsitution syntax, they become

## String substitution and training logic

The next step, assuming we already have a dictionary mapping hyperparameters to their values, would be to build a method which

1. takes the YAML string and the hyperparameter dictionary as inputs,
2. does string substitution on the YAML string,
3. calls the pylearn2.config.yaml_parse.load method to instantiate a Train object and calls its main_loop method and
4. extracts and returns results after the model is trained.

Luckily for us, one such method already exists: pylearn2.scripts.jobman.experiment.train_experiment.

This method integrates with Jobman: it expects state and channel arguments as input and returns channel.COMPLETE at the end of training. Here’s the method’s full implementation:

As you can see, it builds a dictionary out of state.hyper_parameters and uses it to do string substitution on state.yaml_template.

It then instantiates the Train object as described in the YAML string and calls its main_loop method.

Finally, when the method returns, it calls the method referenced in the state.extract_results string by passing it the Train object as argument. This method is responsible to extract any relevant results from the Train object and returning them, either as is or in a DD object. The return value is stored in state.results.

## Writing the extraction method

Your extraction method should accept a Train object instance and return either a single value (float, int, str, etc.) or a DD object containing your values.

For the purpose of this tutorial, let’s write a simple method which extracts the misclassification rate and the NLL from the model’s monitor:

Here we extract misclassification rate and NLL values at the last training epoch from their respective channels of the model’s monitor and return a DD object containing those values.

## Building the hyperparameter dictionary

Let’s now focus on the last piece of the puzzle: the Jobman configuration file. Your configuration file should contain

• yaml_template: a YAML string representing your experiment
• hyper_parameters.[name]: the value of the [name] hyperparameter. You must have at least one such item, but you can have as many as you want.
• extract_results: a string written in module.method form representing the result extraction method which is to be used

Here’s how a configuration file could look for our experiment:

Notice how we’re using the key:=@method statement. This serves two purposes:

1. We don’t have to copy the yaml file to the configuration file as a long, hard to edit string.
2. We don’t have to hard-code hyperparameter values, which means every time Jobman is called with this configuration file, it’ll receive different hyperparameters.

For reference, here’s utils.log_uniform’s implementation:

## Running the whole thing

Here’s how you would train your model:

Alternatively, you can chain jobs using jobdispatch: