Another update on the state of TIMIT dataset in Pylearn2

Feb 18, 2014

A (pretty much) finished business

Last week I continued working on the Pylearn2 implementation of the TIMIT dataset, so I figured now would be the time to write a quick progress report.

More data integration

Thanks to Laurent Dinh’s precious help, more data is available:

  • Phones
  • Phonemes
  • Words

Later this week I’d like to make a blog post to show how this information can be used.

Data standardization

Audio sequences are now normalized, with mean and standard deviation being computed across all sequences of all sets (train, valid and test). Those values are saved to help with generative tasks.

Better memory footprint

With Jean-Philippe Raymond’s help, the number of arrays needed to store information necessary to generate batches of examples on the fly has been reduced.

The batches returned by the iterator are now stored in-place, in a buffer, to reduce the number of memory allocations during the lifetime of the dataset.

What remains to be done

There’s still room for improvement in terms of memory usage. For instance, the array which maps example indexes to their location in data arrays can get quite big, especially if the length of a frame is very small.