Adding nnet-latgen-faster-parallel program for multi-threaded decoding with nnet3; refactoring the nnet3 decodable code.