Skip to content
Commit 7c7a1767 authored by Justin Luitjens's avatar Justin Luitjens Committed by Daniel Povey
Browse files

[src] Implemented CUDA acclerated online cmvn. (#3370)

This patch is part of a larger effort to implement the entire online feature pipeline in CUDA so that wav data is transfered to the device and never copied back to the host.
This patch includes a new binary cudafeatbin/apply-cmvn-online.cc which for the most part matches online2bin/apply-cmvn-online.
This binary is primarily for correctness testing and debugging as it makes no effort to compute multiple features in parallel on the device.
The CUDA performance is dominiated by the cost of copying the feature to and from the device. While there is a small speedup I do not expect this binary to be used in production.
Instead users will use the upcomming online-pipeline which will take features directly from the mfcc computation on the device and pass results to the next part of the pipeline.

Summary of changes:

Makefile:
   Added online2 dependencies to cudafeat, cudafeatbin, cudadecoder, and cudadecoderbin.
cudafeat\:
   Makefile:  added online2 dependency, added new .cu/.h files
   feature-online-cmvn-cuda.cu/h:  implements online-cmvn in cuda.
cudafeatbin\:
   Makefile:  added new binary, added online2 dependency
   apply-cmvn-online-cuda.cc:  binary which mimics online2bin/apply-cmvn-online

Correctness testing:

The correctness was tested by generating set of 20000 features and then running the CPU binary and GPU binary and comparing results using featbin/compare-feats.

../online2bin/apply-cmvn-online /workspace/models/LibriSpeech/ivector_extractor/global_cmvn.stats "scp:mfcc.scp" "ark,scp:cmvn.ark,cmvn.scp"
./apply-cmvn-online-cuda /workspace/models/LibriSpeech/ivector_extractor/global_cmvn.stats "scp:mfcc.scp" "ark,scp:cmvn-cuda.ark,cmvn-cuda.scp"

../featbin/compare-feats ark:cmvn-cuda.ark ark:cmvn.ark
LOG (compare-feats[5.5.1301~3-17818]:main():compare-feats.cc:105) self-product of 1st features for each column dimension:  [ 5.52221e+09 9.1134e+09 5.92818e+09 7.42173e+09 7.48633e+09 7.21316e+09 6.9515e+09 7.03883e+09 6.40267e+09 5.83088e+09 5.01438e+09 5.1575e+09 4.28688e+09 3.529e+09 3.12182e+09 2.28721e+09 1.76343e+09 1.35117e+09 8.72517e+08 5.31836e+08 2.65112e+08 9.20308e+07 1.24084e+07 3.56008e+06 4.25283e+07 1.09786e+08 1.88937e+08 2.60207e+08 3.23115e+08 3.56371e+08 3.69035e+08 3.65216e+08 3.89125e+08 4.07064e+08 3.40407e+08 2.65444e+08 2.50244e+08 2.05726e+08 1.60606e+08 1.07217e+08 ]

LOG (compare-feats[5.5.1301~3-17818]:main():compare-feats.cc:106) self-product of 2nd features for each column dimension:  [ 5.5223e+09 9.11355e+09 5.92812e+09 7.4218e+09 7.48666e+09 7.21338e+09 6.95174e+09 7.03895e+09 6.40254e+09 5.83113e+09 5.01411e+09 5.15774e+09 4.28692e+09 3.52918e+09 3.122e+09 2.28693e+09 1.76326e+09 1.3513e+09 8.72521e+08 5.31802e+08 2.65137e+08 9.20296e+07 1.2408e+07 3.5604e+06 4.25301e+07 1.09793e+08 1.88933e+08 2.60217e+08 3.23124e+08 3.56371e+08 3.69007e+08 3.65176e+08 3.89104e+08 4.07067e+08 3.40416e+08 2.65498e+08 2.50196e+08 2.057e+08 1.60612e+08 1.07192e+08 ]

LOG (compare-feats[5.5.1301~3-17818]:main():compare-feats.cc:107) cross-product for each column dimension:  [ 5.52209e+09 9.11229e+09 5.92538e+09 7.41665e+09 7.47877e+09 7.20269e+09 6.93785e+09 7.02284e+09 6.38411e+09 5.81143e+09 4.99389e+09 5.13753e+09 4.26792e+09 3.51154e+09 3.10676e+09 2.27436e+09 1.75322e+09 1.34367e+09 8.67367e+08 5.28672e+08 2.63516e+08 9.14194e+07 1.23215e+07 3.53409e+06 4.21905e+07 1.08872e+08 1.87238e+08 2.57779e+08 3.19827e+08 3.5252e+08 3.64691e+08 3.60529e+08 3.84482e+08 4.02396e+08 3.36136e+08 2.61631e+08 2.46931e+08 2.03079e+08 1.5856e+08 1.05738e+08 ]

LOG (compare-feats[5.5.1301~3-17818]:main():compare-feats.cc:111) Similarity metric for each dimension  [ 0.99997 0.999871 0.999532 0.999311 0.998968 0.998533 0.998019 0.997719 0.997111 0.996644 0.995941 0.996104 0.995572 0.995028 0.995147 0.994445 0.994258 0.994402 0.994095 0.994084 0.993934 0.993363 0.993015 0.992655 0.992037 0.991645 0.991017 0.990649 0.98981 0.989195 0.988267 0.987222 0.988093 0.98853 0.987442 0.985534 0.986858 0.987196 0.987242 0.986318 ]
 (1.0 means identical, the smaller the more different)
LOG (compare-feats[5.5.1301~3-17818]:main():compare-feats.cc:116) Overall similarity for the two feats is:0.993119 (1.0 means identical, the smaller the more different)
LOG (compare-feats[5.5.1301~3-17818]:main():compare-feats.cc:119) Processed 20960 feature files, 0 had errors.
LOG (compare-feats[5.5.1301~3-17818]:main():compare-feats.cc:126) Features are considered similar since 0.993119 >= 0.99
parent 63c54e2d
Loading
Loading
Loading
Loading
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment