Skip to content
Snippets Groups Projects
  1. Jun 11, 2019
    • Justin Luitjens's avatar
      [src] Implemented CUDA acclerated online cmvn. (#3370) · 7c7a1767
      Justin Luitjens authored
      This patch is part of a larger effort to implement the entire online feature pipeline in CUDA so that wav data is transfered to the device and never copied back to the host.
      This patch includes a new binary cudafeatbin/apply-cmvn-online.cc which for the most part matches online2bin/apply-cmvn-online.
      This binary is primarily for correctness testing and debugging as it makes no effort to compute multiple features in parallel on the device.
      The CUDA performance is dominiated by the cost of copying the feature to and from the device. While there is a small speedup I do not expect this binary to be used in production.
      Instead users will use the upcomming online-pipeline which will take features directly from the mfcc computation on the device and pass results to the next part of the pipeline.
      
      Summary of changes:
      
      Makefile:
         Added online2 dependencies to cudafeat, cudafeatbin, cudadecoder, and cudadecoderbin.
      cudafeat\:
         Makefile:  added online2 dependency, added new .cu/.h files
         feature-online-cmvn-cuda.cu/h:  implements online-cmvn in cuda.
      cudafeatbin\:
         Makefile:  added new binary, added online2 dependency
         apply-cmvn-online-cuda.cc:  binary which mimics online2bin/apply-cmvn-online
      
      Correctness testing:
      
      The correctness was tested by generating set of 20000 features and then running the CPU binary and GPU binary and comparing results using featbin/compare-feats.
      
      ../online2bin/apply-cmvn-online /workspace/models/LibriSpeech/ivector_extractor/global_cmvn.stats "scp:mfcc.scp" "ark,scp:cmvn.ark,cmvn.scp"
      ./apply-cmvn-online-cuda /workspace/models/LibriSpeech/ivector_extractor/global_cmvn.stats "scp:mfcc.scp" "ark,scp:cmvn-cuda.ark,cmvn-cuda.scp"
      
      ../featbin/compare-feats ark:cmvn-cuda.ark ark:cmvn.ark
      LOG (compare-feats[5.5.1301~3-17818]:main():compare-feats.cc:105) self-product of 1st features for each column dimension:  [ 5.52221e+09 9.1134e+09 5.92818e+09 7.42173e+09 7.48633e+09 7.21316e+09 6.9515e+09 7.03883e+09 6.40267e+09 5.83088e+09 5.01438e+09 5.1575e+09 4.28688e+09 3.529e+09 3.12182e+09 2.28721e+09 1.76343e+09 1.35117e+09 8.72517e+08 5.31836e+08 2.65112e+08 9.20308e+07 1.24084e+07 3.56008e+06 4.25283e+07 1.09786e+08 1.88937e+08 2.60207e+08 3.23115e+08 3.56371e+08 3.69035e+08 3.65216e+08 3.89125e+08 4.07064e+08 3.40407e+08 2.65444e+08 2.50244e+08 2.05726e+08 1.60606e+08 1.07217e+08 ]
      
      LOG (compare-feats[5.5.1301~3-17818]:main():compare-feats.cc:106) self-product of 2nd features for each column dimension:  [ 5.5223e+09 9.11355e+09 5.92812e+09 7.4218e+09 7.48666e+09 7.21338e+09 6.95174e+09 7.03895e+09 6.40254e+09 5.83113e+09 5.01411e+09 5.15774e+09 4.28692e+09 3.52918e+09 3.122e+09 2.28693e+09 1.76326e+09 1.3513e+09 8.72521e+08 5.31802e+08 2.65137e+08 9.20296e+07 1.2408e+07 3.5604e+06 4.25301e+07 1.09793e+08 1.88933e+08 2.60217e+08 3.23124e+08 3.56371e+08 3.69007e+08 3.65176e+08 3.89104e+08 4.07067e+08 3.40416e+08 2.65498e+08 2.50196e+08 2.057e+08 1.60612e+08 1.07192e+08 ]
      
      LOG (compare-feats[5.5.1301~3-17818]:main():compare-feats.cc:107) cross-product for each column dimension:  [ 5.52209e+09 9.11229e+09 5.92538e+09 7.41665e+09 7.47877e+09 7.20269e+09 6.93785e+09 7.02284e+09 6.38411e+09 5.81143e+09 4.99389e+09 5.13753e+09 4.26792e+09 3.51154e+09 3.10676e+09 2.27436e+09 1.75322e+09 1.34367e+09 8.67367e+08 5.28672e+08 2.63516e+08 9.14194e+07 1.23215e+07 3.53409e+06 4.21905e+07 1.08872e+08 1.87238e+08 2.57779e+08 3.19827e+08 3.5252e+08 3.64691e+08 3.60529e+08 3.84482e+08 4.02396e+08 3.36136e+08 2.61631e+08 2.46931e+08 2.03079e+08 1.5856e+08 1.05738e+08 ]
      
      LOG (compare-feats[5.5.1301~3-17818]:main():compare-feats.cc:111) Similarity metric for each dimension  [ 0.99997 0.999871 0.999532 0.999311 0.998968 0.998533 0.998019 0.997719 0.997111 0.996644 0.995941 0.996104 0.995572 0.995028 0.995147 0.994445 0.994258 0.994402 0.994095 0.994084 0.993934 0.993363 0.993015 0.992655 0.992037 0.991645 0.991017 0.990649 0.98981 0.989195 0.988267 0.987222 0.988093 0.98853 0.987442 0.985534 0.986858 0.987196 0.987242 0.986318 ]
       (1.0 means identical, the smaller the more different)
      LOG (compare-feats[5.5.1301~3-17818]:main():compare-feats.cc:116) Overall similarity for the two feats is:0.993119 (1.0 means identical, the smaller the more different)
      LOG (compare-feats[5.5.1301~3-17818]:main():compare-feats.cc:119) Processed 20960 feature files, 0 had errors.
      LOG (compare-feats[5.5.1301~3-17818]:main():compare-feats.cc:126) Features are considered similar since 0.993119 >= 0.99
      7c7a1767
  2. Jun 03, 2019
    • Justin Luitjens's avatar
      [src] Add CUDA accelerated MFCC computation. (#3348) · eedd9fa9
      Justin Luitjens authored
      * Add CUDA accelerated MFCC computation.
      
      Creates a new directory 'cudafeat' for placing cuda feature extraction
      components as it is developed.  Added a directory 'cudafeatbin' for
      placing binaries that are cuda accelerated that mirrior binaries
      elsewhere.
      
      This commit implements:
        feature-window-cuda.h/cu which implements a feature window on the device
          by copying it from a host feature window.
        feature-mfcc-cuda.h/cu which implements the cuda mfcc feature
          extractor.
        compute-mfcc-feats-cuda.cc which mirriors compute-mfcc-feats.cc
      
        There were also minor changes to other files.
      
      * Only build cuda binaries if cuda is enabled
      eedd9fa9
  3. Apr 26, 2019
  4. Mar 31, 2019
  5. Jan 14, 2019
  6. Dec 31, 2018
  7. Nov 24, 2017
  8. Aug 15, 2017
  9. May 26, 2017
  10. May 24, 2017
    • Daniel Povey's avatar
      [src,scripts,egs] Merge master into kaldi_52 (#1628) · ec8dec6f
      Daniel Povey authored
      * [scripts] nnet1: minor update  i-vector and mpe scripts (#1607)
      
      - mpe: backward compatibility is provided
      - ivec: the ivectors get stored in binary format (saves space)
      
      * [src] cosmetic change to const-arpa-lm-building code; remove too-general template. (#1610)
      
      * [src,scripts,egs] Segmenting long erroneous recordings (#1167)
      
      This is a solution for creating ASR training data from long recordings with transcription but without segmentation information.
      
      * [egs] thchs30 cmd and stage bug fix (#1619)
      
      * [src] Change to GPU synchronization, for speed (disables GPU stats by default) (#1617)
      
      * [src] Fix template instantiation bug causing failure if DOUBLEPRECISION=1
      
      * [egs,scripts] Updates to BUT-specific cmd.sh settings (affects only Brno team); changes RE verbose level in nnet1 scripts.
      
      * [src] fix a small bug: logging cuda elapsed time (#1623)
      
      * [src,scripts,egs]  Add capability for multilingual training with nnet3; babel_multilang example.
      
      * [scripts] Fix some merge problems I noticed on github review.
      
      * [src] fix problem in test code.
      
      * fixed some issues to merge kaldi_52 into master.
      
      * removed add_lda parameter and its dependency.
      ec8dec6f
  11. May 17, 2017
  12. Feb 09, 2017
    • Dogan Can's avatar
      Resolve merge conflicts and add "make ext" to travis build (#1407) · 0d5e4b1d
      Dogan Can authored
      
      * [build]: resolving OpenFst compilation issue with  gcc-6.x (#1392)
      
      * [egs] Add new graphemic system for Gale Arabic, with newer nnet scripts (#1298)
      
      * [build] Windows build: generate missing base/version.h; cosmetic changes (#1397)
      
      * [build]: Enable cross compilation, including to android. (#726)
      
      If a user has a number of tool chains installed and they do not want to
      use the default, they must currently edit the kaldi.mk file after
      running configure to change the CC, CXX, AR, AS, and RANLIB variables.
      This is something that should be exposed via the configure script. This
      patch exposes an option to set the host triple for the desired tool
      chain in the configure script.
      
      Building Kaldi on my Raspberry Pi boards is not particularly fast.  I
      have been using the following patch to build kaldi executables for use
      on the Pi boards for the better part of a year.  A typical invocation
      for me is something like:
      
      $ ./configure --static --atlas-root=/opt/cross/armv8hf \
      --fst-root=/opt/cross/armv8hf --host=armv8-rpi3-linux-gnueabihf \
      --fst-version=1.4.1
      
      This way I can build on my much faster x86 desktop, but still run
      experiments on ARM.
      
      I have included support for cross compiling for ppc64le and it works for
      me (at least it produces binaries for ppc64le I don't have a ppc64
      machine to test it).
      
      Signed-off-by: default avatarEric B Munson <eric@cobaltspeech.com>
      
      * Add mk file and configure options for building for Android
      
      Building for Android requires a toolchain that can be built using the
      Android NDK.  It works similiarly to the linux build except that it only
      uses clang, only supports the openBLAS math library, and requires an
      additional include directory for the system C++ headers.
      
      A typical configure invocation looks like:
      
      ./configure --static --openblas-root=/opt/cross/arm-linux-androideabi \
      --fst-root=/opt/cross/arm-linux-androideabi \
      --host=arm-linux-androideabi --fst-version=1.4.1 \
      --android-includes=/opt/cross/arm-linux-androideabi/sysroot/usr/include
      
      Signed-off-by: default avatarEric B Munson <eric@cobaltspeech.com>
      
      * Make pthread cancel symbols noops for Android
      
      The Android C library does not support cancelling pthreads so the
      symbols PTHREAD_CANCEL_STATE and pthread_setcancelstate are undefined.
      Because a pthread cannot be cancelled in Android, it is reasonable to
      make the pthread_setcancelstate() call a noop.
      
      Signed-off-by: default avatarEric B Munson <eric@cobaltspeech.com>
      
      * [build] fixing issue introduced in the previous win commit (#1399)
      
      * [egs] Fix to HKUST nnet2/3 scripts. (#1401)
      
      when training ubm, we should just use the 40 dimention mfcc
      so change the train directory for avoiding dimention mismatching
      this script won't get error when run after nnet2's scripts.
      
      * [egs,scripts,src] Add BABEL s5d recipe; various associated fixes (#1356)
      
      * Creating a new recipe directory
      
      * adding lists
      
      * Improvements in the pipeline, fixes, syllab search
      
      * Transplanting the diff to s5d
      
      * added TDNN, LSTM and BLSTM scripts.
      added Telugu conf files.
      
      * added blstm script and top level commands
      
      * improved keyword search, new lang  configs
      
      * removing not needed scripts
      
      * added blstm results
      
      * some keyword-search optimization binaries
      
      * removing some extra files + kwsearch pipeline improvement
      
      * adding configs for the OP3 langs
      
      * configs for the rest of the OP3 langs
      
      * Added updated configs for IndusDB.20151208.Babel.tar.bz2
      
      * fixes of the pipeline, added langp (re)estimation
      
      * adding the kaldi-native search pipeline and a bunch of changes related to this
      
      * removing extra files
      
      * A couple of fixes
      
      * KWS improvements and fixes
      
      * Fixes of a couple of issues reported by Fred Richardson <frichard@ll.mit.edu>
      
      * A separate script for lexicon expansion
      
      * A couple of fixes and tweaks. Added checks for tools, especially sox.
      
      * adding a couple of changes -- new style options and results for BP langs
      
      * adding new results(still will need to be updated)
      
      * added langp and some details tweaked
      
      * updated STT results, new KWS results and a couple of small fixes all around
      
      * adding file lists for dev languages
      
      * miniature fixes and cleanups
      
      * one more batch of small fixes -- mostly whitespace cleanup
      
      * small fixes -- location of files and removal of trailing slash inn the pathname
      
      * enabling stage-2 KWS pipeline
      
      * adding some directories to .gitignore
      
      * some quick fixes
      
      * latest fixes
      
      * making the script split_compound_set to conform to the naming
      
      * some last minute fixes for the combination scoring
      
      * do not attempt to score when the scoring data is not available
      
      * bug fixes and --ntrue-from option
      
      * another batch of fixes
      
      * adding +x permission to split_compound_set.sh
      
      * fixing whitespaces
      
      * fixing whitespaces
      
      * a couple of fixes
      
      * adding the cleanup script and chain models training
      
      * adding the graphemic/unicode lexicon feature
      
      * adding the graphemic/unicode lexicon feature
      
      * fixing the the cc files headers, adding c info
      
      * use the user-provided kwset id, not the filename
      
      * use _cleaned affix
      
      * fixes w.r.t. getting chain models independent on other systems
      
      * small fixes as reported by Fred Richardson and Yenda
      
      * another issue reported by Fred Richarson
      
      * fixing KWS for the chain systems
      
      * fixes in the KWS hitlist combination
      
      * adding 40hrs pashto config and fixes for the unicode system
      
      * fixing some bugs as reported by Ni Chongjia (I2R)
      
      * fixing some bugs as reported by Fred Richardson
      
      * adding 40hrs Pashto OP3 setup
      
      * addressing Dan's comments, some further cleanup
      
      * improving the make_index script
      
      * remove  fsts-scale
      
      * adding 'see also' to some of the fst tools
      
      * adding back accidentaly removed svn check
      
      * [egs] removing empty files in BABEL recipe (#1406)
      
      These caused a problem on MacOS, as reported by @dogancan.
      
      * Add online extension to travis build.
      
      * Fix parallel online extension build. Randomly choose between single and double precision BaseFloats in travis build.
      
      * Remove parantheses that were unintentinally added to the travis script in the previous commit.
      0d5e4b1d
    • Jan "yenda" Trmal's avatar
      42114e64
  13. Jan 18, 2017
    • Dogan Can's avatar
      Further changes to configure. · d8fd0d9c
      Dogan Can authored
      d8fd0d9c
    • Dogan Can's avatar
      Update src/Makefile to enforce OpenFst >= 1.5.3. · b23f7205
      Dogan Can authored
      OpenFst-1.5.3 adds support for minimization of non-deterministic FSTs
      over idempotent semirings which is a feature used throughout Kaldi.
      Along with the requirement for a C++ compiler with C++11 support, we
      are also removing support for older OpenFst releases so that we can
      build against an un-patched OpenFst installation.
      b23f7205
  14. Jan 11, 2017
  15. Nov 01, 2016
  16. Sep 20, 2016
  17. Jul 27, 2016
  18. Jul 16, 2016
  19. Jun 13, 2016
  20. Jun 04, 2016
  21. May 30, 2016
  22. May 20, 2016
  23. Apr 28, 2016
  24. Mar 16, 2016
  25. Feb 28, 2016
  26. Jan 08, 2016
  27. Dec 10, 2015
  28. Dec 09, 2015
  29. Nov 19, 2015
  30. Oct 25, 2015
  31. Oct 09, 2015
  32. Oct 06, 2015
  33. Sep 28, 2015
  34. Sep 24, 2015
  35. Sep 22, 2015
  36. Sep 20, 2015
  37. Aug 05, 2015
Loading