Skip to content
Snippets Groups Projects
  1. Jul 25, 2016
  2. Jul 24, 2016
  3. Jul 16, 2016
    • Daniel Povey's avatar
      Fix to the reorder_addlibs.sh script (was not handling library names with... · c3074101
      Daniel Povey authored
      Fix to the reorder_addlibs.sh script (was not handling library names with numbers in them correctly) and the corresponding fix to Makefiles.
      c3074101
    • Shiyin Kang's avatar
      speed test and unit test for diff group pnorm · 1fc49b12
      Shiyin Kang authored
      bench result:
      CuMatrix::DiffGroupPnorm<float>,    16   0.019   0.009  2.11x
      CuMatrix::DiffGroupPnorm<float>,    32   0.074   0.036  2.06x
      CuMatrix::DiffGroupPnorm<float>,    64   0.297   0.142  2.10x
      CuMatrix::DiffGroupPnorm<float>,   128   1.142   0.520  2.20x
      CuMatrix::DiffGroupPnorm<float>,   256   3.442   1.553  2.22x
      CuMatrix::DiffGroupPnorm<float>,   512   6.856   2.943  2.33x
      CuMatrix::DiffGroupPnorm<float>,  1024  11.653   3.915  2.98x
      CuMatrix::DiffGroupPnorm<float>,  2048  13.812   4.263  3.24x
      CuMatrix::DiffGroupPnorm<float>,  4096  14.431   4.381  3.29x
      CuMatrix::DiffGroupPnorm<double>,    16   0.019   0.009  2.17x
      CuMatrix::DiffGroupPnorm<double>,    32   0.073   0.033  2.20x
      CuMatrix::DiffGroupPnorm<double>,    64   0.296   0.133  2.22x
      CuMatrix::DiffGroupPnorm<double>,   128   1.068   0.457  2.34x
      CuMatrix::DiffGroupPnorm<double>,   256   2.999   1.159  2.59x
      CuMatrix::DiffGroupPnorm<double>,   512   4.921   1.705  2.89x
      CuMatrix::DiffGroupPnorm<double>,  1024   6.932   1.993  3.48x
      CuMatrix::DiffGroupPnorm<double>,  2048   7.499   2.087  3.59x
      CuMatrix::DiffGroupPnorm<double>,  4096   7.684   2.104  3.65x
      
      fix bug
      
      unit test for diff group pnorm
      
      easy test for now
      
      back to full test
      
      fix p=inf for MatrixBase::GrouPnormDeriv
      1fc49b12
    • Shiyin Kang's avatar
      new kernel: _diff_group_pnorm · 58c8f0f4
      Shiyin Kang authored
      standard inf
      
      del TODO
      58c8f0f4
    • Shiyin Kang's avatar
      move pnorm back prop to cumatrix · a338e533
      Shiyin Kang authored
      fix bug
      a338e533
    • Daniel Povey's avatar
      Add script to automatically put the Kaldi libraries we link with in the right... · dbae7fa1
      Daniel Povey authored
      Add script to automatically put the Kaldi libraries we link with in the right order; use it to modify the Makefiles.  Minor top-level Makefile fix.
      dbae7fa1
    • Daniel Povey's avatar
      Some minor changes: replace '*1.0e-03' with '/1000.0f' for more consistent... · da465d84
      Daniel Povey authored
      Some minor changes: replace '*1.0e-03' with '/1000.0f' for more consistent rounding; minor documentation and cosmetic changes
      da465d84
  4. Jul 15, 2016
  5. Jul 08, 2016
    • Shiyin Kang's avatar
      re-impl softmax: less __syncthreads() / arithmetic op / global mem access · 42352b63
      Shiyin Kang authored
      New: For CuMatrix::Softmax<float>, for dim = 16, speed was 0.0153621 gigaflops.
      Old: For CuMatrix::Softmax<float>, for dim = 16, speed was 0.0138999 gigaflops.
      New: For CuMatrix::Softmax<float>, for dim = 32, speed was 0.0614275 gigaflops.
      Old: For CuMatrix::Softmax<float>, for dim = 32, speed was 0.0507328 gigaflops.
      New: For CuMatrix::Softmax<float>, for dim = 64, speed was 0.235765 gigaflops.
      Old: For CuMatrix::Softmax<float>, for dim = 64, speed was 0.203548 gigaflops.
      New: For CuMatrix::Softmax<float>, for dim = 128, speed was 0.729239 gigaflops.
      Old: For CuMatrix::Softmax<float>, for dim = 128, speed was 0.725481 gigaflops.
      New: For CuMatrix::Softmax<float>, for dim = 256, speed was 2.30126 gigaflops.
      Old: For CuMatrix::Softmax<float>, for dim = 256, speed was 1.71863 gigaflops.
      New: For CuMatrix::Softmax<float>, for dim = 512, speed was 5.0565 gigaflops.
      Old: For CuMatrix::Softmax<float>, for dim = 512, speed was 3.69659 gigaflops.
      New: For CuMatrix::Softmax<float>, for dim = 1024, speed was 10.2482 gigaflops.
      Old: For CuMatrix::Softmax<float>, for dim = 1024, speed was 6.38335 gigaflops.
      New: For CuMatrix::Softmax<double>, for dim = 16, speed was 0.0143354 gigaflops.
      Old: For CuMatrix::Softmax<double>, for dim = 16, speed was 0.013143 gigaflops.
      New: For CuMatrix::Softmax<double>, for dim = 32, speed was 0.0590478 gigaflops.
      Old: For CuMatrix::Softmax<double>, for dim = 32, speed was 0.0495458 gigaflops.
      New: For CuMatrix::Softmax<double>, for dim = 64, speed was 0.228611 gigaflops.
      Old: For CuMatrix::Softmax<double>, for dim = 64, speed was 0.193465 gigaflops.
      New: For CuMatrix::Softmax<double>, for dim = 128, speed was 0.668961 gigaflops.
      Old: For CuMatrix::Softmax<double>, for dim = 128, speed was 0.676449 gigaflops.
      New: For CuMatrix::Softmax<double>, for dim = 256, speed was 2.1013 gigaflops.
      Old: For CuMatrix::Softmax<double>, for dim = 256, speed was 1.51862 gigaflops.
      New: For CuMatrix::Softmax<double>, for dim = 512, speed was 4.13055 gigaflops.
      Old: For CuMatrix::Softmax<double>, for dim = 512, speed was 3.1547 gigaflops.
      New: For CuMatrix::Softmax<double>, for dim = 1024, speed was 6.43429 gigaflops.
      Old: For CuMatrix::Softmax<double>, for dim = 1024, speed was 5.02974 gigaflops.
      
      minor changes
      42352b63
    • scinart's avatar
      60757995
  6. Jul 07, 2016
  7. Jul 06, 2016
  8. Jul 04, 2016
  9. Jun 29, 2016
  10. Jun 28, 2016
  11. Jun 26, 2016
  12. Jun 25, 2016
    • Shiyin Kang's avatar
      _diff_softmax kernel: 4 reads and 1 write. · 6b8eefbb
      Shiyin Kang authored
      New: For CuMatrix::DiffSoftmaxPerRow<float>, for dim = 16, speed was 0.0165568 gigaflops.
      Old: For CuMatrix::DiffSoftmaxPerRow<float>, for dim = 16, speed was 0.00355242 gigaflops.
      New: For CuMatrix::DiffSoftmaxPerRow<float>, for dim = 32, speed was 0.0678791 gigaflops.
      Old: For CuMatrix::DiffSoftmaxPerRow<float>, for dim = 32, speed was 0.0145515 gigaflops.
      New: For CuMatrix::DiffSoftmaxPerRow<float>, for dim = 64, speed was 0.24739 gigaflops.
      Old: For CuMatrix::DiffSoftmaxPerRow<float>, for dim = 64, speed was 0.0583246 gigaflops.
      New: For CuMatrix::DiffSoftmaxPerRow<float>, for dim = 128, speed was 0.898427 gigaflops.
      Old: For CuMatrix::DiffSoftmaxPerRow<float>, for dim = 128, speed was 0.225076 gigaflops.
      New: For CuMatrix::DiffSoftmaxPerRow<float>, for dim = 256, speed was 2.89009 gigaflops.
      Old: For CuMatrix::DiffSoftmaxPerRow<float>, for dim = 256, speed was 0.834096 gigaflops.
      New: For CuMatrix::DiffSoftmaxPerRow<float>, for dim = 512, speed was 6.72164 gigaflops.
      Old: For CuMatrix::DiffSoftmaxPerRow<float>, for dim = 512, speed was 1.92722 gigaflops.
      New: For CuMatrix::DiffSoftmaxPerRow<float>, for dim = 1024, speed was 10.4916 gigaflops.
      Old: For CuMatrix::DiffSoftmaxPerRow<float>, for dim = 1024, speed was 2.78281 gigaflops.
      New: For CuMatrix::DiffSoftmaxPerRow<double>, for dim = 16, speed was 0.0148584 gigaflops.
      Old: For CuMatrix::DiffSoftmaxPerRow<double>, for dim = 16, speed was 0.00260567 gigaflops.
      New: For CuMatrix::DiffSoftmaxPerRow<double>, for dim = 32, speed was 0.0586865 gigaflops.
      Old: For CuMatrix::DiffSoftmaxPerRow<double>, for dim = 32, speed was 0.0121077 gigaflops.
      New: For CuMatrix::DiffSoftmaxPerRow<double>, for dim = 64, speed was 0.22893 gigaflops.
      Old: For CuMatrix::DiffSoftmaxPerRow<double>, for dim = 64, speed was 0.0527767 gigaflops.
      New: For CuMatrix::DiffSoftmaxPerRow<double>, for dim = 128, speed was 0.763462 gigaflops.
      Old: For CuMatrix::DiffSoftmaxPerRow<double>, for dim = 128, speed was 0.175736 gigaflops.
      New: For CuMatrix::DiffSoftmaxPerRow<double>, for dim = 256, speed was 2.40457 gigaflops.
      Old: For CuMatrix::DiffSoftmaxPerRow<double>, for dim = 256, speed was 0.58351 gigaflops.
      New: For CuMatrix::DiffSoftmaxPerRow<double>, for dim = 512, speed was 4.55165 gigaflops.
      Old: For CuMatrix::DiffSoftmaxPerRow<double>, for dim = 512, speed was 1.42464 gigaflops.
      New: For CuMatrix::DiffSoftmaxPerRow<double>, for dim = 1024, speed was 4.36421 gigaflops.
      Old: For CuMatrix::DiffSoftmaxPerRow<double>, for dim = 1024, speed was 1.94971 gigaflops.
      6b8eefbb
    • Shiyin Kang's avatar
      add speed test and unit test · 619889a1
      Shiyin Kang authored
      619889a1
    • Shiyin Kang's avatar
      mv diffsoftmax to cumatrix · 69ccd5ce
      Shiyin Kang authored
      69ccd5ce
  13. Jun 23, 2016
  14. Jun 22, 2016
    • Blaise Potard's avatar
      pitch_functions: Corrected the behaviour of snip_edges=falses · 4a2647bc
      Blaise Potard authored
       * Fixed snip_edges=false behaviour to have an expected feature delay of frame_shift/2 instead of being identical to the delay of snip_edges=true. This makes the behaviour of snip_edges=false consistent with other feature extractions tools.
       * Added a unit test that ensure snip_edges=false behaves as expected, by calculating the actual delay between outputs only differing by snip_edge through cross-correlation.
       * Refactored ExtractFrame to treat both edge cases symmetrically
       * Calculating window size / shift by dividing by 1000 instead of multiplying by 0.001 to fix unexpected behaviour with non integer window size / shift.
       * Made sure all changes pass google code validation
       * Changed all feature window functions to use precalculated factor: in my opinion it make the functions easier to compare, and it may be a bit faster.
      
      Note that the feature delay of snip_edges=true (the default) is rather difficult to predict (and almost certainly not what people assume), which almost certainly makes the use of pitch features problematic at the moment.
      I would suggest switching to snip_edges=false for new recipes.
      4a2647bc
  15. Jun 20, 2016
  16. Jun 17, 2016
  17. Jun 14, 2016
  18. Jun 13, 2016
  19. Jun 12, 2016
  20. Jun 11, 2016
  21. Jun 10, 2016
  22. Jun 08, 2016
    • Shiyin Kang's avatar
      full unit test for group_spec_norm with special p · 1bad1143
      Shiyin Kang authored
          stronger unit test
      1bad1143
    • Shiyin Kang's avatar
      fast GroupPnorm for p=0,1,2,inf with group transform reduce kernel template · af843b6e
      Shiyin Kang authored
      New: For CuMatrix::GroupPnorm<float>, for dim = 16, speed was 0.014416 gigaflops.
      Old: For CuMatrix::GroupPnorm<float>, for dim = 16, speed was 0.0138561 gigaflops.
      New: For CuMatrix::GroupPnorm<float>, for dim = 32, speed was 0.0616648 gigaflops.
      Old: For CuMatrix::GroupPnorm<float>, for dim = 32, speed was 0.0542906 gigaflops.
      New: For CuMatrix::GroupPnorm<float>, for dim = 64, speed was 0.241291 gigaflops.
      Old: For CuMatrix::GroupPnorm<float>, for dim = 64, speed was 0.213442 gigaflops.
      New: For CuMatrix::GroupPnorm<float>, for dim = 128, speed was 0.869675 gigaflops.
      Old: For CuMatrix::GroupPnorm<float>, for dim = 128, speed was 0.821949 gigaflops.
      New: For CuMatrix::GroupPnorm<float>, for dim = 256, speed was 3.07193 gigaflops.
      Old: For CuMatrix::GroupPnorm<float>, for dim = 256, speed was 2.90466 gigaflops.
      New: For CuMatrix::GroupPnorm<float>, for dim = 512, speed was 8.8404 gigaflops.
      Old: For CuMatrix::GroupPnorm<float>, for dim = 512, speed was 6.48644 gigaflops.
      New: For CuMatrix::GroupPnorm<float>, for dim = 1024, speed was 16.7489 gigaflops.
      Old: For CuMatrix::GroupPnorm<float>, for dim = 1024, speed was 9.3791 gigaflops.
      New: For CuMatrix::GroupPnorm<double>, for dim = 16, speed was 0.0159731 gigaflops.
      Old: For CuMatrix::GroupPnorm<double>, for dim = 16, speed was 0.0101083 gigaflops.
      New: For CuMatrix::GroupPnorm<double>, for dim = 32, speed was 0.0605624 gigaflops.
      Old: For CuMatrix::GroupPnorm<double>, for dim = 32, speed was 0.0393037 gigaflops.
      New: For CuMatrix::GroupPnorm<double>, for dim = 64, speed was 0.249944 gigaflops.
      Old: For CuMatrix::GroupPnorm<double>, for dim = 64, speed was 0.153672 gigaflops.
      New: For CuMatrix::GroupPnorm<double>, for dim = 128, speed was 0.840825 gigaflops.
      Old: For CuMatrix::GroupPnorm<double>, for dim = 128, speed was 0.598191 gigaflops.
      New: For CuMatrix::GroupPnorm<double>, for dim = 256, speed was 3.13722 gigaflops.
      Old: For CuMatrix::GroupPnorm<double>, for dim = 256, speed was 1.78274 gigaflops.
      New: For CuMatrix::GroupPnorm<double>, for dim = 512, speed was 6.86864 gigaflops.
      Old: For CuMatrix::GroupPnorm<double>, for dim = 512, speed was 2.96384 gigaflops.
      New: For CuMatrix::GroupPnorm<double>, for dim = 1024, speed was 12.5614 gigaflops.
      Old: For CuMatrix::GroupPnorm<double>, for dim = 1024, speed was 3.79237 gigaflops.
      af843b6e
    • Shiyin Kang's avatar
      generalize group_max vec_reduce mat_col_reduce to *_transform_reduce · 0f9625f2
      Shiyin Kang authored
          loop unroll by template
      
          generalize to group transform reduce
      
          _transform_reduce for vec, mat-col and group
      
          fix min bug
      
          fix bug
      
          fix template param bug
      0f9625f2
  23. Jun 04, 2016
    • Daniel Povey's avatar
      various small unrelated fixes · 61156c8c
      Daniel Povey authored
      61156c8c
    • Shiyin Kang's avatar
      Parallel group max using multiple threads per group. · a1b2f2bd
      Shiyin Kang authored
      Good performance on large group sizes (>10).
      New: For CuMatrix::GroupMax<float>, for dim = 16, speed was 0.0190836 gigaflops.
      Old: For CuMatrix::GroupMax<float>, for dim = 16, speed was 0.0193129 gigaflops.
      New: For CuMatrix::GroupMax<float>, for dim = 32, speed was 0.0791846 gigaflops.
      Old: For CuMatrix::GroupMax<float>, for dim = 32, speed was 0.0768508 gigaflops.
      New: For CuMatrix::GroupMax<float>, for dim = 64, speed was 0.311131 gigaflops.
      Old: For CuMatrix::GroupMax<float>, for dim = 64, speed was 0.299519 gigaflops.
      New: For CuMatrix::GroupMax<float>, for dim = 128, speed was 1.13589 gigaflops.
      Old: For CuMatrix::GroupMax<float>, for dim = 128, speed was 1.14847 gigaflops.
      New: For CuMatrix::GroupMax<float>, for dim = 256, speed was 4.22264 gigaflops.
      Old: For CuMatrix::GroupMax<float>, for dim = 256, speed was 3.92072 gigaflops.
      New: For CuMatrix::GroupMax<float>, for dim = 512, speed was 12.2629 gigaflops.
      Old: For CuMatrix::GroupMax<float>, for dim = 512, speed was 10.0812 gigaflops.
      New: For CuMatrix::GroupMax<float>, for dim = 1024, speed was 21.6979 gigaflops.
      Old: For CuMatrix::GroupMax<float>, for dim = 1024, speed was 16.5123 gigaflops.
      New: For CuMatrix::GroupMax (all group sizes)<float>, for dim = 16, speed was 0.0188551 gigaflops.
      Old: For CuMatrix::GroupMax (all group sizes)<float>, for dim = 16, speed was 0.0163827 gigaflops.
      New: For CuMatrix::GroupMax (all group sizes)<float>, for dim = 32, speed was 0.0701613 gigaflops.
      Old: For CuMatrix::GroupMax (all group sizes)<float>, for dim = 32, speed was 0.0620238 gigaflops.
      New: For CuMatrix::GroupMax (all group sizes)<float>, for dim = 64, speed was 0.271106 gigaflops.
      Old: For CuMatrix::GroupMax (all group sizes)<float>, for dim = 64, speed was 0.215268 gigaflops.
      New: For CuMatrix::GroupMax (all group sizes)<float>, for dim = 128, speed was 0.931745 gigaflops.
      Old: For CuMatrix::GroupMax (all group sizes)<float>, for dim = 128, speed was 0.723582 gigaflops.
      New: For CuMatrix::GroupMax (all group sizes)<float>, for dim = 256, speed was 3.53189 gigaflops.
      Old: For CuMatrix::GroupMax (all group sizes)<float>, for dim = 256, speed was 1.9751 gigaflops.
      New: For CuMatrix::GroupMax (all group sizes)<float>, for dim = 512, speed was 9.95109 gigaflops.
      Old: For CuMatrix::GroupMax (all group sizes)<float>, for dim = 512, speed was 3.91183 gigaflops.
      New: For CuMatrix::GroupMax (all group sizes)<float>, for dim = 1024, speed was 17.2099 gigaflops.
      Old: For CuMatrix::GroupMax (all group sizes)<float>, for dim = 1024, speed was 4.92671 gigaflops.
      New: For CuMatrix::GroupMax<double>, for dim = 16, speed was 0.0199497 gigaflops.
      Old: For CuMatrix::GroupMax<double>, for dim = 16, speed was 0.0148693 gigaflops.
      New: For CuMatrix::GroupMax<double>, for dim = 32, speed was 0.079538 gigaflops.
      Old: For CuMatrix::GroupMax<double>, for dim = 32, speed was 0.0718237 gigaflops.
      New: For CuMatrix::GroupMax<double>, for dim = 64, speed was 0.314509 gigaflops.
      Old: For CuMatrix::GroupMax<double>, for dim = 64, speed was 0.237838 gigaflops.
      New: For CuMatrix::GroupMax<double>, for dim = 128, speed was 1.08104 gigaflops.
      Old: For CuMatrix::GroupMax<double>, for dim = 128, speed was 0.788395 gigaflops.
      New: For CuMatrix::GroupMax<double>, for dim = 256, speed was 3.7741 gigaflops.
      Old: For CuMatrix::GroupMax<double>, for dim = 256, speed was 2.87856 gigaflops.
      New: For CuMatrix::GroupMax<double>, for dim = 512, speed was 8.65988 gigaflops.
      Old: For CuMatrix::GroupMax<double>, for dim = 512, speed was 5.87111 gigaflops.
      New: For CuMatrix::GroupMax<double>, for dim = 1024, speed was 14.0373 gigaflops.
      Old: For CuMatrix::GroupMax<double>, for dim = 1024, speed was 8.88655 gigaflops.
      New: For CuMatrix::GroupMax (all group sizes)<double>, for dim = 16, speed was 0.0174585 gigaflops.
      Old: For CuMatrix::GroupMax (all group sizes)<double>, for dim = 16, speed was 0.0136057 gigaflops.
      New: For CuMatrix::GroupMax (all group sizes)<double>, for dim = 32, speed was 0.0694617 gigaflops.
      Old: For CuMatrix::GroupMax (all group sizes)<double>, for dim = 32, speed was 0.0500527 gigaflops.
      New: For CuMatrix::GroupMax (all group sizes)<double>, for dim = 64, speed was 0.265809 gigaflops.
      Old: For CuMatrix::GroupMax (all group sizes)<double>, for dim = 64, speed was 0.177945 gigaflops.
      New: For CuMatrix::GroupMax (all group sizes)<double>, for dim = 128, speed was 0.973417 gigaflops.
      Old: For CuMatrix::GroupMax (all group sizes)<double>, for dim = 128, speed was 0.588654 gigaflops.
      New: For CuMatrix::GroupMax (all group sizes)<double>, for dim = 256, speed was 3.43166 gigaflops.
      Old: For CuMatrix::GroupMax (all group sizes)<double>, for dim = 256, speed was 1.57864 gigaflops.
      New: For CuMatrix::GroupMax (all group sizes)<double>, for dim = 512, speed was 8.26032 gigaflops.
      Old: For CuMatrix::GroupMax (all group sizes)<double>, for dim = 512, speed was 3.14173 gigaflops.
      New: For CuMatrix::GroupMax (all group sizes)<double>, for dim = 1024, speed was 12.1338 gigaflops.
      Old: For CuMatrix::GroupMax (all group sizes)<double>, for dim = 1024, speed was 3.05406 gigaflops.
      
      fix typo; rename and comment
      a1b2f2bd
    • Shiyin Kang's avatar
      Speed test on all posible group sizes given the dim. · 763b27a5
      Shiyin Kang authored
      fix space
      763b27a5
  24. Jun 03, 2016
  25. Jun 01, 2016
Loading