[src] Add interfaces to nnet-batch-compute that expects device input. (#3311) (0e5e07b2) · Commits · Simon Will / kaldi-commonvoice

Commit 0e5e07b2 authored May 23, 2019 by Justin Luitjens Committed by Daniel Povey May 23, 2019

[src] Add interfaces to nnet-batch-compute that expects device input. (#3311)

This avoids a ping pong of memory to host.

Implementation now assumes device memory.  interfaces will allocate
device memory and copy to it if data starts on host.

Add a cuda matrix copy function which clamps rows.  This is much
faster than copying one row at a time and the kernel can handle the
clamping for free.

parent 9e0a7f60

Hide whitespace changes

Inline Side-by-side

Please register or to comment