-
【12】Deep speech 2 End-to-end speech recognition in english and mandarin.pdf下载
资源介绍
We show that an end-to-end deep learning approach can be used to recognize
either English or Mandarin Chinese speech—two vastly different languages. Because
it replaces entire pipelines of hand-engineered components with neural networks,
end-to-end learning allows us to handle a diverse variety of speech including
noisy environments, accents and different languages. Key to our approach is
our application of HPC techniques, resulting in a 7x speedup over our previous
system [26]. Because of this efficiency, experiments that previously took weeks
now run in days. This enables us to iterate more quickly to identify superior architectures
and algorithms. As a result, in several cases, our system is competitive
with the transcription of human workers when benchmarked on standard datasets.
Finally, using a technique called Batch Dispatch with GPUs in the data center, we
show that our system can be inexpensively deployed in an online setting, delivering
low latency when serving users at scale.