mobvoi / wenet
- суббота, 6 февраля 2021 г. в 00:25:45
Python
Production First and Production Ready End-to-End Speech Recognition Toolkit
We share neural Net together.
The main motivation of WeNet is to close the gap between research and production end-to-end (E2E) speech recognition models, to reduce the effort of productionizing E2E models, and to explore better E2E models for production.
Please see examples/$dataset/s0/README.md
for WeNet benchmark on different speech datasets.
You can visit Docs for WeNet Sphinx documentation. Or please read tutorials below:
git clone https://github.com/mobvoi/wenet.git
conda create -n wenet python=3.8
conda activate wenet
pip install -r requirements.txt
conda install pytorch==1.6.0 cudatoolkit=10.1 torchaudio -c pytorch
In addition to discussing in Github Issues, we created a WeChat group for better discussion and quicker response. Please scan the following QR code in WeChat to join the chat group. If it fails, please scan the personal QR code on the right with contact info like "wenet", and we will invite you to the chat group.
![]() |
![]() |
---|
We borrowed a lot of code from ESPnet, and we refered to OpenTransformer for batch inference.
@article{zhang2020unified,
title={Unified Streaming and Non-streaming Two-pass End-to-end Model for Speech Recognition},
author={Zhang, Binbin and Wu, Di and Yao, Zhuoyuan and Wang, Xiong and Yu, Fan and Yang, Chao and Guo, Liyong and Hu, Yaguang and Xie, Lei and Lei, Xin},
journal={arXiv preprint arXiv:2012.05481},
year={2020}
}