k2-fsa / sherpa-onnx
- воскресенье, 9 июня 2024 г. в 00:00:02
Speech-to-text, text-to-speech, and speaker recongition using next-gen Kaldi with onnxruntime without Internet connection. Support embedded systems, Android, iOS, Raspberry Pi, RISC-V, x86_64 servers, websocket server/client, C/C++, Python, Kotlin, C#, Go, NodeJS, Java, Swift
This repository supports running the following functions locally
on the following platforms and operating systems:
x86_64, 32-bit ARM, 64-bit ARM (arm64, aarch64), RISC-V (riscv64)with the following APIs
C#| Description | URL | 中国用户 |
|---|---|---|
| Streaming speech recognition | Address | 点此 |
| Text-to-speech | Address | 点此 |
| Voice activity detection (VAD) | Address | 点此 |
| VAD + non-streaming speech recognition | Address | 点此 |
| Two-pass speech recognition | Address | 点此 |
| Audio tagging | Address | 点此 |
| Audio tagging (WearOS) | Address | 点此 |
| Speaker identification | Address | 点此 |
| Spoken language identification | Address | 点此 |
| Keyword spotting | Address | 点此 |
| Description | URL |
|---|---|
| Speech recognition (speech to text, ASR) | Address |
| Text-to-speech (TTS) | Address |
| VAD | Address |
| Keyword spotting | Address |
| Audio tagging | Address |
| Speaker identification (Speaker ID) | Address |
| Spoken language identification (Language ID) | See multi-lingual Whisper ASR models from Speech recognition |
| Punctuation | Address |
Please see https://k2-fsa.github.io/sherpa/social-groups.html for 新一代 Kaldi 微信交流群 and QQ 交流群.