Speech-to-text, text-to-speech, and speaker recongition using next-gen Kaldi with onnxruntime without Internet connection. Support embedded systems, Android, iOS, Raspberry Pi, RISC-V, x86_64 servers, websocket server/client, C/C++, Python, Kotlin, C#, Go, NodeJS, Java, SwiftIntroduction This repository supports running the following functions locally Speech-to-text (i.e., ASR); both streaming and non-streaming are supported Text-to-speech (i.e., TTS) Speaker identification Speaker verificatio…
🏠 将小爱音箱接入 ChatGPT 和豆包,改造成你的专属语音助手。 MiGPT:智能家居,从未如此贴心 ❤️ 在这个数字化的世界里,家已不仅仅是一个居住的地方,而是我们数字生活的延伸。 MiGPT 通过将小爱音箱、米家智能设备,与 ChatGPT 的理解能力完美融合,让你的智能家居更懂你。 MiGPT 不仅仅是关于设备自动化,而是关于:打造一个懂你、有温度、与你共同进化的家。 未来,你的每个智能家居设备,从灯泡、插座,到扫地机器人、电视等, 都可以作为一个个独立的智能体 (Agent),更智能、更贴心的响应你的指令。 这些独立的智能体,也可以彼此感知,彼此配合,构成一个更强大的协作网络。 而小爱音箱就像是你的智能家居专属管家,全心全意为你服务,释放智能家居的真正潜力。 ⚡️ 项目预览 👉 查看完整演示视频:【整活!将小爱音箱接入 ChatGPT 和豆包,改造成你的专属语音助手~】 mi-gpt-demo1.mp4 ✨ 项目亮点 🎓 LLM 回答。想象一下,你的小爱音箱变身聊天高手,可以使用 ChatGP…
Mesop: Build delightful web apps quickly in Python 🚀 Used at Google for rapid internal app development Mesop is a Python-based UI framework that allows you to rapidly build web apps like demos and internal apps: Intuitive for UI novices ✨ Write UI in idiomatic Python code Easy to understand reactive UI paradigm Ready to use components Frictionless developer workflows 🏎️ Hot reload so the browser automatically reloads and preserves state Rich IDE support with strong type safety Flexible for …
End-to-end stack for WebRTC. SFU media server and SDKs. LiveKit: Real-time video, audio and data for developers LiveKit is an open source project that provides scalable, multi-user conferencing based on WebRTC. It's designed to provide everything you need to build real-time video audio data capabilities in your applications. LiveKit's server is written in Go, using the awesome Pion WebRTC implementation. Features Scalable, distributed WebRTC SFU (Selective Forwarding …
Generative models for conditional audio generationstable-audio-tools Training and inference code for audio generation models Install The library can be installed from PyPI with: $ pip install stable-audio-tools To run the training scripts or inference code, you'll want to clone this repository, navigate to the root, and run: $ pip install . Requirements Requires PyTorch 2.0 or later for Flash Attention support Development for the repo is done in Python 3.8.10 Interface A basic Gradio interf…
Powerful menu bar manager for macOS Ice Ice is a powerful menu bar management tool. While its primary function is hiding and showing menu bar items, it aims to cover a wide variety of additional features to make it one of the most versatile menu bar tools available. NoteIce is currently in active development. Some features have not yet been implemented. Download the latest release here and see the roadmap below for upcoming features. Usage Simply Command + drag your menu bar i…
Official Implementation of Self-Supervised Street Gaussians for Autonomous DrivingS3Gaussian: Self-Supervised Street Gaussians for Autonomous Driving Paper | Project Page S3Gaussian: Self-Supervised Street Gaussians for Autonomous Driving Nan Huang*, Xiaobao Wei, Wenzhao Zheng$^\dagger$, Pengju An, Ming Lu, Wei Zhan, Masayoshi Tomizuka, Kurt Keutzer, Shanghang Zhang$^\ddagger$ * Work done while interning at UC Berkeley $\dagger$ Project leader $\ddagger$ Corresponding author S3Gaussian emp…
Generate Go client and server boilerplate from OpenAPI 3 specificationsoapi-codegen oapi-codegen is a command-line tool and library to convert OpenAPI specifications to Go code, be it server-side implementations, API clients, or simply HTTP models. Using oapi-codegen allows you to reduce the boilerplate required to create or integrate with services based on OpenAPI 3.0, and instead focus on writing your business logic, and working on the real value-add for your organisation. With oapi-codegen, …
🚀 一键部署(含离线整合包)!基于 ChatTTS ,支持音色抽卡、长音频生成和分角色朗读。简单易用,无需复杂安装。ChatTTS_colab 基于 ChatTTS 的 Colab 项目,可以部署在Colab,亦可离线下载到本地部署。两者部署完后都可以打开一个网页访问ChatTTS服务。 演示视频 欢迎关注氪学家频道 版本 地址 介绍 在线Colab版 可以在 Google Colab 上一键运行,需要Google账号,每次部署能使用72h,Colib自带15GB的GPU 离线整合版 百度网盘 提取码:ut3a 下载到本地运行,本机配置较高推荐此方式,适用 Windows 10 及以上 特点 Colab 一键运行:无需复杂的环境配置,只需点击上方的 Colab 按钮,即可在浏览器中直接运行项目。 音色抽卡功能:批量生成多个音色,并可保存自己喜欢的音色。 支持生成长音频:适合生成较长的语音内容。 字符处理:对数字和朗读错误的标点做了初步处理。 分角色朗读功能 :🚀支持对不同角色的文本进行分角色朗读,并支持大模型一键生产脚本。 功能展示 分角色朗读功能…