niedev / RTranslator
- воскресенье, 23 июня 2024 г. в 00:00:01
RTranslator is the world's first open source real-time translation app.
RTranslator is an (almost) open-source, free, and offline real-time translation app for Android.
Connect to someone who has the app, connect Bluetooth headphones, put the phone in your pocket and you can have a conversation as if the other person spoke your language.
The Conversation mode is the main feature of RTranslator. In this mode, you can connect with another phone that uses this app. If the user accepts your connection request:
When you talk, your phone (or the Bluetooth headset, if connected) will capture the audio.
The audio captured will be converted into text and sent to the interlocutor's phone.
The interlocutors' phone will translate the text received into his language.
The interlocutors' phone will convert the translated text into audio and will reproduce it from its speaker (or by the Bluetooth headset of the interlocutor if connected to his phone).
All this in both directions.
Each user can have more than one connected phone so that you can translate conversations between more than two people and in any combination.
If conversation mode is useful for having a long conversation with someone, this mode instead is designed for quick conversations, such as asking for information on the street or talking to a shop assistant.
This mode only translates conversations between two people, it doesn't work with Bluetooth headsets, and you have to talk in turns. It's not a real simultaneous translation, but it can work with only one phone.
In this mode, the smartphone microphone will listen in two languages (selectable in the same screen of the walkie talkie mode) simultaneously.
The app will detect in which language the interlocutor is speaking, translate the audio into the other language, convert the text into audio, and then reproduce it from the phone speaker. When the TTS has finished, it will automatically resume listening.
This mode is just a classic text translator, but always useful.
RTranslator uses Meta's NLLB for translation and OpenAi's Whisper for speech recognition, both are (almost) open-source and state of the art AIs, have excellent quality and run directly on the phone, ensuring absolute privacy and the possibility of using RTranslator even offline without loss of quality.
Also, RTranslator works even in the background, with the phone on standby or when using other apps (only when you use Conversation or WalkieTalkie modes). However, some phones limit the power in the background so in that case it is better to avoid it and keep the app open with the screen on.
The Google API's have been replaced by Meta's NLLB for translation and OpenAi's Whisper for speech recognition. These AI models run directly on your phone, so now the app is totally free and with no configuration required!
A classic text translation mode has been added.
Bluetooth LE device search has been improved.
Fixed some bugs.
I have optimized the AI models a lot to minimize RAM consumption and execution time, despite this however to be able to use the app without the risk of crashing you need a phone with at least 6GB of RAM, and to have a good enough execution time you need a phone with a fast enough CPU.
If you have a pretty crappy phone (or if you want maximum speed) you can always use version 1.0 of RTranslator (but since it uses Google APIs it's not free and needs some initial setup).
To install the app, download the latest version of the app apk file from https://github.com/niedev/RTranslator/releases/ and install it (ignore the other files, those will be downloaded automatically by the app on the first start).
On the first launch, you will need to download the models for translation and speech recognition (1.2GB) and once done you can start translating.
The languages supported are as follows:
Arabic, Bulgarian, Catalan, Chinese, Czech, Danish, German, Greek, English, Spanish, Finnish, French, Croatian, Italian, Japanese, Korean, Dutch, Polish, Portuguese, Romanian, Russian, Slovak, Swedish, Tamil, Thai, Turkish, Ukrainian, Urdu, Vietnamese.
Privacy is a fundamental right. That's why RTranslator does not collect any personal data (I don't even have a server). For more information, read the privacy policy (for now is the same privacy policy of RTranslator 1.0, but I will update it in the future).
RTranslator code is completely open-source, but some of the external libraries it uses have less permissive licenses, these are all the external libraries used by the app (with the indication of their license):
BluetoothCommunicator (open-source): Used for Bluetooth LE communication between devices.
GalleryImageSelector (open-source): Used for selecting and cropping the profile image from the gallery.
OnnxRuntime (open-source): Used as an accelerator engine for the AI models.
SentencePiece (open-source): Used for tokenization of the input text for NLLB.
Ml Kit (closed-source): Used for the identification of the language in the WalkieTalkie mode.
And the following AI models:
NLLB (open-source, but only for non-commercial use): The model used is NLLB-Distilled-600M with KV cache.
Whisper (open-source): The model used is Whisper-Small-244M with KV cache.
I converted both models to onnx format and quantized them in int8 (excluding some weights to ensure almost zero quality loss), also I separated some parts of the models to reduce RAM consumption (without this separation some weights were duplicated at runtime consuming more RAM than expected).
This is an open-source and completely ad-free app, I don't make any money from it.
So, if you like the app and want to say thank you and support the project, you can make a donation via PayPal by clicking on the button below (any amount is well accepted).
In case you will donate, or just live a star, thank you ❤️
If you have found any bug please report it by opening an issue, or by writing an email to contact.niedev@gmail.com
Enjoy your simultaneous translator.