A Python package containing ML modules for OM1, including speech processing, utilities, and vision-language capabilities.
This project uses Poetry for dependency management.
If you haven’t already, install Poetry by following the official installation guide.
Run the following command to install the dependencies and set up the environment:
poetry installBefore running any code, activate the Poetry environment:
# Activate virtual environment
poetry shellNote
The poetry shell command has been moved to a plugin: poetry-plugin-shell. Please follow the installation instructions before using it in version 2.
To start the ASR module, run the following command:
python3 -m om1_speech --remote-url=wss://api-asr.openmind.orgYou can find the example code in ASR.py. Run it using:
python3 ./examples/ASR.pyTo start the TTS module, run the following command:
poetry run om1_tts --tts-url=https://api-dev.openmind.org/api/core/tts --device=<optional> --rate=<optional>You can find the example code in audio_output_stream.py. You can run it with the above command or using Python directly:
python3 -m om1_speech.audio.audio_output_stream --tts-url=https://api-dev.openmind.org/api/core/tts --device=<optional> --rate=<optional>Speech processing module providing the following features:
- Audio input/output handling (
audio_input_stream.py) - NVIDIA Riva ASR/TTS integration
- WebSocket streaming support
Common utility functions for:
- HTTP client/server
- WebSocket handling
- Message formatting
Vision-Language Module offering:
- Video stream processing
- Language model integration
- NVIDIA Nano LLM support
-
Clone the repository:
git clone https://github.com/openmind-org/om1-modules.git cd om1-modules -
Install dependencies using Poetry:
poetry install
To add a new dependency, use:
poetry add <package_name>- om1_speech: Contains speech processing modules.
- om1_utils: Utility functions for HTTP, WebSocket, and message handling.
- om1_vlm: Vision-language module with video stream and LLM support.
We welcome contributions! To contribute, follow these steps:
- Fork the repository.
- Create your feature branch (
git checkout -b feature/your-feature). - Commit your changes (
git commit -m "Add your feature"). - Push to the branch (
git push origin feature/your-feature). - Submit a Pull Request.
This project is licensed under the MIT License. See the LICENSE file for details.
Developed and maintained by openmind.org.