OM1 Modules

A Python package containing ML modules for OM1, including speech processing, utilities, and vision-language capabilities.

Installation

This project uses Poetry for dependency management.

Step 1: Install Poetry

If you haven’t already, install Poetry by following the official installation guide.

Step 2: Install the Package

Run the following command to install the dependencies and set up the environment:

poetry install

Poetry Environment

Activate Environment

Before running any code, activate the Poetry environment:

# Activate virtual environment
poetry shell

Note

The poetry shell command has been moved to a plugin: poetry-plugin-shell. Please follow the installation instructions before using it in version 2.

Quick Start Examples

Automatic Speech Recognition (ASR)

To start the ASR module, run the following command:

python3 -m om1_speech --remote-url=wss://api-asr.openmind.org

Example code

You can find the example code in ASR.py. Run it using:

python3 ./examples/ASR.py

Text-to-Speech (TTS)

To start the TTS module, run the following command:

poetry run om1_tts --tts-url=https://api-dev.openmind.org/api/core/tts --device=<optional> --rate=<optional>

Example code

You can find the example code in audio_output_stream.py. You can run it with the above command or using Python directly:

python3  -m om1_speech.audio.audio_output_stream --tts-url=https://api-dev.openmind.org/api/core/tts --device=<optional> --rate=<optional>

Modules

om1 Speech

Speech processing module providing the following features:

Audio input/output handling (audio_input_stream.py)
NVIDIA Riva ASR/TTS integration
WebSocket streaming support

om1 Utils

Common utility functions for:

HTTP client/server
WebSocket handling
Message formatting

om1 VLM

Vision-Language Module offering:

Video stream processing
Language model integration
NVIDIA Nano LLM support

Development

Set Up the Development Environment

Clone the repository:

git clone https://github.com/openmind-org/om1-modules.git
cd om1-modules

Install dependencies using Poetry:
```
poetry install
```

Adding Dependencies

To add a new dependency, use:

poetry add <package_name>

Code Structure

om1_speech: Contains speech processing modules.
om1_utils: Utility functions for HTTP, WebSocket, and message handling.
om1_vlm: Vision-language module with video stream and LLM support.

Contributing

We welcome contributions! To contribute, follow these steps:

Fork the repository.
Create your feature branch (git checkout -b feature/your-feature).
Commit your changes (git commit -m "Add your feature").
Push to the branch (git push origin feature/your-feature).
Submit a Pull Request.

License

This project is licensed under the MIT License. See the LICENSE file for details.

Authors

Developed and maintained by openmind.org.

Name		Name	Last commit message	Last commit date
Latest commit History 257 Commits
.github/workflows		.github/workflows
docker		docker
examples		examples
src		src
tests		tests
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
LICENSE.md		LICENSE.md
README.md		README.md
poetry.lock		poetry.lock
poetry.toml		poetry.toml
pyproject.toml		pyproject.toml
pyrightconfig.json		pyrightconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

OM1 Modules

Installation

Step 1: Install Poetry

Step 2: Install the Package

Poetry Environment

Activate Environment

Quick Start Examples

Automatic Speech Recognition (ASR)

Example code

Text-to-Speech (TTS)

Example code

Modules

om1 Speech

om1 Utils

om1 VLM

Development

Set Up the Development Environment

Adding Dependencies

Code Structure

Contributing

License

Authors

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 9

Uh oh!

Languages

License

OpenMind/OM1-modules

Folders and files

Latest commit

History

Repository files navigation

OM1 Modules

Installation

Step 1: Install Poetry

Step 2: Install the Package

Poetry Environment

Activate Environment

Quick Start Examples

Automatic Speech Recognition (ASR)

Example code

Text-to-Speech (TTS)

Example code

Modules

om1 Speech

om1 Utils

om1 VLM

Development

Set Up the Development Environment

Adding Dependencies

Code Structure

Contributing

License

Authors

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 9

Uh oh!

Languages

Packages