-
Notifications
You must be signed in to change notification settings - Fork 26
Cli Guide
A comprehensive command-line interface for the Advanced RVC Inference framework. This CLI provides access to all the powerful features of RVC including voice conversion, model training, audio separation, and more - all from the terminal.
- Installation
- Quick Start
-
Command Reference
- infer - Voice Conversion
- uvr - Audio Separation
- create-dataset - Create Training Data
- create-index - Create Model Index
- extract - Feature Extraction
- preprocess - Data Preprocessing
- train - Model Training
- create-ref - Create Reference Set
- download - Download Models/Audio
- serve - Web Interface
- info - System Information
- version - Version Info
- list-models - List Available Models
- list-f0-methods - List F0 Methods
- Configuration
- Examples
- Troubleshooting
# Clone the repository
git clone https://github.com/ArkanDash/Advanced-RVC-Inference.git
cd Advanced-RVC-Inference
# Install the package
pip install -e .
# Or install with all dependencies
pip install -e .[all]# Run directly with Python
python rvc_cli.py --help
# Or use the wrapper script (Linux/macOS)
./rvc-cli --help
# Make the wrapper executable (Linux/macOS)
chmod +x rvc-cli# Convert voice with default settings
rvc-cli infer -m mymodel.pth -i input.wav -o output.wav
# Convert with pitch shift (12 semitones = octave up)
rvc-cli infer -m mymodel.pth -i input.wav -p 12
# Convert with custom F0 method
rvc-cli infer -m mymodel.pth -i input.wav --f0_method harvest# Separate vocals from music
rvc-cli uvr -i music.wav
# Separate with specific model
rvc-cli uvr -i music.wav --model MDXNET_Main --output ./separated# Start web interface on default port
rvc-cli serve
# Start on custom port with public share
rvc-cli serve --port 8080 --shareConvert voice in an audio file using an RVC model.
rvc-cli infer -m <model> -i <input> [options]Required Arguments:
-
-m, --model: Path to RVC model file (.pth or .onnx) -
-i, --input: Path to input audio file
Optional Arguments:
-
-o, --output: Output file path (auto-generated if not specified) -
-p, --pitch: Pitch shift in semitones (default: 0) -
-f, --format: Output format - wav, mp3, flac, ogg (default: wav) -
--index: Path to .index file for better quality -
--f0_method: F0 extraction method (default: rmvpe) -
--filter_radius: Filter radius for F0 smoothing (default: 3) -
--index_rate: Index strength 0.0-1.0 (default: 0.5) -
--rms_mix_rate: RMS mix rate (default: 1.0) -
--protect: Protect consonants 0.0-1.0 (default: 0.33) -
--hop_length: Hop length for processing (default: 64) -
--embedder_model: Embedding model (default: hubert_base) -
--resample_sr: Resample rate (0 = original, default: 0) -
--split_audio: Split audio before processing -
--checkpointing: Enable memory checkpointing -
--f0_autotune: Enable F0 autotune -
--f0_autotune_strength: Autotune strength (default: 1.0) -
--formant_shifting: Enable formant shifting -
--formant_qfrency: Formant frequency coefficient (default: 0.8) -
--formant_timbre: Formant timbre coefficient (default: 0.8) -
--clean_audio: Apply audio cleaning -
--clean_strength: Cleaning strength (default: 0.7)
Example:
rvc-cli infer -m artist_model.pth -i speech.wav -o converted.wav -p 2 \
--index_rate 0.75 --f0_method rmvpe --clean_audioSeparate vocals from instrumentals using UVR5.
rvc-cli uvr -i <input> [options]Required Arguments:
-
-i, --input: Path to input audio file
Optional Arguments:
-
-o, --output: Output directory (default: ./audios/uvr) -
-f, --format: Output format (default: wav) -
--model: Separation model (default: MDXNET_Main) -
--karaoke_model: Karaoke model (default: MDX-Version-1) -
--reverb_model: Reverb model (default: MDX-Reverb) -
--denoise_model: Denoise model (default: Normal) -
--sample_rate: Output sample rate (default: 44100) -
--shifts: Number of predictions (default: 2) -
--batch_size: Batch size (default: 1) -
--overlap: Overlap between segments (default: 0.25) -
--aggression: Extraction intensity (default: 5) -
--hop_length: Hop length (default: 1024) -
--window_size: Window size (default: 512) -
--enable_tta: Enable test-time augmentation -
--enable_denoise: Enable denoising -
--separate_backing: Separate backing vocals -
--separate_reverb: Separate reverb
Available Models:
- MDXNET_Main, MDXNET_9482
- HP-Vocal-1, HP-Vocal-2
- Inst_HQ_1 through Inst_HQ_5
- Kim_Vocal_1, Kim_Vocal_2
Example:
rvc-cli uvr -i song.wav --model HP-Vocal-2 --aggression 10 \
--enable_denoise --output ./vocalsCreate training dataset from YouTube videos or local audio files.
rvc-cli create-dataset -u <url> [options]
# or
rvc-cli create-dataset -i <directory> [options]Required Arguments (one of):
-
-u, --url: YouTube URL (separate multiple with commas) -
-i, --input: Input directory with audio files
Optional Arguments:
-
-o, --output: Output directory (default: ./dataset) -
--sample_rate: Sample rate (default: 48000) -
--clean_dataset: Apply data cleaning -
--clean_strength: Cleaning strength (default: 0.7) -
--separate: Separate vocals (default: True) -
--separator_model: Separation model (default: MDXNET_Main) -
--skip_start: Seconds to skip at start (default: 0) -
--skip_end: Seconds to skip at end (default: 0)
Example:
rvc-cli create-dataset -u "https://youtube.com/watch?v=xxx" \
--sample_rate 48000 --separate --output ./my_datasetCreate .index file for voice retrieval.
rvc-cli create-index <model_name> [options]Arguments:
-
model_name: Name of the model
Optional Arguments:
-
--version: RVC version - v1 or v2 (default: v2) -
--algorithm: Index algorithm - Auto, Faiss, or KMeans (default: Auto)
Example:
rvc-cli create-index mymodel --version v2 --algorithm FaissExtract embeddings and F0 from training data.
rvc-cli extract <model_name> --sample_rate <rate> [options]Required Arguments:
-
model_name: Name of the model -
--sample_rate: Sample rate of input audio
Optional Arguments:
-
--version: RVC version - v1 or v2 (default: v2) -
--f0_method: F0 extraction method (default: rmvpe) -
--f0_onnx: Use ONNX F0 predictor -
--pitch_guidance: Use pitch guidance (default: True) -
--hop_length: Hop length (default: 128) -
--cpu_cores: CPU cores (default: 2) -
--gpu: GPU index (default: - for CPU) -
--embedder_model: Embedder model (default: hubert_base) -
--rms_extract: Extract RMS energy
Example:
rvc-cli extract mymodel --sample_rate 48000 --f0_method rmvpe \
--gpu 0 --pitch_guidanceSlice and normalize training audio.
rvc-cli preprocess <model_name> --sample_rate <rate> [options]Required Arguments:
-
model_name: Name of the model -
--sample_rate: Sample rate
Optional Arguments:
-
--dataset_path: Dataset path (default: ./dataset) -
--cpu_cores: CPU cores (default: 2) -
--cut_method: Cutting method - Automatic, Simple, or Skip (default: Automatic) -
--process_effects: Apply preprocessing effects -
--clean_dataset: Clean dataset -
--chunk_len: Chunk length for Simple method (default: 3.0) -
--overlap_len: Overlap length (default: 0.3) -
--normalization: Normalization mode - none, pre, or post (default: none)
Example:
rvc-cli preprocess mymodel --sample_rate 48000 --cut_method Automatic \
--process_effects --normalization preTrain a new RVC voice model.
rvc-cli train <model_name> [options]Required Arguments:
-
model_name: Name of the model
Optional Arguments:
-
--version: RVC version - v1 or v2 (default: v2) -
--author: Model author name -
--epochs: Total training epochs (default: 300) -
--batch_size: Batch size (default: 8) -
--save_every: Save checkpoint every N epochs (default: 50) -
--save_latest: Save only latest checkpoint (default: True) -
--save_weights: Save all model weights (default: True) -
--gpu: GPU index (default: 0) -
--cache_gpu: Cache data in GPU -
--pitch_guidance: Use pitch guidance (default: True) -
--pretrained_g: Path to pre-trained G weights -
--pretrained_d: Path to pre-trained D weights -
--vocoder: Vocoder - Default, MRF-HiFi-GAN, or RefineGAN (default: Default) -
--energy: Use RMS energy -
--overtrain_detect: Enable overtraining detection -
--optimizer: Optimizer - AdamW, RAdam, or AnyPrecisionAdamW (default: AdamW) -
--multiscale_loss: Use multi-scale mel loss -
--use_reference: Use custom reference set -
--reference_path: Path to reference set
Example:
rvc-cli train mymodel --version v2 --epochs 500 --batch_size 8 \
--gpu 0 --save_every 100 --vocoder "MRF-HiFi-GAN"Create reference audio for better inference quality.
rvc-cli create-ref <audio_file> [options]Required Arguments:
-
audio_file: Path to audio file
Optional Arguments:
-
-n, --name: Reference name (default: reference) -
--version: RVC version - v1 or v2 (default: v2) -
--pitch_guidance: Use pitch guidance (default: True) -
--energy: Use RMS energy -
--embedder_model: Embedder model (default: hubert_base) -
--f0_method: F0 extraction method (default: rmvpe) -
--pitch_shift: Pitch shift (default: 0) -
--filter_radius: Filter radius (default: 3) -
--f0_autotune: Enable F0 autotune -
--alpha: Alpha blending (default: 0.5)
Example:
rvc-cli create-ref reference_audio.wav -n myref --f0_method rmvpeDownload models from HuggingFace or audio from YouTube.
rvc-cli download -l <link> [options]Required Arguments:
-
-l, --link: Download link (HuggingFace or YouTube URL)
Optional Arguments:
-
-t, --type: Download type - model, audio, or index (default: model) -
-n, --name: Name to save as
Example:
rvc-cli download -l "https://huggingface.co/user/model/resolve/main/model.pth"Launch the Gradio web UI.
rvc-cli serve [options]Optional Arguments:
-
--host: Host to bind (default: 0.0.0.0) -
--port: Port to bind (default: 7860) -
--share: Create public share URL
Example:
rvc-cli serve --port 7860 --shareShow system and environment information.
rvc-cli infoDisplays:
- Operating system and version
- CPU information
- Memory and disk space
- GPU information (if available)
- Python and package versions
Show version and dependency information.
rvc-cli versionList installed models in the weights folder.
rvc-cli list-modelsShow all available pitch extraction methods.
rvc-cli list-f0-methods| Variable | Description | Default |
|---|---|---|
ARVC_ASSETS_PATH |
Path to assets directory | assets |
ARVC_CONFIGS_PATH |
Path to configs directory | configs |
ARVC_WEIGHTS_PATH |
Path to weights directory | assets/weights |
ARVC_LOGS_PATH |
Path to logs directory | assets/logs |
Place your model files (.pth or .onnx) in:
advanced_rvc_inference/assets/weights/
Place index files (.index) in:
advanced_rvc_inference/assets/logs/<model_name>/
# 1. Create dataset from YouTube
rvc-cli create-dataset -u "https://youtube.com/watch?v=xxx" \
--output ./dataset --sample_rate 48000 --separate
# 2. Create index for the model (after training)
rvc-cli create-index mymodel --version v2
# 3. Extract features
rvc-cli extract mymodel --sample_rate 48000 --f0_method rmvpe --gpu 0
# 4. Preprocess data
rvc-cli preprocess mymodel --sample_rate 48000 --cut_method Automatic
# 5. Train model
rvc-cli train mymodel --version v2 --epochs 300 --batch_size 8 --gpu 0# Using RMVPE (recommended)
rvc-cli infer -m model.pth -i input.wav -o output_rmvpe.wav --f0_method rmvpe
# Using Harvest (faster)
rvc-cli infer -m model.pth -i input.wav -o output_harvest.wav --f0_method harvest
# Using Crepe (most accurate but slow)
rvc-cli infer -m model.pth -i input.wav -o output_crepe.wav --f0_method crepe-medium# Process all audio files in a directory
for file in ./inputs/*.wav; do
rvc-cli infer -m model.pth -i "$file" -o "./outputs/$(basename $file)"
done# Convert with different models
rvc-cli infer -m model_a.pth -i voice.wav -o voice_model_a.wav -p 0
rvc-cli infer -m model_b.pth -i voice.wav -o voice_model_b.wav -p 2
rvc-cli infer -m model_c.pth -i voice.wav -o voice_model_c.wav -p -2- Ensure the model path is correct
- Check that the model file has
.pthor.onnxextension - Verify file permissions
- Reduce batch size:
--batch_size 4 - Enable checkpointing:
--checkpointing - Use CPU:
--gpu -
- Convert audio to WAV first:
ffmpeg -i input.mp3 output.wav - Supported formats: wav, mp3, flac, ogg, opus, m4a, aac
- Install ONNX runtime for some methods
- Some methods require specific embedders
# General help
rvc-cli --help
# Command-specific help
rvc-cli infer --help
rvc-cli uvr --help
rvc-cli train --help