An AI-powered tool that automatically generates engaging short-form videos from longer YouTube content, optimized for platforms like YouTube Shorts, Instagram Reels, and TikTok and for static videos with a 1 person speaking.
-
Smart Video Download:
- Downloads videos from YouTube URLs with quality selection
- Supports both progressive and adaptive streams
- Automatically merges video and audio for best quality
- Handles local video files as input
-
Advanced Transcription:
- Uses
faster-whisper(base.en model) for efficient transcription - Provides both segment-level and word-level timestamps
- CPU-optimized processing with int8 quantization
- Multi-threaded performance for faster processing
- Uses
-
AI-Powered Highlight Detection:
- Leverages Google's Gemini-2.0-flash model for content analysis
- Identifies the most engaging segments from transcriptions
- Generates relevant hashtags and captions
- Smart content selection based on engagement potential
-
Intelligent Video Processing:
- Multiple vertical cropping strategies:
- Static centered crop
- Face-detection based dynamic cropping
- Average face position based cropping
- Maintains optimal 9:16 aspect ratio for shorts
- Automatic bottom margin cropping for better framing
- Supports both static and animated captions
- Multiple vertical cropping strategies:
-
Robust Caching System:
- SQLite database for efficient data management
- Caches processed videos, audio, and transcriptions
- Prevents redundant processing of previously handled content
- Easy cache management and cleanup
- Python 3.10 or higher
- FFmpeg (latest version recommended)
- CUDA-compatible GPU (optional, for faster processing)
- 4GB+ RAM recommended
-
Clone the repository:
git clone https://github.com/yourusername/AI-Youtube-Shorts-Generator.git cd AI-Youtube-Shorts-Generator -
Create and activate a virtual environment:
# Windows python -m venv venv venv\Scripts\activate # Linux/MacOS python3 -m venv venv source venv/bin/activate
-
Install dependencies:
pip install -r requirements.txt
-
Set up environment variables: Create a
.envfile in the project root:GOOGLE_API_KEY=your_google_ai_studio_key_here
-
Start the tool:
python main.py
-
Input either:
- A YouTube URL
- A path to a local video file
-
Select video quality when prompted (for YouTube downloads)
-
The tool will process your video through several stages:
- Download/import video
- Extract and transcribe audio
- Identify engaging segments
- Create vertical crops
- Add captions
- Generate final shorts
-
Find your processed shorts in the
shortsdirectory
USE_ANIMATED_CAPTIONS: Toggle between static and animated captions (in main.py) (reccomended)SHORTS_DIR: Customize output directory for processed videos- CPU thread optimization in
Components/Transcription.py
AI-Youtube-Shorts-Generator/
├── Components/
│ ├── Captions.py # Caption generation and rendering
│ ├── Database.py # SQLite database management
│ ├── Edit.py # Video editing and processing
│ ├── FaceCrop.py # Vertical cropping algorithms
│ ├── LanguageTasks.py # AI content analysis
│ ├── Speaker.py # Speaker detection (experimental)
│ ├── Transcription.py # Audio transcription
│ └── YoutubeDownloader.py # Video download handling
├── main.py # Main execution script
├── requirements.txt # Python dependencies
└── .env # Environment variables
The SQLite database (video_processing.db) contains three main tables:
-
videos:
- id (PRIMARY KEY)
- youtube_url
- local_path
- audio_path
- created_at
-
transcriptions:
- id (PRIMARY KEY)
- video_id (FOREIGN KEY)
- transcription_data
- created_at
-
highlights:
- id (PRIMARY KEY)
- video_id (FOREIGN KEY)
- start_time
- end_time
- output_path
- segment_text
- caption_with_hashtags
- created_at
-
Face Detection:
- The face-based cropping can be inconsistent with multiple faces
- May need manual adjustment for optimal framing in some cases
-
Speaker Detection:
- Current implementation uses basic voice activity detection
- Full speaker diarization not yet implemented
-
Resource Usage:
- Processing long videos can be memory-intensive
- GPU acceleration limited to specific components
-
If facing cache-related issues:
- Delete
video_processing.dbto clear the cache - Remove temporary files in the
videosdirectory
- Delete
-
For video processing errors:
- Ensure FFmpeg is properly installed and accessible
- Check available disk space for temporary files
- Verify input video format compatibility
-
For AI-related issues:
- Confirm Google API key is valid and has sufficient quota
- Check internet connectivity for API calls
Contributions are welcome! Please:
- Fork the repository
- Create a feature branch
- Commit your changes
- Push to your branch
- Create a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.
- SQL integration made by YassineKADER
- Original project by SamurAIGPT
- Uses Google's Gemini AI for content analysis
- Powered by faster-whisper for transcription