AI-Youtube-Shorts-Generator-Gemini

An AI-powered tool that automatically generates engaging short-form videos from longer YouTube content, optimized for platforms like YouTube Shorts, Instagram Reels, and TikTok and for static videos with a 1 person speaking.

Key Features

Smart Video Download:
- Downloads videos from YouTube URLs with quality selection
- Supports both progressive and adaptive streams
- Automatically merges video and audio for best quality
- Handles local video files as input
Advanced Transcription:
- Uses faster-whisper (base.en model) for efficient transcription
- Provides both segment-level and word-level timestamps
- CPU-optimized processing with int8 quantization
- Multi-threaded performance for faster processing
AI-Powered Highlight Detection:
- Leverages Google's Gemini-2.0-flash model for content analysis
- Identifies the most engaging segments from transcriptions
- Generates relevant hashtags and captions
- Smart content selection based on engagement potential
Intelligent Video Processing:
- Multiple vertical cropping strategies:
  - Static centered crop
  - Face-detection based dynamic cropping
  - Average face position based cropping
- Maintains optimal 9:16 aspect ratio for shorts
- Automatic bottom margin cropping for better framing
- Supports both static and animated captions
Robust Caching System:
- SQLite database for efficient data management
- Caches processed videos, audio, and transcriptions
- Prevents redundant processing of previously handled content
- Easy cache management and cleanup

Prerequisites

Python 3.10 or higher
FFmpeg (latest version recommended)
CUDA-compatible GPU (optional, for faster processing)
4GB+ RAM recommended

Installation

Clone the repository:

git clone https://github.com/yourusername/AI-Youtube-Shorts-Generator.git
cd AI-Youtube-Shorts-Generator

Create and activate a virtual environment:

# Windows
python -m venv venv
venv\Scripts\activate

# Linux/MacOS
python3 -m venv venv
source venv/bin/activate

Install dependencies:
```
pip install -r requirements.txt
```
Set up environment variables: Create a .env file in the project root:
```
GOOGLE_API_KEY=your_google_ai_studio_key_here
```

Usage

Start the tool:
```
python main.py
```
Input either:
- A YouTube URL
- A path to a local video file
Select video quality when prompted (for YouTube downloads)
The tool will process your video through several stages:
- Download/import video
- Extract and transcribe audio
- Identify engaging segments
- Create vertical crops
- Add captions
- Generate final shorts
Find your processed shorts in the shorts directory

Configuration Options

USE_ANIMATED_CAPTIONS: Toggle between static and animated captions (in main.py) (reccomended)
SHORTS_DIR: Customize output directory for processed videos
CPU thread optimization in Components/Transcription.py

Project Structure

AI-Youtube-Shorts-Generator/
├── Components/
│   ├── Captions.py       # Caption generation and rendering
│   ├── Database.py       # SQLite database management
│   ├── Edit.py          # Video editing and processing
│   ├── FaceCrop.py      # Vertical cropping algorithms
│   ├── LanguageTasks.py # AI content analysis
│   ├── Speaker.py       # Speaker detection (experimental)
│   ├── Transcription.py # Audio transcription
│   └── YoutubeDownloader.py # Video download handling
├── main.py              # Main execution script
├── requirements.txt     # Python dependencies
└── .env                # Environment variables

Database Schema

The SQLite database (video_processing.db) contains three main tables:

videos:
- id (PRIMARY KEY)
- youtube_url
- local_path
- audio_path
- created_at
transcriptions:
- id (PRIMARY KEY)
- video_id (FOREIGN KEY)
- transcription_data
- created_at
highlights:
- id (PRIMARY KEY)
- video_id (FOREIGN KEY)
- start_time
- end_time
- output_path
- segment_text
- caption_with_hashtags
- created_at

Known Issues & Limitations

Face Detection:
- The face-based cropping can be inconsistent with multiple faces
- May need manual adjustment for optimal framing in some cases
Speaker Detection:
- Current implementation uses basic voice activity detection
- Full speaker diarization not yet implemented
Resource Usage:
- Processing long videos can be memory-intensive
- GPU acceleration limited to specific components

Troubleshooting

If facing cache-related issues:
- Delete video_processing.db to clear the cache
- Remove temporary files in the videos directory
For video processing errors:
- Ensure FFmpeg is properly installed and accessible
- Check available disk space for temporary files
- Verify input video format compatibility
For AI-related issues:
- Confirm Google API key is valid and has sufficient quota
- Check internet connectivity for API calls

Contributing

Contributions are welcome! Please:

Fork the repository
Create a feature branch
Commit your changes
Push to your branch
Create a Pull Request

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

SQL integration made by YassineKADER
Original project by SamurAIGPT
Uses Google's Gemini AI for content analysis
Powered by faster-whisper for transcription

Name		Name	Last commit message	Last commit date
Latest commit History 35 Commits
Components		Components
fonts		fonts
models		models
.env.example		.env.example
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
clean_url.py		clean_url.py
clear_highlights.py		clear_highlights.py
ffmpeg		ffmpeg
haarcascade_frontalface_default.xml		haarcascade_frontalface_default.xml
main.py		main.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

AI-Youtube-Shorts-Generator-Gemini

Key Features

Prerequisites

Installation

Usage

Configuration Options

Project Structure

Database Schema

Known Issues & Limitations

Troubleshooting

Contributing

License

Acknowledgments

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 7

Uh oh!

Languages

License

Mastro1/AI-Youtube-Shorts-Generator

Folders and files

Latest commit

History

Repository files navigation

AI-Youtube-Shorts-Generator-Gemini

Key Features

Prerequisites

Installation

Usage

Configuration Options

Project Structure

Database Schema

Known Issues & Limitations

Troubleshooting

Contributing

License

Acknowledgments

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 7

Uh oh!

Languages

Packages