🚀 ParaCodex: A Profiling-Guided Autonomous Coding Agent for Reliable Parallel Code Generation and Translation

A comprehensive framework for translating benchmark code between serial and parallel implementations (OpenMP, CUDA) and between different parallel programming models using AI agents, with automated performance testing and correctness verification. Supports Rodinia, NAS, HeCBench, and custom benchmarks.

📋 Overview

This repository implements a complete pipeline for:

🔄 Code Translation: Converting between serial C/C++ code and parallel implementations (OpenMP, CUDA) using AI agents
🔀 Cross-Parallel Translation: Translating between different parallel programming models (e.g., CUDA to OpenMP)
⚡ Performance Optimization: Multi-stage optimization with GPU offloading and profiling
✅ Correctness Verification: Automated testing to ensure numerical equivalence

🏗️ Project Structure

paracodex/
├── pipeline/                                    # Core translation and optimization pipeline
│   ├── initial_translation_codex.py             # Initial code translation using AI
│   ├── optimize_codex.py                        # Multi-stage optimization pipeline
│   ├── supervisor_codex.py                      # Correctness verification agent
│   ├── path_config.py                           # Path configuration and Codex CLI helpers
│   ├── SERIAL_OMP_PROMPTS.md                    # AI prompts for serial-to-OpenMP translation
│   ├── CUDA_PROMPTS.md                          # AI prompts for CUDA translation
│   ├── combined_serial_filenames.jsonl          # Serial kernel listings
│   ├── combined_omp_filenames.jsonl             # OpenMP kernel listings
│   ├── combined_cuda_filenames.jsonl            # CUDA kernel listings
│   └── combined_omp_pareval_filenames.jsonl     # ParEval benchmark listings
├── performance_testers/                         # Performance testing and benchmarking tools
│   └── performance_comparison.py                # Performance comparison utilities
├── utils/                                       # Utility scripts
│   └── clean_kernel_dirs.py                     # Cleanup utilities
├── workdirs/                                    # Working directories for different benchmarks
│   ├── serial_omp_rodinia_workdir/              # Rodinia benchmark workspace
│   │   ├── data/                                # Source code and benchmarks (parallel versions)
│   │   │   └── src/                             # Kernel directories (e.g., nw-omp, lud-omp)
│   │   ├── gate_sdk/                            # GATE SDK for correctness verification
│   │   ├── golden_labels/                       # Reference serial implementations
│   │   │   └── src/                             # Serial kernel directories
│   │   └── serial_kernels_changedVars/          # Transformed serial kernels
│   │       └── src/                             # Modified serial kernels for translation
│   ├── serial_omp_nas_workdir/                  # NAS benchmark workspace
│   ├── serial_omp_hecbench_workdir/             # HeCBench workspace
│   └── cuda_omp_pareval_workdir/                # ParEval CUDA/OpenMP workspace
├── results/                                     # Results and performance data
├── setup_environment.sh                         # Main environment setup script
├── install_nvidia_hpc_sdk.sh                    # Automated NVIDIA HPC SDK installer
├── verify_environment.sh                        # Environment verification tool
├── kill_gpu_processes.py/sh                     # GPU process management utilities
└── requirements.txt                             # Python dependencies

✨ Key Features

🤖 AI-Powered Translation

Multi-Agent Pipeline: Specialized AI agents for translation, optimization, and verification
Serial-to-Parallel Translation: Converting serial code to OpenMP and CUDA implementations
Cross-Parallel Translation: Translating between different parallel programming models
Intelligent Analysis: Automatic hotspot identification and offload target selection
GPU Offloading: Automatic translation to OpenMP with GPU acceleration
CUDA Implementation: Direct CUDA kernel generation and optimization

🔧 Multi-Stage Optimization

2-Stage Process: Systematic optimization from correctness to performance (GPU offload + performance tuning)
GPU Profiling: Integration with NVIDIA Nsight Systems (nsys) for detailed analysis
Retry Mechanisms: Robust error handling with automatic retry logic
Performance Tracking: Continuous monitoring of optimization progress
Cyclic Optimization: Iterative refinement until target performance is achieved

✅ Correctness Verification

GATE SDK Integration: Automated numerical correctness checking
Reference Comparison: Validation against golden reference implementations
Supervisor Agent: AI-powered code repair and correctness enforcement
Numerical Equivalence: Ensures translated code produces identical results

📊 Performance Evaluation

Comprehensive Benchmarking: CPU and GPU performance testing
Performance Comparison: Side-by-side analysis of different implementations
Results Visualization: JSON output for easy integration with analysis tools
Automated Testing: Batch processing of multiple kernels and configurations

🚀 Quick Start

📋 Prerequisites

Core Requirements

Node.js 22+ and npm: For Codex CLI
Python 3.8+: For the pipeline scripts
OpenAI Codex CLI: For AI agent interactions (npm install -g @openai/codex)
OpenAI API Access:
- Pro/Plus Users: Login directly with codex login (recommended)
- API Key Users: Set OPENAI_API_KEY environment variable

Compilation and GPU Tools

NVIDIA HPC SDK: For OpenMP GPU offloading (nvc++ compiler)
CUDA Toolkit: For CUDA development
NVIDIA Nsight Systems (nsys): For GPU profiling
NVIDIA GPU: With OpenMP offloading support

⚙️ Installation

Quick Setup (Automated)

We provide automated setup scripts to streamline the installation process:

# Clone the repository
git clone <repository-url>
cd paracodex

# Make scripts executable
chmod +x setup_environment.sh install_nvidia_hpc_sdk.sh verify_environment.sh

# Step 1: Install NVIDIA HPC SDK (if not already installed)
sudo ./install_nvidia_hpc_sdk.sh

# Step 2: Run the main environment setup
./setup_environment.sh

# Step 3: Verify your installation
./verify_environment.sh

🔧 Setup Scripts Overview

1. install_nvidia_hpc_sdk.sh - NVIDIA HPC SDK Automated Installer

This script automates the download and installation of NVIDIA HPC SDK v25.7:

✅ Downloads NVIDIA HPC SDK (~3GB)
✅ Verifies system requirements (10GB disk space, x86_64 architecture)
✅ Installs compilers (nvc++, nvfortran) with OpenMP GPU offload support
✅ Configures PATH and environment variables in ~/.bashrc
✅ Tests OpenMP CPU and GPU offload support
⚠️ Requires sudo privileges

sudo ./install_nvidia_hpc_sdk.sh
source ~/.bashrc  # Activate the new environment

2. setup_environment.sh - Main Environment Setup Script

Checks and installs all ParaCodex dependencies:

✅ Check for Node.js v22+ and npm
✅ Install Codex CLI (@openai/codex) if missing
✅ Verify Python 3.8+ installation
✅ Install Python dependencies from requirements.txt
✅ Check for NVIDIA HPC SDK (nvc++)
✅ Check for Nsight Systems (nsys)
✅ Verify GPU accessibility
✅ Confirm OpenAI API key is set
📊 Provides summary of missing dependencies

./setup_environment.sh

3. verify_environment.sh - Environment Verification Tool

Comprehensive check of your ParaCodex environment:

✅ Verifies all required tools are installed and accessible
✅ Shows version information for each component
✅ Tests OpenMP support
✅ Checks GPU accessibility with detailed info
✅ Validates Python package installations
📊 Provides a summary of any missing dependencies

./verify_environment.sh

Manual Setup

If you prefer manual installation or need to install specific components:

1. Install NVIDIA HPC SDK

Option A: Automated (Recommended)

sudo ./install_nvidia_hpc_sdk.sh
source ~/.bashrc

Option B: Manual

Download from: https://developer.nvidia.com/hpc-sdk
Install version 25.7 for CUDA 12.6 support

Add to PATH:

export PATH=/opt/nvidia/hpc_sdk/Linux_x86_64/25.7/compilers/bin:$PATH
export LD_LIBRARY_PATH=/opt/nvidia/hpc_sdk/Linux_x86_64/25.7/compilers/lib:$LD_LIBRARY_PATH
export MANPATH=/opt/nvidia/hpc_sdk/Linux_x86_64/25.7/compilers/man:$MANPATH

Add to ~/.bashrc for persistence

2. Install Node.js and npm

# Using nvm (recommended)
curl -o- https://raw.githubusercontent.com/nvm-sh/nvm/v0.39.0/install.sh | bash
source ~/.bashrc
nvm install 22

# Or using apt
curl -fsSL https://deb.nodesource.com/setup_22.x | sudo -E bash -
sudo apt-get install -y nodejs

3. Install Codex CLI

npm install -g @openai/codex

4. Set Up OpenAI API Access

Option 1: Login with OpenAI Pro Account (Recommended)

# Interactive login (opens browser for authentication)
codex login

# Or login with API key from environment
echo $OPENAI_API_KEY | codex login --with-api-key

# Verify login status
codex login status

Option 2: Use API Key Environment Variable

export OPENAI_API_KEY='your-api-key-here'
# Add to ~/.bashrc for persistence
echo "export OPENAI_API_KEY='your-api-key-here'" >> ~/.bashrc

5. Install Python dependencies

pip install -r requirements.txt

6. Install NVIDIA Nsight Systems

Download from: https://developer.nvidia.com/nsight-systems
Or use package manager if available

Verify Installation

After setup, verify your environment is correctly configured:

./verify_environment.sh

This will check and display:

✅ Node.js and npm versions
✅ Codex CLI installation
✅ Python version and key packages (openai, numpy, torch, matplotlib)
✅ NVIDIA HPC SDK (nvc++) version
✅ Nsight Systems (nsys) installation
✅ CUDA toolkit (optional)
✅ GPU information (name, driver, memory)
✅ OpenMP support
✅ OpenAI API key configuration

Expected output for a properly configured system:

✅ All core dependencies are installed and configured

If you see missing dependencies:

❌ X core dependencies are missing
   Run './setup_environment.sh' or see installation instructions

Manual verification commands:

codex --version
python3 --version
nvc++ --version
nsys --version
nvidia-smi
echo $OPENAI_API_KEY

Python Dependencies

Key Python packages (see requirements.txt for full list):

openai>=2.8.1 - OpenAI API client
pandas>=2.3.3 - Data processing
numpy>=2.1.2 - Numerical computing
matplotlib>=3.10.6 - Visualization
seaborn>=0.13.2 - Statistical visualization
tqdm>=4.67.1 - Progress bars
NVIDIA CUDA libraries for GPU support

💻 Basic Usage

Initial Setup

After installation, ensure your working directory is properly configured:

# The workdirs contain benchmark-specific source code and configurations
# For Rodinia benchmarks, use workdirs/serial_omp_rodinia_workdir/
# Ensure proper directory structure:
#   - data/src/ containing parallel kernel directories (e.g., nw-omp, lud-omp)
#   - golden_labels/src/ containing serial reference implementations
#   - serial_kernels_changedVars/src/ containing transformed serial kernels (optional)
# Ensure jsonl file with kernel names present in pipeline/combined_*_filenames.jsonl
# Ensure `golden_labels/src` exists if you want to run the supervisor agent

📁 Path Configuration

ParaCodex requires you to specify the working directory path when running pipeline scripts. Here are the paths you need to configure:

1. Codex Working Directory (--codex-workdir)

This is the main path you need to set when running translation scripts. It points to the benchmark-specific workdir:

# For Rodinia benchmarks
--codex-workdir /path/to/paracodex/workdirs/serial_omp_rodinia_workdir/

# For NAS benchmarks
--codex-workdir /path/to/paracodex/workdirs/serial_omp_nas_workdir/

# For HeCBench benchmarks
--codex-workdir /path/to/paracodex/workdirs/serial_omp_hecbench_workdir/

# For ParEval benchmarks
--codex-workdir /path/to/paracodex/workdirs/cuda_omp_pareval_workdir/

Alternative: Environment Variable

You can also set the CODEX_WORKDIR environment variable instead of using the --codex-workdir flag:

export CODEX_WORKDIR=/path/to/paracodex/workdirs/serial_omp_rodinia_workdir/
python pipeline/initial_translation_codex.py --source-api serial --target-api omp

2. Makefile Paths (GATE_ROOT)

The Makefiles in each kernel directory use GATE_ROOT to locate the GATE SDK for correctness checking. This is automatically set using relative paths:

GATE_ROOT ?= $(abspath ../..)

This means GATE_ROOT is automatically resolved relative to the kernel directory (e.g., workdirs/serial_omp_rodinia_workdir/). You typically don't need to change this unless you have a custom directory structure.

If you need to override it, you can set it when running make:

make -f Makefile.nvc GATE_ROOT=/custom/path/to/workdir

3. Output Directories

Translation outputs are saved to directories relative to the pipeline directory:

Rodinia: pipeline/rodinia_outputs/
NAS: pipeline/nas_outputs/
HeCBench: pipeline/hecbench_outputs/
ParEval: pipeline/pareval_outputs/

These paths are automatically determined based on the workdir name. You don't need to configure these.

Quick Path Setup Example:

# Set your repository path
export PARACODEX_ROOT="/path/to/paracodex"

# For Rodinia benchmarks
python pipeline/initial_translation_codex.py \
    --codex-workdir $PARACODEX_ROOT/workdirs/serial_omp_rodinia_workdir/ \
    --source-api serial \
    --target-api omp \
    --kernels nw

🔄 1. Initial Translation

Translate Rodinia benchmarks from serial to OpenMP:

# To translate all the kernels in the jsonl file run without --kernels
# Serial to OpenMP for Rodinia benchmarks
python pipeline/initial_translation_codex.py \
    --codex-workdir /path/to/paracodex/workdirs/serial_omp_rodinia_workdir/ \
    --source-api serial \
    --target-api omp \
    --kernels nw,srad,lud

Note: The --codex-workdir flag specifies the working directory containing your benchmark kernels. The output will be saved to pipeline/rodinia_outputs/ (or appropriate benchmark output directory) by default.

2. Translation with optimization

# Serial to OpenMP with optimization
python pipeline/initial_translation_codex.py \
    --codex-workdir /path/to/paracodex/workdirs/serial_omp_rodinia_workdir/ \
    --source-api serial \
    --target-api omp \
    --optimize

⚡ 3. Translation with supervision (correctness gate) after initial translation

# Serial to OpenMP with supervision for correctness verification
python pipeline/initial_translation_codex.py \
    --codex-workdir /path/to/paracodex/workdirs/serial_omp_rodinia_workdir/ \
    --source-api serial \
    --target-api omp \
    --supervise

This will create initial_supervised_* files in the output directory, including:

initial_supervised_nsys_output.txt - Full nsys profiling output
initial_supervised_nsys_relevant.txt - Extracted GPU performance metrics
initial_supervised_compilation.txt - Compilation logs
initial_supervised_output.txt - Execution output

🔧 4. Translation with optimization and supervision after optimization steps

# Serial to OpenMP with optimization and supervision after optimization steps
python pipeline/initial_translation_codex.py \
    --codex-workdir /path/to/paracodex/workdirs/serial_omp_rodinia_workdir/ \
    --source-api serial \
    --target-api omp \
    --optimize \
    --supervise \
    --opt-supervisor-steps 2

This will:

Perform initial translation
Run supervision (correctness verification)
Run optimization steps (step1, step2)
Run supervision after specified optimization steps (creates step2_supervised/ directory)

📁 Running all the steps will result in a folder with the following structure:

pipeline/rodinia_outputs/
├── {kernel_name}-{target_api}/                # Per-kernel results (e.g., nw-omp, lud-omp)
│   ├── compilation_result.txt                 # Initial compilation result
│   ├── initial_compilation.txt                # Initial compilation result
│   ├── initial_transcript.txt                 # Initial translation transcript
│   ├── initial_transcript_summary.txt         # Summary of initial translation
│   ├── {file}_initial.c                       # Initial translated code (root level)
│   ├── initial/                               # Initial translation directory
│   │   └── {file}.c                           # Initial translated code
│   ├── initial_supervised_nsys_output.txt       # Full nsys profiling output (if --supervise)
│   ├── initial_supervised_nsys_relevant.txt    # Extracted GPU metrics (if --supervise)
│   ├── initial_supervised_compilation.txt     # Compilation logs (if --supervise)
│   ├── initial_supervised_output.txt          # Execution output (if --supervise)
│   ├── initial_correct/                       # After supervisor correction
│   │   └── {file}.c                           # Supervised initial code
│   ├── step1/                                 # Optimization step 1
│   │   ├── {file}.c                           # Code after step 1
│   │   ├── transcript.txt                     # AI agent transcript
│   │   ├── transcript_summary.txt             # Transcript summary
│   │   ├── nsys_output.txt                    # Full nsys profiling output
│   │   └── nsys_relevant.txt                  # Extracted relevant nsys metrics
│   ├── step2/                                 # Optimization step 2
│   │   ├── {file}.c                           # Code after step 2
│   │   ├── transcript.txt                     # AI agent transcript
│   │   ├── transcript_summary.txt             # Transcript summary
│   │   ├── nsys_output.txt                    # Full nsys profiling output
│   │   └── nsys_relevant.txt                  # Extracted relevant nsys metrics
│   ├── step2_supervised/                      # Supervision after step 2 (if --opt-supervisor-steps 2)
│   │   ├── {file}.c                           # Supervised code after step 2
│   │   ├── supervised_nsys_output.txt         # Full nsys profiling output
│   │   └── supervised_nsys_relevant.txt       # Extracted GPU metrics
│   └── optimized/                             # Final optimized code
│       └── {file}.c                           # Final optimized code
├── {kernel2_name}-{target_api}/               # Results for second kernel
│   └── [same structure as above]
└── [additional kernels...]                     # Results for other kernels

Key Artifacts Explained:

Source Code Snapshots: Versioned code at each optimization stage (in initial/, step1/, step2/, optimized/ directories)
Transcripts: AI agent conversations and decision logs (initial_transcript.txt, step*/transcript.txt)
Nsys Outputs: GPU profiling data (step*/nsys_output.txt, initial_supervised_nsys_output.txt) and extracted metrics (step*/nsys_relevant.txt, initial_supervised_nsys_relevant.txt)
Supervised Files: Correctness-verified code and performance metrics from supervision phase

📊 5. Running performance test (against golden label parallel code)

python performance_testers/performance_comparison.py \
    --candidate_dir <your path to the parent directory of translated code> \
    --reference_dir <parent directory of golden label parallel code> \
    --output <output directory for generated artifacts>

💡 Example

python performance_testers/performance_comparison.py \
    --candidate_dir /path/to/paracodex/pipeline/rodinia_outputs \
    --reference_dir /path/to/paracodex/workdirs/serial_omp_rodinia_workdir/data/src \
    --output /path/to/paracodex/results/perf_rodinia_nsys

🛤️ Supported Translation Paths

The framework supports the following translation paths across multiple benchmark suites:

Serial → OpenMP: Converting serial code to OpenMP with GPU offloading (primary use case for Rodinia, NAS, HeCBench)
Serial → CUDA: Converting serial code to CUDA kernels
OpenMP → CUDA: Translating OpenMP code to CUDA implementations
CUDA → OpenMP: Converting CUDA kernels to OpenMP with GPU offloading (ParEval benchmarks)

🎯 Benchmark-Specific Workflows

Rodinia Benchmarks

For Rodinia benchmarks, the typical workflow is:

Prepare serial kernels: Place serial reference implementations in workdirs/serial_omp_rodinia_workdir/golden_labels/src/{kernel}-serial/
Optional transformations: Apply variable renaming, comment stripping, and reorderings to create serial_kernels_changedVars/src/{kernel}-serial/
Run translation: Use initial_translation_codex.py with --codex-workdir pointing to workdirs/serial_omp_rodinia_workdir/
Verify correctness: Use --supervise flag to run GATE SDK correctness checks
Optimize performance: Use --optimize flag for multi-stage GPU optimization
Review metrics: Check *_nsys_relevant.txt or *_nsys_relevant.txt files for GPU performance metrics

NAS Benchmarks

For NAS benchmarks, use workdirs/serial_omp_nas_workdir/ as the working directory with similar workflow steps.

HeCBench Benchmarks

For HeCBench benchmarks, use workdirs/serial_omp_hecbench_workdir/ as the working directory.

ParEval Benchmarks

For ParEval CUDA/OpenMP translation, use workdirs/cuda_omp_pareval_workdir/ as the working directory.

🛠️ Utility Scripts

ParaCodex includes several utility scripts to simplify environment management and troubleshooting:

Setup and Verification Scripts

`setup_environment.sh`

Main environment setup and dependency checker. Checks for all required tools and installs missing Python dependencies.

Usage:

./setup_environment.sh

What it does:

Checks Node.js, npm, and Codex CLI
Installs Codex CLI if npm is available
Verifies Python and pip
Installs Python packages from requirements.txt
Checks NVIDIA HPC SDK, Nsight Systems, and GPU
Validates OpenAI API key
Provides summary of missing dependencies

`install_nvidia_hpc_sdk.sh`

Automated installer for NVIDIA HPC SDK v25.7 with OpenMP GPU offload support.

Usage:

sudo ./install_nvidia_hpc_sdk.sh
source ~/.bashrc

Features:

Downloads NVIDIA HPC SDK v25.7 (~3GB)
Checks system requirements (10GB disk, x86_64 arch)
Installs to /opt/nvidia/hpc_sdk
Configures environment variables automatically
Tests OpenMP CPU and GPU offload support
Colorful progress indicators

Requirements: sudo access, 10GB free disk space

`verify_environment.sh`

Comprehensive environment verification tool that checks all dependencies and displays detailed version information.

Usage:

./verify_environment.sh

Checks:

Node.js, npm, Codex CLI versions
Python version and key packages
NVIDIA HPC SDK (nvc++) and OpenMP support
Nsight Systems and CUDA toolkit
GPU info (name, driver, memory)
OpenAI API key configuration

GPU Management Scripts

`kill_gpu_processes.sh` / `kill_gpu_processes.py`

Utilities to terminate GPU processes when needed (e.g., clearing hung processes during development).

Usage:

./kill_gpu_processes.sh
# or
python3 kill_gpu_processes.py

Cleanup Utilities

`utils/clean_kernel_dirs.py`

Cleans up generated files in kernel directories during development.

Usage:

python3 utils/clean_kernel_dirs.py --help

🆘 Support

For questions and support:

Create an issue in the repository
Check the prompts documentation:
- pipeline/SERIAL_OMP_PROMPTS.md for serial-to-OpenMP translation
- pipeline/CUDA_PROMPTS.md for CUDA-related translations

Troubleshooting

Environment Issues

Using the automated scripts:

If setup_environment.sh reports missing dependencies, install them as indicated. Common issues:

NVIDIA HPC SDK not found:
```
sudo ./install_nvidia_hpc_sdk.sh
source ~/.bashrc
```
If installation fails:
- Check disk space (need 10GB+)
- Verify you have sudo privileges
- Check internet connection (downloads ~3GB)
- Manually download from https://developer.nvidia.com/hpc-sdk
Codex CLI not found:
```
npm install -g @openai/codex
```
If this fails, ensure Node.js and npm are installed first.

OpenAI API Key not set:

export OPENAI_API_KEY='your-api-key'
echo "export OPENAI_API_KEY='your-api-key'" >> ~/.bashrc
source ~/.bashrc

GPU not accessible:
- Check with nvidia-smi
- Install NVIDIA drivers if needed: sudo apt install nvidia-driver-XXX
- Verify CUDA compatibility
- Reboot after driver installation

nvc++ installed but not in PATH:

export PATH=/opt/nvidia/hpc_sdk/Linux_x86_64/25.7/compilers/bin:$PATH
export LD_LIBRARY_PATH=/opt/nvidia/hpc_sdk/Linux_x86_64/25.7/compilers/lib:$LD_LIBRARY_PATH
echo 'export PATH=/opt/nvidia/hpc_sdk/Linux_x86_64/25.7/compilers/bin:$PATH' >> ~/.bashrc
source ~/.bashrc

Python package installation fails:

pip install --upgrade pip
pip install -r requirements.txt

If specific packages fail, install them individually:

pip install openai pandas numpy matplotlib seaborn tqdm

Common Runtime Issues

Compilation errors: Ensure NVIDIA HPC SDK is properly installed and in PATH
Permission denied on scripts: Run chmod +x setup_environment.sh
Python import errors: Run pip install -r requirements.txt

Note: This framework is designed for research and development purposes with multiple benchmark suites. Ensure you have appropriate hardware (NVIDIA GPU with OpenMP offloading support) and software (NVIDIA HPC SDK, CUDA Toolkit) before use. The framework supports translation between serial, OpenMP, and CUDA programming models with automated correctness verification and performance optimization.

Supported Benchmarks: The framework is configured for multiple benchmark suites:

Rodinia: nw (Needleman-Wunsch), srad (Speckle Reducing Anisotropic Diffusion), lud (LU Decomposition), b+tree, backprop, bfs, hotspot, and others
NAS Parallel Benchmarks: Scientific computing kernels
HeCBench: Heterogeneous computing benchmarks
ParEval: Parallel evaluation benchmarks for CUDA/OpenMP translation

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
performance_testers		performance_testers
pipeline		pipeline
results		results
utils		utils
workdirs		workdirs
ENVIRONMENT.md		ENVIRONMENT.md
LICENSE		LICENSE
QUICKSTART.md		QUICKSTART.md
README.md		README.md
install_nvidia_hpc_sdk.sh		install_nvidia_hpc_sdk.sh
kill_gpu_processes.py		kill_gpu_processes.py
kill_gpu_processes.sh		kill_gpu_processes.sh
requirements.txt		requirements.txt
setup_environment.sh		setup_environment.sh
verify_environment.sh		verify_environment.sh

License

Scientific-Computing-Lab/ParaCodex

Folders and files

Latest commit

History

Repository files navigation

🚀 ParaCodex: A Profiling-Guided Autonomous Coding Agent for Reliable Parallel Code Generation and Translation

📋 Overview

🏗️ Project Structure

✨ Key Features

🤖 AI-Powered Translation

🔧 Multi-Stage Optimization

✅ Correctness Verification

📊 Performance Evaluation

🚀 Quick Start

📋 Prerequisites

Core Requirements

Compilation and GPU Tools

⚙️ Installation

Quick Setup (Automated)

🔧 Setup Scripts Overview

Manual Setup

1. Install NVIDIA HPC SDK

2. Install Node.js and npm

3. Install Codex CLI

4. Set Up OpenAI API Access

5. Install Python dependencies

6. Install NVIDIA Nsight Systems

Verify Installation

Python Dependencies

💻 Basic Usage

Initial Setup

📁 Path Configuration

🔄 1. Initial Translation

2. Translation with optimization

⚡ 3. Translation with supervision (correctness gate) after initial translation

🔧 4. Translation with optimization and supervision after optimization steps

📁 Running all the steps will result in a folder with the following structure:

📊 5. Running performance test (against golden label parallel code)

💡 Example

🛤️ Supported Translation Paths

🎯 Benchmark-Specific Workflows

Rodinia Benchmarks

NAS Benchmarks

HeCBench Benchmarks

ParEval Benchmarks

🛠️ Utility Scripts

Setup and Verification Scripts

setup_environment.sh

install_nvidia_hpc_sdk.sh

verify_environment.sh

GPU Management Scripts

kill_gpu_processes.sh / kill_gpu_processes.py

Cleanup Utilities

utils/clean_kernel_dirs.py

🆘 Support

Troubleshooting

Environment Issues

Common Runtime Issues

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

`setup_environment.sh`

`install_nvidia_hpc_sdk.sh`

`verify_environment.sh`

`kill_gpu_processes.sh` / `kill_gpu_processes.py`

`utils/clean_kernel_dirs.py`

Packages