generated from brevdev/seed
-
Notifications
You must be signed in to change notification settings - Fork 22
Add gpu-search and gpu-create commands with multi-provider support #276
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
theFong
wants to merge
13
commits into
main
Choose a base branch
from
claude/gpu-create-retry-command-6JSva
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
6b6a408 to
e18fc25
Compare
c16f690 to
5a74f48
Compare
Introduces a new `brev gpu-search` command (also aliased as `brev gpu`, `brev gpus`, and `brev gpu-list`) that allows users to search and filter GPU instance types from the Brev API. Features include: - Filter by GPU name (case-insensitive partial match) - Filter by minimum VRAM per GPU (in GB) - Filter by minimum total VRAM (GPU count * VRAM) - Filter by minimum GPU compute capability (e.g., 8.0 for Ampere) - Sort by price, gpu-count, vram, total-vram, vcpu, type, or capability - Support for ascending/descending sort order The command displays results in a formatted table showing instance type, GPU name, count, VRAM per GPU, total VRAM, compute capability, vCPUs, and hourly price. Includes comprehensive unit tests for filtering, sorting, and data processing functionality.
Add capability mappings for: - RTXPro6000 (12.0), B200 and RTX5090 (10.0 Blackwell) - RTX6000Ada, RTX4000Ada (8.9 Ada Lovelace) - A6000, A5000, A4000 (8.6 Ampere) - RTX6000 (7.5 Turing) - M60 (5.2 Maxwell) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Only show NVIDIA GPUs (exclude AMD Radeon, Intel Gaudi, etc.) since compute capability is NVIDIA-specific. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
This command allows creating GPU instances with automatic retry across multiple instance types. Key features: - Accept instance types from --type flag or piped from 'brev gpus' - Create multiple instances with --count flag - Parallel creation attempts with --parallel flag - Automatic cleanup of extra instances beyond requested count - Detached mode to not wait for instances to be ready Usage examples: brev gpu-create --name my-instance --type g5.xlarge brev gpus --min-vram 24 | brev gpu-create --name my-instance brev gpus --gpu-name A100 | brev gpu-create --name cluster --count 3 --parallel 5
Previously gpu-create hardcoded workspaceGroupId to "GCP", causing
instance creation to fail for non-GCP providers (shadeform, nebius,
crusoe, lambda, etc.) with "instance type not found" errors.
This change:
- Adds GetAllInstanceTypesWithWorkspaceGroups API call to fetch instance
types with their associated workspace groups from the authenticated
endpoint (/api/instances/alltypesavailable/{orgId})
- Updates gpu-create to lookup the correct workspaceGroupId for each
instance type before creating the workspace
- Adds WorkspaceGroup type and AllInstanceTypesResponse for the new API
- Adds provider filtering, disk size filtering, and enhanced display
columns to gpu-search command
Tested with: GCP, shadeform (massedcompute, hyperstack, vultr, scaleway),
nebius, crusoe, lambda, and devplane providers.
Adds --startup-script (-s) flag to attach a startup script that runs when the instance boots. The script can be provided as: - An inline string: --startup-script 'pip install torch' - A file path (prefix with @): --startup-script @setup.sh - An absolute file path: --startup-script @/path/to/setup.sh Also improves CLI documentation with examples for startup script usage.
Enable seamless command chaining between provision, shell, and open commands: - Add stdin support to `brev shell` and `brev open` to read instance names from piped input, enabling `brev provision | brev shell -c "cmd"` - Add `-c` flag to `brev shell` for running non-interactive commands with stdout/stderr piped back (supports inline commands and @filepath syntax) - Fix error output to go to stderr instead of stdout for proper piping - Add default GPU selection for `brev provision` when no type specified (min 20GB VRAM, 500GB disk, capability 8.0+, boot time <7min) - Skip version check output for gpu-create/provision commands when piped - Simplify gpusearch code by consolidating parseMemoryToGB/parseSizeToGB - Export ProcessInstances, FilterInstances, SortInstances from gpusearch - Fix duplicate workspace name error to fail immediately instead of retry - Remove dead code: unused -r/--remote and -d/--dir flags from brev shell - Remove debug print statement from workspace creation Example workflows now supported: brev provision --name my-instance | brev shell -c "nvidia-smi" brev provision --name my-instance | brev open brev shell $(brev provision --name my-instance) # interactive
Command renaming: - Rename 'gpu-search' to 'search' with 'gpu-search' as alias - Rename 'gpu-create' to 'create' with 'gpu-create' as alias - Support positional argument for instance name: brev create my-instance - Update all examples to use new command names - Keep backwards compatibility with old command names as aliases Search enhancements: - Add $/GB/MO column showing monthly disk storage pricing - Extract disk price from API's price_per_gb_hr field (converted to monthly) - Add cloud field to distinguish underlying cloud from provider/aggregator - Display cloud:provider format (e.g., hyperstack:shadeform) when different - Auto-switch to JSON output when stdout is piped (for provision chaining) - Add target_disk_gb to JSON for passing --min-disk through pipeline - Add tests for extractCloud, extractDiskInfo, and cloud extraction
Both commands now support multiple instance names: brev shell: - Run commands on multiple instances: brev shell i1 i2 i3 -c "nvidia-smi" - Read multiple instances from stdin (one per line) - Interactive shell still only supports one instance brev open: - Open multiple instances in separate editor windows: brev open i1 i2 i3 - Read multiple instances from stdin (one per line) - tmux still only supports one instance (requires interactive stdin) Enables workflows like: brev create my-cluster --count 3 | brev shell -c "nvidia-smi" brev create my-cluster --count 3 | brev open
- Add 'terminal' editor type: opens new terminal window with SSH - Update 'tmux' to open in new terminal window with tmux session - Both now support multiple instances (each opens new window) - Add --editor / -e flag for explicit editor selection Cross-platform terminal support: macOS: Terminal.app (via osascript) Linux: gnome-terminal, konsole, or xterm (tries in order) WSL: Windows Terminal (wt.exe) Windows: Windows Terminal or cmd Editor options: code - VS Code cursor - Cursor windsurf - Windsurf terminal - New terminal window with SSH tmux - New terminal window with SSH + tmux
In WSL, Windows .exe files cannot be executed directly - they need to go through cmd.exe. This fixes the "Exec format error" when running `brev open cursor` or similar commands in WSL. Changes: - Add isWSL() helper to detect Windows Subsystem for Linux - Add wslPathToWindows() to convert /mnt/c/... paths to C:\... - Add runWindowsExeInWSL() to run Windows executables via cmd.exe - Update runVsCodeCommand, runCursorCommand, runWindsurfCommand to detect WSL and use cmd.exe for Windows executables This allows users in WSL to seamlessly open VS Code, Cursor, and Windsurf installed on the Windows side.
Previously, workers would each grab different instance types in parallel, leading to mixed type usage even when capacity was available. Now the command tries to create ALL instances with the first type before moving to the next, resulting in more consistent instance configurations. Changes: - Refactor parallel workers to focus on one type at a time - Remove unused context import - Update documentation to explain the new retry behavior with examples
5a74f48 to
23e28b2
Compare
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
Adds new GPU commands for searching instance types and creating GPU instances with automatic retry logic across multiple providers.
gpu-search command
gpu-create command
workspaceGroupIdfrom authenticated APIKey Changes
GetAllInstanceTypesWithWorkspaceGroups()API to fetch instance types with workspace groupsgpu-createnow correctly provisions instances on any provider (not just GCP)WorkspaceGrouptype andAllInstanceTypesResponsefor workspace group lookupsgpu-searchwith provider and disk size filteringTest plan
Tested provisioning and deletion with multiple providers:
n1-standard-1:nvidia-tesla-t4:1,g2-standard-4:nvidia-l4:1)massedcompute_A6000_plus,hyperstack_A4000x2,vultr_A16,scaleway_L4)gpu-l40s-a.1gpu-32vcpu-128gb)l40s-48gb.2x)gpu_2x_b200_sxm6)g6.8xlarge)All providers successfully created and deleted instances.