Add Crawl-delay Directive Support from robots.txt #1707

dillonledoux · 2026-01-14T17:54:15Z

Summary

This PR adds support for respecting Crawl-delay directives from robots.txt] files. When enabled, the crawler will automatically wait the specified delay between requests to the same domain, improving compliance with website policies and reducing the risk of being rate-limited or blocked.

Motivation

Many websites specify Crawl-delay directives in their robots.txt to indicate how long crawlers should wait between requests. Respecting this directive helps:

Maintain good etiquette when crawling websites
Reduce server load on target websites
Avoid triggering rate limiting or IP bans
Comply with website policies

Changes

New Feature: respect_crawl_delayConfiguration Parameter

Files Modified:

async_configs.py - Added respect_crawl_delay parameter to CrawlerRunConfig
models.py - Added crawl_delay] field to DomainState dataclass
utils.py - Added get_crawl_delay() method to RobotsParser
async_dispatcher.py - Enhanced RateLimiter] to support crawl-delay
async_webcrawler.py - Wired up respect_crawl_delay in arun_many()

Files Added:

test_crawl_delay.py - Comprehensive test suite

Documentation Updates

File	Changes
CHANGELOG.md	Added new feature entry under [Unreleased] section with full feature description
parameters.md	Added respect_crawl_delay to both the "Page Navigation & Timing" table and "Compliance & Ethics" table with updated code examples
arun.md	Added respect_crawl_delay to both the core usage example and comprehensive example
complete-sdk-reference.md	Added respect_crawl_delay to multiple examples and both parameter tables

Running Tests

python -m pytest tests/general/test_crawl_delay.py -v

Checklist:

My code follows the style guidelines of this project
I have performed a self-review of my own code
I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation
I have added/updated unit tests that prove my fix is effective or that my feature works
New and existing unit tests pass locally with my changes

dillonledoux added 2 commits January 14, 2026 11:39

respect the crawl delay directive from robots.txt

0de5fc3

add documentation updates

fd67179

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Add Crawl-delay Directive Support from robots.txt #1707

Add Crawl-delay Directive Support from robots.txt #1707

Uh oh!

dillonledoux commented Jan 14, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Add Crawl-delay Directive Support from robots.txt #1707

Are you sure you want to change the base?

Add Crawl-delay Directive Support from robots.txt #1707

Uh oh!

Conversation

dillonledoux commented Jan 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Motivation

Changes

Files Modified:

Files Added:

Documentation Updates

Running Tests

Checklist:

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

dillonledoux commented Jan 14, 2026 •

edited

Loading