Feature: Add built-in network test #874

jan-janssen · 2025-12-23T08:53:58Z

Summary by CodeRabbit

New Features
- Added executor initialization validation with timeout detection and enhanced error reporting for startup failures.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

for more information, see https://pre-commit.ci

coderabbitai · 2025-12-23T08:54:13Z

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Walkthrough

A new internal self-test function is added to verify executor functionality immediately after initialization. The function submits a task, polls for completion with timeout, and raises an error with guidance if the task does not complete within the specified interval.

Changes

Cohort / File(s)	Summary
Executor Self-Test Addition `src/executorlib/executor/single.py`	Adds `_executor_self_test(exe)` function that validates executor task submission and completion. Imports `sleep` from `time` module. Function is invoked in two locations: after SingleNodeExecutor task scheduler construction and after creating the single-node executor. Implements sleep-based polling to wait for task completion and raises `TimeoutError` with guidance message if task does not complete within timeout window.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 66.67% which is insufficient. The required threshold is 80.00%.	You can run `@coderabbitai generate docstrings` to improve docstring coverage.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'Feature: Add built-in network test' directly aligns with the main change: introducing a self-test function that validates network connectivity in executors.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (2)

src/executorlib/executor/single.py (2)
459-463: Consider a more diagnostic error message.

The error message prescriptively suggests hostname_localhost=True as the solution, but timeout could occur for other reasons (network issues, resource contention, worker failures). Consider providing a more diagnostic message that describes what failed and suggests hostname_localhost=True as one possible solution rather than the definitive fix.
🔎 Suggested improvement
     if not f.done():
         exe.shutdown(wait=False, cancel_futures=False)
         raise TimeoutError(
-            'Plase try "hostname_localhost=True" in the initialization of the SingleNodeExecutor(hostname_localhost=True).'
+            'Executor self-test timed out after 1 second. This may indicate network connectivity issues. '
+            'Try setting hostname_localhost=True in SingleNodeExecutor initialization if running on macOS or in HPC environments.'
         )
453-464: Consider optional refinements.

Two minor refinements to consider:

The function returns True but this value is never used (line 192 doesn't capture it). Consider removing the return statement or documenting why it returns a value.

The static analysis tool suggests avoiding long exception messages outside the exception class (TRY003). While this is a style preference, you could consider defining a custom exception class if this pattern is repeated elsewhere.

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3281a55 and 4f1dc53.

📒 Files selected for processing (1)

src/executorlib/executor/single.py

🧰 Additional context used

🧬 Code graph analysis (1)

src/executorlib/executor/single.py (2)

src/executorlib/executor/base.py (2)

submit (71-108)

shutdown (110-126)

src/executorlib/task_scheduler/file/shared.py (1)

done (40-48)

🪛 Ruff (0.14.10)

src/executorlib/executor/single.py

461-463: Avoid specifying long messages outside the exception class

(TRY003)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (20)

GitHub Check: unittest_openmpi (ubuntu-latest, 3.13)
GitHub Check: unittest_flux_openmpi
GitHub Check: unittest_openmpi (macos-latest, 3.13)
GitHub Check: unittest_openmpi (ubuntu-latest, 3.12)
GitHub Check: unittest_mpich (ubuntu-latest, 3.13)
GitHub Check: unittest_openmpi (ubuntu-latest, 3.11)
GitHub Check: unittest_openmpi (ubuntu-24.04-arm, 3.13)
GitHub Check: unittest_mpich (macos-latest, 3.13)
GitHub Check: unittest_openmpi (ubuntu-22.04-arm, 3.13)
GitHub Check: unittest_mpich (ubuntu-22.04-arm, 3.13)
GitHub Check: unittest_mpich (ubuntu-24.04-arm, 3.13)
GitHub Check: unittest_mpich (ubuntu-latest, 3.11)
GitHub Check: unittest_slurm_mpich
GitHub Check: unittest_flux_mpich
GitHub Check: notebooks
GitHub Check: notebooks_integration
GitHub Check: benchmark (ubuntu-latest, 3.13, .ci_support/environment-mpich.yml)
GitHub Check: unittest_win
GitHub Check: unittest_old
GitHub Check: benchmark (ubuntu-latest, 3.13, .ci_support/environment-openmpi.yml)

🔇 Additional comments (2)

src/executorlib/executor/single.py (2)

1-1: LGTM!

The import of sleep is necessary for the polling mechanism in the self-test function.

456-458: Verify the timeout duration is sufficient.

The self-test uses a 1-second timeout (10 iterations × 0.1s). Ensure this duration is sufficient for various environments, including slower systems, busy clusters, or CI/CD environments where resources may be constrained.

coderabbitai · 2025-12-23T08:58:03Z

src/executorlib/executor/single.py

                    log_obj_size=log_obj_size,
                )
            )
+        _executor_self_test(exe=self._task_scheduler)


⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

# First, find the file and check its size find . -name "single.py" -path "*/executor/*" | head -5

Repository: pyiron/executorlib

Length of output: 98

🏁 Script executed:

# Examine the file around line 192 and the TestClusterExecutor area sed -n '185,300p' src/executorlib/executor/single.py

Repository: pyiron/executorlib

Length of output: 8212

🏁 Script executed:

# Continue reading to see TestClusterExecutor.__init__ implementation body sed -n '289,370p' src/executorlib/executor/single.py

Repository: pyiron/executorlib

Length of output: 4316

🏁 Script executed:

# Search for _executor_self_test definition rg -n "def _executor_self_test" src/executorlib/

Repository: pyiron/executorlib

Length of output: 130

🏁 Script executed:

# View the _executor_self_test function sed -n '453,500p' src/executorlib/executor/single.py

Repository: pyiron/executorlib

Length of output: 474

🏁 Script executed:

# Check the structure of both __init__ methods to see if they both expose _task_scheduler sed -n '100,200p' src/executorlib/executor/single.py

Repository: pyiron/executorlib

Length of output: 6781

Add self-test validation to TestClusterExecutor for consistency.

TestClusterExecutor initialization lacks the _executor_self_test() call present in SingleNodeExecutor, despite both having similar executor setup patterns. Consider adding _executor_self_test(exe=self._task_scheduler) after the super().__init__() call to validate executor initialization and catch connection issues early, consistent with SingleNodeExecutor's approach.

🤖 Prompt for AI Agents

In src/executorlib/executor/single.py around line 192, TestClusterExecutor's initializer is missing the self-test call used in SingleNodeExecutor; add a call to _executor_self_test(exe=self._task_scheduler) immediately after the super().__init__(...) return so the TaskScheduler/connection is validated during initialization, mirroring SingleNodeExecutor and allowing early failure detection.

coderabbitai · 2025-12-23T08:58:03Z

src/executorlib/executor/single.py

+def _executor_self_test(exe):
+    f = exe.submit(sum, [1, 1])
+    counter = 0
+    while not f.done() and counter < 10:
+        sleep(0.1)
+        counter += 1
+    if not f.done():
+        exe.shutdown(wait=False, cancel_futures=False)
+        raise TimeoutError(
+            'Plase try "hostname_localhost=True" in the initialization of the SingleNodeExecutor(hostname_localhost=True).'
+        )
+    return True


⚠️ Potential issue | 🟡 Minor

Fix typo in error message.

Line 462 contains a typo: "Plase" should be "Please".

🔎 Proposed fix

raise TimeoutError( - 'Plase try "hostname_localhost=True" in the initialization of the SingleNodeExecutor(hostname_localhost=True).' + 'Please try "hostname_localhost=True" in the initialization of the SingleNodeExecutor(hostname_localhost=True).' )

🧰 Tools

🪛 Ruff (0.14.10)

461-463: Avoid specifying long messages outside the exception class

(TRY003)

🤖 Prompt for AI Agents

In src/executorlib/executor/single.py around lines 453 to 464, the error message raised on timeout contains a typo ("Plase"). Update the string to correct the spelling to "Please" so the message reads: 'Please try "hostname_localhost=True" in the initialization of the SingleNodeExecutor(hostname_localhost=True).' Ensure only the typo is fixed and preserve punctuation and quoting exactly as in the original message.

jan-janssen · 2025-12-23T09:03:23Z

======================================================================
ERROR: test_meta_executor_parallel (test_singlenodeexecutor_mpi.TestExecutorBackend.test_meta_executor_parallel)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/runner/work/executorlib/executorlib/tests/test_singlenodeexecutor_mpi.py", line 66, in test_meta_executor_parallel
    with SingleNodeExecutor(
         ~~~~~~~~~~~~~~~~~~^
        max_workers=2,
        ^^^^^^^^^^^^^^
        resource_dict={"cores": 2},
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^
        block_allocation=True,
        ^^^^^^^^^^^^^^^^^^^^^^
    ) as exe:
    ^
  File "/home/runner/miniconda3/envs/test/lib/python3.13/site-packages/executorlib/executor/single.py", line 192, in __init__
    _executor_self_test(exe=self._task_scheduler)
    ~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/runner/miniconda3/envs/test/lib/python3.13/site-packages/executorlib/executor/single.py", line 461, in _executor_self_test
    raise TimeoutError(
        'Plase try "hostname_localhost=True" in the initialization of the SingleNodeExecutor(hostname_localhost=True).'
    )
TimeoutError: Plase try "hostname_localhost=True" in the initialization of the SingleNodeExecutor(hostname_localhost=True).

======================================================================
ERROR: test_meta_executor_parallel_cache (test_singlenodeexecutor_mpi.TestExecutorBackendCache.test_meta_executor_parallel_cache)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/runner/work/executorlib/executorlib/tests/test_singlenodeexecutor_mpi.py", line 92, in test_meta_executor_parallel_cache
    with SingleNodeExecutor(
         ~~~~~~~~~~~~~~~~~~^
        max_workers=2,
        ^^^^^^^^^^^^^^
    ...<2 lines>...
        cache_directory="executorlib_cache",
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    ) as exe:
    ^
  File "/home/runner/miniconda3/envs/test/lib/python3.13/site-packages/executorlib/executor/single.py", line 192, in __init__
    _executor_self_test(exe=self._task_scheduler)
    ~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/runner/miniconda3/envs/test/lib/python3.13/site-packages/executorlib/executor/single.py", line 461, in _executor_self_test
    raise TimeoutError(
        'Plase try "hostname_localhost=True" in the initialization of the SingleNodeExecutor(hostname_localhost=True).'
    )
TimeoutError: Plase try "hostname_localhost=True" in the initialization of the SingleNodeExecutor(hostname_localhost=True).

======================================================================
ERROR: test_single_node_executor_init_function (test_singlenodeexecutor_noblock.TestFunctionCrashes.test_single_node_executor_init_function)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/runner/work/executorlib/executorlib/tests/test_singlenodeexecutor_noblock.py", line 166, in test_single_node_executor_init_function
    with SingleNodeExecutor(max_workers=2, init_function=exit_funct, block_allocation=True) as exe:
         ~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/runner/miniconda3/envs/test/lib/python3.13/site-packages/executorlib/executor/single.py", line 192, in __init__
    _executor_self_test(exe=self._task_scheduler)
    ~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/runner/miniconda3/envs/test/lib/python3.13/site-packages/executorlib/executor/single.py", line 461, in _executor_self_test
    raise TimeoutError(
        'Plase try "hostname_localhost=True" in the initialization of the SingleNodeExecutor(hostname_localhost=True).'
    )
TimeoutError: Plase try "hostname_localhost=True" in the initialization of the SingleNodeExecutor(hostname_localhost=True).

======================================================================
FAIL: test_cache_data_class (test_singlenodeexecutor_cache.TestCacheFunctions.test_cache_data_class)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/runner/work/executorlib/executorlib/tests/test_singlenodeexecutor_cache.py", line 52, in test_cache_data_class
    self.assertEqual(sum([c["output"] for c in cache_lst]), sum(result_lst))
    ~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AssertionError: 14 != 12

======================================================================
FAIL: test_cache_key (test_singlenodeexecutor_cache.TestCacheFunctions.test_cache_key)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/runner/work/executorlib/executorlib/tests/test_singlenodeexecutor_cache.py", line 66, in test_cache_key
    self.assertTrue("same" in entry['filename'])
    ~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AssertionError: False is not true

----------------------------------------------------------------------
Ran 244 tests in 67.719s

FAILED (failures=2, errors=3, skipped=35)

jan-janssen and others added 2 commits December 23, 2025 09:53

Feature: Add built-in network test

a29aa6d

[pre-commit.ci] auto fixes from pre-commit.com hooks

4f1dc53

for more information, see https://pre-commit.ci

jan-janssen marked this pull request as draft December 23, 2025 08:55

coderabbitai bot reviewed Dec 23, 2025

View reviewed changes

reset after successful selftest

11ad8e3

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feature: Add built-in network test #874

Feature: Add built-in network test #874

Uh oh!

jan-janssen commented Dec 23, 2025 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Dec 23, 2025 •

edited

Loading

Review skipped

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Dec 23, 2025

Uh oh!

coderabbitai bot Dec 23, 2025

Uh oh!

jan-janssen commented Dec 23, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Feature: Add built-in network test #874

Are you sure you want to change the base?

Feature: Add built-in network test #874

Uh oh!

Conversation

jan-janssen commented Dec 23, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Dec 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Review skipped

Walkthrough

Changes

Estimated code review effort

Pre-merge checks and finishing touches

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Dec 23, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Dec 23, 2025

Choose a reason for hiding this comment

Uh oh!

jan-janssen commented Dec 23, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

jan-janssen commented Dec 23, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Dec 23, 2025 •

edited

Loading