-
Notifications
You must be signed in to change notification settings - Fork 3
[Feature] Update plot routines for compatibility to the Python Workflow Definition #881
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Caution Review failedThe pull request is closed. 📝 WalkthroughWalkthroughAdds inspect-based argument signature binding to tasks, enriches dependency-graph nodes (function, input, collection types) and edges (start_label/end_label), updates task hashing to use bound signatures and composite function identity, and ensures node labels are rendered as strings in the plotted graph. Changes
Sequence Diagram(s)(omitted — changes are focused on internal graph construction and representation, not multi-component sequential interactions) Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
📜 Recent review detailsConfiguration used: defaults Review profile: CHILL Plan: Pro 📒 Files selected for processing (5)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
for more information, see https://pre-commit.ci
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🧹 Nitpick comments (1)
src/executorlib/task_scheduler/interactive/dependency_plot.py (1)
141-141: Consider using f-string for better readability.The string concatenation can be replaced with an f-string for improved readability.
🔎 Proposed refactor
- "value": v["fn"].__module__ + "." + v["fn"].__name__, + "value": f"{v['fn'].__module__}.{v['fn'].__name__}",
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
src/executorlib/task_scheduler/interactive/dependency_plot.py
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (19)
- GitHub Check: unittest_openmpi (ubuntu-latest, 3.12)
- GitHub Check: unittest_mpich (macos-latest, 3.13)
- GitHub Check: unittest_openmpi (ubuntu-22.04-arm, 3.13)
- GitHub Check: unittest_mpich (ubuntu-latest, 3.13)
- GitHub Check: unittest_mpich (ubuntu-latest, 3.11)
- GitHub Check: unittest_openmpi (macos-latest, 3.13)
- GitHub Check: unittest_mpich (ubuntu-latest, 3.12)
- GitHub Check: unittest_openmpi (ubuntu-latest, 3.13)
- GitHub Check: unittest_mpich (ubuntu-24.04-arm, 3.13)
- GitHub Check: unittest_mpich (ubuntu-22.04-arm, 3.13)
- GitHub Check: unittest_openmpi (ubuntu-latest, 3.11)
- GitHub Check: unittest_win
- GitHub Check: unittest_flux_mpich
- GitHub Check: unittest_old
- GitHub Check: unittest_slurm_mpich
- GitHub Check: unittest_flux_openmpi
- GitHub Check: benchmark (ubuntu-latest, 3.13, .ci_support/environment-openmpi.yml)
- GitHub Check: notebooks
- GitHub Check: benchmark (ubuntu-latest, 3.13, .ci_support/environment-mpich.yml)
🔇 Additional comments (5)
src/executorlib/task_scheduler/interactive/dependency_plot.py (5)
1-1: LGTM!The
inspectmodule import is necessary for the signature resolution functionality introduced in this PR.
49-50: LGTM!The addition of
end_labelandstart_labelprovides more granular edge labeling forFutureSelectoroperations, enhancing graph visualization clarity.
237-237: LGTM!The defensive
str()casting ensures the label is always a string, which is good practice for compatibility with the graphing library.
89-105: Verify that the referenced external function exists.Line 91 references
"python_workflow_definition.shared.get_dict", which must exist in the Python Workflow Definition for compatibility. Ensure this function is available and properly handles the dict construction.
65-82: [No action needed - this is visualization metadata]The string values
"python_workflow_definition.shared.get_list"and"python_workflow_definition.shared.get_dict"at lines 67 and 85 are node labels in a dependency graph visualization, not executable function references. They are metadata within the node dictionary structure used for plotting, not imports or runtime dependencies. No external function verification is required.Likely an incorrect or invalid review comment.
| def extend_args(funct_dict): | ||
| sig = inspect.signature(funct_dict["fn"]) | ||
| args = sig.bind(*funct_dict["args"], **funct_dict["kwargs"]) | ||
| funct_dict["signature"] = args.arguments | ||
| return funct_dict |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add error handling for signature binding failures.
The sig.bind() call can raise TypeError if arguments don't match the function signature (e.g., missing required parameters, unexpected keyword arguments). Without error handling, this will crash the plotting functionality.
🔎 Proposed fix with error handling
def extend_args(funct_dict):
- sig = inspect.signature(funct_dict["fn"])
- args = sig.bind(*funct_dict["args"], **funct_dict["kwargs"])
- funct_dict["signature"] = args.arguments
- return funct_dict
+ try:
+ sig = inspect.signature(funct_dict["fn"])
+ args = sig.bind(*funct_dict["args"], **funct_dict["kwargs"])
+ funct_dict["signature"] = args.arguments
+ except (TypeError, ValueError) as e:
+ # Fallback to kwargs if signature binding fails
+ funct_dict["signature"] = funct_dict["kwargs"]
+ return funct_dict🤖 Prompt for AI Agents
In src/executorlib/task_scheduler/interactive/dependency_plot.py around lines 28
to 32, the call to sig.bind(...) can raise TypeError when provided args/kwargs
don't match the function signature; wrap the bind call in a try/except that
catches TypeError, record the error (e.g., add a "binding_error" or
"binding_exception" entry to funct_dict with the exception message/exception
object) and either return funct_dict with that error attached or re-raise a more
descriptive exception; ensure you do not let an unhandled TypeError crash the
caller and include enough context (function name and provided args/kwargs) in
the stored or raised message.
| value_dict = { | ||
| str(n["value"]): n["id"] for n in node_lst if n["type"] == "input" | ||
| } | ||
| if str(arg) not in value_dict: | ||
| node_id = len(node_lst) | ||
| node_lst.append( | ||
| { | ||
| "name": label, | ||
| "value": arg, | ||
| "id": node_id, | ||
| "type": "input", | ||
| "shape": "circle", | ||
| } | ||
| ) | ||
| else: | ||
| node_id = value_dict[str(arg)] | ||
| edge_lst.append( | ||
| { | ||
| "start": node_id, | ||
| "end": link_to, | ||
| "label": label, | ||
| } | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Optimize input node deduplication to avoid repeated scans.
Lines 108-109 scan the entire node_lst for every non-Future/collection argument, creating O(n*m) complexity. For graphs with many tasks and arguments, this could become a performance bottleneck.
Additionally, line 110 uses str(arg) for deduplication, which may not be robust for all types (e.g., objects with custom __str__ methods, or complex objects where string representation is ambiguous).
🔎 Proposed fix to maintain value_dict outside the function
Move value_dict outside the add_element function and update it incrementally:
node_lst: list = []
edge_lst: list = []
hash_id_dict: dict = {}
+input_value_dict: dict = {}
def extend_args(funct_dict):
sig = inspect.signature(funct_dict["fn"])
args = sig.bind(*funct_dict["args"], **funct_dict["kwargs"])
funct_dict["signature"] = args.arguments
return funct_dict
def add_element(arg, link_to, label=""):
"""
Add element to the node and edge lists.
Args:
arg: Argument to be added.
link_to: ID of the node to link the element to.
label (str, optional): Label for the edge. Defaults to "".
"""
if isinstance(arg, FutureSelector):
# ... existing code ...
elif isinstance(arg, Future):
# ... existing code ...
elif isinstance(arg, list) and any(isinstance(a, Future) for a in arg):
# ... existing code ...
elif isinstance(arg, dict) and any(isinstance(a, Future) for a in arg.values()):
# ... existing code ...
else:
- value_dict = {
- str(n["value"]): n["id"] for n in node_lst if n["type"] == "input"
- }
- if str(arg) not in value_dict:
+ if str(arg) not in input_value_dict:
node_id = len(node_lst)
node_lst.append(
{
"name": label,
"value": arg,
"id": node_id,
"type": "input",
"shape": "circle",
}
)
+ input_value_dict[str(arg)] = node_id
else:
- node_id = value_dict[str(arg)]
+ node_id = input_value_dict[str(arg)]
edge_lst.append(
{
"start": node_id,
"end": link_to,
"label": label,
}
)Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In src/executorlib/task_scheduler/interactive/dependency_plot.py around lines
107 to 129, avoid rebuilding value_dict inside the loop by hoisting a single
mutable value_dict out of add_element and updating it incrementally when you
append a new input node; use that dict to test membership and return node ids.
Replace the fragile str(arg) key with a stable key strategy: for primitives
(int/float/str/bool) use (type(arg).__name__, arg), otherwise attempt to derive
a stable identifier (e.g., an explicit .id or .uid attribute if present) and
fall back to repr(arg) as the last resort; use that key for deduplication and
store the node id in value_dict when creating a node so future checks are O(1).
| for k, task_dict in task_hash_modified_dict.items(): | ||
| for kw, v in task_dict["signature"].items(): | ||
| add_element(arg=v, link_to=hash_id_dict[k], label=str(kw)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add defensive check for "signature" key existence.
Line 147 assumes task_dict["signature"] exists, but if extend_args encounters an error (which currently has no error handling), the "signature" key may be missing, causing a KeyError.
🔎 Proposed fix with fallback
for k, task_dict in task_hash_modified_dict.items():
- for kw, v in task_dict["signature"].items():
+ signature = task_dict.get("signature", task_dict.get("kwargs", {}))
+ for kw, v in signature.items():
add_element(arg=v, link_to=hash_id_dict[k], label=str(kw))📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| for k, task_dict in task_hash_modified_dict.items(): | |
| for kw, v in task_dict["signature"].items(): | |
| add_element(arg=v, link_to=hash_id_dict[k], label=str(kw)) | |
| for k, task_dict in task_hash_modified_dict.items(): | |
| signature = task_dict.get("signature", task_dict.get("kwargs", {})) | |
| for kw, v in signature.items(): | |
| add_element(arg=v, link_to=hash_id_dict[k], label=str(kw)) |
🤖 Prompt for AI Agents
In src/executorlib/task_scheduler/interactive/dependency_plot.py around lines
146 to 148, the loop assumes task_dict["signature"] always exists which can
raise KeyError if extend_args failed; guard access by checking for the
"signature" key (e.g., use task_dict.get("signature") or an if "signature" in
task_dict) and treat a missing signature as an empty mapping so the inner loop
is skipped; optionally log or warn when signature is missing to aid debugging.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (3)
src/executorlib/task_scheduler/interactive/dependency_plot.py (3)
28-32: Add error handling for signature binding failures.The
sig.bind()call can raiseTypeErrorif arguments don't match the function signature. This issue was already flagged in a previous review and remains unaddressed.
94-111: Optimize input node deduplication to avoid repeated scans.This performance issue (O(n*m) complexity from rebuilding
value_dicton every call) was already flagged in a previous review and remains unaddressed.
128-130: Add defensive check for "signature" key existence.Line 129 assumes
task_dict["signature"]exists, which can causeKeyErrorifextend_argsfails. This issue was already flagged in a previous review and remains unaddressed.
🧹 Nitpick comments (1)
src/executorlib/task_scheduler/interactive/dependency_plot.py (1)
93-93: Consider casting dict key to string for consistency.For consistency with line 76 where list indices are cast to strings (
str(i)), consider casting the dict keyktto string:label=str(kt). This ensures labels are always strings regardless of the key type.🔎 Proposed fix
- add_element(arg=vt, link_to=node_id, label=kt) + add_element(arg=vt, link_to=node_id, label=str(kt))
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
src/executorlib/task_scheduler/interactive/dependency_plot.py
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: unittest_win
- GitHub Check: unittest_slurm_mpich
- GitHub Check: benchmark (ubuntu-latest, 3.13, .ci_support/environment-mpich.yml)
- GitHub Check: benchmark (ubuntu-latest, 3.13, .ci_support/environment-openmpi.yml)
🔇 Additional comments (4)
src/executorlib/task_scheduler/interactive/dependency_plot.py (4)
43-52: LGTM!The addition of
end_labelandstart_labelfields enriches edge metadata for better visualization of FutureSelector dependencies.
219-219: LGTM!Casting node labels to strings is good defensive coding that ensures compatibility with networkx and pygraphviz rendering requirements.
67-67: [Your rewritten review comment text here]
[Exactly ONE classification tag]
85-85: Clarify the purpose of the hardcoded visualization string.The string
"python_workflow_definition.shared.get_dict"at line 85 is a visualization label in the dependency graph node dictionary, but the module path does not exist in the codebase. Similarly, line 67 contains"python_workflow_definition.shared.get_list"following the same pattern. These appear to be placeholder strings representing the operations used to reconstruct lists and dicts from futures during visualization. Confirm whether these hardcoded strings are intentional placeholders, or if they should reference actual callable objects/imports available in the codebase.
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #881 +/- ##
==========================================
+ Coverage 93.30% 93.34% +0.03%
==========================================
Files 38 38
Lines 1808 1817 +9
==========================================
+ Hits 1687 1696 +9
Misses 121 121 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
Summary by CodeRabbit
New Features
Tests
✏️ Tip: You can customize this high-level summary in your review settings.