-
Notifications
You must be signed in to change notification settings - Fork 33
feat: 清洗任务增加重试次数记录和日志展示 #280
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
# Conflicts: # backend/services/data-cleaning-service/src/main/java/com/datamate/cleaning/domain/model/entity/CleaningTask.java # scripts/db/data-cleaning-init.sql
| public List<CleaningTaskLog> getTaskLog(String taskId, int retryCount) { | ||
| cleanTaskValidator.checkTaskId(taskId); | ||
| String logPath = FLOW_PATH + "/" + taskId + "/output.log"; | ||
| if (retryCount > 0) { |
Check failure
Code scanning / CodeQL
Uncontrolled data used in path expression High
user-provided value
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI about 9 hours ago
In general, user-controlled data used in path expressions should be constrained so that the resulting path cannot escape an intended base directory or become an absolute path. A common and robust pattern is: (1) define a base directory as a Path, (2) resolve the user input as a child path using base.resolve(userInput), (3) normalize the result, and (4) verify that the normalized result still starts with the base directory. If any check fails, reject the request.
For this specific case, we should treat FLOW_PATH as the root of all cleaning task logs, and ensure that taskId cannot break out of that directory. Instead of using String logPath = FLOW_PATH + "/" + taskId + "/output.log"; and Paths.get(logPath), we should build a Path like:
Path baseFlowPath = Paths.get(FLOW_PATH).toAbsolutePath().normalize();
Path logFilePath = baseFlowPath.resolve(taskId).resolve("output.log").toAbsolutePath().normalize();
if (!logFilePath.startsWith(baseFlowPath)) {
throw new BusinessException(SystemErrorCode.ILLEGAL_ARGUMENT, "Invalid taskId");
}Then use Files.lines(logFilePath) and in the error logging use logFilePath.toString(). This preserves existing functionality (log files still live under FLOW_PATH/<taskId>/output.log[.retry]) while ensuring that even if taskId contains .. or separators, the normalized path is checked against the base directory before reading. We can keep the existing cleanTaskValidator.checkTaskId(taskId); call; the new checks are an additional defense-in-depth and the direct mitigation for this CodeQL alert.
We only need to adjust the getTaskLog method in CleaningTaskService and slightly change the imports to use java.nio.file.Path instead of java.nio.file.Paths directly where appropriate. No changes are required in CleaningTaskController.
-
Copy modified line R46 -
Copy modified lines R163-R165 -
Copy modified lines R167-R168 -
Copy modified lines R170-R175 -
Copy modified line R185
| @@ -43,6 +43,7 @@ | ||
| import java.io.FileWriter; | ||
| import java.io.IOException; | ||
| import java.nio.file.Files; | ||
| import java.nio.file.Path; | ||
| import java.nio.file.Paths; | ||
| import java.util.*; | ||
| import java.util.concurrent.atomic.AtomicReference; | ||
| @@ -159,11 +160,19 @@ | ||
|
|
||
| public List<CleaningTaskLog> getTaskLog(String taskId, int retryCount) { | ||
| cleanTaskValidator.checkTaskId(taskId); | ||
| String logPath = FLOW_PATH + "/" + taskId + "/output.log"; | ||
|
|
||
| Path baseFlowPath = Paths.get(FLOW_PATH).toAbsolutePath().normalize(); | ||
| Path logFilePath = baseFlowPath.resolve(taskId).resolve("output.log").toAbsolutePath().normalize(); | ||
| if (retryCount > 0) { | ||
| logPath += "." + retryCount; | ||
| logFilePath = logFilePath.resolveSibling("output.log." + retryCount); | ||
| logFilePath = logFilePath.toAbsolutePath().normalize(); | ||
| } | ||
| try (Stream<String> lines = Files.lines(Paths.get(logPath))) { | ||
|
|
||
| if (!logFilePath.startsWith(baseFlowPath)) { | ||
| throw new BusinessException(SystemErrorCode.ILLEGAL_ARGUMENT, "Invalid taskId"); | ||
| } | ||
|
|
||
| try (Stream<String> lines = Files.lines(logFilePath)) { | ||
| List<CleaningTaskLog> logs = new ArrayList<>(); | ||
| AtomicReference<String> lastLevel = new AtomicReference<>("INFO"); | ||
| lines.forEach(line -> { | ||
| @@ -175,7 +182,7 @@ | ||
| }); | ||
| return logs; | ||
| } catch (IOException e) { | ||
| log.error("Fail to read log file {}", logPath, e); | ||
| log.error("Fail to read log file {}", logFilePath.toString(), e); | ||
| return Collections.emptyList(); | ||
| } | ||
| } |
No description provided.