InvariantsMiner Optimisation #95

gutjuri · 2022-06-02T08:51:28Z

For datasets with a large number of log keys, InvariantsMiner has been exceptionally slow.
I performed tests with a linux syslog dataset (415 log keys) and fitting times have been unbearable.

I profiled InvariantsMiner and detected that the (by far) largest amount of time is spent in the method _join_set. I optimised this method in order to reduce its computational complexity.

Now, runtimes are considerably better for linux syslogs.
For HDFS logs, runtimes didn't change.

gutjuri added 2 commits June 2, 2022 10:41

optimized invariantsMiner

193410d

fix dataloader for npz files

19ec54c

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

InvariantsMiner Optimisation #95

InvariantsMiner Optimisation #95

gutjuri commented Jun 2, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

InvariantsMiner Optimisation #95

Are you sure you want to change the base?

InvariantsMiner Optimisation #95

Conversation

gutjuri commented Jun 2, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant