Skip to content

Conversation

@gutjuri
Copy link

@gutjuri gutjuri commented Jun 2, 2022

For datasets with a large number of log keys, InvariantsMiner has been exceptionally slow.
I performed tests with a linux syslog dataset (415 log keys) and fitting times have been unbearable.

I profiled InvariantsMiner and detected that the (by far) largest amount of time is spent in the method _join_set. I optimised this method in order to reduce its computational complexity.

Now, runtimes are considerably better for linux syslogs.
For HDFS logs, runtimes didn't change.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant