Skip to content

make relationship parsing to be more efficient through precomputation#743

Merged
armintaenzertng merged 4 commits into
spdx:mainfrom
lumjjb:make-rs-parser-faster
Aug 17, 2023
Merged

make relationship parsing to be more efficient through precomputation#743
armintaenzertng merged 4 commits into
spdx:mainfrom
lumjjb:make-rs-parser-faster

Conversation

@lumjjb

@lumjjb lumjjb commented Aug 2, 2023

Copy link
Copy Markdown
Contributor

I ran into long times parsing big SPDX documents that use the hasFiles shortcut fields.

Handling of hasFiles field results in the list of all relationships being iterated through for each package. This change moves the creation of existing_relationships_without_comments to the top level in parse_all_relationships.

Ideally, it would be nice to use a set for more optimal look-up (but that requires Relationship to be hashable, and probably would like a bit more input on how'd you think that would be better implemented).

before:
python3 src/spdx_tools/spdx/clitools/pyspdxtools.py -i   617.18s user 0.53s system 99% cpu 10:17.96 total
after:
python3 spdx_tools/spdx/clitools/pyspdxtools.py -i  2> /dev/null  316.08s user 0.46s system 99% cpu 5:17.73 total

Signed-off-by: Brandon Lum <lumjjb@gmail.com>
maxhbr

This comment was marked as outdated.

@maxhbr

maxhbr commented Aug 11, 2023

Copy link
Copy Markdown
Member

some related tests are failing:

 =========================== short test summary info ============================
FAILED tests/spdx/parser/jsonlikedict/test_relationship_parser.py::test_parse_has_files_without_duplicating_relationships[has_files0-existing_relationships0-contains_relationships0] - AssertionError: assert 1 == 0
 +  where 1 = len([Relationship(spdx_element_id='SPDXRef-Package', relationship_type=<RelationshipType.CONTAINS: 6>, related_spdx_element_id='SPDXRef-File1', comment=None)])
 +  and   0 = len([])
============ 1 failed, 1008 passed, 3 skipped, 4 warnings in 41.66s ============

you should be able to run these tests locally with pytest.

maxhbr
maxhbr previously requested changes Aug 11, 2023

@maxhbr maxhbr left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tests are failing

Signed-off-by: Brandon Lum <lumjjb@gmail.com>
@maxhbr

maxhbr commented Aug 11, 2023

Copy link
Copy Markdown
Member

I triggered the CI

Signed-off-by: Brandon Lum <lumjjb@gmail.com>
@lumjjb

lumjjb commented Aug 11, 2023

Copy link
Copy Markdown
Contributor Author

Thanks! sorry for the multiple pushes, not super familiar with the tooling, just fixed the lint errors (i believe)

@maxhbr

maxhbr commented Aug 11, 2023

Copy link
Copy Markdown
Member

I am happy to re-trigger and to support you.

(Sadly, I am on my way to bed. Will trigger again tomorrow, if you made new pushes)

@maxhbr

maxhbr commented Aug 11, 2023

Copy link
Copy Markdown
Member

you can run the linting tests locally via:

$ isort src tests --check
$ black src tests --check
$ flake8 src tests

Signed-off-by: Brandon Lum <lumjjb@gmail.com>
@lumjjb

lumjjb commented Aug 11, 2023

Copy link
Copy Markdown
Contributor Author

I fixed some of the ones not caused by changes as drive-bys in this PR. Can't reproduce errors with the same tooling locally for some reason :|. So 🤞

@lumjjb

lumjjb commented Aug 14, 2023

Copy link
Copy Markdown
Contributor Author

looks like all green! PTAL Thanks!

@armintaenzertng armintaenzertng left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good, thanks :)

@armintaenzertng armintaenzertng dismissed maxhbr’s stale review August 17, 2023 14:24

tests have been fixed

@armintaenzertng

Copy link
Copy Markdown
Collaborator

I noticed that this already fixes #747

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants