perf: Speed up creating archives when pusing big codebases#1011
perf: Speed up creating archives when pusing big codebases#1011matyascimbulka wants to merge 1 commit intoapify:masterfrom
Conversation
B4nan
left a comment
There was a problem hiding this comment.
looks good to me, lets wait for Vlad before merging. i guess you tested this in the mentioned eu project, so you know it works. a test would be nice, but not crutial, we use the CLI beta for e2e tests in crawlee, so we should be warned in case things fall apart
|
@B4nan Thanks for the review. I agree that having a test would be better but the EU project code lives in |
|
well, if you tested it with your codebase / can test it with the built binaries from CI, and confirm it works the same way, we can merge it |
|
How do I get the binaries for the CI so I can test them? |
|
You can find them here: https://github.com/apify/apify-cli/actions/runs/21478914434/attempts/1?pr=1011 at the bottom of the page, or by visiting the Unix (mac/linux): https://github.com/apify/apify-cli/actions/runs/21478914434/artifacts/5303421975 |
This PR somewhat relates to #982. The main goal is to speed up creating ZIP archives for codebases larger than 2MB when using
apify push.The issue was in using
archive.glob()function within the loop. This approach forces the library to go through the entire current working directory (and its children). Using this function is also redundant because thegetActorLocalFilePathsalready usesglobbyto get all of the valid file paths to be archived.This issue is easy to measure while working on a generic actor for
eu-monitoring-toolwhich has roughly 1200 source code files and largenode_modules.Here is comparison of the compression times:
@B4nan or @vladfrangu Could you please have a look?