Skip to content

feat: improve performance of createFromJSON#142

Merged
AndreDalcher merged 1 commit intomasterfrom
feat/faster-validation
Mar 2, 2026
Merged

feat: improve performance of createFromJSON#142
AndreDalcher merged 1 commit intomasterfrom
feat/faster-validation

Conversation

@AndreDalcher
Copy link
Contributor

@AndreDalcher AndreDalcher commented Feb 19, 2026

Remove nested loops in validateDepGraphData

  • Ready for review
  • Follows CONTRIBUTING rules
  • Reviewed by Snyk internal team

createFromJson is a significant source of event loop blocks, and these largely originate from validateDepGraphData.

Upon inspection, the validation code repeatedly re-iterates over the same lists of objects, which can be very large. Refactoring this code to instead perform a single iteration with all validation code self-contained demonstrates a significant improvement. There is no change in validation behaviour, other than the removal of a check which could never fail, and a fix to a typo.

Further improvements are possible. This is an initial change to assess impact.

Utilising the stress test in this repo, a 17.4% speedup was observed for a dep-graph with 125,000 nodes.

@snyk-io
Copy link

snyk-io bot commented Feb 19, 2026

Snyk checks have passed. No issues have been found so far.

Status Scanner Critical High Medium Low Total (0)
Open Source Security 0 0 0 0 0 issues
Licenses 0 0 0 0 0 issues
Code Security 0 0 0 0 0 issues

💻 Catch issues earlier using the plugins for VS Code, JetBrains IDEs, Visual Studio, and Eclipse.

Comment on lines 39 to 41
}

for (const node of depGraphData.graph.nodes) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pointlessly iterating twice over the same set of nodes

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Getting rid of a O(n^2): 🧑‍🍳 💋

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correction: O(n * e), where n is the number of nodes in the graph, and e it's the number of deps for each node.

e of course could be 1 but also very large, so this could be considered, Ø(n^2) for worst case.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but aside from me showing that I went to university (and making a fool of myself): this is fundamentally an excellent optimization.

Comment on lines -95 to -98
assert(
nodesMap[rootNodeId].pkgId === rootPkgId,
`the root node .pkgId should be "${rootPkgId}"`,
);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This check cannot fail, so has been removed.

rootPkgId is rootNode.pkgId and rootNode is nodesMap[rootNodeId]
So this actually just asserts that nodesMap[rootNodeId].pkgId === nodesMap[rootNodeId].pkgId

}, {} as { [pkdId: string]: types.PkgInfo });

const nodesMap = depGraphData.graph.nodes.reduce((acc, cur) => {
assert(!(cur.nodeId in acc), 'more than on node with same id');
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo fixed on -> one

Comment on lines -102 to -116
assert(
pkgIds.filter((pkgId) => pkgId !== DepGraphImpl.getPkgId(pkgsMap[pkgId]))
.length === 0,
'pkgs ids should be name@version',
);
assert(
Object.values(nodesMap).filter((node) => !(node.pkgId in pkgsMap))
.length === 0,
'some instance nodes belong to non-existing pkgIds',
);
assert(
Object.values(pkgsMap).filter((pkg: { name: string }) => !pkg.name)
.length === 0,
'some .pkgs elements have no .name field',
);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

filtering over all the entries in nodes/pkgs is unnecessary, as earlier iterations have already iterated over the same sets, so these operations can be combined

},
pkgs: [
{ id: 'toor', info: { name: 'toor', version: '1.0.0' } },
{ id: 'toor@1.0.0', info: { name: 'toor', version: '1.0.0' } },
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was always an invalid package. The order of execution of validation steps meant that this test was failing on a different check previously.

Comment on lines -70 to -81
assert(
depGraphData.pkgManager && !!depGraphData.pkgManager.name,
'.pkgManager.name is missing',
);

const pkgsMap = depGraphData.pkgs.reduce((acc, cur) => {
assert(!(cur.id in acc), 'more than one pkg with same id');
assert(!!cur.info, '.pkgs item missing .info');

acc[cur.id] = cur.info;
return acc;
}, {} as { [pkdId: string]: types.PkgInfo });
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Redundant iterations/reduces - these lists are iterated over elsewhere, and merging the validation in there makes this code much cheaper.

@AndreDalcher AndreDalcher reopened this Feb 19, 2026
@AndreDalcher AndreDalcher marked this pull request as ready for review February 19, 2026 12:22
@AndreDalcher AndreDalcher requested review from a team as code owners February 19, 2026 12:22
Copy link
Contributor

@mcombuechen mcombuechen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

I never knew the function was an entry point to so much chaos. I especially like that we're getting rid of the reduce/filter.

@AndreDalcher AndreDalcher force-pushed the feat/faster-validation branch from 30f38ae to a57b8c2 Compare March 2, 2026 12:40
Remove nested loops in validateDepGraphData
@AndreDalcher AndreDalcher force-pushed the feat/faster-validation branch from a57b8c2 to 4bf21f9 Compare March 2, 2026 12:48
@AndreDalcher AndreDalcher merged commit 5a4b533 into master Mar 2, 2026
8 checks passed
@AndreDalcher AndreDalcher deleted the feat/faster-validation branch March 2, 2026 16:11
@snyksec
Copy link

snyksec commented Mar 2, 2026

🎉 This PR is included in version 2.14.0 🎉

The release is available on:

Your semantic-release bot 📦🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants