feat: improve performance of createFromJSON#142
Conversation
✅ Snyk checks have passed. No issues have been found so far.
💻 Catch issues earlier using the plugins for VS Code, JetBrains IDEs, Visual Studio, and Eclipse. |
| } | ||
|
|
||
| for (const node of depGraphData.graph.nodes) { |
There was a problem hiding this comment.
Pointlessly iterating twice over the same set of nodes
There was a problem hiding this comment.
Correction: O(n * e), where n is the number of nodes in the graph, and e it's the number of deps for each node.
e of course could be 1 but also very large, so this could be considered, Ø(n^2) for worst case.
There was a problem hiding this comment.
but aside from me showing that I went to university (and making a fool of myself): this is fundamentally an excellent optimization.
| assert( | ||
| nodesMap[rootNodeId].pkgId === rootPkgId, | ||
| `the root node .pkgId should be "${rootPkgId}"`, | ||
| ); |
There was a problem hiding this comment.
This check cannot fail, so has been removed.
rootPkgId is rootNode.pkgId and rootNode is nodesMap[rootNodeId]
So this actually just asserts that nodesMap[rootNodeId].pkgId === nodesMap[rootNodeId].pkgId
| }, {} as { [pkdId: string]: types.PkgInfo }); | ||
|
|
||
| const nodesMap = depGraphData.graph.nodes.reduce((acc, cur) => { | ||
| assert(!(cur.nodeId in acc), 'more than on node with same id'); |
There was a problem hiding this comment.
typo fixed on -> one
| assert( | ||
| pkgIds.filter((pkgId) => pkgId !== DepGraphImpl.getPkgId(pkgsMap[pkgId])) | ||
| .length === 0, | ||
| 'pkgs ids should be name@version', | ||
| ); | ||
| assert( | ||
| Object.values(nodesMap).filter((node) => !(node.pkgId in pkgsMap)) | ||
| .length === 0, | ||
| 'some instance nodes belong to non-existing pkgIds', | ||
| ); | ||
| assert( | ||
| Object.values(pkgsMap).filter((pkg: { name: string }) => !pkg.name) | ||
| .length === 0, | ||
| 'some .pkgs elements have no .name field', | ||
| ); |
There was a problem hiding this comment.
filtering over all the entries in nodes/pkgs is unnecessary, as earlier iterations have already iterated over the same sets, so these operations can be combined
| }, | ||
| pkgs: [ | ||
| { id: 'toor', info: { name: 'toor', version: '1.0.0' } }, | ||
| { id: 'toor@1.0.0', info: { name: 'toor', version: '1.0.0' } }, |
There was a problem hiding this comment.
This was always an invalid package. The order of execution of validation steps meant that this test was failing on a different check previously.
| assert( | ||
| depGraphData.pkgManager && !!depGraphData.pkgManager.name, | ||
| '.pkgManager.name is missing', | ||
| ); | ||
|
|
||
| const pkgsMap = depGraphData.pkgs.reduce((acc, cur) => { | ||
| assert(!(cur.id in acc), 'more than one pkg with same id'); | ||
| assert(!!cur.info, '.pkgs item missing .info'); | ||
|
|
||
| acc[cur.id] = cur.info; | ||
| return acc; | ||
| }, {} as { [pkdId: string]: types.PkgInfo }); |
There was a problem hiding this comment.
Redundant iterations/reduces - these lists are iterated over elsewhere, and merging the validation in there makes this code much cheaper.
mcombuechen
left a comment
There was a problem hiding this comment.
LGTM
I never knew the function was an entry point to so much chaos. I especially like that we're getting rid of the reduce/filter.
30f38ae to
a57b8c2
Compare
Remove nested loops in validateDepGraphData
a57b8c2 to
4bf21f9
Compare
|
🎉 This PR is included in version 2.14.0 🎉 The release is available on: Your semantic-release bot 📦🚀 |
Remove nested loops in validateDepGraphData
createFromJsonis a significant source of event loop blocks, and these largely originate fromvalidateDepGraphData.Upon inspection, the validation code repeatedly re-iterates over the same lists of objects, which can be very large. Refactoring this code to instead perform a single iteration with all validation code self-contained demonstrates a significant improvement. There is no change in validation behaviour, other than the removal of a check which could never fail, and a fix to a typo.
Further improvements are possible. This is an initial change to assess impact.
Utilising the stress test in this repo, a 17.4% speedup was observed for a dep-graph with 125,000 nodes.