It's a useful thing to be able to list all third-party software dependencies used in a project, along with their versions and licenses. In other words, a Software Bill of Materials (SBOM). However, sometimes you just want a simple list rather than a heavy-weight JSON document produced by the likes of CycloneDX, maybe to quickly validate that no dependencies have forbidden licenses. In this post we compare two different approaches to listing software dependencies for an open-source project like Apache CXF which has many third-party dependencies.
Listing software dependencies from the project source
The first approach is to list the software dependencies directly from the project source code. The advantage of this approach is that we know exactly what the down-stream projects are and don't have to reverse engineer them from the binaries as in the second approach.
We'll use the latest released version of Apache CXF (4.1.4) so clone the project directly from GitHub: https://github.com/apache/cxf/tree/cxf-4.1.4. The binary distribution is built in the distribution folder. As Apache CXF uses Apache Maven to build the project, we'll use the license-maven-plugin to generate the licenses:
- cd distribution; mvn license:download-licenses
This generates an XML file in target/generated-resources that lists all of the dependencies in the maven project along with their licenses. We'll convert it into a tab delimited file using yq:
- yq -j -r '.licenseSummary.dependencies.dependency[] | [.artifactId, .version, ([.licenses.license] | flatten | .[0].name) // "None"] | @tsv' licenses.xml
The output looks like:
It lists 240 dependencies and impressively finds licenses for all but one.Listing software dependencies from the distribution
The second approach is to detect the software dependencies from the project distribution. This works best if you don't have access to the source code of the project or if it uses a build system that's hard to use with no experience. Previously I blogged about how Syft is able to detect licenses from java archives. To enable Syft to go to Maven Central to find licenses add the following to ~/.syft.yaml:
java:
maven-url: "https://repo1.maven.org/maven2"
max-parent-recursive-depth: 8
use-network: true
Download and extract the Apache CXF binary distribution. Go into the lib directory and use Syft as follows:
- syft . -o json | jq -r '.artifacts[] | [.name, .version, (.licenses[]?.value // "none") ] | @tsv'
It produces a similar output to the source code approach above. It finds 263 dependencies, 23 more than the source code approach above, because it's able to introspect into jars to detect when other dependencies are shaded to the jars, something of an advantage over the source detection route.
The license detection isn't as good though as the source plugin route, there are 23 libraries listed without a license. This is because sometimes Syft doesn't have enough information in the binary metadata to be able to detect the correct path in Maven Central to retrieve the corresponding pom.xml (if it includes the license information). For example, it fails to find a license for the OpenSAML dependencies, because these jars don't come from Maven Central but a third-party maven repository specific to OpenSAML.
