Thursday, March 7, 2024

Improving license detection when generating SBOMs

I blogged last year about generating a Software Bill of Material (SBOM) for an Apache Maven project using the cyclonedx-maven-plugin. It's ideal to generate an SBOM at build time in this way, as you have access to an accurate dependency graph (from Maven in this case). However, sometimes you want to create an SBOM from a third-party binary artifact, such as a jar, zip or docker image. Anchore Syft is ideal for this purpose. However, I found that it generated somewhat limited licensing information for jars. In this post I'll examine a series of contributions I made to Syft over the end of 2023 that majorly improved this. As an aside, I found the Syft community to be very helpful and responsive, so it was an enjoyable process!

Improvements Contributed

The initial Syft release I looked at (Syft v0.92.0) only detected a license for a jar if the Java Manifest.MF contained in the jar had an OSGi Bundle-License tag detailing the license used. As many projects don't support OSGi it meant that relatively few licenses were detected. Here are the improvements I contributed and the versions they were released in:

An additional improvement (not by me) was made in a subsequent release (v0.103.1) to fix a bug with underscores in artifacts that resulted in licenses not being found.

One point to make is that going to Maven Central to find poms with license information is not enabled by default, this is what I have in a local .syft.yaml:

   maven-url: ""
   max-parent-recursive-depth: 8
   use-network: true


As a test-case, I chose Apache Spark as an example of a project containing a large number of third-party (Java-based) dependencies, specifically the distribution spark-3.5.1-bin-hadoop3.tgz. Using Syft v0.92.0 as a starting point, I generated a cyclonedx-json SBOM using Syft via:

  • syft packages ./spark-3.5.1-bin-hadoop3.tgz -o cyclonedx-json > spark.json (note: newer versions of Syft use "scan" instead of "packages")

Then I used jq to generate a CSV consisting of the dependencies found in the SBOM and their license detected, or "unknown-license" if no license was found:

  • jq -r '.components[] | .group + "/" + .name + ":" + .version + "," + try(.licenses[] | .license? | flatten | join(" ")) // .group + "/" + .name + ":" + .version + "," + .licenses?[]?.expression // .group + "/" + .name + ":" + .version +  ",unknown-license"' spark.json


For the Apache Spark distributed detailed above, these are the results:

Syft version Dependencies detected Unknown licenses % licenses detected
v0.92.0 440 306 30.4%
v0.93.0 442 245 44.5%
v0.94.0 470 203 56.8%
v0.95.0 444 157 64.6%
v0.96.0 468 32 93.1%
v0.97.0 468 27 94.2%
v1.103.1 467 11 97.4%

Going from less than a third of dependencies getting their license detected correctly to almost 100% is pretty good! The remaining 11 dependencies don't contain any pom.xml or or any other metadata that allow Syft to find the correct pom.xml in Maven Central. Possibly some improvements could be made in looking at the package names to try to find the correct path in Maven Central.

Monday, October 23, 2023

CVE-2023-44483 in Apache Santuario - XML Security for Java

A new CVE has been published for the recent Apache Santuario - XML Security for Java releases (4.0.0, 3.0.3, 2.3.4 and 2.2.6):

  • CVE-2023-44483: Apache Santuario: Private Key disclosure in debug-log output

"A private key may be disclosed in log files when generating an XML Signature and logging with debug level is enabled. Users are recommended to upgrade to version 2.2.6, 2.3.4, or 3.0.3, which fixes this issue."

Tuesday, October 10, 2023

Publishing SBOMs for open-source projects

Software Bill of Materials (SBOMs) are a recent hot topic, in part due to an executive order by the US government which references making an SBOM available on a public site. Making a signed SBOM available publicly allows downstream projects to consume the SBOM automatically using tooling to list dependency names and versions, what licenses they use, what vulnerabilities are known about them, etc. 

When looking at a recent Apache commons-codec version in Maven Central recently, I noticed that it was publishing signed SBOMs in both CycloneDX and SPDX formats. Inspired by this, I've started to add similar functionality to the other ASF projects I contribute to, starting at first with CycloneDX support. It's very easy to do this for Java-based projects by just adding Maven plugins, see for example.

Last month, version 4.0.0-M1 of the Apache XML Security for Java library was released. The SBOM is available on Maven Central along with the released artifacts.

Let's see what we can do with this SBOM by hand. Firstly let's download it and the signature:

  • wget
  • wget

Validate the signature (using the KEYS file):

  • gpg --verify xmlsec-4.0.0-M1-cyclonedx.json.asc

Now we're ready to answer some questions about the library. 

What library name, versions and licenses are third-party dependencies of Apache XML Security for Java 4.0.0-M1? We can extract this information using the jq tool and some hacking:

  • jq -r '.components[] | .group + "/" + .name + ":" + .version + "," + (.licenses?[]?.license | flatten | join(" "))' xmlsec-4.0.0-M1-cyclonedx.json

Next we might want to know if any of these dependencies have publicly known vulnerabilities. For this we can use the excellent Grype tool, which parses the sbom directly:

  • grype sbom:./xmlsec-4.0.0-M1-cyclonedx.json

For now at least, it outputs that no vulnerabilities were found. So by using the SBOM with some third-party open-source tools we can find out what the third-party dependencies are, re-assure ourselves that they are available under a business-friendly open-source license, and ensure that there are no known vulnerabilities associated with them. Hopefully more open-source projects will roll-out having publicly available SBOMs for their releases to make answering these questions easier.

Friday, April 14, 2023

Open Source Software Composition Analysis

Software Composition Analysis (SCA) is the process of figuring out which third-party dependencies are used in your project. It's an essential part of the software security process as it helps you to answer questions like:

  • Does my project contain third party dependencies with known vulnerabilities (CVEs)?
  • Does my project contain third party dependencies with risky licenses?
  • Does my project comply with all legal requirements imposed by the upstream projects?

In this post we'll look at some popular open-source SCA options. It's not intended to be comprehensive, let me know if I missed anything! Adding one or more of these projects to your CI/CD process will really improve your Supply Chain Security process.

GitHub Dependabot 

If your project is hosted on GitHub, then the first port of call for SCA is to enable GitHub Dependabot. You have the options to just enable alerts, which let you know if your dependencies have known CVEs, and also to have Dependabot automatically create pull requests to upgrade the dependencies in question to fix the vulnerability. Adding CI using GitHub Actions to this process to verify that the updates don't break any of the tests/build means fixing CVEs is a straightforward process. Dependabot has support for a wide range of software ecosystems.

GitHub recently added support as well to download an SPDX SBOM for a GitHub repository via e.g for Apache CXF

OWASP Dependency-Check

OWASP Dependency-Check is another tool that can help you find CVEs in your dependencies. It's useful as an alternative to dependabot if you don't have access to the security tab of a GitHub project, or if Dependabot is otherwise not enabled. You can run it on a Maven project via:

  • mvn org.owasp:dependency-check-maven:check


Aqua Trivy is a really useful tool for SCA as it can help with a wide range of scenarios:

  • Scan a docker image for CVEs/Secrets: trivy image tomcat:9.0 
    • Exclude secret scanning: trivy --security-checks vuln image tomcat:9.0
    • Exclude OS level CVEs: trivy --security-checks vuln --vuln-type library image tomcat:9.0
  • Scan a GitHub repository: trivy repository
  • Scan the filesystem at the current working directory: trivy fs .


Anchore Syft is a tool which can help you with generating an SBOM from an image or filesystem:

  • Generate a CycloneDX SBOM from a docker image: syft -o cyclonedx-json tomcat:9.0
  • Generate an SBOM from a war file: syft packages ./fedizhelloworld.war


Anchore Grype is another super-useful tool that works well with Syft:

  • Scan the current working directory for CVEs: grype dir:.
  • Scan a docker image for CVEs: grype tomcat:9.0
  • Scan a CycloneDX SBOM produced by Syft for CVEs: grype sbom:./sbom.json


Yet another tool is Google's OSV-Scanner:

  • Scan a docker image for CVEs: osv-scanner --docker tomcat:9.0
  • Scan the local filesystem for CVEs: osv-scanner -r .

Thursday, March 16, 2023

OpenSSF Allstar

In the previous blog post, I looked at how to use OpenSSF Scorecard to improve the security posture of your open-source GitHub projects. This is a really useful tool when working at the level of individual repositories. However, what if you want to apply security policies to many repositories in a GitHub organization? This is where OpenSSF Allstar comes in.

Getting Started

Detailed installation instructions are available here. The easiest way of getting started is to install the OpenSSF AllStar GitHub app in your organization. However you may not wish to grant access to your internal/private repositories to this instance, in which case it's pretty easy to manually install it

General Configuration

Allstar reads configuration from a GitHub repo called ".allstar" in your GitHub organization. Here an "allstar.yaml" file defines the general configuration for the tool, e.g.:

This configuration uses the "Opt out" strategy, meaning that all repositories are included in the organization unless you explicitly opt them out. Archived and forked repos are excluded as you may not care about applying security policies to these types of repositories. Finally the configuration blocks individual repositories overriding the allstar configuration.


Allstar policies are added by checking in the corresponding yaml file to the .allstar repository. Each policy allows you to define whether to just log the issue or whether to create a GitHub issue for it in the repository where a policy violation was found. GitHub issues are labelled with "allstar", making it easy to search for them across all repositories in your organization.

Here are some of the policies Allstar currently supports:

  • binary_artifacts.yaml: Enforce that binary artifacts aren't checked in to source control.
  • branch_protection.yaml: Enforce branch protection requirements on repos, for example:
    • Default branches are covered by branch protection. 
    • Approval is required for pull requests 
    • Block force pushes 
    • Require the branch is up to date before merging
  • dangerous_workflow.yaml: Flag dangerous things in github actions workflows.
  • outside.yaml: Enforce that outside collaborators can't be an admin on a repository. 
  • security.yaml: Enforce that repositories have a security policy. I use it with "optOutPrivateRepos: true" to only apply this policy to public repos. This helps to let external users of your software know how to report security issues to the project.


I've found allstar pretty useful and submitted a few contributions to it in the spirit of open-source, that were included in the recent v3.0 release:

Tuesday, February 21, 2023

OpenSSF Scorecard

OpenSSF Scorecard is a tool that assesses your project against a number of security best practices and assigns a score (out of 10). It is a really useful thing to run on any open-source project you might contribute to, to try to improve the overall security posture of the project, or even to assess how secure a third-party project is that you might want to use. In this post I'll describe how I improved the security posture of a number of ASF projects I contribute to using OpenSSF Scorecard.

Getting Started

The first step is to install the OpenSSF Scorecard GitHub Action. This can be done in the GitHub dashboard, by going to "Actions", then "New Workflow" and searching for "OpenSSF Scorecard". Once this is committed to source control and runs successfully, the findings appear in the GitHub dashboard under "Security" and then "Code scanning". After the first run, you can add a Scorecard badge to the README of your project to display the current score. For example, for Apache Santuario.

Improving the score

After doing the initial run to get the base score, it's time to try to improve the score a bit. Here are some of the actions I performed:

  • Enable dependabot. This involves adding dependabot.yml (for example) to your project to automatically create PRs for updated dependencies. As in the example, it should cover both the package ecosystem of the project (e.g. Maven) as well as GitHub Actions, to keep any GitHub actions up to date as well.
  • Automated builds. Any pull request should have the full suite of project tests run on it before being committed. I made sure that all of the projects had Jenkins projects set up to build both maintained branches whenever new commits were made, as well as dedicated jobs to run on PRs. Note that at the ASF, the dependabot user needs to be explicitly allow-listed in a .asf.yaml file to automatically run Jenkins jobs on submitted PRs. The combination of dependabot and automated builds makes it easy to have confidence in automatically updating your project dependencies, assuming a good test-suite.
  • Adding CodeQL (and fixing the findings). CodeQL is a SAST tool that can be run on your project via a GitHub action by searching for "CodeQL". It should be run on the maintained branches of the project, as well as on any pull requests for the maintained branches.
  • Adding A (for example) should be added to source control to describe the supported versions of the project, and how to submit security issues.
  • Pin GitHub action commits. It's best practice to pin GitHub action commits so that new updates don't break your project or even introduce a security regression. can be used as a tool to analyse the GitHub actions of your project and to create pull requests with the correct versions pinned. Dependabot is then clever enough to be able to update your GitHub actions based on the pinned commit.
  • Adding OpenSSF Best Practices Badge. allows you to obtain a best practices badge for your project and to embed it in the README.

ASF Projects

Here are some of the ASF projects I applied the above to, and their current OpenSSF Scorecard result at the time of writing:

Future improvements that would improve the score are as follows:

  • No fuzzing. could be used to fuzz the projects.
  • No branch protection. Branch protection is not enabled on these projects as traditionally we have followed a CTR approach to development. OpenSSF Scorecard also penalises committing directly to the main branch without a approved PR, so adding branch protection would greatly improve the score of all projects above.
  • No packaging. OpenSSF Scorecard's packaging check doesn't support Maven Central, which is where the releases of all the above projects go.
  • No signed releases. Again OpenSSF Scorecard doesn't check Maven Central for signed releases.


Wednesday, December 14, 2022

New Apache CXF releases and CVEs published

Apache CXF has released versions 3.5.5 and 3.4.10. Notable security upgrades in these releases include picking up a fix for CVE-2022-40152 in Woodstox, and a fix for CVE-2022-40150 in Jettison. In addition, two new CVEs are published for issues found directly in Apache CXF itself:

  • CVE-2022-46363: Apache CXF directory listing / code exfiltration. A vulnerability in Apache CXF before versions 3.5.5 and 3.4.10 allows an attacker to perform a remote directory listing or code exfiltration. The vulnerability only applies when the CXFServlet is configured with both the static-resources-list and redirect-query-check attributes. These attributes are not supposed to be used together, and so the vulnerability can only arise if the CXF service is misconfigured.
  • CVE-2022-46364: Apache CXF SSRF Vulnerability. A SSRF vulnerability in parsing the href attribute of XOP:Include in MTOM requests in versions of Apache CXF before 3.5.5 and 3.4.10 allows an attacker to perform SSRF style attacks on webservices that take at least one parameter of any type.
Thanks to thanat0s from Beijin Qihoo 360 adlab for reporting both issues to the project. The first issue is not really applicable in practice as it only arises on a misconfiguration. For the second issue, we restricted following MTOM URLs only to message attachments by default. It can be controlled via a new property "org.apache.cxf.attachment.xop.follow.urls" (which of course defaults to false).