SBOM¶
Hermeto produces a Software Bill of Materials (SBOM) describing the dependencies it fetches. This document explains how Hermeto structures the SBOM, what custom fields it uses, how it maps between supported formats, and how conversion works.
Hermeto supports the following SBOM formats:
The output format can be selected with --sbom-output-type cyclonedx or
--sbom-output-type spdx when running hermeto fetch-deps or
hermeto merge-sboms.
Hermeto represents SBOMs internally as CycloneDX and can convert them to SPDX. Both formats use only a subset of their respective specifications; the subsets map into each other. The mapping is documented in CycloneDX to SPDX mapping.
Scheme¶
Hermeto produces a flat list of components (CycloneDX) or packages (SPDX). There is no nested tree of dependencies in either format.
CycloneDX¶
All packages processed by Hermeto are represented by CycloneDX components. In a
SBOM produced by Hermeto components form a single flat top-level components array,
no hierarchy is preserved withing it. Of all possible component types available
in CycloneDX spec Hermeto uses only two component types: library (default, used
for components produced by all package managers except generic) and file (used
for URL-based generic artifacts such as archives when the generic package manager
fetches a zip/tarball).
SPDX¶
At the time of writing, SBOMs are represented internally as CycloneDX. SPDX is produced with a converter (see CycloneDX to SPDX mapping). SPDX SBOMs provided by Hermeto have the following properties:
- The output has a
packagesarray and arelationshipsarray. packages: the first entry is a synthetic root withSPDXID:SPDXRef-DocumentRoot-File-(emptynameandversionInfo); the rest are SPDX package entries for actual packages. The root is required by the SPDX specification.relationships: the document describes the root (SPDXRef-DOCUMENT→SPDXRef-DocumentRoot-File-viaDESCRIBES); every real package is linked to the root viaCONTAINS.- Unlike CycloneDX SPDX does not have direct equivalents to "library" or "file"
component types; all packages are listed without that distinction. As a result,
a
filecomponent in CycloneDX that is converted to SPDX and back will become alibrarycomponent — this is a known limitation.
Custom data Hermeto reports¶
Dependency properties¶
Hermeto attaches properties from a fixed set to all dependencies. These
properties carry metadata about dependencies they are attached to. These
properties are stored in the CycloneDX components[].properties array; in SPDX
they are represented as package annotations (see
CycloneDX to SPDX mapping). Property names use
hermeto: as a prefix, except for the two standard CycloneDX npm properties
which use the cdx: prefix defined by the CycloneDX specification. The table
below lists all custom properties:
| Property name | Meaning | When it appears |
|---|---|---|
hermeto:found_by |
Tool that added the dependency | Always |
hermeto:missing_hash:in_file |
Path to a lockfile with a missing checksum | When there is a lockfile with some checksums missing |
hermeto:bundler:package:binary |
Bundler gem is a platform-specific binary | When the gem is a platform-specific binary |
cdx:npm:package:bundled |
Npm package is bundled | When the package is bundled within another |
cdx:npm:package:development |
Npm package is a dev dependency | When the package is a development dependency |
hermeto:pip:package:binary |
Pip package is a binary wheel | When the package is a binary wheel |
hermeto:pip:package:build-dependency |
Pip package is a build-time dependency | When the package is required at build time |
hermeto:rpm_modularity_label |
RPM modularity label | When the RPM has a modularity label |
hermeto:rpm_summary |
RPM summary | When you set include_summary_in_sbom to true |
Backend and source information¶
Unlike the dependency properties above, which are stored as flat key/value pairs
in components[].properties (CycloneDX) or package annotations (SPDX), backend
and source information use other SBOM fields because they describe where a
component came from, not arbitrary key/value metadata on the component.
Backends — lets consumers know which backend (e.g. gomod, pip, npm)
produced each dependency. Experimental backends (name starting with x-) are
tagged as experimental. Supported SBOM types represent them like follows:
- CycloneDX: Top-level annotations. Text format
hermeto:backend:<backend_name>(e.g.hermeto:backend:gomod); for experimental backends,hermeto:backend:experimental:<backend_name>. Each annotation references a component bybom-ref(which equals the component'spurl). - SPDX: Those backend tags become package annotations on the matching packages (see CycloneDX to SPDX mapping).
Actual source URL — When a dependency was downloaded via an artifact repository manager (e.g. JFrog Artifactory or Sonatype Nexus), Hermeto records the real download location as well as the usual distribution reference. If several URLs are known, all are recorded; the SBOM does not say which URL was actually used (some backends cannot provide that). It is represented as follows:
- CycloneDX: An external reference with
type:"distribution",comment:"proxy URL", andurlset to the download location. Multiple URLs are multiple such references. - SPDX: The same information is mapped to the package field
sourceInfo(semicolon-separated when there are several).
Main package representation¶
The SBOM does not mark which dependency is the main package. All are listed as peers. The SBOM alone does not distinguish "the app" from its dependencies.
Versioning¶
Hermeto provides several versioning mechanisms in SBOMs:
- Version: Each dependency can have a version, taken from the package manager's data (lockfile, metadata, or similar). Some dependencies have no version; Hermeto omits the field in that case. For further details see Reasons for lack of versions in some dependencies.
- PURL: The main identifier for each dependency is its PURL (package URL). The PURL can still include version-like information (e.g. a Git commit) even when the version field is missing.
- SBOM format version: The SBOM file itself carries a format version
— the schema version, not the application's version.
CycloneDX:
specVersion(e.g."1.6") and a top-level integerversion(BOM revision). SPDX:spdxVersion(e.g."SPDX-2.3").
Reasons for lack of versions in some dependencies¶
Some dependencies in the SBOM may have no version; when and why depends on the ecosystem:
- Go (gomod): The standard library version is tied to the Go runtime, but
Hermeto omits it because the Go version used during pre-fetching may differ
from the one used in the actual hermetic build; recording it could lead
vulnerability analyzers to incorrectly flag versions and to incorrectly apply
a particular provenance policy. Modules referenced via
replacedirectives pointing to local paths may also lack a version; other modules get one when resolved. - Yarn: Workspace and linked packages often have no version.
- Cargo: Missing if a local path dependency omits it, or if a workspace
member inherits from
[workspace.package]and that field is unset. - npm: Version is taken from the lock file, however in rare cases it could be missing.
- pip: PyPI packages always have a version; VCS or local path sources may not. May also be absent if the main project uses dynamic versioning (e.g. setuptools-scm).
Merging SBOMs¶
Merging is only meant for SBOMs produced by Hermeto (for example from
fetch-deps or an earlier successful merge) due to the fact that Hermeto
supports only subsets of SBOM formats. Input files must match Hermeto's
CycloneDX or SPDX model; arbitrary SBOMs from other tools are not supported and
will be rejected as invalid.
CycloneDX to SPDX mapping¶
Hermeto forms the SBOM as CycloneDX first; this section describes the
translation used when SPDX output is requested (e.g. via
--sbom-output-type spdx).
| CycloneDX | SPDX |
|---|---|
Component |
SPDXPackage |
| Component identity | SPDXID: SPDXRef-Package-{name}-{version}-{hash} (or SPDXRef-Package-{name}-{hash}). The hash is the SHA-256 hex digest of a JSON object with keys name, version, and purl, serialized with keys sorted. The idstring is sanitized per SPDX rules (only letters, digits, ., -). versionInfo is set to the component's version or omitted when None. |
purl |
externalRefs: one entry with referenceCategory=PACKAGE-MANAGER, referenceType=purl, referenceLocator=<purl> |
properties |
Package annotations: each property encoded as JSON {"name":"...","value":"..."} in comment; annotator is Tool: hermeto:jsonencoded. |
Top-level annotations (by bom-ref) |
Per-package annotations containing the same values as CycloneDX top-level annotations: comment = annotation text |
ExternalReference (type=distribution, comment=proxy URL) |
sourceInfo (semicolon-separated if multiple) |
| (no root in CycloneDX) | A synthetic root package SPDXRef-DocumentRoot-File- is created with name="" and versionInfo="". The document describes the root; the root contains every package. |
metadata.tools |
creationInfo.creators (Tool: / Organization:) |
SPDX to CycloneDX¶
When SPDX is converted to CycloneDX (e.g. when merging an SPDX SBOM with a CycloneDX one), the following applies:
- Each SPDX package becomes one or more CycloneDX components. Because CycloneDX
allows only one PURL per component, an SPDX package with multiple PURLs in
externalRefsbecomes multiple components. versionInfois passed through as the component'sversion.- Annotations whose
annotatorends with:jsonencodedare parsed as properties; others are stored as top-level annotations. sourceInfois converted back to ExternalReferences withtype=distributionandcomment=proxy URL.
Limitation: An SPDX package that has multiple PURLs in externalRefs becomes
multiple CycloneDX components when converted to CycloneDX, because CycloneDX does
not support multiple PURLs on one component.