rpki/specs/cir_draft.md

9.3 KiB


Internet-Draft Yirong Yu Intended status: Standards Track Zhongguancun Labortary Expires: [Date, e.g., October 2026] April 2026

A Profile for Resource Public Key Infrastructure (RPKI) Canonical Input Representation (CIR)

draft-yu-sidrops-rpki-cir-00

Abstract

This document specifies a Canonical Input Representation (CIR) content type for use with the Resource Public Key Infrastructure (RPKI). While the Canonical Cache Representation (CCR) profiles the validated output state of a Relying Party (RP), CIR is a DER-encoded data interchange format used to represent the exact, unvalidated raw input data fetched by an RP at a particular point in time. The CIR profile provides a deterministic "world view" snapshot, enabling advanced operational capabilities such as differential testing, failure path debugging, and highly accurate historical black-box replay of RPKI validation logic.

Status of This Memo

TBD

Table of Contents

  1. Introduction 1.1. Requirements Language
  2. Motivation and Architecture
  3. The Canonical Input Representation Content Type
  4. The Canonical Input Representation Content 4.1. version 4.2. metaInfo 4.3. BaseCIR Fields 4.4. DeltaCIR Fields
  5. Operational Considerations 5.1. Differential Testing and Historical Replay 5.2. Delta Compression for Archival
  6. Security Considerations
  7. IANA Considerations
  8. References

1. Introduction

This document specifies a Canonical Input Representation (CIR) content type for use with the Resource Public Key Infrastructure (RPKI).

A Relying Party (RP) fetches RPKI objects from publication points using protocols such as rsync [RFC5781] or RRDP [RFC8182] prior to executing cryptographic validation. While the Canonical Cache Representation (CCR) [draft-ietf-sidrops-rpki-ccr] accurately describes the subset of objects that successfully passed validation, it inherently omits objects that were rejected due to format errors, invalid signatures, or expired timestamps (survivorship bias).

CIR records the precise mapping of object URIs to their cryptographic hashes before validation occurs. By decoupling the network transport layer from the validation layer, CIR allows researchers and operators to reconstruct the exact physical file tree (the "dirty inputs") perceived by a vantage point.

1.1. Requirements Language

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all capitals, as shown here.

2. Motivation and Architecture

CIR is designed to solve the "time paradox" and "state desynchronization" problems inherent to RPKI historical archiving. It defines two distinct operational modes:

  • Base CIR: A complete snapshot of all fetched Trust Anchor Locators (TALs) and RPKI objects, typically generated by an RP immediately after a synchronization cycle.
  • Delta CIR: A compressed representation generated by offline archival processes, describing the additions, modifications, and deletions between two chronological Base CIR snapshots.

3. The Canonical Input Representation Content Type

The content of a CIR file is an instance of ContentInfo.

The contentType for a CIR is defined as id-ct-rpkiCanonicalInputRepresentation, with Object Identifier (OID) [TBD-OID].

The content is an instance of RpkiCanonicalInputRepresentation.

4. The Canonical Input Representation Content

The content of a Canonical Input Representation is formally defined using ASN.1. To ensure absolute deterministic serialization, CIR MUST be encoded using Distinguished Encoding Rules (DER, [X.690]).

RpkiCanonicalInputRepresentation-2026
  { iso(1) member-body(2) us(840) rsadsi(113549)
    pkcs(1) pkcs9(9) smime(16) mod(0) id-mod-rpkiCIR-2026(TBD) }

DEFINITIONS EXPLICIT TAGS ::=
BEGIN

IMPORTS
  CONTENT-TYPE, Digest
  FROM CryptographicMessageSyntax-2010 -- in [RFC6268]
  ;

ContentInfo ::= SEQUENCE {
  contentType      CONTENT-TYPE.&id({ContentSet}),
  content      [0] EXPLICIT CONTENT-TYPE.&Type({ContentSet}{@contentType}) }

ContentSet CONTENT-TYPE ::= {
  ct-rpkiCanonicalInputRepresentation, ... }

ct-rpkiCanonicalInputRepresentation CONTENT-TYPE ::=
  { TYPE RpkiCanonicalInputRepresentation
    IDENTIFIED BY id-ct-rpkiCanonicalInputRepresentation }

id-ct-rpkiCanonicalInputRepresentation OBJECT IDENTIFIER ::=
  { iso(1) member-body(2) us(840) rsadsi(113549) pkcs(1)
    pkcs-9(9) id-smime(16) id-ct(1) cir(TBD) }

RpkiCanonicalInputRepresentation ::= CHOICE {
  baseCIR   [0] BaseCIR,
  deltaCIR  [1] DeltaCIR
}

BaseCIR ::= SEQUENCE {
  version         INTEGER DEFAULT 0,
  metaInfo        CIRMetaInfo,
  talList         SEQUENCE OF URIAndHash,
  objectList      SEQUENCE OF URIAndHash
}

DeltaCIR ::= SEQUENCE {
  version         INTEGER DEFAULT 0,
  metaInfo        CIRMetaInfo,
  talChanges      [0] DeltaChanges OPTIONAL,
  objectChanges   [1] DeltaChanges
}

DeltaChanges ::= SEQUENCE {
  upserted        [0] SEQUENCE OF URIAndHash OPTIONAL,
  removed         [1] SEQUENCE OF IA5String OPTIONAL
}

CIRMetaInfo ::= SEQUENCE {
  validationTime  GeneralizedTime,
  rpSoftware      [0] UTF8String OPTIONAL,
  rpVersion       [1] UTF8String OPTIONAL,
  observerID      [2] UTF8String OPTIONAL
}

URIAndHash ::= SEQUENCE {
  uri             IA5String,
  hash            OCTET STRING,
  source          [0] SourceType OPTIONAL
}

SourceType ::= ENUMERATED {
  rsync   (0),
  rrdp    (1),
  https   (2),
  erik    (3),
  cache   (4),
  other   (5)
}

END

4.1. version

The version field contains the format version for the structure. In this version of the specification, it MUST be 0.

4.2. metaInfo

The metaInfo structure provides crucial temporal and environmental context:

  • validationTime: Contains a GeneralizedTime indicating the moment the synchronization concluded. This timestamp is REQUIRED, as it is strictly necessary to freeze the system clock when replaying RPKI validation logic to evaluate time-sensitive object expiration.
  • rpSoftware / rpVersion / observerID: OPTIONAL metadata to identify the specific software and observation vantage point generating the CIR.

4.3. BaseCIR Fields

  • talList: A sequence of URIAndHash representing the Trust Anchor Locators used as the root of validation.
  • objectList: A sequence of URIAndHash representing every raw file fetched by the RP. The uri MUST be the absolute logical address (e.g., rsync://...), and the hash MUST be the SHA-256 digest of the raw file.
  • source: An OPTIONAL enumerated value indicating the network transport or cache layer from which the file was successfully obtained (e.g., rrdp, rsync).

4.4. DeltaCIR Fields

To support compact archival, DeltaCIR describes changes relative to a preceding BaseCIR or DeltaCIR:

  • upserted: A sequence of URIAndHash for newly discovered objects or objects where the URI remained identical but the cryptographic Hash changed.
  • removed: A sequence of IA5String containing URIs that were present in the previous snapshot but are no longer observed.

5. Operational Considerations

5.1. Differential Testing and Historical Replay

Because CIR captures the global input state regardless of object validity, it allows operators to construct an isolated physical sandbox matching the exact network state at validationTime. By injecting this state into different RP software implementations (using native functionality like --disable-rrdp coupled with local rsync wrappers), operators can perform deterministic differential testing. Discrepancies in the resulting CCR outputs indicate implementation bugs or vulnerabilities in boundary-case handling.

5.2. Delta Compression for Archival

Given that the global RPKI repository experiences relatively low churn within short timeframes (e.g., 10-minute intervals), DeltaCIR significantly reduces storage overhead. Archival systems SHOULD compute DeltaCIR sequences from raw BaseCIR outputs to facilitate efficient streaming historical replays.

6. Security Considerations

Unlike RPKI signed objects, CIR objects are not cryptographically signed by CAs. They are observational records.

CIR explicitly permits the indexing of corrupted, malicious, or malformed ASN.1 objects. Parsers ingesting CIR to reconstruct sandboxes MUST NOT attempt to cryptographically decode or execute the objects referenced by the hashes, but simply treat them as opaque binary blobs to be placed in the file system for the target RP to evaluate.

7. IANA Considerations

IANA is requested to register the media type application/rpki-cir, the file extension .cir, and the necessary SMI Security for S/MIME Module Identifiers (OIDs), modeled identically to the IANA considerations defined in the CCR specification.

8. References

[Standard IETF references for RFC 2119, RFC 8174, RFC 6488, RFC 8182, etc. to be populated]


Next Step Guidance: If you plan to officially submit this to the IETF SIDROPS working group, you'll need to allocate the [TBD] OID placeholders and potentially run the ASN.1 syntax through an official compiler (like asn1c) to ensure there are no implicit tagging ambiguities in the CHOICE and OPTIONAL fields. Would you like me to refine the ASN.1 tagging strategy further?