Skip to content

LITE-34406 Add Prometheus metrics for certificate renewal#125

Open
jonatrios wants to merge 2 commits into
masterfrom
cr/LITE-34406-cert-renewal-metrics
Open

LITE-34406 Add Prometheus metrics for certificate renewal#125
jonatrios wants to merge 2 commits into
masterfrom
cr/LITE-34406-cert-renewal-metrics

Conversation

@jonatrios

@jonatrios jonatrios commented Jun 15, 2026

Copy link
Copy Markdown
Contributor

Summary

  • Adds chaperone_cert_expiry_seconds gauge: set on startup (after cert load in NewTLSConfig) and after each Install hot-swap; tls.Certificate.Leaf is now pre-parsed so NotAfter is always accessible without a second round-trip through x509.ParseCertificate
  • Adds chaperone_cert_renewals_total counter with status label (success|failure): incremented in HandleInstall on every attempt outcome
  • Extracts namespace = "chaperone" constant in metrics.go to satisfy goconst linter
  • Updates /metrics reference table in docs/reference/http-api.md
  • Rewords persistPair doc comment and mTLS test comment to accurately describe the best-effort rollback (cert may be absent after a key-rename failure, not the old pair restored)
  • Adds TestHandler_Install_PersistPair_KeyRenameFails covering the previously untested os.Remove(certPath) rollback branch

Jira: https://imorg.atlassian.net/browse/LITE-34406

🤖 Generated with Claude Code

@jonatrios jonatrios changed the title [CBPFR] CRTR-15: add Prometheus metrics for certificate renewal LITE-34406 Add Prometheus metrics for certificate renewal Jun 15, 2026
@jonatrios jonatrios changed the base branch from LITE-34405-renewal-http-handlers to master June 15, 2026 14:38
@jonatrios jonatrios requested a review from qarlosh June 15, 2026 14:42
- Extract const namespace = "chaperone" in metrics.go to satisfy
  goconst linter; replace all Namespace: "chaperone" occurrences
- CertExpirySeconds gauge: seeded on startup (after cert load) and
  updated in HandleInstall after each hot-swap; Leaf parsed in
  NewTLSConfig so NotAfter is accessible without a second parse
- CertRenewalsTotal counter (status: success|failure): incremented in
  HandleInstall on success and on Install error
- docs/reference/http-api.md: document both new metrics
Reword the persistPair doc comment and the mTLS integration test comment
to accurately describe the best-effort rollback: after a key-rename failure
the cert file may be absent (not the old pair restored), avoiding only the
new-cert + old-key mismatch.

Add TestHandler_Install_PersistPair_KeyRenameFails which forces the second
rename to fail (keyPath is an existing directory) and asserts that certPath
is removed by the rollback, covering the previously untested os.Remove(certPath)
branch in persistPair.
@jonatrios jonatrios force-pushed the cr/LITE-34406-cert-renewal-metrics branch from 48f2037 to 0dc4115 Compare June 18, 2026 09:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant