Exporting Audit Logs to a SIEM

Audit events are only useful for compliance and incident response once they leave the portal and land in a queryable, tamper-evident store. This guide ships structured JSON audit logs from a Backstage portal to a SIEM — Splunk or Elasticsearch — with reliable delivery, defined retention, and the field schema auditors expect.

Centralizing logs is the operational endpoint of Audit Logging & Compliance within your Authentication, RBAC & Security Governance program. The portal emits events; a log shipper batches and forwards them over TLS; the SIEM indexes them and enforces retention. Getting the shipping layer right is what makes SOC 2 and ISO 27001 evidence collection a query instead of a fire drill.

Prerequisites

  1. Structured logging enabled in Backstage >= 1.20.0 so the backend emits JSON (not pretty-printed) logs with a stable event_id, actor, action, and correlation_id.
  2. A log shipper: Fluent Bit >= 3.0 as a Kubernetes DaemonSet, or the Elastic Filebeat >= 8.13 agent. The examples use Fluent Bit.
  3. SIEM ingest endpoint and token: a Splunk HEC token or an Elasticsearch API key, stored in Vault and surfaced as ${SIEM_API_KEY} — never inline.
  4. Network egress allowlisting the SIEM endpoint on 443 with TLS 1.3, matching the egress baseline from the parent section.
  5. Retention policy decided: hot tier for recent queryable logs, cold/archive tier for the compliance window.

Exact Configuration

1. Emit audit events as structured JSON

Ensure the portal writes one JSON object per event to stdout so the shipper can parse it without regex. The envelope mirrors the schema from the audit-logging baseline.

# app-config.production.yaml — Requires Backstage >= 1.20.0
backend:
  logger:
    format: json
    level: info
auditLog:
  enabled: true
  includeFields: [event_id, timestamp, actor, action, resource, outcome, correlation_id]

2. Ship logs with Fluent Bit

The pipeline tails the container logs, parses JSON, keeps only audit events, and forwards them. Batching and an on-disk buffer guarantee delivery across SIEM restarts.

# fluent-bit.conf — Requires Fluent Bit >= 3.0
[SERVICE]
    Flush         5
    Log_Level     info
    storage.path  /var/log/flb-storage/
    storage.backlog.mem_limit 64M

[INPUT]
    Name              tail
    Path              /var/log/containers/portal-backend-*.log
    Parser            docker
    Tag               portal.audit
    storage.type      filesystem

[FILTER]
    Name    grep
    Match   portal.audit
    Regex   log "action":

[OUTPUT]
    Name              splunk
    Match             portal.audit
    Host              ${SIEM_HOST}
    Port              443
    TLS               On
    TLS.Verify        On
    Splunk_Token      ${SIEM_API_KEY}
    Splunk_Send_Raw   On
    Retry_Limit       5

For an Elasticsearch destination, swap the output block:

# Requires Fluent Bit >= 3.0, Elasticsearch >= 8.13
[OUTPUT]
    Name              es
    Match             portal.audit
    Host              ${SIEM_HOST}
    Port              443
    TLS               On
    HTTP_Auth_Header  ApiKey ${SIEM_API_KEY}
    Index             portal-audit
    Suppress_Type_Name On
    Retry_Limit       5

3. Enforce retention at the SIEM

Shipping is half the contract; the SIEM must keep logs for the mandated window and then expire them. The example uses an Elasticsearch ILM policy: hot for 30 days, then cold, then delete at the compliance boundary.

// PUT _ilm/policy/portal-audit — Requires Elasticsearch >= 8.13
{
  "policy": {
    "phases": {
      "hot":    { "actions": { "rollover": { "max_age": "30d", "max_primary_shard_size": "50gb" } } },
      "cold":   { "min_age": "30d", "actions": { "freeze": {} } },
      "delete": { "min_age": "395d", "actions": { "delete": {} } }
    }
  }
}

4. Lock down the index against tampering

Restrict write access to the shipper’s API key only, and disable updates so events are append-only. This append-only posture is what auditors mean by tamper-evident. Scope the shipper’s credential narrowly using the same ownership discipline as your Role-Based Access Control Setup.

# Requires Elasticsearch >= 8.13 — create a write-only role for the shipper
curl -s -X POST "https://${SIEM_HOST}/_security/role/portal-audit-writer" \
  -H "Authorization: ApiKey ${SIEM_ADMIN_KEY}" \
  -H "Content-Type: application/json" \
  -d '{"indices":[{"names":["portal-audit*"],"privileges":["create_index","create_doc"]}]}'

Validation

# 1. Fluent Bit is parsing and matching audit events (not dropping them)
kubectl exec ds/fluent-bit -n logging -- curl -s http://127.0.0.1:2020/api/v1/metrics \
  | jq '.output."splunk.0".proc_records'
# Expect: a non-zero, increasing count

# 2. The SIEM received a recent event (Elasticsearch)
curl -s "https://${SIEM_HOST}/portal-audit*/_search?q=action:rbac.policy_update&size=1" \
  -H "Authorization: ApiKey ${SIEM_API_KEY}" | jq '.hits.total.value'
# Expect: >= 1

# 3. ILM policy is attached and tracking the index
curl -s "https://${SIEM_HOST}/portal-audit*/_ilm/explain" \
  -H "Authorization: ApiKey ${SIEM_API_KEY}" | jq '.indices[].policy'
# Expect: "portal-audit"

# 4. Buffer is draining, not backing up
kubectl exec ds/fluent-bit -n logging -- du -sh /var/log/flb-storage/
# Expect: small and stable, not growing unbounded

Edge Cases & Troubleshooting

Symptom Root Cause Resolution
Events missing in SIEM but pods healthy grep filter regex too strict, dropping events Relax the Regex match or confirm logs contain the action field
Filesystem buffer growing unbounded SIEM unreachable; shipper retrying Check egress/TLS to ${SIEM_HOST}; raise storage.backlog.mem_limit temporarily
Duplicate events after a restart At-least-once retry re-sent buffered batch Index on event_id as the document _id to dedupe on ingest
Logs rejected with mapping conflict Schema drift in a field type Pin an index template defining actor and outcome types before ingest
Old logs not expiring ILM delete phase min_age misconfigured Verify min_age matches the retention window and rollover is firing

Frequently Asked Questions

Should the portal push to the SIEM directly or go through a shipper?

Use a shipper. A dedicated agent like Fluent Bit adds an on-disk buffer, batching, and retry, so a SIEM outage never blocks the portal or loses events. Pushing directly from application code couples request latency to SIEM availability and drops events on failure.

How do I guarantee delivery if the SIEM is temporarily down?

Enable filesystem buffering (storage.type filesystem) with a backlog limit. The shipper persists undelivered batches to disk and replays them when the SIEM recovers, giving at-least-once delivery; deduplicate on event_id at the index to absorb the resulting retries.

What retention satisfies common compliance frameworks?

SOC 2 typically expects at least 12 months of accessible logs, so an ILM policy that keeps roughly 13 months before deletion gives a safe margin. ISO 27001 and HIPAA may require longer archival; tier older logs to cold or frozen storage to control cost while meeting the window.