Exporting Audit Logs to a SIEM
Audit events are only useful for compliance and incident response once they leave the portal and land in a queryable, tamper-evident store. This guide ships structured JSON audit logs from a Backstage portal to a SIEM — Splunk or Elasticsearch — with reliable delivery, defined retention, and the field schema auditors expect.
Centralizing logs is the operational endpoint of Audit Logging & Compliance within your Authentication, RBAC & Security Governance program. The portal emits events; a log shipper batches and forwards them over TLS; the SIEM indexes them and enforces retention. Getting the shipping layer right is what makes SOC 2 and ISO 27001 evidence collection a query instead of a fire drill.
Prerequisites
- Structured logging enabled in Backstage
>= 1.20.0so the backend emits JSON (not pretty-printed) logs with a stableevent_id,actor,action, andcorrelation_id. - A log shipper: Fluent Bit
>= 3.0as a Kubernetes DaemonSet, or the Elastic Filebeat>= 8.13agent. The examples use Fluent Bit. - SIEM ingest endpoint and token: a Splunk HEC token or an Elasticsearch API key, stored in Vault and surfaced as
${SIEM_API_KEY}— never inline. - Network egress allowlisting the SIEM endpoint on
443with TLS 1.3, matching the egress baseline from the parent section. - Retention policy decided: hot tier for recent queryable logs, cold/archive tier for the compliance window.
Exact Configuration
1. Emit audit events as structured JSON
Ensure the portal writes one JSON object per event to stdout so the shipper can parse it without regex. The envelope mirrors the schema from the audit-logging baseline.
# app-config.production.yaml — Requires Backstage >= 1.20.0
backend:
logger:
format: json
level: info
auditLog:
enabled: true
includeFields: [event_id, timestamp, actor, action, resource, outcome, correlation_id]
2. Ship logs with Fluent Bit
The pipeline tails the container logs, parses JSON, keeps only audit events, and forwards them. Batching and an on-disk buffer guarantee delivery across SIEM restarts.
# fluent-bit.conf — Requires Fluent Bit >= 3.0
[SERVICE]
Flush 5
Log_Level info
storage.path /var/log/flb-storage/
storage.backlog.mem_limit 64M
[INPUT]
Name tail
Path /var/log/containers/portal-backend-*.log
Parser docker
Tag portal.audit
storage.type filesystem
[FILTER]
Name grep
Match portal.audit
Regex log "action":
[OUTPUT]
Name splunk
Match portal.audit
Host ${SIEM_HOST}
Port 443
TLS On
TLS.Verify On
Splunk_Token ${SIEM_API_KEY}
Splunk_Send_Raw On
Retry_Limit 5
For an Elasticsearch destination, swap the output block:
# Requires Fluent Bit >= 3.0, Elasticsearch >= 8.13
[OUTPUT]
Name es
Match portal.audit
Host ${SIEM_HOST}
Port 443
TLS On
HTTP_Auth_Header ApiKey ${SIEM_API_KEY}
Index portal-audit
Suppress_Type_Name On
Retry_Limit 5
3. Enforce retention at the SIEM
Shipping is half the contract; the SIEM must keep logs for the mandated window and then expire them. The example uses an Elasticsearch ILM policy: hot for 30 days, then cold, then delete at the compliance boundary.
// PUT _ilm/policy/portal-audit — Requires Elasticsearch >= 8.13
{
"policy": {
"phases": {
"hot": { "actions": { "rollover": { "max_age": "30d", "max_primary_shard_size": "50gb" } } },
"cold": { "min_age": "30d", "actions": { "freeze": {} } },
"delete": { "min_age": "395d", "actions": { "delete": {} } }
}
}
}
4. Lock down the index against tampering
Restrict write access to the shipper’s API key only, and disable updates so events are append-only. This append-only posture is what auditors mean by tamper-evident. Scope the shipper’s credential narrowly using the same ownership discipline as your Role-Based Access Control Setup.
# Requires Elasticsearch >= 8.13 — create a write-only role for the shipper
curl -s -X POST "https://${SIEM_HOST}/_security/role/portal-audit-writer" \
-H "Authorization: ApiKey ${SIEM_ADMIN_KEY}" \
-H "Content-Type: application/json" \
-d '{"indices":[{"names":["portal-audit*"],"privileges":["create_index","create_doc"]}]}'
Validation
# 1. Fluent Bit is parsing and matching audit events (not dropping them)
kubectl exec ds/fluent-bit -n logging -- curl -s http://127.0.0.1:2020/api/v1/metrics \
| jq '.output."splunk.0".proc_records'
# Expect: a non-zero, increasing count
# 2. The SIEM received a recent event (Elasticsearch)
curl -s "https://${SIEM_HOST}/portal-audit*/_search?q=action:rbac.policy_update&size=1" \
-H "Authorization: ApiKey ${SIEM_API_KEY}" | jq '.hits.total.value'
# Expect: >= 1
# 3. ILM policy is attached and tracking the index
curl -s "https://${SIEM_HOST}/portal-audit*/_ilm/explain" \
-H "Authorization: ApiKey ${SIEM_API_KEY}" | jq '.indices[].policy'
# Expect: "portal-audit"
# 4. Buffer is draining, not backing up
kubectl exec ds/fluent-bit -n logging -- du -sh /var/log/flb-storage/
# Expect: small and stable, not growing unbounded
Edge Cases & Troubleshooting
| Symptom | Root Cause | Resolution |
|---|---|---|
| Events missing in SIEM but pods healthy | grep filter regex too strict, dropping events |
Relax the Regex match or confirm logs contain the action field |
| Filesystem buffer growing unbounded | SIEM unreachable; shipper retrying | Check egress/TLS to ${SIEM_HOST}; raise storage.backlog.mem_limit temporarily |
| Duplicate events after a restart | At-least-once retry re-sent buffered batch | Index on event_id as the document _id to dedupe on ingest |
Logs rejected with mapping conflict |
Schema drift in a field type | Pin an index template defining actor and outcome types before ingest |
| Old logs not expiring | ILM delete phase min_age misconfigured |
Verify min_age matches the retention window and rollover is firing |
Frequently Asked Questions
Should the portal push to the SIEM directly or go through a shipper?
Use a shipper. A dedicated agent like Fluent Bit adds an on-disk buffer, batching, and retry, so a SIEM outage never blocks the portal or loses events. Pushing directly from application code couples request latency to SIEM availability and drops events on failure.
How do I guarantee delivery if the SIEM is temporarily down?
Enable filesystem buffering (storage.type filesystem) with a backlog limit. The shipper persists undelivered batches to disk and replays them when the SIEM recovers, giving at-least-once delivery; deduplicate on event_id at the index to absorb the resulting retries.
What retention satisfies common compliance frameworks?
SOC 2 typically expects at least 12 months of accessible logs, so an ILM policy that keeps roughly 13 months before deletion gives a safe margin. ISO 27001 and HIPAA may require longer archival; tier older logs to cold or frozen storage to control cost while meeting the window.