Observability — Pawrly

Pawrly can emit traces, metrics, and an activity log so you can see what queries run, how long they take, where the time goes, and what failed. It speaks OpenTelemetry (OTLP) and Prometheus, and exposes its own activity as a queryable SQL table.

Everything here is off by default — with no configuration, Pawrly logs to stderr as before and exports nothing. You turn pieces on with CLI flags for ad-hoc runs, or the observability: block in pawrly.yaml for a persistent setup. Flags override config.

Quick start

# JSON logs instead of text
pawrly --log-format json sql "SELECT 1"

# Export traces + metrics + logs to a local OpenTelemetry collector
pawrly --otel-endpoint http://localhost:4317 serve

# Scrape metrics with Prometheus (no collector needed)
pawrly --prometheus-listen 127.0.0.1:9090 serve &
curl -s localhost:9090/metrics | grep pawrly_

Or persist it in pawrly.yaml (see Configuration):

observability:
  otel:
    enabled: true
    endpoint: http://localhost:4317
    prometheus: { enabled: true, listen: 127.0.0.1:9090 }
  activity:
    enabled: true
    sinks: [tracing, table]
    redact_sql: literals
    store: ~/.pawrly/activity

Logging

Logs go to stderr in text (default) or json form. The level is an EnvFilter directive; RUST_LOG always wins.

Setting	Flag / env	Config
Level	`--log-level` / `PAWRLY_LOG` / `RUST_LOG`	`tracing.level`
Format	`--log-format` / `PAWRLY_LOG_FORMAT`	`tracing.format`

The subscriber is unified across the CLI, the daemon, and the MCP server, so all three log the same way.

Tracing

When OTLP export is on, Pawrly produces a span tree per operation, named pawrly.<subsystem>.<op>, exported over OTLP:

Span	Covers
`pawrly.engine.query` / `pawrly.engine.semantic_query`	query execution
`pawrly.engine.explain` / `pawrly.engine.materialize`	explain / materialize
`pawrly.semantic.compile`	semantic-model → SQL compilation
`pawrly.cache.refresh`	cache write-through
`pawrly.source.http.request`	an outbound REST/GraphQL request
`pawrly.server.query`	the gRPC transport hop
`pawrly.mcp.tool`	an MCP tool call

Trace context is propagated as W3C traceparent across the gRPC (CLI → daemon) and MCP HTTP boundaries, so a request that crosses processes is a single trace. SQL text and parameter values are never put on spans (cardinality + secrets); they live in the activity log, subject to redaction.

Configure under otel: — endpoint, protocol (grpc | http), service_name (the OTel resource name, default pawrly), sample_ratio (parent-based), and the traces / logs toggles.

Metrics

Metrics export over OTLP push (otel.metrics) and/or a Prometheus pull endpoint (otel.prometheus) — independently; enable either, both, or neither. The instruments:

Instrument	Type	Key attributes
`pawrly.query.total`	counter	`status`, `error_code`
`pawrly.query.duration`	histogram (ms)	`status`
`pawrly.query.rows_returned`	histogram
`pawrly.query.active`	up/down counter
`pawrly.semantic.compile.duration`	histogram (ms)
`pawrly.cache.refresh.duration`	histogram (ms)	`source`, `status`
`pawrly.source.request.total` / `.duration`	counter / histogram	`kind`, `status`, `http.response.status_code`
`pawrly.activity.dropped`	counter
`pawrly.activity.redaction_failed`	counter

active_queries is also surfaced by pawrly status and the daemon health check.

Activity log

The activity log records one structured record per query — both raw SQL (query) and semantic (semantic_query) — capturing who ran it, how long it took, how many rows it returned, and whether it failed. Enable it under activity: and choose one or more sinks:

tracing — emits each record as a structured tracing event (target pawrly.activity), so it flows to your logs and, with OTLP, to your log pipeline.

table — exposes the records as the system.activity SQL table, so you query your own history with SQL:

SELECT interface, status, count(*) AS n, avg(duration_ms) AS avg_ms
FROM system.activity
WHERE at > now() - INTERVAL '1 hour'
GROUP BY 1, 2
ORDER BY n DESC;

`system.activity` columns

Column	Notes
`id`	operation id
`at`	completion time (UTC)
`interface`	how it entered: `cli`, `grpc`, `mcp`, `flight`, `in_process`
`principal`	authenticated identity, when known
`operation`	`query` / `semantic_query`
`sql`	redacted per `redact_sql`
`param_keys`	parameter keys only — never values
`status`	`ok` / `error`
`error_code`	stable error code on failures
`duration_ms`, `rows_returned`, `bytes`
`trace_id`	OTel trace id, to cross-reference a trace

SQL redaction

redact_sql controls how much of the query text is stored:

Mode	Stored
`false`	the SQL verbatim
`literals`	the SQL with literal values replaced by `$REDACTED` (shape kept)
`true`	only the statement kind and referenced tables

Parameter values are never stored under any mode. Redaction is leak-safe: if a statement can't be parsed it degrades (literals → tables → the bare leading keyword like SELECT) and never falls back to raw text. Redaction parses with the same grammar the engine runs.

Durability

Without store, system.activity is a bounded in-memory ring of the most recent ring_capacity records, lost on restart. Set store to persist records as date/hour-partitioned Parquet (dt=YYYY-MM-DD/hr=HH/…), so the table survives restarts — it unions the on-disk history with the not-yet-flushed buffer. Records flush on a flush_threshold, a flush_interval timer, and on clean shutdown. retention prunes files older than its window; omit it to keep everything.

activity:
  enabled: true
  sinks: [table]
  store: ~/.pawrly/activity
  partition_hours: 4        # hr= bucket width
  flush_threshold: 1000     # records buffered before a file is written
  flush_interval: 60s       # or this, whichever comes first
  retention: 30d            # prune older files; omit to keep all history

Configuration reference

The full observability: block — every field and default — is documented in Configuration → Observability. A runnable config lives at examples/observability.yaml.