Pawrly ships a Model Context Protocol server so AI assistants — Claude Desktop, Cursor, Codex, and others — can query your workspace directly. The MCP server is a frontend: it runs the same engine as the CLI (in-process by default, or proxied to a pawrly serve daemon), so an agent sees exactly the data and semantic models you do.
Running it
pawrly mcp-stdioThis speaks MCP over stdio — the transport assistants launch as a subprocess. It honors the global engine-selection flags, so you can point it at a shared daemon:
pawrly mcp-stdio --remote uds:///path/to/pawrly.sockRun several agents against one daemon and they share one engine and one cache.
Over HTTP
For network clients, run the HTTP transport instead:
pawrly mcp-http --addr 127.0.0.1:8090It serves JSON-RPC at POST /mcp and a liveness probe at GET /healthz. Without a token it refuses to bind anything but a loopback address. To accept remote connections, require a bearer token — resolved from the config's secret backend or an environment variable of the same name:
pawrly mcp-http --addr 0.0.0.0:8090 --bearer-token-from MCP_TOKENEvery /mcp request must then carry Authorization: Bearer <token>.
Connecting Claude Desktop
Add Pawrly to your MCP client config (for Claude Desktop, claude_desktop_config.json):
{
"mcpServers": {
"pawrly": {
"command": "pawrly",
"args": ["mcp-stdio", "--config", "/absolute/path/to/pawrly.yaml"]
}
}
}Use an absolute path to the binary if pawrly isn't on the client's PATH. Other stdio-capable clients (Cursor, Codex, …) use the same command and args.
Tools
The server exposes these tools:
| Tool | Input | Returns |
|---|---|---|
query |
{ sql, max_rows?, query_id? } |
{ columns, rows, row_count, truncated } |
cancel_query |
{ query_id } |
{ cancelled } — aborts an in-flight query with that id |
list_sources |
{} |
the configured sources, their kinds, status, and table counts |
list_tables |
{ source? } |
the tables across configured sources |
search_tables |
{ query, source?, limit? } |
tables whose name or description match the keywords, ranked; { tables, match_count, truncated } |
list_columns |
{ table?, source?, name?, limit? } |
columns flattened one-per-row across tables; name greps column name/description; { columns, column_count, truncated } |
describe_table |
{ table } |
one table's columns, descriptions, pushdown affordances, examples, and agent-facing wiki notes |
get_schema |
{ sources?, compact? } |
a compact catalog overview for grounding an LLM |
refresh_table |
{ table } |
forces a cache refresh; returns rows written, size, and expiry |
materialize |
{ name, sql? | file? | url?, format?, params? } |
persists a named, self-backed table; { name, file_path, row_count, size_bytes } |
drop_materialized |
{ name } |
drops a materialized table; { dropped } |
list_semantic_models |
{} |
the semantic models with dimension/measure counts |
describe_semantic_model |
{ name } |
one model's full spec — dimensions, measures, relationships |
semantic_query |
a structured query (below) | { columns, rows, row_count, truncated } |
query
Run raw SQL. max_rows (default 1000) caps the rows returned. Pass a query_id so a concurrent cancel_query can abort a long-running scan.
{ "sql": "SELECT status, COUNT(*) FROM data.orders GROUP BY status", "max_rows": 100 }cancel_query
Abort an in-flight query or semantic_query that was started with the same query_id. Returns { cancelled } — false if no query with that id was running. The cancel arrives on a separate request, so it is effective over the HTTP transport (where a second connection can reach the server mid-query).
{ "query_id": "report-42" }list_sources
List every configured source with its kind, connection status, and table count. Takes no arguments.
{}list_tables
List tables across all sources, or limit to one with source. Each row carries the table's schema, name, kind, description, cache flag, and any required filters.
{ "source": "github" }search_tables
Keyword discovery for large catalogs. Matches the query terms against table names and descriptions (case-insensitive; every term must appear), ranking name hits ahead of description-only hits. Returns { tables, match_count, truncated }. Reach for this before describe_table when a source has hundreds of tables.
{ "query": "pull request review", "source": "github", "limit": 20 }list_columns
List columns flattened to one row per column — the column-level counterpart to list_tables. Scope with table (one table), source (one source), and/or name, a case-insensitive keyword over column name and description. Use name to find which tables expose a column like created_at or email. Returns { columns, column_count, truncated }.
{ "name": "created_at", "source": "github" }describe_table
Full detail for one fully-qualified <schema>.<table>: column schema, pushdown affordances, example queries, and agent-facing wiki usage notes.
{ "table": "github.pulls" }get_schema
A compact catalog overview for grounding an LLM in one call — every schema, its tables, and a one-line column list per table. Limit to named sources, or set compact: false for fuller detail.
{ "sources": ["github", "warehouse"], "compact": true }refresh_table
Force an immediate cache refresh of a fully-qualified table. Only valid for tables with caching enabled; returns the rows written, size on disk, and next expiry.
{ "table": "github.pulls" }materialize
Persist data as a named, self-backed table queryable as <namespace>.materialized.<name> (see materialize). Provide exactly one origin: sql (a query), file (a local CSV/Parquet/JSON path), or url (an http(s) file). Create-or-replace by name; the table is pinned and never auto-evicted. Returns { name, file_path, row_count, size_bytes }.
{ "name": "top_customers", "sql": "SELECT * FROM data.customers ORDER BY revenue DESC LIMIT 100" }drop_materialized
Drop a materialized table by name. Returns { dropped } — false if no such table existed.
{ "name": "top_customers" }list_semantic_models
List the semantic-layer models with their dimension and measure counts. Takes no arguments.
{}describe_semantic_model
Full spec for one model: its dimensions, measures, relationships, named segments (reusable filter sets you can pass in segments), and any required filters to satisfy up front.
{ "name": "orders" }semantic_query
Run a structured query against the semantic layer — the recommended surface for agents, because models give them a curated business vocabulary instead of raw column names.
{
"measures": ["orders.revenue"],
"dimensions": ["orders.order_date.month", "orders.status"],
"filters": [{ "member": "orders.status", "op": "equals", "values": ["paid"] }],
"order_by": [{ "member": "orders.order_date.month", "direction": "asc" }],
"limit": 100,
"params": { "tenant_id": "acme" },
"max_rows": 1000
}params binds ${param:NAME} placeholders used by a model's row-level-security predicates. If a model requires a param and the agent omits it, the query is refused before any scan — so an agent can't accidentally read across tenants.
Grounding agents
The intended flow for an assistant:
list_semantic_modelsto see what's available.describe_semantic_modelto learn a model's dimensions, measures, and any required filters.semantic_query(orqueryfor ad-hoc SQL) to get results.
Because describe_semantic_model advertises required filters and RLS params up front, an agent can satisfy them in the very next call.