# Pawrly — agent quick reference > Query APIs, files, MCP servers, and databases with SQL. Describe each source once in pawrly.yaml, then query it with stable table and column names from the CLI, a local service, or any MCP client. ## When to use Pawrly Use Pawrly when an agent or script needs data from more than one place — REST/GraphQL APIs, files (Parquet/CSV/JSON), object storage, MCP servers, databases, or warehouses — and you want one SQL question instead of a custom integration per source. It is the read and context path: reach for it before deciding or acting. For a single write to one system (create a ticket, send a message), call that system's own tool directly. ## Sources Pawrly can query - HTTP APIs — any REST or GraphQL endpoint; point at an OpenAPI 3.0 spec to get one table per GET. - Files & object storage — Parquet, CSV, JSON on local disk or S3 / GCS / Azure. - MCP servers — query tools from Linear, GitHub, Notion, internal systems, or other MCP servers as tables. - Databases — Postgres, MySQL, SQLite, DuckDB. - Warehouses & lakehouses — Snowflake, Iceberg, Delta, DuckLake. --- # Installing Pawrly > Query APIs, files, MCP servers, and databases with SQL. Describe each > source once, then use the same table and column names from the CLI, scripts, or > MCP clients. This document is written for LLMs and agents: it is the fastest path from nothing to a working `pawrly` binary and a first query. Everything here is POSIX-friendly and copy-pasteable. ## Install (macOS / Linux) Download a prebuilt binary for your platform: ```sh curl -fsSL https://pawrly.dev/install.sh | sh ``` This installs the `pawrly` binary to `~/.local/bin` (override with `PAWRLY_INSTALL_DIR`). It detects your OS/arch, verifies the SHA-256 checksum, and falls back to building from source with `cargo` if no prebuilt binary matches. Prebuilt binaries are published for Linux (`x86_64`, `aarch64`) and macOS (Apple Silicon and Intel). Pin a version or change the install location: ```sh curl -fsSL https://pawrly.dev/install.sh \ | PAWRLY_VERSION=v0.1.0 PAWRLY_INSTALL_DIR=/usr/local/bin sh ``` ## Install (Windows, PowerShell) ```powershell irm https://pawrly.dev/install.ps1 | iex ``` ## Install with Cargo Straight from source, no install script: ```sh cargo install --git https://github.com/CITGuru/pawrly pawrly-cli ``` Requires Rust ≥ 1.85 (2024 edition) and a C/C++ toolchain for DuckDB (`xcode-select --install` on macOS; `build-essential pkg-config libssl-dev cmake` on Debian/Ubuntu). ## Verify Run the engine with no sources, no network, no config: ```sh pawrly sql "SELECT 1 AS hello" ``` A single-row table confirms a healthy install. ## If you are an agent After install, inspect the workspace before querying: ```sh pawrly schema ``` Then use `pawrly sql ""` for reads, `pawrly mcp-stdio` when connecting an MCP client, and `pawrly validate` after editing `pawrly.yaml`. ## First query — join two local files Create `pawrly.yaml`: ```yaml version: 1 name: quickstart sources: - name: data kind: file tables: - name: customers path: ./data/customers.csv format: csv - name: orders path: ./data/orders.csv format: csv ``` Then query across both files in one statement: ```sh pawrly sql " SELECT c.name, COUNT(o.id) AS orders, SUM(o.amount_cents)/100 AS total FROM data.customers c LEFT JOIN data.orders o ON o.customer_id = c.id GROUP BY c.name ORDER BY total DESC " ``` ## First query — join two live APIs Describe each API once, then join them in plain SQL — no SDKs, no pagination loops: ```yaml version: 1 name: quickstart secrets: - kind: env # resolves ${secret:NAME} from environment variables sources: - name: stripe kind: http config: base_url: https://api.stripe.com auth: type: header headers: - name: Authorization bearer: ${secret:STRIPE_API_KEY} tables: - name: customers endpoint: /v1/customers response: path: $.data schema: - { name: email, type: varchar } - { name: delinquent, type: bool } ``` ```sh pawrly sql "SELECT email FROM stripe.customers WHERE delinquent = true" ``` Point an `http` source at an OpenAPI 3.0 spec and Pawrly synthesizes one table per `GET` operation automatically — no hand-written schema. ## Connect Pawrly to an agent (MCP) Pawrly ships an MCP server, so Claude Desktop, Cursor, Codex, and other clients can query the same workspace your CLI uses, over stdio or HTTP: ```sh pawrly mcp-stdio --config /absolute/path/to/pawrly.yaml ``` Pawrly also *consumes* other MCP servers as sources — their tools become tables you can query and join. ## Useful CLI commands - `pawrly sql ""` — run a query. - `pawrly schema` — list every table the workspace knows about. - `pawrly validate` — sanity-check the YAML without running anything. - `pawrly serve --config ./pawrly.yaml` — run a local daemon for faster invocations. - `pawrly status` — confirm a running daemon and that sources loaded. ## Environment overrides for the install script - `PAWRLY_VERSION` — tag to install (e.g. `v0.1.0`). Default: latest release. - `PAWRLY_INSTALL_DIR` — directory to install into. Default: `$HOME/.local/bin`. - `PAWRLY_REPO` — `owner/repo` to pull releases from. Default: `CITGuru/pawrly`. - `PAWRLY_NO_VERIFY` — set to `1` to skip SHA-256 checksum verification. - `PAWRLY_BUILD_FROM_SOURCE` — set to `1` to `cargo install` instead of a prebuilt. ## Links - Source: https://github.com/CITGuru/pawrly - Docs: https://github.com/CITGuru/pawrly#quickstart - Sources reference: https://github.com/CITGuru/pawrly/blob/main/docs/sources.md - MCP guide: https://github.com/CITGuru/pawrly/blob/main/docs/mcp.md - Semantic layer: https://github.com/CITGuru/pawrly/blob/main/docs/semantic.md - JSON Schema for `pawrly.yaml`: https://pawrly.dev/pawrly.schema.json ## Editor completion & validation The JSON Schema for `pawrly.yaml` is published at **https://pawrly.dev/pawrly.schema.json**. Reference it once at the top of your config and most editors (via the YAML language server) will give you inline completion, hover docs, and validation: ```yaml # yaml-language-server: $schema=https://pawrly.dev/pawrly.schema.json version: 1 name: my-workspace ``` --- ## Features ### Save a result and query it later Turn a slow query, local file, or remote URL into a table Pawrly can reuse. Agents get stable data without fetching or rebuilding the same answer every time. More: https://pawrly.dev/features/materialization ### Give agents the right business vocabulary Name the metrics, fields, joins, and filters your team trusts. Then people and agents can ask for revenue, customers, or usage without guessing how your tables work. More: https://pawrly.dev/features/semantic-layer ### See what your queries are doing See who queried what, what failed, and which sources are slow. Keep a safe query history for people and agents, then send the signals to the monitoring tools you already use. More: https://pawrly.dev/features/observability ## Writing ### Agents Need a Query Surface, Not More Tools I keep coming back to the same problem with agents. Read: https://pawrly.dev/blog/agents-need-a-query-surface-not-more-tools ## Reference - Documentation / quickstart: https://github.com/CITGuru/pawrly#quickstart - Sources reference: https://github.com/CITGuru/pawrly/blob/main/docs/sources.md - MCP guide: https://github.com/CITGuru/pawrly/blob/main/docs/mcp.md - Semantic layer: https://github.com/CITGuru/pawrly/blob/main/docs/semantic.md - JSON Schema for pawrly.yaml: https://pawrly.dev/pawrly.schema.json - Agent skills (Claude Code / Codex plugin): https://pawrly.dev/skill.md - Machine-readable index: https://pawrly.dev/llms.txt - Source code: https://github.com/CITGuru/pawrly