Configuration

Single YAML file, passed as CLI argument or set via the PRAXIS_CONFIG environment variable. See example configs for working examples.

Structure

listeners:             # Required. Named listeners to bind.
filter_chains:         # Named, reusable filter chains.
clusters:              # Optional. Standalone cluster defs (health checks).
admin:                 # Optional. Admin health endpoint.
body_limits:           # Optional. Global body size ceilings.
runtime:               # Optional. Thread pool and logging tuning.
shutdown_timeout_secs: # Optional. Graceful drain time (default: 30).
insecure_options:      # Optional. Dev/test overrides. See development.md.

Validating Configuration

Use --validate (or -t) to check configuration without starting the server. The flag loads the config through the same parsing and validation path used during startup, including filter pipeline construction and ordering checks.

praxis --validate --config praxis.yaml
praxis -t -c praxis.yaml

Exits 0 on success (no output). Exits non-zero and prints an error to stderr on failure. Does not bind listener ports or enter the server runtime.

Dumping Effective Configuration

Use --dump (or -T) to validate and dump the effective parsed configuration as YAML to stdout. The output includes the effective parsed config (with defaults applied) plus resolved top-level listener chains.

praxis --dump --config praxis.yaml
praxis -T -c praxis.yaml

Exits 0 on valid config, writing YAML to stdout. Exits non-zero and writes errors to stderr on failure. Does not start the proxy or bind listeners. --dump and --validate are mutually exclusive.

Dynamic Configuration Reload

Praxis watches the config file for changes and automatically reloads filter pipelines without restart or disruption. When the file is modified, the server validates the new config, rebuilds pipelines, and swaps them atomically. In-flight requests complete on the old pipeline; new requests pick up the new config.

If the new config is invalid (bad YAML, unknown filter, validation failure), the server logs the error and continues serving with the old config.

Dynamically reloadable:

Filter pipeline configuration
Router routes and path mappings
Load balancer endpoints and weights
Rate limit and circuit breaker settings
Health check configuration

Requires restart (logged as warning):

Listener add, remove, or address rebind
Protocol changes (HTTP to TCP)
Compression module addition
TLS enable/disable

Stateful filters (rate limiter, circuit breaker) reset their state on reload. Operators should expect a brief burst window for rate limiters and a closed circuit for circuit breakers immediately after reload.

See hot-reload.yaml for an example.

Admin

admin.address binds a separate HTTP listener that serves /healthy, /ready, and /metrics.

/healthy returns 200 OK with {"status":"ok"} once the server is accepting connections (liveness).
/ready returns per-cluster health status with healthy/unhealthy/total counts when active health checks are configured; it returns 503 SERVICE UNAVAILABLE when any cluster has zero healthy endpoints. Without health checks, /ready returns {"status":"ok"}.
/metrics returns Prometheus text exposition format with HTTP request metrics (praxis_http_requests_total, praxis_http_request_duration_seconds).

Any other path returns 404 NOT FOUND. Useful for orchestrator health checks and monitoring without exposing them on the main listeners.

admin:
  address: "127.0.0.1:9901"

When admin.verbose: true, the /ready response includes per-cluster detail (cluster names, health counts). Default is false to avoid leaking internal topology.

admin:
  address: "127.0.0.1:9901"
  verbose: true

By default, binding admin to a public interface (0.0.0.0 / [::]) is a validation error.

Annotated Example

listeners:
  - name: web
    address: "0.0.0.0:8080"
    filter_chains:
      - observability
      - routing

filter_chains:
  - name: observability
    filters:
      - filter: request_id
      - filter: access_log

  - name: routing
    filters:
      - filter: router
        routes:
          - path_prefix: "/api/"
            cluster: api
          - path_prefix: "/"
            cluster: web
      - filter: load_balancer
        clusters:
          - name: api
            endpoints: ["127.0.0.1:4000"]
          - name: web
            endpoints:              # multi-line form
              - "127.0.0.1:3000"   # (equivalent to inline
              - "127.0.0.1:3001"   #  array above)

Listeners

Each listener has a required name, an address, optional tls, optional protocol (defaults to http), and an optional list of filter_chains to apply. When filter_chains is omitted it defaults to empty (no filters applied).

listeners:
  - name: public
    address: "0.0.0.0:80"
    filter_chains: [main]

  - name: secure
    address: "0.0.0.0:443"
    filter_chains: [main]
    tls:
      certificates:
        - cert_path: /etc/praxis/tls/cert.pem
          key_path: /etc/praxis/tls/key.pem

The name field uniquely identifies the listener and is used to resolve its pipeline at startup.

Network Binding

Binding to 0.0.0.0 or [::] exposes the listener on all network interfaces. For local development, prefer 127.0.0.1. In production, bind to specific internal IPs and use firewall rules to restrict access. The default configuration binds to 127.0.0.1:8080 as a security precaution.

TCP Listeners

TCP listeners set protocol: tcp and require an upstream address. Filter chains are optional for TCP listeners.

listeners:
  - name: postgres
    address: "0.0.0.0:5432"
    protocol: tcp
    upstream: "10.0.0.1:5432"

Optional tcp_idle_timeout_ms closes connections that have been idle longer than the specified duration:

listeners:
  - name: postgres
    address: "0.0.0.0:5432"
    protocol: tcp
    upstream: "10.0.0.1:5432"
    tcp_idle_timeout_ms: 300000   # 5 minutes

Optional tcp_max_duration_secs caps the total session duration regardless of activity:

listeners:
  - name: postgres
    address: "0.0.0.0:5432"
    protocol: tcp
    upstream: "10.0.0.1:5432"
    tcp_max_duration_secs: 3600   # 1 hour

Downstream Read Timeout

Optional downstream_read_timeout_ms sets how long the proxy waits for data from downstream clients during body reads. Mitigates slow-body attacks on HTTP listeners.

listeners:
  - name: web
    address: "0.0.0.0:8080"
    downstream_read_timeout_ms: 10000   # 10 seconds
    filter_chains: [main]

Pingora applies its own 60s default for initial request header reads on fresh connections. This setting controls body read timeouts within an active request.

Max Connections

Optional max_connections caps concurrent connections per listener. HTTP listeners reject excess requests with 503 Service Unavailable and a Retry-After: 1 header. TCP listeners close the socket immediately.

listeners:
  - name: public
    address: "0.0.0.0:8080"
    max_connections: 10000
    filter_chains: [main]

The limit is enforced via a per-listener semaphore. Permits are held for the request lifetime (HTTP) or connection lifetime (TCP) and released automatically on completion, error, or timeout. Each listener has an independent limit.

See max-connections.yaml for an example.

Mixed Protocols

HTTP and TCP listeners can run on a single server instance. Each listener gets its own filter chains appropriate to its protocol.

listeners:
  - name: web
    address: "0.0.0.0:8080"
    filter_chains: [routing]

  - name: db
    address: "0.0.0.0:5432"
    protocol: tcp
    upstream: "10.0.0.1:5432"

See TLS & mTLS for TLS details.

Filter Chains

Named filter chains are defined at the top level. Each chain has a name and an ordered list of filters. Listeners reference chains by name via filter_chains:.

filter_chains:
  - name: security
    filters:
      - filter: headers
        response_set:
          - name: "X-Content-Type-Options"
            value: "nosniff"

  - name: observability
    filters:
      - filter: request_id
      - filter: access_log

  - name: routing
    filters:
      - filter: router
        routes:
          - path_prefix: "/"
            cluster: backend
      - filter: load_balancer
        clusters:
          - name: backend
            endpoints: ["10.0.0.1:8080"]

Chain Composition

A listener can reference multiple chains. The filters from each chain are concatenated in order to form the listener's complete pipeline. This enables reuse without duplication.

listeners:
  - name: public
    address: "0.0.0.0:8080"
    filter_chains:
      - security
      - observability
      - routing

  - name: internal
    address: "0.0.0.0:9090"
    filter_chains:
      - observability
      - routing

The public listener runs security + observability + routing. The internal listener skips security but shares the same observability and routing chains.

Protocol Compatibility

Filters are protocol-aware. HTTP filters (e.g. router, load_balancer) only work on HTTP listeners. TCP filters (e.g. tcp_access_log) work on both HTTP and TCP listeners. An HTTP listener's protocol stack includes TCP, so it supports TCP-level filters too.

Built-in Filters

Filter	Category	Protocol
`router`	Traffic Management	HTTP
`load_balancer`	Traffic Management	HTTP
`timeout`	Traffic Management	HTTP
`static_response`	Traffic Management	HTTP
`rate_limit`	Traffic Management	HTTP
`circuit_breaker`	Traffic Management	HTTP
`headers`	Transformation	HTTP
`request_id`	Observability	HTTP
`access_log`	Observability	HTTP
`tcp_access_log`	Observability	TCP
`forwarded_headers`	Security	HTTP
`guardrails`	Security	HTTP
`ip_acl`	Security	HTTP
`credential_injection`	Security	HTTP
`a2a`	Payload Processing	HTTP
`json_body_field`	Payload Processing	HTTP
`json_rpc`	Payload Processing	HTTP
`mcp`	Payload Processing	HTTP
`compression`	Payload Processing	HTTP
`cors`	Security	HTTP
`csrf`	Security	HTTP
`redirect`	Traffic Management	HTTP
`path_rewrite`	Transformation	HTTP
`url_rewrite`	Transformation	HTTP
`sni_router`	Traffic Management	TCP
`tcp_load_balancer`	Traffic Management	TCP
`model_to_header`	AI / Inference	HTTP (requires `ai-inference` feature)
`prompt_enrich`	AI / Inference	HTTP (requires `ai-inference` feature)

Router

Routes requests to clusters by path prefix. Longest prefix wins. Optional host restricts matching to a specific Host header. Optional headers restricts matching to requests with all specified header values present (AND semantics, case-sensitive). Routes without host match any host.

Example configs: path-based-routing.yaml, hosts.yaml, canary-routing.yaml.

Load Balancing

Strategies:

round_robin (default): cycles through endpoints
least_connections: picks endpoint with fewest active requests
consistent_hash: hashes a request header (or URI path as fallback) to pin requests to stable endpoints

Example configs: weighted-load-balancing.yaml, least-connections.yaml, session-affinity.yaml.

Cluster-level options: connection_timeout_ms, total_connection_timeout_ms, idle_timeout_ms, read_timeout_ms, write_timeout_ms, tls.

total_connection_timeout_ms sets the combined budget for TCP connect and TLS handshake. When used alongside connection_timeout_ms, the difference is effectively the TLS handshake budget. It must be >= connection_timeout_ms.

Cluster tls enables TLS to the upstream. See TLS & mTLS for full details on upstream TLS, mTLS, CA trust, and certificate verification.

Health Checks

Clusters support active health checks via the health_check field. Endpoints that fail consecutive probes are removed from load balancer rotation until they recover. See health-checks.yaml.

Field	Type	Default	Description
`type`	string	required	`"http"` or `"tcp"` (`"grpc"` parses but is not yet supported)
`path`	string	`"/"`	HTTP path to probe (HTTP only)
`expected_status`	integer	200	Expected HTTP status code
`interval_ms`	integer	5000	Probe interval in ms
`timeout_ms`	integer	2000	Per-probe timeout in ms
`healthy_threshold`	integer	2	Consecutive successes to mark healthy
`unhealthy_threshold`	integer	3	Consecutive failures to mark unhealthy
`passive_unhealthy_threshold`	integer	none	Consecutive upstream failures (5xx or connect error) to mark unhealthy without probes
`passive_healthy_threshold`	integer	none	Consecutive upstream successes to recover a passively-marked endpoint

TCP health checks only verify a TCP connection can be established; path and expected_status are ignored. When active health checks are configured, the admin /ready endpoint reports per-cluster health counts.

Passive health checking tracks upstream request outcomes inline. When passive_unhealthy_threshold is set, endpoints that return consecutive 5xx responses or connect errors are marked unhealthy without dedicated probe traffic. Set passive_healthy_threshold to control how many consecutive successes are required to recover. Passive and active checks can be used together; either mechanism can mark an endpoint unhealthy, and either can recover it.

By default, health check endpoints that resolve to loopback, link-local, or cloud metadata addresses are rejected (SSRF protection).

Headers

Add headers to requests; add, set, or remove headers on responses:

- filter: headers
  request_add:
    - name: "X-Forwarded-Proto"
      value: "https"
  response_add:
    - name: "X-Served-By"
      value: "praxis"
  response_set:
    - name: "Server"
      value: "praxis"
  response_remove:
    - "X-Powered-By"

add appends (preserves existing), set replaces, remove deletes. Request headers support add only. Response headers support all three operations.

Timeout

Returns 504 if upstream response exceeds configured duration:

- filter: timeout
  timeout_ms: 5000

Request ID

Propagates an existing request ID header or generates a new one:

- filter: request_id
  header_name: "X-Request-Id"   # optional, this is the default

Access Log

Structured JSON logging of method, path, status, and timing:

- filter: access_log

Optional sampling to reduce log volume:

- filter: access_log
  sample_rate: 0.1    # log ~10% of requests

Forwarded Headers

Injects X-Forwarded-For, X-Forwarded-Proto, and X-Forwarded-Host into upstream requests:

- filter: forwarded_headers
  trusted_proxies:
    - "10.0.0.0/8"
    - "172.16.0.0/12"

When the client IP is from a trusted proxy, existing X-Forwarded-For values are preserved. Otherwise, the header is overwritten to prevent spoofing.

IP ACL

Allow or deny requests by source IP/CIDR:

- filter: ip_acl
  allow:
    - "10.0.0.0/8"

Use either allow or deny, not both (mutually exclusive). When allow is set, only matching IPs are permitted (implicit deny-all). Denied requests receive a 403 Forbidden response.

Credential Injection

Injects per-cluster API credentials into upstream requests and strips client-provided credentials to prevent forwarding. Pair with a source discriminator (IP ACL, client authentication) to control which clients receive credential upgrades. See credential-injection.yaml.

- filter: credential_injection
  clusters:
    - name: openai
      header: Authorization
      value: "sk-example-key"
      header_prefix: "Bearer "
      strip_client_credential: true

Field	Type	Required	Description
`clusters[].name`	string	yes	Cluster to inject credentials for
`clusters[].header`	string	yes	Header name to set
`clusters[].value`	string	one of	Inline credential value
`clusters[].env_var`	string	one of	Environment variable containing the credential
`clusters[].header_prefix`	string	no	Prefix prepended to the value (e.g. `"Bearer "`)
`clusters[].strip_client_credential`	bool	no	Remove client-sent value before injection (default: true)

TCP Access Log

Structured JSON logging of TCP connections. Works on both TCP and HTTP listeners:

- filter: tcp_access_log

SNI Router

Routes TLS connections to upstream addresses based on the SNI hostname from the TLS ClientHello. Supports exact matches and wildcard patterns (e.g. *.example.com). Performs exact-match lookup first, then longest-suffix wildcard match. Matching is case-insensitive per RFC 4343.

- filter: sni_router
  routes:
    - server_names: ["api.example.com"]
      upstream: "10.0.0.1:443"
    - server_names: ["*.example.com"]
      upstream: "10.0.0.2:443"
  default_upstream: "10.0.0.3:443"

Field	Type	Required	Description
`routes`	list	yes	SNI route entries
`routes[].server_names`	list	yes	Exact or wildcard hostnames
`routes[].upstream`	string	yes	Upstream address for matches
`default_upstream`	string	no	Fallback when no route matches

Connections without SNI or with no matching route use default_upstream if configured, otherwise receive a 421 rejection. Bare wildcards (*), IP addresses as server names, and duplicate server names across routes are rejected at config validation.

TCP Load Balancer

Selects an upstream TCP endpoint from a cluster using the configured load-balancing strategy. Reads ctx.cluster to find the target cluster, selects an endpoint, and writes ctx.upstream_addr. Supports round-robin (default), least-connections, and consistent-hash strategies.

- filter: tcp_load_balancer
  clusters:
    - name: db_pool
      endpoints:
        - "10.0.0.1:5432"
        - "10.0.0.2:5432"

Weighted endpoints and strategy selection follow the same syntax as the HTTP load_balancer filter. Health check integration is supported; if all endpoints are unhealthy, the filter enters panic mode and routes to all endpoints.

JSON Body Field

Extracts a top-level field from a JSON request body and promotes its value to a request header. Uses StreamBuffer mode to inspect the body before upstream selection, enabling body-based routing.

- filter: json_body_field
  field: model
  header: X-Model

field is the JSON key to extract. header is the request header name to promote the value into. If the field is missing or the body is not valid JSON, the filter passes through without modification.

JSON-RPC

Parses JSON-RPC 2.0 request bodies and promotes method, id, and message kind to request headers for routing. Uses StreamBuffer mode to inspect the body before upstream selection.

- filter: json_rpc
  max_body_bytes: 1048576
  batch_policy: reject
  on_invalid: continue

Field	Type	Default	Description
`max_body_bytes`	integer	1048576	Maximum body size to buffer (1 MiB)
`batch_policy`	string	`"reject"`	`"reject"` returns 400 for batch arrays; `"first"` uses first valid request
`on_invalid`	string	`"continue"`	`"continue"` passes non-JSON through; `"reject"` returns 400; `"error"` raises a filter error
`headers.method`	string	`"X-Json-Rpc-Method"`	Header name for the JSON-RPC method
`headers.id`	string	`"X-Json-Rpc-Id"`	Header name for the JSON-RPC id
`headers.kind`	string	`"X-Json-Rpc-Kind"`	Header name for the message kind

Message kinds: request, notification, response, batch. The filter also writes json_rpc.* entries to the filter result set for branch chain conditions.

MCP

Extracts Model Context Protocol metadata from JSON-RPC request bodies and promotes method, tool/resource/prompt name, session ID, and protocol version to request headers and filter results for routing. Validates MCP headers against body-derived values when header_validation is configured.

- filter: mcp
  max_body_bytes: 65536
  on_invalid: reject

Field	Type	Default	Description
`max_body_bytes`	integer	65536	Maximum body size to buffer (64 KiB)
`on_invalid`	string	`"reject"`	`"reject"` returns 400 for non-MCP; `"continue"` passes through
`header_validation.mismatch`	string	`"reject"`	`"reject"` or `"ignore"` when MCP headers conflict with body values
`header_validation.missing`	string	`"ignore"`	`"ignore"`, `"synthesize"` (inject from body), or `"reject"`
`headers.method`	string	`"x-praxis-mcp-method"`	Header name for MCP method
`headers.name`	string	`"x-praxis-mcp-name"`	Header name for tool/resource/prompt name
`headers.kind`	string	`"x-praxis-mcp-kind"`	Header name for JSON-RPC kind
`headers.session_present`	string	`"x-praxis-mcp-session-present"`	Header name for session presence

Recognized MCP methods include initialize, tools/call, tools/list, resources/read, resources/list, prompts/get, prompts/list, ping, and others. Methods requiring a name selector (tools/call, resources/read, prompts/get) return a JSON-RPC error if the selector is missing and on_invalid is "reject". The filter writes mcp.* and json_rpc.* entries to the filter result set for branch chain conditions.

Static Response

Returns a fixed response without contacting any upstream. Useful for health checks, status endpoints, or stub routes:

- filter: static_response
  status: 200
  headers:
    - name: Content-Type
      value: application/json
  body: '{"status": "ok", "server": "praxis"}'

status is required. headers and body are optional. Combine with conditions to serve static responses on specific paths.

Rate Limit

Token bucket rate limiter. Supports per_ip (one bucket per source IP) and global (one shared bucket) modes. Rejects excess traffic with 429 and Retry-After header. Injects X-RateLimit-Limit, X-RateLimit-Remaining, and X-RateLimit-Reset headers into both rejections and successful responses.

- filter: rate_limit
  mode: per_ip        # "per_ip" or "global"
  rate: 100           # tokens replenished per second
  burst: 200          # maximum bucket capacity

Field	Type	Required	Description
`mode`	string	yes	`"per_ip"` or `"global"`
`rate`	float	yes	Tokens per second (must be > 0)
`burst`	integer	yes	Max bucket capacity (must be >= rate)

Circuit Breaker

Per-cluster circuit breaker that prevents cascading failures. When consecutive upstream failures reach the threshold, the circuit opens and subsequent requests receive 503 immediately. After the recovery window, a single probe request is forwarded; if it succeeds the circuit closes. See circuit-breaker.yaml.

- filter: circuit_breaker
  clusters:
    - name: backend
      consecutive_failures: 5
      recovery_window_secs: 30

Field	Type	Required	Description
`clusters[].name`	string	yes	Cluster name to protect
`clusters[].consecutive_failures`	integer	yes	Failures before opening
`clusters[].recovery_window_secs`	integer	yes	Seconds before half-open probe

Guardrails

Rejects requests matching string or regex rules against headers and/or body content. Rejected requests receive 403 Forbidden.

- filter: guardrails
  rules:
    - target: header
      name: "User-Agent"
      pattern: "bad-bot.*"
    - target: body
      contains: "DROP TABLE"
    - target: body
      pattern: "^\\{.*\\}$"
      negate: true

Each rule has:

Field	Type	Required	Description
`target`	string	yes	`"header"` or `"body"`
`name`	string	header only	Header name to inspect
`contains`	string	one of	Literal substring match
`pattern`	string	one of	Regex pattern match
`negate`	bool	no	Invert match (default: false)

Each rule must have either contains or pattern, not both. Body rules use StreamBuffer mode (up to 1 MiB by default) to inspect the full request body.

CORS

Spec-compliant CORS filter with preflight handling, origin validation, and credential support. See cors.yaml.

Field	Type	Default	Description
`allow_origins`	list	required	Origins to allow; `["*"]` for any
`allow_methods`	list	GET, HEAD, POST	Allowed HTTP methods
`allow_headers`	list	none	Allowed request headers
`expose_headers`	list	none	Response headers exposed to client
`allow_credentials`	bool	false	Include credentials header
`max_age`	integer	86400	Preflight cache duration (seconds)
`allow_private_network`	bool	false	Private Network Access support
`disallowed_origin_mode`	string	"omit"	`"omit"` or `"reject"` for non-matching origins
`allow_null_origin`	bool	false	Allow `Origin: null`

Wildcard subdomain patterns (e.g. https://*.example.com) are supported. allow_credentials: true is incompatible with wildcard origins, methods, or headers per the Fetch spec.

CSRF

Cross-site request forgery protection via origin validation. Safe methods (GET, HEAD, OPTIONS by default) bypass the check. State-changing methods require an Origin or Referer header matching the trusted origins. Rejected requests receive 403 Forbidden. See csrf.yaml.

- filter: csrf
  trusted_origins:
    - "https://app.example.com"
    - "https://*.example.com"
  enforce_percentage: 100
  enable_sec_fetch_site: true

Field	Type	Default	Description
`trusted_origins`	list	required	Origins to allow; `["*"]` for any
`safe_methods`	list	GET, HEAD, OPTIONS	Methods that bypass CSRF checks
`enforce_percentage`	integer	100	Percentage of requests to enforce (0..=100); enables gradual rollout
`enable_sec_fetch_site`	bool	false	Reject requests with `Sec-Fetch-Site: cross-site`

Wildcard subdomain patterns (e.g. https://*.example.com) are supported. A bare wildcard ("*") cannot be mixed with other origins. Set insecure_options.csrf_log_only: true to log violations without rejecting requests during initial rollout.

Redirect

Returns a 3xx redirect without contacting any upstream:

- filter: redirect
  status: 301
  location: "https://example.com${path}"

Field	Type	Default	Description
`status`	integer	301	Redirect status (301, 302, 307, or 308)
`location`	string	required	URL template; `${path}` and `${query}` are substituted

Path Rewrite

Rewrites the request path before forwarding to upstream. Exactly one of strip_prefix, add_prefix, or replace per filter instance. Query strings are preserved. See path-rewriting.yaml.

Field	Type	Description
`strip_prefix`	string	Remove this prefix from the path
`add_prefix`	string	Prepend this prefix to the path
`replace.pattern`	string	Regex pattern to match
`replace.replacement`	string	Replacement string (`$1`, `$name` captures)

URL Rewrite

Regex-based path transformation and query string manipulation. Operations applied in order: regex_replace, strip_query_params, add_query_params. See url-rewriting.yaml.

Compression

Gzip, brotli, and zstd response compression. All three enabled by default. See compression.yaml.

Field	Type	Default	Description
`level`	integer	6	Default compression level (1-12)
`min_size_bytes`	integer	256	Skip responses smaller than this
`gzip`	object	enabled	Per-algorithm `enabled` and `level`
`brotli`	object	enabled	Per-algorithm `enabled` and `level`
`zstd`	object	enabled	Per-algorithm `enabled` and `level`
`content_types`	list	see above	MIME type prefixes that qualify

At least one algorithm must be enabled.

Prompt Enrich

Injects statically configured messages into OpenAI-compatible chat completion request bodies. The filter parses the JSON body, splices configured messages into the messages array, re-serializes, and updates Content-Length. Requires the ai-inference feature. See prompt-enrichment.yaml.

- filter: prompt_enrich
  prepend:
    - role: system
      content: "You are a helpful assistant."
  append:
    - role: user
      content: "Cite your sources."

Field	Type	Default	Description
`prepend`	list	`[]`	Messages inserted at the beginning of `messages` (system role only)
`append`	list	`[]`	Messages added at the end of `messages` (system or user role)
`on_invalid`	string	`"continue"`	`"continue"` passes non-JSON through; `"reject"` returns 400
`max_body_bytes`	integer	10485760	Maximum body size to buffer (10 MiB)

At least one of prepend or append must be non-empty. Each message has a role (system or user) and a non-empty content string. JSON is re-serialized, so byte-for-byte body identity is not preserved. In chains that also use json_body_field or model_to_header, place prompt_enrich first.

Conditions

when/unless gates on any filter chain entry. Request predicates: path (exact match), path_prefix, methods, headers. All fields within a condition are ANDed. Use response_conditions with status or headers predicates to gate response hooks. See conditional-filters.yaml.

Request conditions gate both request and body hooks. Response conditions gate only on_response and response body hooks. A filter can have both conditions and response_conditions.

Payload Size Limits

Global hard ceilings on request and response payload size. These apply across all body modes (Stream, StreamBuffer). When a filter also declares a per-filter max_bytes, the smaller of the two limits is enforced. Requests exceeding the limit receive 413 (Payload Too Large).

body_limits:
  max_request_bytes: 10485760    # 10 MiB
  max_response_bytes: 5242880    # 5 MiB

Both default to 10 MiB (10,485,760 bytes) when omitted. Setting either to null removes the ceiling but requires insecure_options.allow_unbounded_body: true; without that flag, startup fails with a validation error.

Header and Request Limits

Praxis inherits header and request limits from Pingora's HTTP/1.x parser. These are compile-time constants in Pingora and are not currently configurable in Praxis.

Limit	Value	Notes
Max total header size	1,048,575 B (~1 MiB)	Includes request line
Max number of headers	256	HTTP/1.x only
Request-URI max size	shared with header limit	No separate cap
Header read timeout	60 s	Pingora default
Body buffer chunk	65,536 B (64 KiB)	Per-read buffer

HTTP/2 header limits are governed by the h2 crate's HPACK and frame-level settings (typically 16 KiB for HEADERS frames by default, negotiated via SETTINGS).

Requests that exceed header size or count limits receive a 400 Bad Request from Pingora before reaching the filter pipeline.

Runtime

Worker thread pool and scheduling configuration.

runtime:
  threads: 8             # 0 = auto-detect (default)
  work_stealing: true    # default: true

threads: number of worker threads per service. When set to 0 (the default), the thread count is auto-detected from available CPUs.
work_stealing: allow work-stealing between worker threads of the same service. Enabled by default.
global_queue_interval: fixed global queue interval for the tokio scheduler. Option<u32>, defaults to Some(61). Set to null to use tokio's default.
upstream_keepalive_pool_size: maximum number of idle upstream connections kept per thread. Option<usize>, defaults to Some(64). Set to null to disable keepalive pooling.

runtime:
  threads: 4
  work_stealing: true
  global_queue_interval: 61
  upstream_keepalive_pool_size: 64

Upstream CA

upstream_ca_file sets a PEM CA file used as the root certificate store for all upstream TLS connections. Per-cluster tls.ca overrides this for individual clusters.

runtime:
  upstream_ca_file: /etc/praxis/tls/internal-ca.pem

This replaces the system trust store (not additive). See TLS & mTLS for details on CA trust precedence and combined bundles.

Logging

Set PRAXIS_LOG_FORMAT=json to emit structured JSON log output instead of the default human-readable format.

Per-module log level overrides can be configured under runtime.log_overrides:

runtime:
  log_overrides:
    praxis_filter::pipeline: trace
    praxis_protocol: debug

This is useful for debugging a specific subsystem without flooding output from every module.

Key-Value Stores

In-memory key-value stores for runtime-updatable mappings. Stores are created dynamically by filters at runtime via KvStoreRegistry::get_or_create and managed through the admin API. No YAML configuration is required.

Filters access stores by name through HttpFilterContext and TcpFilterContext.

Match Types

Stores support four match types for key lookup:

Type	Behavior
`exact`	Key must equal the lookup key
`prefix`	Stored key starts with the pattern
`suffix`	Stored key ends with the pattern
`regex`	Stored key matches a regex pattern

Admin API

When admin.address is configured, CRUD endpoints are available:

Method	Path	Description
`GET`	`/api/kv/{store}`	List all entries
`GET`	`/api/kv/{store}/{key}`	Get a value
`PUT`	`/api/kv/{store}/{key}`	Set a value (body)
`DELETE`	`/api/kv/{store}/{key}`	Delete a key

Writes are immediately visible to all filters on all threads. Unknown store names return 404.

Runtime Cache Semantics

Key-value stores are runtime caches, not durable storage. Data lives in memory and is lost on process exit.

The store is designed for operational overrides (routing tables, feature flags, config knobs) that can be reconstructed from an external source of truth. Do not use it as a primary data store.

Pluggable Backends

The KvBackend trait allows alternative implementations (e.g. Redis). The default InMemoryKvBackend uses DashMap for lock-free reads. See the praxis_core::kv module docs for the trait definition.

Graceful Shutdown

The shutdown_timeout_secs field controls how long the server drains in-flight connections before forcing shutdown:

shutdown_timeout_secs: 60    # default: 30

Default Configuration

When no configuration file is provided, Praxis starts with a built-in default config that listens on 127.0.0.1:8080 and responds with {"status": "ok", "server": "praxis"} on / (exact match) and 404 elsewhere. The default binds to localhost only, preventing accidental exposure to public networks during initial setup. This allows zero config startup for testing. The source lives in default.yaml. For a realistic starting point, see basic-reverse-proxy.yaml.

Example Configs

Working examples live under examples/configs/, organized by category:

Directory	Contents
`ai`	AI inference model-to-header routing
`traffic-management`	Router, load balancer, timeouts, static responses, redirects, rate limiting, health checks
`payload-processing`	Body processing: compression, field extraction, stream buffering, size limits
`security`	Forwarded headers, IP ACL, guardrails, CORS, downstream read timeout
`observability`	Access logs, request IDs
`transformation`	Header manipulation, path rewriting, URL rewriting
`protocols`	TCP, TLS, mixed protocol configs
`pipeline`	Filter chain composition and conditions
`operations`	Production gateway, multi-listener, admin

Validation and Security

Praxis validates configuration at startup and fails closed. Ambiguous or risky settings are errors, not warnings. Insecure overrides (see Testing) require explicit opt-in and emit warnings at startup.

Key validations: listener name uniqueness, filter chain reference resolution, TLS path traversal rejection, admin endpoint binding restrictions, health check SSRF protection, upstream TLS SNI requirements, and payload size enforcement.

Error Behavior

Praxis fails fast at startup for configuration problems. Common failure modes:

Invalid YAML or missing required fields: the process exits with a descriptive error before any listener binds.
Unknown filter chain reference: a listener references a chain name not defined in filter_chains:; caught at config validation.
TLS certificate load failure: the process exits if a certificate's cert_path or key_path cannot be read or parsed.
Address bind failure: if the listen address is already in use or invalid, the server fails to start.

At runtime:

Unreachable upstream: the request returns 502 (Bad Gateway). Connection timeouts are configurable per cluster.
Filter error: an Err from a filter results in a 500 response to the client. The error is logged.
Payload too large: exceeding body_limits.max_request_bytes or a filter's max_bytes returns 413.

Overrides

Some validations and features can be overridden for development and testing purposes. See insecure_options in Testing.

Structure​

Validating Configuration​

Dumping Effective Configuration​

Dynamic Configuration Reload​

Admin​

Annotated Example​

Listeners​

Network Binding​

TCP Listeners​

Downstream Read Timeout​

Max Connections​

Mixed Protocols​

Filter Chains​

Chain Composition​

Protocol Compatibility​

Built-in Filters​

Router​

Load Balancing​

Health Checks​

Headers​

Timeout​

Request ID​

Access Log​

Forwarded Headers​

IP ACL​

Credential Injection​

TCP Access Log​

SNI Router​

TCP Load Balancer​

JSON Body Field​

JSON-RPC​

MCP​

Static Response​

Rate Limit​

Circuit Breaker​

Guardrails​

CORS​

CSRF​

Redirect​

Path Rewrite​

URL Rewrite​

Compression​

Prompt Enrich​

Conditions​

Payload Size Limits​

Header and Request Limits​

Runtime​

Upstream CA​

Logging​

Key-Value Stores​

Match Types​

Admin API​

Runtime Cache Semantics​

Pluggable Backends​

Graceful Shutdown​

Default Configuration​

Example Configs​

Validation and Security​

Error Behavior​

Overrides​