Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions docs/configuration/pgdog.toml/databases.md
Original file line number Diff line number Diff line change
Expand Up @@ -105,3 +105,7 @@ Overrides the [`idle_timeout`](general.md#idle_timeout) setting. Idle server con
### `read_only`

Sets the `default_transaction_read_only` connection parameter to `on` on all server connections to this database. Clients can still override it with `SET`.

### `server_lifetime`

Overrides the [`server_lifetime`](general.md#server_lifetime) setting. Server connections older than this will be closed when returned to the pool.
42 changes: 40 additions & 2 deletions docs/configuration/pgdog.toml/general.md
Original file line number Diff line number Diff line change
Expand Up @@ -141,6 +141,12 @@ Delay running idle healthchecks at PgDog startup to give databases (and pools) t

Default: **`5_000`** (5s)

### `healthcheck_timeout`

Maximum amount of time to wait for a healthcheck query to complete.

Default: **`5_000`** (5s)

### `connection_recovery`

Controls if server connections are recovered or dropped if a client abruptly disconnects.
Expand Down Expand Up @@ -237,6 +243,12 @@ Close client connections that have been idle, i.e., haven't sent any queries, fo

Default: **`none`** (disabled)

### `client_idle_in_transaction_timeout`

Close client connections that have been idle inside a transaction for this amount of time. This prevents clients from holding server connections indefinitely while in a transaction.

Default: **`none`** (disabled)

### `client_login_timeout`

Maximum amount of time new clients have to complete authentication. Clients that don't will be disconnected.
Expand All @@ -261,6 +273,17 @@ Which strategy to use for load balancing read queries. See [load balancer](../..

Default: **`random`**

### `read_write_strategy`

How aggressive the query parser should be in determining read vs. write queries.

Available options:

- `conservative` (default): transactions are writes, standalone `SELECT` are reads
- `aggressive`: use first statement inside a transaction for determining query route

Default: **`conservative`**

### `read_write_split`

How to handle the separation of read and write queries.
Expand Down Expand Up @@ -449,10 +472,25 @@ Available options:
### `reload_schema_on_ddl`

!!! warning
This setting is intended for local development / CI / single node PgDog deployments.

This setting requires [PgDog Enterprise Edition](../../enterprise_edition/index.md) to work as expected. If using the open source edition,
it will only work with single-node PgDog deployments, e.g., in local development or CI.

Automatically reload the schema cache used by PgDog to route queries upon detecting DDL statements (e.g., `CREATE TABLE`, `ALTER TABLE`, etc.).

Default: **`true`** (enabled)

### `load_schema`

Controls whether PgDog loads the database schema at startup for query routing.

Available options:

- `on`: always load schema on startup
- `off`: disable loading schema
- `auto` (default): load schema if number of database shards is greater than 1

Default: **`auto`**

## Logging

### `log_connections`
Expand Down
32 changes: 32 additions & 0 deletions docs/configuration/pgdog.toml/memory.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
---
icon: material/memory
---

# Memory

Memory settings control buffer sizes used by PgDog for network I/O and task execution.

```toml
[memory]
net_buffer = 4096
message_buffer = 4096
stack_size = 2097152
```

### `net_buffer`

Size of the network read buffer in bytes. This buffer is used for reading data from client and server connections.

Default: **`4096`** (4 KiB)

### `message_buffer`

Size of the message buffer in bytes. This buffer is used for assembling PostgreSQL protocol messages.

Default: **`4096`** (4 KiB)

### `stack_size`

Stack size for Tokio tasks in bytes. Increase this if you encounter stack overflow errors with complex queries.

Default: **`2097152`** (2 MiB)
2 changes: 2 additions & 0 deletions docs/configuration/pgdog.toml/rewrite.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,13 +11,15 @@ The `rewrite` section controls PgDog's automatic SQL rewrites for sharded databa
enabled = false
shard_key = "error"
split_inserts = "error"
primary_key = "ignore"
```

| Setting | Description | Default |
| --- | --- | --- |
| `enabled` | Enables/disables the query rewrite engine. | `false` |
| `shard_key` | Behavior when an `UPDATE` changes a sharding key: `error` rejects the statement,<br>`rewrite` migrates the row between shards,<br>`ignore` forwards it unchanged. | `"error"` |
| `split_inserts` | Behavior when a sharded table receives a multi-row `INSERT`: `error` rejects the statement, `rewrite` fans the rows out to their shards, `ignore` forwards it unchanged. | `"error"` |
| `primary_key` | Behavior when an `INSERT` is missing a `BIGINT` primary key: `error` rejects the statement,<br>`rewrite` auto-injects `pgdog.unique_id()` for missing keys,<br>`ignore` allows the INSERT without modification. | `"ignore"` |

!!! note "Two-phase commit"
Consider enabling [two-phase commit](../../features/sharding/2pc.md) when either feature is set to `rewrite`. Without it, rewrites are committed shard-by-shard and can leave partial changes if a transaction fails.
Expand Down
94 changes: 89 additions & 5 deletions docs/configuration/pgdog.toml/sharded_tables.md
Original file line number Diff line number Diff line change
Expand Up @@ -57,15 +57,16 @@ Currently, PgDog supports sharding `BIGINT` (and `BIGSERIAL`), `UUID`, `VARCHAR`

The name of the database in [`[[databases]]`](databases.md) section in which the table is located. PgDog supports sharding thousands of databases and tables in the same configuration file.

### `table`
### `schema`

The name of the PostgreSQL schema where the sharded table is located. This is optional. If not set, all schemas will be sharded.

### `name`

The name of the PostgreSQL table. Only columns explicitly referencing that table will be sharded.

The name must not contain the schema name, just the table name.

!!! note "Postgres schemas"
Disambiguating tables in different schemas isn't currently supported and all of them will be sharded.

### `column`

The name of the sharded column.
Expand All @@ -79,6 +80,25 @@ The data type of the column. Currently supported options are:
- `varchar`
- `vector`

### `hasher`

The hash function to use for sharding. Available options:

- `postgres` (default) - PostgreSQL's native hash function
- `sha1` - SHA-1 hash function

### `centroids`

For vector sharding, specify the centroid vectors directly in the configuration. This is useful for small centroid sets.

### `centroids_path`

Path to a JSON file containing centroid vectors. This is useful when centroids are large (1000+ dimensions) and impractical to embed in `pgdog.toml`.

### `centroid_probes`

Number of centroids to probe during vector similarity search. If not specified, defaults to the square root of the number of centroids.

## Omnisharded tables

[Omnisharded](../../features/sharding/omnishards.md) tables are tables that have the same data on all shards. They typically are small and contain metadata, e.g., list of countries, cities, etc., and are used in joins. PgDog allows to read from these tables directly and load balances traffic evenly across all shards.
Expand Down Expand Up @@ -113,9 +133,56 @@ By default, PgDog uses hash-based sharding, with data evenly split between shard

To configure either one, you need to specify the value-to-shard mapping in the configuration.

## Mapping fields

!!! note
The `column`, `table`, and `schema` fields must match a corresponding `[[sharded_tables]]` entry.

### `database`

The name of the database in [`[[databases]]`](databases.md) section.

### `column`

The name of the column to match for routing. Must match a `column` in `[[sharded_tables]]`.

### `table`

The name of the table to match. Must match a `name` in `[[sharded_tables]]` if specified there. This is optional.

### `schema`

The name of the PostgreSQL schema to match. Must match a `schema` in `[[sharded_tables]]` if specified there. This is optional.

### `kind`

The type of mapping. Available options:

- `list`: match specific values (i.e., `PARTITION BY LIST`)
- `range`: match a range of values (i.e., `PARTITION BY RANGE`)
- `default`: fallback for unmatched values (list-based sharding only)

### `values`

For `list` mappings, the set of values that route to this shard.

### `start`

For `range` mappings, the starting value (inclusive).

### `end`

For `range` mappings, the ending value (exclusive).

### `shard`

The target shard number for matched queries.

## Mapping examples

### Lists

Lists are defined as a list of values and a corresponding shard number. Just like sharded tables, the mapping is database and column (and optionally, table) specific:
Lists are defined as a list of values and a corresponding shard number. Just like sharded tables, the mapping is database and column (and optionally, table and schema) specific:

```toml
[[sharded_mappings]]
Expand Down Expand Up @@ -153,3 +220,20 @@ shard = 0
UPDATE users SET deleted_at = NOW()
WHERE tenant_id IN (1, 2, 5, 10, 56)
```

### Default

The `default` kind specifies a fallback shard for values that don't match any list mapping:

```toml
[[sharded_mappings]]
database = "prod"
column = "tenant_id"
kind = "default"
shard = 2
```

Any sharding key value that doesn't match an explicit list mapping will be routed to the default shard.

!!! note
The `default` kind only works with list-based sharding, not range-based sharding.
Loading