Deployment topology
Certeasy is designed and supported as a single-instance deployment. This page documents which topologies are supported today, and which ones will silently break your installation if you try.
Supported
Single instance (recommended)
One Certeasy process on one host, with its own database. This is the production-ready topology.
┌──────────────┐ HTTPS ┌──────────────┐ RPC ┌──────────┐
│ ACME clients │ ──────────► │ Certeasy │ ───────────► │ ADCS │
└──────────────┘ └──────────────┘ └──────────┘
│
▼
┌──────────────────┐
│ Database │
│ SQLite / PG / │
│ SQL Server │
└──────────────────┘
This covers the vast majority of enterprise PKI volumes. A single Certeasy instance on a modest Windows Server processes several certificate orders per second.
Cold Active / Passive (manual switchover)
You can install Certeasy on two hosts for failover, as long as only one instance is running at a time. The standby is fully stopped (process not running, port not bound). The administrator switches manually : stop the active node, then start the standby.
┌──────────────────────────────────────────┐
│ VIP / LB │
└──────────────────┬───────────────────────┘
│ HTTPS
┌──────────┴──────────┐
▼ ▼
┌────────┐ ┌──────────┐
│ Node A │ │ Node B │
│ ACTIVE │ │ STOPPED │
└────┬───┘ └──────────┘
│
▼
┌────────────────────────────────────┐
│ Shared DB (PostgreSQL / SQL Server)│
└────────────────────────────────────┘
Requirements for this topology :
- Database must be PostgreSQL or SQL Server. SQLite is NOT supported for any multi-host setup: its file-level locking is not reliable across hosts on shared filesystems (NFS, SMB, etc.) and corruption is a matter of when, not if.
- Each node has its own local work directory (TLS cache, transient files, audit log). The work directory does not need to be shared.
- The administrator owns the switchover discipline: never start the standby before the active is fully stopped. Starting two instances against the same database is the unsupported Active / Active topology described below — pathologies will appear silently.
Switchover procedure:
- Stop Certeasy on the active node (graceful shutdown drains in-flight ACME requests).
- Start Certeasy on the standby node.
- Update your load balancer to route to the new active node.
Expected switchover time: typically under a minute, including the standby's boot probe.
Load balancer in front of a single instance
A reverse proxy or load balancer in front of a single Certeasy instance — for TLS termination, IP filtering, geo-routing, etc. — is fully supported. Forward the Host header and preserve the client IP if your audit log relies on it.
Node identity
Each Certeasy instance has a stable identifier called server_id. It is materialised on first boot as a UUID v4 stored in <workdir>/server_id (file permissions 0o600) and registered in the servers table of the database. Every subsequent boot of the same node reads back the same server_id and updates the last_seen timestamp; a background heartbeat refreshes it once per minute while the instance runs.
Two operator-visible consequences:
- Each line of the audit log carries the
server_idof the node that wrote it. When a node opens its audit log at boot, it checks the last line. If that line was written by a differentserver_id, the node refuses to start with an explicit error message naming both identifiers. This is intentional: it prevents an operator from accidentally pointing two nodes at the same audit file and silently splicing two histories. - Cold Active / Passive works naturally with this model. Each node has its own workdir, its own
<workdir>/server_id, and its own<workdir>/audit.log. The database is shared, but the audit chain is per-node. Theserverstable will accumulate one row per host that has ever booted against this database — useful for operators tracking which nodes participated in the cluster over time.
Do not copy a workdir from one host to another. Each new host should generate its own server_id on first boot — that is the point of the marker file. If you ever need to inspect or decommission a known node, list them with:
certeasy audit list-servers -f config.yml
NOT supported — Active / Active
Running two or more Certeasy instances concurrently against the same database is not supported in the current release. Several core mechanisms hold in-process state that does not coordinate across nodes:
| Subsystem | What breaks under Active / Active |
|---|---|
| ACME nonces | Each instance generates nonces with its own secret and tracks them in local memory. A client whose first request lands on node A and second request lands on node B is rejected with badNonce, forcing a fresh newNonce call every two requests. Replay protection is also local per node — the same nonce can be rejected by one node and accepted by another within its 2-minute TTL. |
| Rate limiting | Several limits (failed-validation back-off, account-creation throttling, order-creation throttling, global) are in-memory per instance. A client hitting two nodes can effectively double its quota. |
| License enforcement | Per-instance counters for max_managed_servers. Two instances can both believe they are under the limit while the cluster as a whole has already exceeded it. |
| PKI health checks | Each instance pings the configured CAs independently. Operationally noisy in logs, not corruption-inducing. |
| TLS certificate manager | Each instance maintains its own server-certificate cache on disk. Two instances behind the same hostname will fetch or issue their TLS cert independently — risk of double-consuming a Let's Encrypt quota or returning different chains depending on which node a client lands on. |
Failure modes are silent and intermittent. Clients see sporadic badNonce errors, rate limits feel inconsistent, license counters drift, certificate behaviour depends on which node terminates the TLS handshake. Diagnosing these in production after the fact is painful.
Why sticky sessions do not fix this
A load balancer with session affinity (cookie-based) does not solve the nonce problem in practice : the standard ACME clients (lego, certbot, acme.sh, native Go clients) do not enable an HTTP cookie jar for their ACME requests, so a Set-Cookie from the load balancer is ignored. Source-IP affinity is more reliable but breaks the moment clients sit behind a NAT, a corporate proxy, or a CGNAT.
If you need true multi-node availability today, use cold Active / Passive above and accept the manual switchover. True warm Active / Passive and Active / Active deployments are tracked on the public roadmap for V2.0 Enterprise.
Database backend behind a single instance
Independently of the Certeasy topology, the database tier can run its own HA setup:
- SQLite WAL — concurrent readers + single writer. Adequate for single-instance Certeasy. Not usable across hosts.
- PostgreSQL with replication — primary + read replicas for backup and reporting. Certeasy only writes to the primary. Database-level failover (e.g.
pg_auto_failover, Patroni) is transparent to Certeasy as long as the connection string resolves to the new primary after the cut. - SQL Server with Always On / mirroring — same principle. Certeasy connects to one target ; failover at the database tier is handled by the listener / cluster role.
In all cases, scaling the database tier does not unlock Active / Active for Certeasy itself.