本文目前仅提供英文版本。翻译即将推出。
OpenClaw Multi-Tenant Production Deployment Architecture
Putting OpenClaw in production for a single tenant is straightforward. Putting OpenClaw in production for hundreds or thousands of tenants — where one tenant's noisy agent must not break another, every tenant's data is isolated cryptographically, and your SLOs hold under hostile traffic — is a different exercise. This article is the architecture playbook for that scenario, drawn from ECOSIRE's deployments serving SaaS, ERP, and managed-service workloads.
We assume you have read our installation quickstart and have a single-tenant agent working. Now you need to scale to N tenants safely. The patterns below cover isolation, secrets, message bus design, observability, scaling, and the trade-offs that come with each.
Key Takeaways
- Three deployment models: shared runtime + tenant-scoped messages, dedicated runtime per tenant, or namespace-isolated runtime — each with different cost/isolation trade-offs.
- Tenant context flows through every layer: agent invocation, message bus headers, audit log entries, and skill credentials.
- Use Postgres Row-Level Security on the audit table and per-tenant credential vaults for true data isolation.
- Message bus needs tenant-scoped queues or partitions; do not let one tenant's backlog block another.
- Resource quotas (CPU, RAM, tokens, concurrent runs) per tenant prevent noisy-neighbor problems.
- Secrets must NOT be shared across tenants — even if "everyone" calls the same external API, each tenant should have their own credential.
- Observability must filter by tenant ID; build dashboards that show per-tenant health, not just aggregate.
- For B2B SaaS embedding agents in your product, a dedicated runtime per tier (Free, Pro, Enterprise) is often the right balance.
The Three Deployment Models
You have three architecturally distinct ways to multi-tenant OpenClaw. Pick based on your isolation requirements, scale, and budget.
Model 1: Shared runtime, tenant-scoped messages (most common)
A single OpenClaw runtime cluster serves all tenants. Tenant ID is stamped on every message, every audit entry, every skill invocation. Skills receive tenant context and use tenant-scoped credentials.
| Pros | Cons |
|---|---|
| Cheapest at scale | Logical isolation only; bug in OpenClaw could leak across tenants |
| Single dashboard, single deploy | Noisy-neighbor risk requires quotas |
| Easy to share Skills across tenants | Compliance teams may push back on multi-tenant runtime |
Best for: many small tenants on a SaaS plan; cost is critical; tenants do not have strong data-residency requirements.
Model 2: Dedicated runtime per tenant
Each tenant gets their own OpenClaw runtime instance, often with its own message bus and audit DB.
| Pros | Cons |
|---|---|
| Maximum isolation | Highest cost — N runtimes |
| Per-tenant scaling | Operational overhead — N deploys |
| Compliance-friendly | Cross-tenant analytics requires aggregation |
Best for: regulated industries; few large tenants paying enterprise prices; strict data-residency requirements (e.g., EU, FedRAMP).
Model 3: Namespace-isolated runtime (middle path)
Single Kubernetes cluster, one runtime deployment per tenant in their own namespace. Shared bus and audit DB but with per-tenant network policies, resource quotas, and namespace-scoped credentials.
| Pros | Cons |
|---|---|
| Stronger isolation than Model 1 | More complex than Model 1 |
| Cheaper than Model 2 | Still shared underlying infra |
| Per-tenant resource quotas natural in K8s | Requires K8s expertise |
Best for: midmarket SaaS with tenants spanning Free, Pro, Enterprise tiers; want strong isolation without per-tenant infra cost.
In practice we deploy Model 1 for SaaS with hundreds of small tenants, Model 3 for B2B SaaS with mixed tier tenants, and Model 2 for regulated enterprises with 1-5 large tenants per environment.
Tenant Context Through the Stack
In Model 1 and Model 3, every layer must carry tenant context. The pattern:
1. Agent invocation
When you call an agent, pass tenant_id explicitly:
from openclaw import Client
client = Client()
result = client.run(
agent="customer-support",
tenant_id="acme-corp",
input={"question": "Where is my order?"},
)
The runtime tags the run, the audit log, and every downstream call.
2. Message Bus headers
For agent-to-agent communication, the bus carries tenant_id as a header:
bus.publish(
"ResearchRequest",
body={"topic": "competitor analysis"},
headers={"tenant_id": "acme-corp", "trace_id": "abc123"},
)
Subscribers read the header and pass it to their skills:
@bus.subscribe("ResearchRequest")
def handle(msg):
tenant_id = msg.headers["tenant_id"]
# ... call skills with tenant_id
3. Skill invocation
Skills receive tenant context via the runtime context object:
from openclaw import skill, context
@skill(name="crm.lookup_account")
def lookup_account(name: str) -> dict:
tenant_id = context.tenant_id
sf = get_salesforce_client_for_tenant(tenant_id)
return sf.query(f"SELECT Id, Name FROM Account WHERE Name = '{name}'")
The context is implicit but always available. Skills should never assume a global tenant.
4. Audit log
Every audit entry includes tenant_id. With Postgres RLS:
CREATE POLICY tenant_isolation ON audit_log
USING (tenant_id = current_setting('app.tenant_id'));
ALTER TABLE audit_log ENABLE ROW LEVEL SECURITY;
Even if a query bug tries to read across tenants, RLS blocks it.
Per-Tenant Credentials
The single biggest mistake we see: a "shared API key" for an external service that all tenants' agents use. This is a data leak waiting to happen — if Tenant A's agent prompts the model to dump the API key, Tenant B's data is at risk.
Correct pattern: each tenant has their own credentials in your secret store, scoped by tenant_id.
from openclaw.secrets import get_secret_for_tenant
@skill(name="crm.salesforce.lookup_account")
def lookup_account(name: str) -> dict:
tenant_id = context.tenant_id
creds = get_secret_for_tenant(tenant_id, "salesforce")
sf = Salesforce(
username=creds["user"],
password=creds["pass"],
security_token=creds["token"],
)
return sf.query(f"SELECT ... FROM Account WHERE Name = '{name}'")
Storage backends:
- AWS Secrets Manager with one secret per
<tenant>/<service>. - HashiCorp Vault with per-tenant policies.
- Kubernetes Secrets namespaced per tenant in Model 3.
- Azure Key Vault with per-tenant access policies.
OpenClaw's built-in secrets.get_secret_for_tenant() works against all four with provider drivers.
Message Bus Design for Multi-Tenancy
The bus must not let one tenant's backlog starve others. Two patterns work:
Pattern A: Tenant-scoped queues (Redis Streams)
# Each tenant gets their own stream
stream_name = f"openclaw:{tenant_id}:research-requests"
Each agent process subscribes to all tenant streams it serves. Consumer groups are per-tenant. A backlog in tenant-acme does not slow tenant-globex.
Pros: simplest. Cons: at thousands of tenants, the number of streams grows; memory overhead per stream is small but real.
Pattern B: Shared partitioned queue (Kafka)
One topic, partitioned by tenant_id hash. Consumers can scale horizontally.
producer.send(
"research-requests",
key=tenant_id.encode(),
value=msg.encode(),
)
Pros: scales to tens of thousands of tenants. Cons: ordering only within partition; cross-tenant fairness needs careful consumer config (e.g., max.poll.records to avoid hot partitions stalling others).
We use Pattern A for <500 tenants and Pattern B above that.
Resource Quotas
Without quotas, one tenant's runaway agent can consume all your CPU, RAM, or token budget. OpenClaw supports per-tenant quotas:
# In runtime config
tenant_quotas:
default:
max_concurrent_runs: 10
max_tokens_per_minute: 100000
max_skill_calls_per_minute: 500
max_memory_mb: 512
enterprise-acme:
max_concurrent_runs: 100
max_tokens_per_minute: 1000000
max_skill_calls_per_minute: 5000
When a tenant exceeds quota, the runtime returns 429 to the caller. Build retry + backoff into your client. Surface the quota usage to tenant admins so they can plan upgrades.
In Kubernetes (Model 3), use ResourceQuota and LimitRange per namespace as additional belt-and-braces.
Observability per Tenant
Aggregate dashboards lie about per-tenant health. Build per-tenant dashboards from day one. Tag every metric, log, and trace with tenant_id.
Metrics
from openclaw.metrics import counter
agent_runs_total = counter(
"openclaw_agent_runs_total",
labels=["tenant_id", "agent", "status"],
)
agent_runs_total.labels(
tenant_id=context.tenant_id,
agent="customer-support",
status="success",
).inc()
Build a "Top 10 Tenants by Run Count" panel. When a tenant complains, drill in by tenant_id without leaving the dashboard.
Logs
Structured logs with tenant_id as a top-level field. Use Loki/Datadog/CloudWatch filters to scope.
Traces
OpenTelemetry traces with tenant_id as a span attribute. Trace ID + tenant ID = the only two fields you need to debug 95% of incidents.
Tenant-facing observability
Some tenants will ask for their own dashboard. Build a customer-facing analytics page in your SaaS product showing:
- Runs per day
- Success rate
- Token usage and cost
- Active agents
- Recent errors
This is a feature you can charge for. We have shipped this for ECOSIRE clients who run tenant-facing portals.
Compliance and Data Residency
Multi-tenant deployments must answer:
- Where is each tenant's data stored? Audit log, memory, secrets — list each and verify residency.
- Can data cross regions? EU tenants on a US runtime are a GDPR red flag. Region-scope your runtime per tenant tier.
- How do you delete a tenant's data on offboarding? Audit log purge, memory deletion, credentials revocation. Build a single CLI command.
- How do you prove isolation to auditors? RLS policies, network policies, encryption at rest with per-tenant keys (KMS envelope encryption).
For SOC 2 and ISO 27001, you will need:
- Documented isolation architecture.
- Audit log of data access.
- Annual penetration test that includes tenant isolation.
- Customer-facing security documentation.
OpenClaw Cloud handles much of this for you; self-hosters own it.
Scaling Patterns
Horizontal scaling per agent
Each agent is independently scalable. A customer-support agent might need 50 replicas; a nightly-reporting agent needs 2.
openclaw deploy --agent customer-support --replicas 50
openclaw deploy --agent nightly-reporting --replicas 2
Use HPA (Horizontal Pod Autoscaler) with custom metrics from OpenClaw's Prometheus exporter to scale on queue depth.
Cold start handling
Agents that wake up on demand need to be ready within target SLO. For agents serving real-time chat, keep min replicas ≥ 1 and use warm-pool patterns. For batch agents, accept cold start and provision on schedule.
Token cost scaling
Token cost grows with tenant count and traffic. Build per-tenant cost dashboards. For tenants on Free tier, cap token usage hard. For Enterprise, aggressive caching and prompt caching reduce costs (see our token efficiency guide).
Tenant Onboarding and Offboarding
Onboarding for Model 1:
openclaw tenant create acme-corp --tier pro
openclaw tenant set-quota acme-corp --max-runs 10
openclaw secret put acme-corp/salesforce '{"user":"...","pass":"..."}'
Onboarding for Model 3:
kubectl create namespace tenant-acme-corp
helm install openclaw-acme ./helm-chart \
--namespace tenant-acme-corp \
--set tenant_id=acme-corp \
--set tier=pro
Offboarding for either:
openclaw tenant offboard acme-corp \
--delete-audit \
--delete-memory \
--revoke-secrets \
--grace-period-days 30
Always have a --dry-run flag and a documented rollback for the first 30 days.
Comparison Table
| Concern | Model 1 (shared) | Model 2 (dedicated) | Model 3 (namespace) |
|---|---|---|---|
| Cost per tenant | $ | $$$ | $$ |
| Isolation | Logical | Strong | Medium-Strong |
| Operational complexity | Low | High | Medium |
| Suitable scale | 100s-1000s tenants | 1-50 large tenants | 50-500 mixed tenants |
| Compliance fit | Mid | Best | Strong |
| Time to onboard | Seconds | Hours | Minutes |
Frequently Asked Questions
Can I migrate between models?
Yes, but it requires planning. Model 1 → Model 3 requires moving each tenant's runtime to its namespace; the data layer (audit, memory, secrets) usually stays. Model 1 → Model 2 is heaviest because each tenant gets its own data layer too. Plan migrations during low-traffic windows.
Should every agent run for every tenant?
No. Make agent activation per-tenant. Tenant on the Free tier might only get the support agent; Pro adds analytics; Enterprise unlocks custom workflows. The runtime can route messages only to active agents per tenant.
How do I handle tenant-specific Skills?
Two options: (a) the same Skill checks tenant_id and branches behavior; (b) different Skills with different names per tenant. We prefer (a) for variations and (b) for genuinely separate logic. For tenant-specific Skills, publish to a private Marketplace and grant install permission per tenant.
What about per-tenant LLM provider choices?
OpenClaw supports per-tenant model overrides. A tenant on a PII-sensitive workload might want Bedrock with Anthropic Claude in their region; another might want OpenAI for cost. Configure in the tenant config:
tenants:
acme-corp:
model_override:
provider: bedrock
name: anthropic.claude-3-7-sonnet-20250219-v1:0
region: eu-west-1
Where can I get help with our specific multi-tenant architecture?
ECOSIRE has shipped all three models for SaaS, ERP, and regulated industry clients. We will help you pick, design, and deploy. Talk to our OpenClaw implementation team or browse OpenClaw products for multi-tenant agent templates. For embedded SaaS analytics, also see our Power BI Embedded comparison.
Multi-tenant OpenClaw rewards thinking through tenant context, isolation, and quotas before you go live. The patterns above scale from 5 tenants to thousands without rework as long as you commit to per-tenant credentials, tenant-scoped buses, and per-tenant observability from day one. Skip those, and the migration cost grows with every tenant you add.
作者
ECOSIRE TeamTechnical Writing
The ECOSIRE technical writing team covers Odoo ERP, Shopify eCommerce, AI agents, Power BI analytics, GoHighLevel automation, and enterprise software best practices. Our guides help businesses make informed technology decisions.
相关文章
Drizzle ORM + Postgres 多租户行级安全 2026
使用 Drizzle ORM 和 Postgres 行级安全性实施多租户 SaaS:架构、策略、会话变量、NestJS 集成、真实生产模式。
Odoo 与 Tryton 2026:架构、模块化和社区
平衡的 Odoo 与 Tryton ERP 比较:架构、模块化、会计、定制模型、部署以及每个人对认真用户的胜利。
OpenClaw 大规模成本优化和代币效率
OpenClaw 令牌成本优化:提示缓存、模型路由、响应缓存、批处理 API 和生产代理的每租户成本护栏。