Part of our Performance & Scalability series
Read the complete guidePower BI Capacity Planning: Sizing Premium and Fabric
Choosing the wrong Power BI capacity tier is one of the most expensive analytics mistakes an organization can make. Undersizing creates throttling, slow queries, and refresh failures during peak periods. Oversizing pays for compute that sits idle most of the day. Getting capacity right requires understanding how Power BI uses compute resources, what your workload actually demands, and how the SKU options map to those demands.
This guide covers Power BI Premium and Microsoft Fabric capacity planning — from understanding the compute model, through monitoring current utilization, to sizing new deployments and managing cost with autoscale.
Key Takeaways
- Power BI Premium capacity is measured in virtual cores (v-cores) that govern memory and compute throughput
- Microsoft Fabric uses Capacity Units (CUs) as the fundamental billing unit, replacing the SKU tiers
- Background workloads (dataset refresh) and interactive workloads (query execution) compete for capacity resources
- The Capacity Metrics app is the essential monitoring tool for understanding resource utilization
- CPU smoothing over 24 hours means bursts are averaged — short peak periods don't immediately trigger throttling
- Autoscale (Premium Gen2) adds compute automatically during peak periods and removes it when demand drops
- Dataset memory consumption is the most common cause of capacity under-performance
- Proper capacity planning requires baseline measurement before sizing
Power BI Premium Capacity Model
Power BI Premium provides dedicated compute resources — isolated from the shared infrastructure used by Pro workspaces. This isolation delivers consistent performance regardless of what other Power BI tenants are doing.
The resource model: Premium capacity is measured in virtual cores (v-cores). Each v-core provides a specific amount of memory and CPU compute. The relationship between v-cores and capabilities determines what workloads the capacity can handle simultaneously.
| SKU | V-Cores | RAM | DirectQuery/Live Connection Throughput |
|---|---|---|---|
| P1 | 8 v-cores | 25 GB | 30 queries/second |
| P2 | 16 v-cores | 50 GB | 60 queries/second |
| P3 | 32 v-cores | 100 GB | 120 queries/second |
| P4 | 64 v-cores | 200 GB | 240 queries/second |
| P5 | 128 v-cores | 400 GB | 480 queries/second |
Microsoft Fabric replaces the P-SKU model with Fabric Capacity Units (CUs). Fabric F64 is roughly equivalent to P1, F128 to P2, and so on. The Fabric model allows more granular sizing and pay-as-you-go billing (pause/resume), which is often more cost-effective than the monthly subscription of P-SKUs.
| Fabric SKU | CUs | Equivalent P-SKU | Monthly Estimate |
|---|---|---|---|
| F2 | 2 CUs | — (small dev/test) | ~$262 |
| F4 | 4 CUs | — | ~$524 |
| F8 | 8 CUs | — | ~$1,047 |
| F16 | 16 CUs | — | ~$2,095 |
| F32 | 32 CUs | — | ~$4,189 |
| F64 | 64 CUs | P1 | ~$8,378 |
| F128 | 128 CUs | P2 | ~$16,756 |
| F256 | 256 CUs | P3 | ~$33,512 |
(Prices are approximate USD; actual pricing varies by region and negotiated agreements.)
Workload Categories
Power BI capacity handles two categories of workload, and they compete for the same compute resources:
Background workloads run without user interaction:
- Dataset refresh (import mode refreshes)
- Dataflow refresh
- AI workloads (model training, inference)
- Paginated report rendering triggered by subscriptions
- Export operations
Interactive workloads respond to user interactions:
- Query execution (user opens a report page)
- DirectQuery/Live connection queries
- Dashboard tile refresh
- Report export triggered by a user
- Natural language Q&A
When both types of workload compete for the same v-cores, the capacity must have sufficient resources to handle peak overlap. A capacity that runs 20 simultaneous dataset refreshes during the business night while handling 200 concurrent user queries during the business day may need to be sized for both peaks.
The Capacity Metrics App
The Microsoft Fabric Capacity Metrics app (previously Power BI Premium Capacity Metrics app) is the essential tool for capacity monitoring and planning. Install it from AppSource and connect it to your capacity.
What it shows:
CPU and Memory utilization by workload type. The utilization chart shows CPU consumption over time, with separate series for interactive and background workloads. The smoothed line shows the 24-hour smoothed average (what Power BI uses for throttling decisions).
Throttling events: When the 24-hour smoothed CPU exceeds 100% of capacity resources, Power BI begins throttling background workloads (delaying refreshes). When it exceeds the smoothing threshold significantly, interactive workloads are throttled too. The metrics app shows throttling events with duration and severity.
Dataset memory: The memory waterfall shows which datasets are loaded into memory, how much memory they consume, and when they're evicted. A dataset that's constantly evicted and reloaded (high "evictions" count) is too large for the available memory — causing delays as users wait for the dataset to reload on each query.
Top datasets and reports by resource consumption: The metrics app identifies which datasets and reports consume the most resources — these are the candidates for optimization before scaling up.
Key metrics to monitor:
| Metric | Healthy | Warning | Critical |
|---|---|---|---|
| CPU Utilization (24h smoothed) | < 70% | 70–90% | > 90% |
| Memory Utilization | < 80% | 80–90% | > 90% |
| Dataset Evictions (daily) | < 10 | 10–50 | > 50 frequent datasets |
| Interactive Query Wait | < 1s avg | 1–3s avg | > 3s avg |
| Refresh Success Rate | > 98% | 95–98% | < 95% |
Sizing a New Deployment
When sizing a Power BI Premium deployment for the first time (without existing metrics data), the estimation process uses these inputs:
Step 1: Count users and usage patterns
- How many total users will access Power BI reports?
- What is the peak concurrent user count? (Typically 10–20% of total users)
- What are the peak usage hours? (Usually 9–11 AM and 2–4 PM business hours)
Step 2: Estimate dataset memory requirements
- Sum the uncompressed size of all datasets that will be active simultaneously
- Apply an average VertiPaq compression ratio of 5:1 to estimate in-memory size
- Add 20% overhead for query operations
- Total dataset memory requirement = the dominant sizing constraint for most implementations
Step 3: Estimate refresh workload
- How many datasets need to refresh simultaneously at peak?
- What is the expected refresh duration for each?
- Peak refresh resource consumption = (number of simultaneous refreshes × average memory per dataset refresh)
Step 4: Add DirectQuery/Live Connection throughput
- How many users will use reports with DirectQuery?
- What is the expected peak queries per second?
- Compare against SKU throughput limits (P1 handles 30 DQ queries/second)
Example sizing calculation:
Organization with 500 Power BI users:
- 50 concurrent users at peak (10% of total)
- 15 active datasets, average 4 GB uncompressed → ~0.8 GB each in memory = 12 GB total dataset memory
- 10 datasets refresh overnight simultaneously, each consuming 2 GB during refresh = 20 GB refresh memory
- 20 DirectQuery report pages at peak = ~5 queries/second
Analysis: 32 GB peak memory (12 GB datasets + 20 GB refreshes) + overhead = requires P1 (25 GB) may be tight → consider P2 (50 GB). DirectQuery throughput is within P1's 30 qps limit, so memory drives the sizing decision.
Starting with P1 and monitoring with the Metrics app for 30 days will reveal whether P2 is necessary.
Autoscale Configuration
Power BI Premium Gen2 (and Fabric) supports autoscale — automatically adding compute resources when demand exceeds the provisioned capacity, then removing them when demand drops.
Autoscale for Premium (P-SKUs): Configure in the Power BI Admin Portal → Capacity settings → Premium capacity → Autoscale. Set the maximum number of additional v-cores that can be added (1–71 for P1). When the capacity utilization approaches limits, autoscale adds v-cores in increments.
Autoscale billing: additional v-cores are billed per hour at a per-v-core rate. A P1 that adds 8 v-cores for 2 hours during a peak period pays for 16 v-core-hours.
Autoscale for Fabric: Fabric capacities can be paused and resumed (cost-effective for dev/test) and have burstable compute that scales within the CU limits purchased. Fabric also supports reservations (committed spend for significant discounts) alongside pay-as-you-go pricing.
When to use autoscale:
- You have predictable daily peaks (e.g., month-end financial reporting generates 3× normal load)
- You don't want to permanently provision for peak capacity that's only needed occasionally
- You want cost predictability with a safety valve for unexpected demand surges
When NOT to use autoscale:
- Sustained high utilization (you're consistently at capacity) — upgrade base tier instead
- Very large one-time report rendering loads — autoscale may not react fast enough
- Strict budget constraints where any variable billing is unacceptable
Capacity Optimization Before Scaling
Before upgrading to a larger capacity, optimize existing workloads. Most performance problems are fixable without spending more money.
Dataset optimization:
- Run DAX Studio's VertiPaq Analyzer to identify large tables and columns that can be removed or summarized
- Check for unused columns and measures consuming memory without being referenced in any report
- Optimize data types (use Integer instead of Text for date keys, Boolean instead of string for flags)
- Apply incremental refresh to reduce refresh duration and memory consumption during refresh cycles
Report optimization:
- Reduce the number of visuals per report page — each visual generates at least one DAX query on load
- Replace low-value visuals with cards or KPIs that generate simpler queries
- Avoid bidirectional relationships and complex DAX that generates multiple storage engine queries
- Use field parameters instead of many similar calculated columns
Refresh schedule optimization:
- Stagger refresh times to avoid multiple large datasets refreshing simultaneously
- Schedule lower-priority datasets during off-peak hours
- Use incremental refresh to shorten the refresh window for large datasets
- Pause or disable refreshes for rarely-used datasets
Multi-Capacity Architecture
Large organizations sometimes use multiple capacities to isolate workloads, separate cost centers, or provide geographic redundancy.
Common multi-capacity patterns:
- Tier isolation: Production on P2, Development/Test on F8. Prevents dev refreshes from consuming production capacity.
- Workload isolation: Finance on one P1, HR on another P1. Keeps department workloads from affecting each other.
- Geographic distribution: US users on US East capacity, EU users on West Europe capacity. Reduces latency for regional user populations.
- Cost center separation: Each business unit has its own capacity, enabling precise cost chargeback.
Cross-capacity considerations: Datasets and reports must be published to workspaces assigned to specific capacities. A report can only use datasets in the same capacity (or import from a different capacity, which has performance implications). Plan workspace-to-capacity assignments before publishing to avoid cross-capacity data access patterns.
Frequently Asked Questions
What is the minimum Power BI capacity tier for enterprise use?
Power BI Premium P1 (or Fabric F64) is the minimum tier that supports the full enterprise feature set: paginated reports, deployment pipelines, XMLA endpoint access, AI insights, dataflow computed entities, and up to 400 GB model sizes. For smaller organizations or departmental implementations, Power BI Premium Per User (PPU) at $20/user/month provides most features without requiring a capacity commitment. For development and testing, Fabric F2 or F4 is sufficient.
How does the 24-hour CPU smoothing affect capacity planning?
Power BI uses a 24-hour CPU smoothing algorithm to determine whether a capacity is overloaded. Short bursts of high CPU consumption (a large refresh completing in 30 minutes) don't immediately cause throttling — the burst is averaged over the 24-hour window. This means you can handle moderate burst workloads without needing to size for peak. However, sustained high CPU (3+ hours of intensive workload) will push the smoothed average over the throttling threshold. Size for your sustained peak, not your momentary maximum.
Is Microsoft Fabric better than Power BI Premium for new deployments?
For new enterprise deployments in 2026, Fabric is generally the recommended path. It provides the same Power BI capabilities as Premium plus additional workloads (Data Engineering, Data Science, Data Warehouse, Real-Time Analytics), more flexible billing (pause/resume, reservations), and a unified governance model. Organizations already on Premium P-SKUs with long-term contracts may find staying on Premium until renewal makes financial sense. All Power BI Premium content is compatible with Fabric.
How do I reduce capacity costs without degrading user experience?
The highest-impact cost reduction levers are: (1) optimize datasets to reduce memory footprint before sizing up, (2) stagger refresh schedules to prevent simultaneous resource competition, (3) use Fabric with pause/resume for development capacities (pay only during business hours), (4) enable autoscale on production capacity rather than permanently provisioning for peak, and (5) audit workspaces for unused reports and datasets that are consuming refresh resources without active users.
What monitoring tools does Microsoft provide for capacity health?
The primary tool is the Microsoft Fabric Capacity Metrics app (available on AppSource). It provides CPU utilization, memory utilization, throttling events, dataset activity, and query performance metrics. For deeper diagnostics, the XMLA endpoint (accessible via SSMS or Tabular Editor) allows querying DMVs (Dynamic Management Views) for real-time query performance data. The Power BI REST API provides programmatic access to capacity metrics for custom monitoring dashboards.
Next Steps
Capacity planning is an ongoing activity, not a one-time decision. Start with the right tier, monitor actively with the Capacity Metrics app, optimize workloads before scaling, and plan for growth. The organizations that get the most value from Power BI Premium treat capacity management as a performance engineering discipline.
ECOSIRE's Power BI performance optimization services include capacity assessment, workload analysis, and sizing recommendations. Contact us to audit your current capacity utilization and identify the most cost-effective path to improved performance.
Written by
ECOSIRE Research and Development Team
Building enterprise-grade digital products at ECOSIRE. Sharing insights on Odoo integrations, e-commerce automation, and AI-powered business solutions.
Related Articles
Building Financial Dashboards with Power BI
Step-by-step guide to building financial dashboards in Power BI covering data connections to accounting systems, DAX measures for KPIs, P&L visualisations, and best practices.
Case Study: Power BI Analytics for Multi-Location Retail
How a 14-location retail chain unified their reporting in Power BI connected to Odoo, replacing 40 spreadsheets with one dashboard and cutting reporting time by 78%.
GoHighLevel + Power BI: Advanced Reporting and Analytics
Connect GoHighLevel to Power BI for advanced marketing analytics. Build executive dashboards, track multi-channel ROI, and create automated reports that go beyond GHL's native reporting.
More from Performance & Scalability
k6 Load Testing: Stress-Test Your APIs Before Launch
Master k6 load testing for Node.js APIs. Covers virtual user ramp-ups, thresholds, scenarios, HTTP/2, WebSocket testing, Grafana dashboards, and CI integration patterns.
Nginx Production Configuration: SSL, Caching, and Security
Nginx production configuration guide: SSL termination, HTTP/2, caching headers, security headers, rate limiting, reverse proxy setup, and Cloudflare integration patterns.
Odoo Performance Tuning: PostgreSQL and Server Optimization
Expert guide to Odoo 19 performance tuning. Covers PostgreSQL configuration, indexing, query optimization, Nginx caching, and server sizing for enterprise deployments.
Odoo vs Acumatica: Cloud ERP for Growing Businesses
Odoo vs Acumatica compared for 2026: unique pricing models, scalability, manufacturing depth, and which cloud ERP fits your growth trajectory.
Testing and Monitoring AI Agents in Production
A complete guide to testing and monitoring AI agents in production environments. Covers evaluation frameworks, observability, drift detection, and incident response for OpenClaw deployments.
Compliance Monitoring Agents with OpenClaw
Deploy OpenClaw AI agents for continuous compliance monitoring. Automate regulatory checks, policy enforcement, audit trail generation, and compliance reporting.