On-Premises Data Gateway: Setup and Configuration Guide
The on-premises data gateway is the bridge between Power BI Service (cloud) and your on-premises data sources. Without it, any data that lives behind your corporate firewall --- SQL Server databases, PostgreSQL instances, Oracle systems, file shares, ODBC sources --- cannot be refreshed in Power BI Service. The gateway is also required for live/DirectQuery connections from the cloud to on-premises databases.
Despite its critical role, the gateway is often treated as an afterthought. Organizations install it on a developer's laptop, skip high-availability configuration, and wonder why scheduled refreshes fail every weekend. This guide covers the full lifecycle: architecture decisions, installation, clustering, data source configuration, monitoring, performance tuning, and troubleshooting the most common errors.
Key Takeaways
- The on-premises data gateway comes in two modes: personal (single user, no sharing) and standard/enterprise (shared across the organization, supports clustering)
- Enterprise gateways should always be installed on a dedicated server (never a developer workstation) with reliable power, network, and uptime
- Gateway clustering with two or more nodes provides high availability --- if one node goes down, the other continues handling refresh requests
- All communication is outbound from the gateway to Azure Service Bus --- no inbound firewall ports need to be opened
- Data source credentials are encrypted locally on the gateway machine using the recovery key --- losing this key means reconfiguring all data sources
- The gateway logs are the single most useful troubleshooting resource, located in the GatewayComponents folder under the user's local app data
- Performance can be improved by enabling connection pooling for relational sources, setting appropriate timeout values, and ensuring the gateway machine has adequate RAM and CPU
Gateway Architecture
How the Gateway Works
The gateway establishes an outbound connection to Azure Service Bus using TCP port 443 (HTTPS). No inbound ports need to be opened in your firewall. The communication flow is:
- A user opens a Power BI report in the Service, or a scheduled refresh triggers
- Power BI Service sends a query request to Azure Service Bus
- The gateway (polling Azure Service Bus) picks up the request
- The gateway executes the query against the on-premises data source
- The gateway encrypts the results and sends them back through Azure Service Bus
- Power BI Service receives the results and renders the report or completes the refresh
This architecture means the gateway never receives inbound connections from the internet. It initiates all communication outbound, which simplifies firewall configuration significantly.
Personal Gateway vs Standard (Enterprise) Gateway
| Feature | Personal Gateway | Standard Gateway |
|---|---|---|
| Users | Single user only | Shared across organization |
| Data sources | User's own sources | Centrally managed sources |
| Clustering | Not supported | Up to 10 nodes |
| Administration | User self-service | Gateway admin role |
| Runs as | Windows application | Windows service |
| DirectQuery | Not supported | Supported |
| Dataflows | Not supported | Supported |
| Live connection | Not supported | Supported |
| Virtual network | Not supported | Supported (Premium) |
| Recommendation | Personal prototyping only | Production use |
For any production deployment, use the standard (enterprise) gateway. The personal gateway is suitable only for individual users prototyping with their own data sources.
Installation
Prerequisites
Before installing the gateway, ensure the target machine meets these requirements:
| Requirement | Minimum | Recommended |
|---|---|---|
| OS | Windows Server 2016 | Windows Server 2022 |
| CPU | 4 cores | 8 cores |
| RAM | 8 GB | 16 GB |
| Disk | 50 GB free | 100 GB SSD |
| .NET Framework | 4.8 | 4.8 (latest cumulative update) |
| Network | 1 Gbps | 1 Gbps with low latency to data sources |
| TLS | 1.2 required | 1.2 (1.0/1.1 disabled) |
Critical: Do not install the gateway on the same server as your database. The gateway competes for CPU and RAM during refresh operations, and co-locating it with the database can degrade both gateway and database performance.
Installation Steps
- Download the latest gateway installer from the official Microsoft download page
- Run the installer and select "On-premises data gateway (recommended)" for enterprise mode
- Accept the license terms and choose the installation directory
- Sign in with your organizational account (the account must be in the same Azure AD tenant as your Power BI Service)
- Select "Register a new gateway on this computer"
- Name the gateway (use a descriptive name: e.g., "PROD-GW-NY-01" for production gateway, New York, node 1)
- Set the recovery key --- store this securely in a password manager or key vault. You will need it to add cluster nodes or recover the gateway
- Complete the installation
The gateway service starts automatically and runs under the "NT SERVICE\PBIEgwService" account by default.
Changing the Service Account
By default, the gateway runs as a local service account. For accessing network resources (file shares, domain-joined databases with Windows authentication), you may need to change the service account to a domain account:
- Open Windows Services (services.msc)
- Find "On-premises data gateway service"
- Right-click, select Properties, then the Log On tab
- Select "This account" and enter the domain credentials
- Restart the service
Grant the service account the following:
- "Log on as a service" local policy
- Read access to data sources it needs to query
- Network access to data source servers
Gateway Clustering for High Availability
A single gateway is a single point of failure. If the machine goes down, all scheduled refreshes and DirectQuery connections fail. Gateway clustering solves this by distributing requests across multiple nodes.
Creating a Cluster
- Install the gateway on a second machine following the same installation steps
- During the "Register a new gateway" step, select "Add to an existing gateway cluster"
- Select the existing gateway name from the dropdown
- Enter the recovery key (the same key used for the first node)
- Complete the installation
The cluster now has two nodes. Requests are distributed across healthy nodes.
Load Balancing Configuration
By default, gateway clusters distribute requests randomly. You can configure load balancing:
Round-robin: Distributes requests evenly across all nodes. Best for clusters with identical hardware.
Weighted routing: Directs more requests to more powerful nodes. Configure in the Power BI admin portal under gateway settings.
Failover only: All requests go to the primary node. Secondary nodes only activate if the primary is unavailable. Best for cost-conscious deployments with a standby server.
Recommended Cluster Topology
For production deployments, ECOSIRE recommends a minimum of two gateway nodes:
| Component | Node 1 | Node 2 |
|---|---|---|
| Role | Primary | Secondary |
| Location | Primary data center | DR site or same DC |
| Hardware | 8 cores, 16 GB RAM | 8 cores, 16 GB RAM |
| Network | 1 Gbps, low latency | 1 Gbps, low latency |
| Maintenance window | Sunday 2-4 AM | Saturday 2-4 AM |
Stagger maintenance windows so both nodes are never down simultaneously. Windows updates, .NET patches, and gateway version upgrades should be applied to one node at a time.
Data Source Configuration
Adding a Data Source
After installing the gateway, configure data sources in the Power BI Service:
- Go to Settings (gear icon), then Manage Gateways
- Select your gateway cluster
- Click "Add data source"
- Choose the data source type (SQL Server, PostgreSQL, Oracle, ODBC, etc.)
- Enter the connection details (server name, database name)
- Select the authentication method (Windows, Basic, OAuth2)
- Enter credentials
- Test the connection
Supported Data Source Types
The standard gateway supports over 80 data source types. The most common for Power BI:
| Data Source | Auth Methods | DirectQuery | Notes |
|---|---|---|---|
| SQL Server | Windows, Basic, OAuth | Yes | Most common enterprise source |
| PostgreSQL | Basic | Yes | Used by Odoo, many open-source apps |
| Oracle | Windows, Basic | Yes | Requires Oracle client on gateway |
| MySQL | Basic | Yes | Community connector |
| SAP HANA | Basic, SAML | Yes | Requires SAP HANA client |
| File (CSV/Excel) | N/A | No | Files must be on a network share |
| ODBC | Basic, Windows | Yes | Generic connector for any ODBC source |
| Web API | Anonymous, Basic, OAuth | No | For REST/OData endpoints |
Credential Encryption
Data source credentials are encrypted using the recovery key and stored locally on the gateway machine. They are never sent to the cloud in plaintext. When you add a cluster node, credentials are synced using the shared recovery key.
Important: If you lose the recovery key and all gateway nodes fail, you must:
- Install a new gateway with a new recovery key
- Reconfigure all data sources and credentials
- Re-map all datasets in Power BI Service to the new gateway
Store the recovery key in Azure Key Vault or your organization's password manager.
Connection Pooling
For relational databases (SQL Server, PostgreSQL, Oracle), enable connection pooling to reuse database connections across refresh operations:
In the gateway configuration file (Microsoft.PowerBI.EnterpriseGateway.exe.config):
<setting name="PoolConnections" serializeAs="String">
<value>True</value>
</setting>
<setting name="MinPoolSize" serializeAs="String">
<value>2</value>
</setting>
<setting name="MaxPoolSize" serializeAs="String">
<value>20</value>
</setting>
Connection pooling reduces the overhead of establishing new database connections for each query, especially during DirectQuery workloads with many concurrent users.
Scheduled Refresh Configuration
Setting Up Scheduled Refresh
After publishing a dataset to Power BI Service:
- Go to the dataset settings
- Under "Gateway connection," select your gateway and the configured data source
- Under "Scheduled refresh," enable the toggle
- Set the refresh frequency (daily, weekly, or specific times)
- Configure the time zone
- Optionally set up failure notifications
Refresh Frequency Limits
| License | Max Refreshes per Day | Minimum Interval |
|---|---|---|
| Power BI Pro | 8 | 3 hours |
| Power BI Premium (per capacity) | 48 | 30 minutes |
| Power BI Premium Per User | 48 | 30 minutes |
Refresh Windows and Staggering
Do not schedule all dataset refreshes at the same time. The gateway has finite CPU and memory, and concurrent refreshes compete for resources.
Best practice: Create a refresh schedule that staggers datasets across the available window:
| Time | Dataset | Priority |
|---|---|---|
| 1:00 AM | Finance - GL Summary | Critical |
| 1:30 AM | Sales - Pipeline | Critical |
| 2:00 AM | HR - Headcount | High |
| 2:30 AM | Inventory - Stock Levels | High |
| 3:00 AM | Manufacturing - OEE | Medium |
| 3:30 AM | Marketing - Campaign Metrics | Medium |
Critical datasets refresh first, ensuring they complete even if later refreshes encounter issues.
Incremental Refresh and the Gateway
Incremental refresh significantly reduces the data volume processed through the gateway. Instead of refreshing the entire dataset, only new and changed rows are fetched. This is especially important for large datasets where full refresh would take hours and consume excessive gateway resources.
Configure incremental refresh in Power BI Desktop (see the RangeStart/RangeEnd parameter approach), then publish to the Service. The gateway handles the parameterized queries automatically.
Firewall and Proxy Configuration
Required Outbound Connections
The gateway requires outbound HTTPS (TCP 443) access to:
| Destination | Purpose |
|---|---|
| *.servicebus.windows.net | Azure Service Bus (query relay) |
| *.frontend.clouddatahub.net | Gateway registration and updates |
| *.core.windows.net | Azure Blob Storage (data transfer) |
| login.microsoftonline.com | Azure AD authentication |
| *.msftncsi.com | Network connectivity check |
| download.microsoft.com | Gateway updates |
If your firewall requires explicit IP allowlisting instead of wildcard domains, use the Azure IP Ranges JSON file from Microsoft (updated weekly) to find the IP ranges for Azure Service Bus in your region.
Proxy Server Configuration
If the gateway must route through a corporate proxy:
- Edit
Microsoft.PowerBI.EnterpriseGateway.exe.config - Add the proxy configuration in the
<system.net>section:
<system.net>
<defaultProxy useDefaultCredentials="true">
<proxy proxyaddress="http://proxy.company.com:8080"
bypassonlocal="true" />
</defaultProxy>
</system.net>
- Restart the gateway service
If the proxy requires specific credentials (not pass-through Windows authentication), you may need to use a proxy PAC file or configure the proxy to allow the gateway's service account without additional authentication.
TLS Configuration
The gateway requires TLS 1.2. If your environment still has TLS 1.0 or 1.1 enabled, the gateway will use TLS 1.2 by default. However, if the data source server only supports TLS 1.0, the connection will fail.
Verify TLS 1.2 is enabled in the Windows registry:
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\SecurityProviders\SCHANNEL\Protocols\TLS 1.2\Client
Enabled = 1 (DWORD)
DisabledByDefault = 0 (DWORD)
Monitoring and Logging
Gateway Logs
The gateway writes detailed logs to:
C:\Users\<ServiceAccount>\AppData\Local\Microsoft\On-premises data gateway\
Key log files:
| File | Contents |
|---|---|
| GatewayInfo*.log | General gateway operations, startup, shutdown |
| GatewayErrors*.log | Errors and exceptions |
| Mashup*.log | Power Query (M) engine operations |
| Report*.log | Query execution details, performance counters |
Enabling Additional Logging
For troubleshooting, enable verbose logging:
- Open the gateway configuration application
- Go to Diagnostics
- Enable "Additional logging"
- Reproduce the issue
- Export logs using the "Export logs" button (creates a ZIP of all log files)
- Disable additional logging after troubleshooting (it generates large log volumes)
Performance Counters
The gateway exposes Windows performance counters under the "On-premises data gateway" category:
| Counter | Description | Alert Threshold |
|---|---|---|
| Active Connections | Current open connections to data sources | > 50 |
| Queries Executed/sec | Query throughput | Baseline + 50% |
| Average Query Duration | Time to execute queries | > 30 seconds |
| Queue Length | Pending queries waiting for execution | > 10 |
| Memory Usage | Gateway process memory consumption | > 80% of available |
| CPU Usage | Gateway process CPU consumption | > 70% sustained |
Set up Windows Performance Monitor or a monitoring tool (Prometheus, Datadog, Azure Monitor) to track these counters and alert on thresholds.
Power BI Admin Portal Monitoring
In the Power BI admin portal:
- Go to Admin Portal, then Gateway Management
- View all gateways, their status (online/offline), and version
- See data source usage statistics
- Monitor refresh success/failure rates
Configure email notifications for gateway offline events and refresh failures.
Performance Tuning
Hardware Right-Sizing
The gateway's performance is primarily bound by:
- CPU — for query parsing, data compression, and encryption
- RAM — for holding intermediate query results
- Network — for transferring data to Azure Service Bus
Sizing guidelines:
| Scenario | CPU | RAM | Network |
|---|---|---|---|
| 5 datasets, daily refresh | 4 cores | 8 GB | 100 Mbps |
| 20 datasets, twice daily | 8 cores | 16 GB | 1 Gbps |
| 50+ datasets, DirectQuery | 16 cores | 32 GB | 1 Gbps |
| Heavy DirectQuery, many concurrent users | 16+ cores | 64 GB | 10 Gbps |
Mashup Engine Settings
The gateway uses the Power Query (Mashup) engine for data transformation. Configure in the gateway app:
Max concurrent queries: Default is the number of CPU cores times 2. Increase for I/O-bound workloads (waiting on slow data sources). Decrease for CPU-bound workloads (heavy transformations).
Memory limit per query: Default is no limit. Set a limit (e.g., 2 GB) to prevent a single runaway query from consuming all available RAM.
Network Optimization
Locate the gateway close to the data source. Network latency between the gateway and the data source is multiplied by the number of queries per refresh. A gateway in the same data center as the database minimizes latency.
Do not locate the gateway based on proximity to Azure. The Azure Service Bus connection is a single persistent TCP connection. Latency to Azure affects initial connection setup but not query throughput.
Use a wired connection. Never run a production gateway on Wi-Fi. The intermittent connectivity causes refresh failures.
Query Optimization at the Source
The fastest way to improve gateway performance is to optimize the queries it executes:
- Use custom SQL queries instead of importing entire tables (reduce data volume)
- Create database indexes on columns used in WHERE clauses and JOINs
- Use views with pre-joins and pre-aggregations for complex data models
- Enable query folding in Power Query to push transformations to the database
- Implement incremental refresh to reduce the data volume per refresh cycle
Troubleshooting Common Errors
"The gateway is not reachable"
Cause: The gateway service is stopped, the machine is down, or network connectivity to Azure is blocked.
Resolution:
- Check if the gateway Windows service is running (services.msc)
- Verify outbound HTTPS to *.servicebus.windows.net is allowed
- Check proxy settings if behind a corporate proxy
- Verify the gateway machine has internet connectivity
- Check if the gateway version is outdated (auto-updates can fail silently)
"Unable to connect to the data source"
Cause: Incorrect credentials, network connectivity to the data source, or driver issues.
Resolution:
- Test the connection in the gateway configuration app (Diagnostics, then Test Connection)
- Verify the data source server is reachable from the gateway machine (ping, telnet to port)
- Verify credentials are correct and the account is not locked/expired
- For Oracle and SAP, verify the required client libraries are installed on the gateway machine
- Check the data source's firewall allows connections from the gateway's IP
"The on-premises data gateway's refresh is taking too long"
Cause: Large dataset, slow queries, insufficient gateway resources, or network bottleneck.
Resolution:
- Enable incremental refresh to reduce data volume
- Optimize SQL queries (add indexes, reduce columns, filter rows)
- Check gateway machine CPU and RAM usage during refresh
- Stagger refresh schedules to reduce concurrent load
- Consider adding a second gateway node for load distribution
"Data source credentials are invalid"
Cause: Password changed, account locked, or Kerberos delegation misconfigured.
Resolution:
- Re-enter credentials in Power BI Service (dataset settings, then Gateway connection)
- If using Windows authentication with Kerberos, verify:
- The gateway service account has delegation privileges in Active Directory
- SPNs are correctly configured for the data source
- The KDC (domain controller) is reachable from the gateway
"Gateway version is out of date"
Cause: Auto-update failed or was disabled.
Resolution:
- Download the latest gateway installer from Microsoft
- Run the installer on the existing gateway machine (it upgrades in place)
- For clusters, upgrade one node at a time with a gap between upgrades
- Verify the gateway version in the Power BI admin portal after upgrade
Security Best Practices
Principle of Least Privilege
- The gateway service account should have read-only access to data sources
- Do not use domain admin or database admin accounts
- Create dedicated service accounts per data source type if your security policy requires it
- Rotate service account passwords on a regular schedule and update the gateway data source configuration
Recovery Key Management
The recovery key encrypts all locally stored credentials. Treat it with the same care as a database master key:
- Store in Azure Key Vault or an enterprise password manager
- Document who has access to the recovery key
- Include recovery key rotation in your key management policy
- Test recovery by restoring a gateway from backup with the recovery key
Network Segmentation
Place the gateway in a network segment that can reach:
- Data source servers (SQL Server, PostgreSQL, Oracle, etc.)
- Azure Service Bus (outbound HTTPS)
- Azure AD (outbound HTTPS)
Block all other inbound and outbound traffic. The gateway does not need inbound connections from any source.
Audit Trail
Enable Windows Security auditing on the gateway machine to track:
- Service account logon events
- Gateway configuration changes
- Data source access patterns
Forward these events to your SIEM (Splunk, Sentinel, Datadog) for centralized monitoring.
Migration and Upgrade Scenarios
Migrating to a New Gateway Machine
- Install the gateway on the new machine
- During registration, select "Migrate, restore, or takeover an existing gateway"
- Enter the recovery key from the original gateway
- The new machine inherits all data source configurations and credentials
- Verify all data sources show as connected in the Power BI admin portal
- Update any IP-based firewall rules to include the new machine's IP
- Decommission the old gateway machine
Upgrading Gateway Versions
Microsoft releases gateway updates monthly. Best practices:
- Subscribe to the gateway release notes for advance notice of changes
- Test new versions in a non-production gateway cluster first
- For production clusters, upgrade one node at a time with a 24-hour gap
- Verify refresh success rates after each node upgrade
- Keep at least one node on the previous version until the new version is validated
The gateway supports N-1 version compatibility in clusters --- nodes do not need to run the exact same version.
FAQ
Can I install the gateway on a virtual machine?
Yes. The gateway runs on physical and virtual machines, including Azure VMs, AWS EC2, and on-premises Hyper-V or VMware. For Azure VMs, consider using the VNet data gateway (in preview for Premium capacities) which eliminates the need for a self-managed gateway entirely. For on-premises VMs, ensure the VM has dedicated (not shared) CPU and RAM resources, and that the hypervisor does not aggressively overcommit resources.
How many data sources can a single gateway support?
There is no hard limit on the number of data sources per gateway. In practice, gateways commonly support 50 to 100 data sources without issues. The limiting factor is the concurrent query load during refresh windows, not the number of configured data sources. If refresh times are degrading, add cluster nodes rather than creating additional gateway installations.
Does the gateway support Linux?
No. The on-premises data gateway requires Windows (Server 2016 or later). If your data sources run on Linux, install the gateway on a Windows machine that has network access to the Linux data source servers. The gateway connects to the data source over the network --- it does not need to run on the same operating system as the data source.
What happens if both gateway nodes in a cluster go offline simultaneously?
All scheduled refreshes fail, and all DirectQuery connections return errors. Power BI Service detects the offline status and sends notifications to gateway admins (if configured). Reports using cached data (Import mode) continue to display the last successfully refreshed data. When at least one node comes back online, pending refresh requests are processed automatically. To prevent this scenario, stagger maintenance windows and place cluster nodes on separate physical infrastructure.
Can the gateway handle real-time streaming data?
The gateway is designed for query-response patterns, not streaming. For real-time data, consider Power BI streaming datasets (which bypass the gateway entirely), Azure Stream Analytics, or Azure Event Hubs with Power BI real-time dashboards. The gateway supports DirectQuery for near-real-time access to on-premises databases, but each report interaction triggers a new query rather than receiving a continuous data stream.
Written by
ECOSIRE Research and Development Team
Building enterprise-grade digital products at ECOSIRE. Sharing insights on Odoo integrations, e-commerce automation, and AI-powered business solutions.
Related Articles
Power BI AI Features: Copilot, AutoML, and Predictive Analytics
Master Power BI AI features including Copilot for natural language reports, AutoML for predictions, anomaly detection, and smart narratives. Licensing guide.
Complete Guide to Power BI Dashboard Development
Learn how to build effective Power BI dashboards with KPI design, visual best practices, drill-through pages, bookmarks, mobile layouts, and RLS security.
Power BI Data Modeling: Star Schema Design for Business Intelligence
Master Power BI data modeling with star schema design, fact and dimension tables, DAX measures, calculation groups, time intelligence, and composite models.