Wednesday, December 11, 2019

Power BI Connection Types and Data refresh

I wanted to take a quick moment to outline the different connection types and data refresh. To understand how Power BI refreshes your datasets, reports and dashboards, you must aware of the following concepts:
Storage modes and dataset type
  • Power BI refresh types
  • Power BI Report Server
  • Storage modes and dataset type

A Power BI dataset can operate in one of the following modes to access data from variety data sources
  • Import mode
  • DirectQuery mode
  • LiveConnect mode
  • Push mode

Datasets in Import mode
Power BI imports the data from the original data sources into the dataset.
Because Power BI caches the data, Import mode dataset sizes can be substantial. Refer to the following table for maximum dataset sizes per capacity. Stay well below the maximum dataset sizes to avoid refresh issues that might occur if your datasets require more than the maximum available resources during a refresh operation.
Schedule Refresh Scheduled refresh dialog box
Datasets in DirectQuery/LiveConnect mode
Power BI does not import data over connections that operate in DirectQuery/LiveConnect mode. Instead, the dataset returns results from the underlying data source whenever a report or dashboard queries the dataset.
Although DirectQuery mode and LiveConnect mode are similar in that Power BI forwards the queries to the source, it is important to note that Power BI does not have to transform queries in LiveConnect mode. The queries go directly to the Analysis Services instance hosting the database without consuming resources on shared capacity or a Premium capacity.
Live Connection refers to connecting directly to a Model (SSAS Multidimensional or Tabular) in this regard, all your modelling work is done on those. With Direct Query - you are connecting to a database, and you can create calculated columns and measures in Power BI. So, Power BI serves as your model layer, while keeping your data in your databases.
The Scheduled cache refresh section of the Datasets tab is not available for datasets in import mode. These datasets don't require a separate tile refresh because Power BI refreshes the tiles automatically during each scheduled or on-demand data refresh.
Push datasets
Push datasets don't contain a formal definition of a data source, so they don't require you to perform a data refresh in Power BI. You refresh them by pushing your data into the dataset through an external service or process, such as Azure Stream Analytics. This is a common approach for real-time analytics with Power BI. Power BI still performs cache refreshes for any tiles used on top of a push dataset.
Power BI refresh types
A Power BI refresh operation can consist of multiple refresh types, including data refresh, OneDrive refresh, refresh of query caches, tile refresh, and refresh of report visuals.

Data refresh

For Power BI users, refreshing data typically means importing data from the original data sources into a dataset, either based on a refresh schedule or on-demand. Power BI limits datasets on shared capacity to 8 daily refreshes. If the dataset resides on a Premium capacity, you can schedule up to 48 refreshes per day in the dataset settings. It is also important to call out that the shared-capacity limitation for daily refreshes applies to both scheduled refreshes and API refreshes combined. Also note that datasets on a Premium capacity don't impose limitations for API refreshes.
Data refreshes must complete in less than 2 hours on shared capacity. If your datasets require longer refresh operations, consider moving the dataset onto a Premium capacity. On Premium, the maximum refresh duration is 5 hours.

OneDrive refresh

If you created your datasets and reports based on a Power BI Desktop file, Excel workbook, or comma separated value (.csv) file on OneDrive or SharePoint Online, Power BI performs another type of refresh, known as OneDrive refresh. 
Unlike a dataset refresh during which Power BI imports data from a data source into a dataset, OneDrive refresh synchronizes datasets and reports with their source files. By default, Power BI checks about every hour if a dataset connected to a file on OneDrive or SharePoint Online requires synchronization.

Refresh of query caches

If your dataset resides on a Premium capacity, you might be able to improve the performance of any associated reports and dashboards by enabling query caching, as in the following screenshot. Query caching instructs the Premium capacity to use its local caching service to maintain query results, avoiding having the underlying data source compute those results.

Tile refresh

Power BI maintains a cache for every tile visual on your dashboards and proactively updates the tile caches when data changes. In other words, tile refresh happens automatically following a data refresh.
By default, Power BI maintains a single cache for every tile, but if you use dynamic security to restrict data access based on user roles, as covered in the article row-level security (RLS) with Power BI, then Power BI must maintain a cache for every role and every tile. The number of tile caches multiplies by the number of roles.

Refresh of report visuals

This refresh process is less important because it is only relevant for live connections to Analysis Services. 

How to configure Power BI report scheduled refresh


To refresh data in your Power BI report, a scheduled refresh plan must be created. This is done within the Manage area of a Power BI report. This schedule is same as SSRS Report Schedule.
Cheers!
Uma