Reducing costs of Azure Databricks (FinOps in practice)
Big data is what powers companies. Lots of data mean lots of insights, which enables better decision-making. Having ...
Data governance is one of the hot topics right now. Especially if you’re in a regulated industry like finance or insurance, looking after the data you process is not always straightforward.
Recently with our client, a large banking institution, we dove deep into the subject of Bring Your Own Key (BYOK) in Power BI. What did we find?
It turns out that there are exceptions to BYOK coverage that you should be aware of if you want to use this feature. I have prepared a brief guide to indicate key areas to pay attention to.
Read on if you:
Recently, we worked with one of our enterprise clients on a Power BI Governance project. The organization was rolling out the Power BI service to its numerous business units comprising thousands of users.
One of the principal regulatory requirements was that all sensitive data at rest must be encrypted with a key managed by the client. We had to ensure that all used features were covered by BYOK, disabled, or protected in another way.
All data at rest in Power BI is encrypted. By default, the keys are generated and handled by the cloud service provider, i.e. Microsoft.
Bring Your Own Key (BYOK) is a security method allowing you to provide and control encryption keys. It means that you can revoke the keys, making the data unreadable even to the service provider, which may be necessary to meet compliance requirements.
BYOK is part of the Power BI Premium offering, as you can apply keys to reserved Premium capacities.
You can apply BYOK to other services, such as Microsoft 365, but you will need to handle it separately. I mention it here because it is important in one of the scenarios covered in this article.
Microsoft documentation specifies a number of exceptions to BYOK. At the time of writing, these are:
In addition to the above, we also discovered:
So, let’s go through each of these exceptions in turn and see what, if anything, can be done about them.
We all handle a lot of data in Microsoft Excel. Power BI supports workbooks created in Excel 2007 or later, saved in .xlsx, .xlsm or .csv formats. In Power BI service (via a browser) you can:
You may find it curious why it is so, especially in Import scenarios. We certainly did, so we investigated.
It transpired that in this specific case, the dataset is created in different backend storage, not the one related to your Premium capacity assigned to this workspace.
One way to confirm this is to use Power BI Rest API and check EncryptionStatus property of the dataset. It will read as “NotSupported”.
It’s important to know, however, that ingesting data from Excel workbooks in Power BI Desktop and publishing the resulting dataset is fully covered by BYOK. This is also true for the resulting dataset being refreshed in the service from the Excel file’s source (e.g. network drive).
You should educate your users on which method is supported and detect non-compliant workbook usage. For our client, we prepared an automation detecting such cases that allows taking corrective actions.
Metrics in Power BI are a collaborative and adaptable way to measure KPIs. With it, your teams can easily select the metrics that matter most and analyze them in a single view.
You can use them e.g. to measure progress against goals, proactively share updates with stakeholders, and dive deeper into data when something needs further analysis.
This is an optional functionality that can be disabled in the Power BI Admin Portal.
For regulated industries, we recommend disabling Metrics in the tenant settings, as they are not yet covered by BYOK.
Analysis Services is an analytical data engine used in decision support and business analytics, providing semantic data model capabilities for business intelligence. Analysis Services is available on different platforms:
Live Connection is a type of connection between a Power BI Report and one of the Analysis Services platforms (other types of connections are Direct Query and Import Mode).
According to the Power BI Security Whitepaper, Live Connect doesn’t persist any data in Power BI.
Analysis Services Live Connection is a BYOK exception by default. Because it doesn’t involve data at rest, there is nothing to be covered by this type of encryption.
For more information, refer to the Microsoft article on the BYOK feature.
Streaming datasets are a specific case of real-time data streaming. Here, Power BI only stores data in a temporary cache that is not secured by BYOK.
Data from this dataset can only be utilized as a tile within a dashboard and has no underlying database.
Streaming datasets are designed to work on highly processed, low latency, simple data. As such, they should never be considered for any data containing sensitive information.
When historic data analysis is enabled for a streaming dataset, it becomes a push dataset.
BYOK is not available for this functionality. For this reason, if you’re in a regulated industry, it’s advisable to avoid using streaming datasets or ensure they don’t contain any sensitive data. Here we have also set up an automation to detect and remediate any potential issues.
A push dataset is a Power BI dataset that can be created and populated only through the Power BI API. It is typically used in cases where a single table is populated with data streaming in real time.
A push dataset is a regular database that is updated by pushing data through the Push API, instead of pulling data from data sources at refresh time. A push dataset is a real database. The key difference from a regular import model is the way it is populated.
As with streaming datasets, BYOK is not supported in this case. Therefore, you should avoid it or ensure no sensitive data is kept in push datasets. Once again, for our client, an automation to detect and fix any issues was introduced.
There are a few more features not covered by BYOK. Depending on your requirements, you might be able to use a workaround here but there are additional considerations you will need to take into account. Let’s look at them more closely.
My Workspace is the personal workspace for any Power BI user to work with their content. By default, it is assigned to a shared capacity which is encrypted with Microsoft-managed keys.
With a Power BI Premium license, My Workspaces can be assigned to Premium capacities that you may protect with BYOK. Easy, right? In a way, yes, but it also brings up other considerations, such as cost. This is especially true if your users use My Workspaces a lot.
Managing personal workspaces is problematic for other reasons as well – but that’s a topic for another article.
You can check report usage metrics to see how your content is being used. For this purpose, under the hood, Power BI creates a dataset containing usage data.
This dataset is created in different backend storage, not the one related to your Premium capacity assigned to this workspace. Hence, it is not covered by BYOK. You can verify it in the same way as the Excel exceptions described above.
Why does it matter for data encryption? Usage metrics report contains personal information such as the usernames of those who have visited the report recently. If you are concerned about PII information not covered by BYOK, you can examine the following tenant settings:
For more information, refer to Audit and usage tenant settings.
Dataflows are ETL in Power BI. They promote the reusability of transformation logic that can be later consumed in datasets.
A dataflow is a collection of tables that are created and managed in workspaces in the Power BI service. Physically, this is Azure Data Lake Storage Gen2. By default, this is a data lake created and managed for you by Microsoft. And again in such a scenario, this is different backend storage than the one where your other workspace elements land.
Even though you create a dataflow in a workspace assigned to a Premium capacity encrypted with your key, the default data lake will be provisioned somewhere else. The way to resolve this could be to “bring your own data lake” from your Azure subscription. Then – provided you also brought your own key for the data lake – you are covered 😊
For more information, refer to Configuring dataflow storage to use Azure Data Lake Gen 2.
MS documentation mentions Customer-managed keys in Power BI. Not much information is provided but interestingly, it is clearly stated that this feature is not the same as the BYOK function.
When deciding on the encryption solution for Power BI, make sure that you understand the differences and choose the one appropriate for your requirements. And if you have any doubts, just ask me for advice.