top of page
Search
Writer's pictureMichiel Vromans

Databricks: ‘Publish to Power BI’

"Seamless Data Workflows: Azure Databricks Now Enables Direct Publishing to Power BI.”

Azure Databricks announced that publishing to Power BI became generally available, allowing the user to create semantic models in Power BI directly from the Unity Catalog. This integration simplifies workflows between data engineers and business analysts, turning raw data into actionable insights with minimal effort.


When data models are carefully crafted in Databricks' Unity Catalog, all tables, relationships (including primary and foreign keys), and descriptions are synchronized upon publishing. This eliminates the need to switch between Databricks and Power BI Desktop, simplifying the process of making your data available for visualization and analysis. Besides streamlining the workflow, the key benefits are:


  • Direct integration: Publish datasets to Power BI right from within Databricks, at the end of your data pipelines.

  • Simplified connection management: Automatically handle connections and credentials, removing the need for manual setup.

  • Always in sync: Push changes to tables and their relationships effortlessly, ensuring data remains up-to-date.

  • Single source of truth: Define entity relationships once in Unity Catalog, and have them reflected in Power BI without duplication of effort.

  • Consistent Documentation: Table column descriptions in Azure Databricks are copied to corresponding Power BI fields automatically.



Requirements

The functionality is generally available but certain permissions are required. To enable these, administrator rights are necessary. The permissions allow the functionality to create content and read and write to datasets. As such:


  • The data in Databricks must be in Unity Catalog.

  • You must grant permissions to the app Databricks Dataset Publishing Integration, which is used to publish from Databricks.

  • You must have a premium Power BI license (either premium capacity or premium per user).

  • The XMLA endpoint should be read and write enabled.

  • In Power BI workspace settings, you must enable Users can edit data models in Power BI service (preview) to edit the Semantic Model after it is published.



Use case

To demonstrate the “Publish to Power BI” function, consider a boutique shop analysing the sales of their apparel items and slicing it by customers, products and the size of the items. To enable the analysis, the engineering team creates a star schema in the warehouse in Databricks with one fact table and three dimensions:


  • Customer: the client purchasing the item

  • Product: the list of items available in the store and their category

  • Size: the available sizes in the store (S, M, L, XL)


The fact table shows the item purchased and the price. The tables are created with the primary and foreign keys defined and the descriptions are filled. The image below shows the view in the Unity Catalog.




Publishing to Power BI

After crafting the star schema in Databricks, the data is published to Power BI using the "Connect to Partner" feature. We want to publish to the workspace ‘Demo-publish-to-power-bi’ and use the Direct Query mode. These options can be selected in the “Connect to partner” dialog. Databricks recommends to use the OAuth authentication due to the improved security OAuth has over using personal access tokens. Note that OAuth credentials might need to be configured on the Power BI dataset settings page under Data source credentials.




Governance and Observability

The schema can be published using import mode or direct query mode. By integrating Microsoft Entra ID with Single Sign-On (SSO), users can access the data in Power BI with their own credentials. Governance and observability is managed in Databricks without the need to duplicate the security controls in Power BI. Configuring the semantic model in Direct Query mode ensures that the governance rules (e.g., row-level security, data masking) established in Databricks are automatically enforced in Power BI. This approach streamlines security management, enhances compliance, and avoids redundancy.



Configuring the Dataset

In ”Dataset Name”, users can select ”Publish as a new data set” where a new semantic model is created or ”Use an existing data set” where an existing model is updated without overwriting, keeping the existing connections in place. As this is a new request, we publish the schema as a new data set.




After successful publishing, the data is available in the Power BI service with the relationships that were defined in Databricks.



Updating the schema

After a first review of the business, a requirement is added to be able to analyse the sales per colour of the item. The schema is updated in Databricks by adding a dimension ‘color’ and a foreign key in the sales table.



After the update is approved, the schema is published with the option ‘Use an existing data set’ in the ‘Connect to partner’ dialog, selecting the semantic model deployed earlier.

The new dimension is now available in the semantic model in Power BI and the description is also published to Power BI. “Publish to Power BI” adds new tables and columns to the semantic model. Deletes, however, are not synchronized in Power BI. When a table is deleted in the schema, the table remains in the semantic model and users of the report will get an error message when using the direct query mode.




The semantic model is now ready to be used in reports and visualizations for further analysis. The analyst can open the model in the service and create measures and additional calculated columns. However, as the semantic model is created using an XMLA endpoint, the user is not able to download the model as a .pbix file.



Conclusion

The general availability of the "Publish to Power BI" feature in Azure Databricks further integrates data engineering and business intelligence workflows.


By leveraging Unity Catalog, users can create semantic models that synchronize with schemas, preserve relationships, and maintain documentation, eliminating the need for manual context switching between Databricks and Power BI.


With options to publish schemas as new datasets or update existing models, and the ability to choose between Import and Direct Query modes, this integration supports flexibility and scalability. Furthermore, centralized governance through Databricks and secure user access via Microsoft Entra ID with SSO ensures compliance and simplifies security management.


This functionality enables teams to efficiently bring data from Databricks to Power BI, accelerating the development of rich, dynamic reports and visualizations in Power BI.


61 views

Comentarios


bottom of page