The G2 + Databricks integration enables Databricks customers to access G2 Buyer Intent, Reviews, and Market Intelligence data through Databricks Delta Sharing.
Basics of the G2 + Databricks integration
About Databricks Delta Sharing
G2 shares datasets with Databricks customers using Databricks-to-Databricks Delta Sharing.
Before data can be shared, G2 must configure access for your organization using your Databricks sharing identifier.
Once access is configured, you can mount shared datasets to a catalog and query them from your Databricks workspace.
Understanding how G2 data is delivered
G2 publishes datasets to Databricks and shares them using Delta Sharing.
The sharing workflow consists of the following stages:
| Stage | Description |
|---|---|
| Share configuration | G2 configures a share and grants access to your organization |
| Access configuration | Your organization provides a sharing identifier to G2 |
| Share access | Shared datasets become available through Delta Sharing |
| Catalog mounting | Shared datasets are mounted to a catalog in your Databricks workspace |
| Querying | Shared datasets can be queried from Databricks |
Shared datasets are read-only and accessed through Databricks Delta Sharing.
G2 is not responsible for Databricks compute costs associated with querying shared datasets.
Available datasets
G2 currently shares the following datasets through Databricks:
- Buyer Intent
- Reviews
- Market Intelligence
Refer to the following documentation for dataset schemas and field definitions:
Databricks datasets are refreshed daily after the Snowflake refresh completes at 10:00 AM UTC.
Historical data is included when access is first provisioned.
Providing your sharing identifier
Before G2 can configure access to shared datasets, provide your Databricks sharing identifier to your G2 representative.
The sharing identifier uniquely identifies the Unity Catalog metastore attached to the Databricks workspace where users will access shared data.
The sharing identifier uses the following format:
<cloud>:<region>:<uuid>
Example:
aws:eu-west-1:b0c978c8-3e68-4cdf-94af-d05c120ed1ef
You can retrieve the sharing identifier in one of the following ways:
Using SQL
Run the following command in a Databricks notebook:
SELECT CURRENT_METASTORE()
Using the Databricks UI
Open Catalog.
Open Delta Sharing.
On the Shared with me tab, select your Databricks sharing organization.
Copy the sharing identifier.
After receiving your sharing identifier, G2 configures access to the appropriate shared datasets.
Accessing shared datasets
After G2 configures access, shared datasets become available in Databricks Delta Sharing.
To view and mount shared datasets, your Databricks account must have the required Delta Sharing permissions.
For more information, refer to the Databricks documentation: Read data shared using Databricks-to-Databricks Delta Sharing (for recipients).
To mount a shared dataset:
- Open Catalog to open Catalog Explorer.
- Select the gear icon and choose Delta Sharing.
Alternatively, select Share > Delta Sharing.
- On the Shared with me tab, select the provider.
- Find the desired share and select Mount to catalog.
- Select Create a new catalog or Mount to existing catalog.
- Enter a catalog name or select an existing catalog.
- Select Create or Mount.
After the share is mounted, verify that the shared tables are available in the selected catalog.