Documentation Index

Fetch the complete documentation index at: https://documentation.g2.com/llms.txt

Use this file to discover all available pages before exploring further.

Databricks

Prev Next

The G2 + Databricks integration enables Databricks customers to access G2 Buyer Intent, Reviews, and Market Intelligence data through Databricks Delta Sharing.

Basics of the G2 + Databricks integration

About Databricks Delta Sharing

G2 shares datasets with Databricks customers using Databricks-to-Databricks Delta Sharing.

Before data can be shared, G2 must configure access for your organization using your Databricks sharing identifier.

Once access is configured, you can mount shared datasets to a catalog and query them from your Databricks workspace.

Understanding how G2 data is delivered

G2 publishes datasets to Databricks and shares them using Delta Sharing.

The sharing workflow consists of the following stages:

Stage Description
Share configuration G2 configures a share and grants access to your organization
Access configuration Your organization provides a sharing identifier to G2
Share access Shared datasets become available through Delta Sharing
Catalog mounting Shared datasets are mounted to a catalog in your Databricks workspace
Querying Shared datasets can be queried from Databricks

Shared datasets are read-only and accessed through Databricks Delta Sharing.

G2 is not responsible for Databricks compute costs associated with querying shared datasets.

Available datasets

G2 currently shares the following datasets through Databricks:

  • Buyer Intent
  • Reviews
  • Market Intelligence

Refer to the following documentation for dataset schemas and field definitions:

Databricks datasets are refreshed daily after the Snowflake refresh completes at 10:00 AM UTC.

Historical data is included when access is first provisioned.

Providing your sharing identifier

Before G2 can configure access to shared datasets, provide your Databricks sharing identifier to your G2 representative.

The sharing identifier uniquely identifies the Unity Catalog metastore attached to the Databricks workspace where users will access shared data.

The sharing identifier uses the following format:

<cloud>:<region>:<uuid>

Example:

aws:eu-west-1:b0c978c8-3e68-4cdf-94af-d05c120ed1ef

You can retrieve the sharing identifier in one of the following ways:

Using SQL

Run the following command in a Databricks notebook:

SELECT CURRENT_METASTORE()

Using the Databricks UI

Open Catalog.
Open Delta Sharing.
On the Shared with me tab, select your Databricks sharing organization.

Copy the sharing identifier.

After receiving your sharing identifier, G2 configures access to the appropriate shared datasets.

Accessing shared datasets

After G2 configures access, shared datasets become available in Databricks Delta Sharing.

To view and mount shared datasets, your Databricks account must have the required Delta Sharing permissions.

For more information, refer to the Databricks documentation: Read data shared using Databricks-to-Databricks Delta Sharing (for recipients).

To mount a shared dataset:

  1. Open Catalog to open Catalog Explorer.
  2. Select the gear icon and choose Delta Sharing.

Alternatively, select Share > Delta Sharing.

  1. On the Shared with me tab, select the provider.
  2. Find the desired share and select Mount to catalog.
  3. Select Create a new catalog or Mount to existing catalog.
  4. Enter a catalog name or select an existing catalog.
  5. Select Create or Mount.

After the share is mounted, verify that the shared tables are available in the selected catalog.