Skip to main content

Databases

The Database Browser

Setting up a connection to your database is the best option for larger datasets or datasets that change over time. To open the Database Browser, from the homepage, click New > Dataset > Connect.

database browser

The Database Browser lets you create a database connection and explore tables that you can then load into a session.

Connect to a Database

DataChat supports the following database systems:

  • BigQuery
  • Databricks
  • MicrosoftSQLServer
  • MySQL
  • PostgreSQL
  • Presto
  • Snowflake
  • Redshift
  • ConnectionString
note

If you're working from DataChat's cloud platform and are connecting to a read-only database that's behind a firewall, refer to Identity-Aware Proxy Setup for more information on allowing access to read-only databases.

Database Types

note

If you're using a service account to connect to BigQuery, follow these steps before connecting in DataChat:

Create a service account

  1. In the Google Cloud Console, go to IAM & Admin > Service Accounts.
  2. Click Create Service Account.
  3. Enter a name (e.g., datachat-access) and click Create.

Assign BigQuery Permissions

In the Grant access section, add the following roles to the service account:

  • BigQuery Data Viewer. Grants read access to a specific tables or views.
  • BigQuery Job User. Grants permission to run jobs at the dataset level.

If you plan to use BigQuery ML, ensure the account has the following permissions:

  • bigquery.jobs.create
  • bigquery.models.create
  • bigquery.models.getData
  • bigquery.models.updateData
  • bigquery.models.updateMetadata
  • bigquery.models.delete

For more information on BigQuery ML permissions, see Google Cloud's documentation.

Download the Key File

  1. In the Service Accounts list, click the name of your new account.
  2. Go to the Keys tab, then click Add Key > Create new key.
  3. Select JSON and click Create. The file will download automatically. Keep this file secure as it won’t be downloadable again.
note

To access BigQuery via OAuth, refer to Enable Database OAuth.

To add your BigQuery database:

Big Query database

  1. Give the database a display name. This is how you'll reference the database across DataChat.
  2. In the Project field, enter the name of your BigQuery project.
  3. In the Dataset Name field, enter the name of your BigQuery dataset.
  4. Select whether this connection will use a service account or connect to BigQuery via OAuth.
  5. If this connection uses a service account, upload your key file.
  6. Optionally, set the database to be read-only.
  7. Optionally, enable using BigQuery's storage API. Note that selecting this option will improve performance when interacting with your BigQuery tables, but may result in increased BigQuery costs.
  8. Test your database, then click Save.

Load from a Database

After connecting to a database, they appear on the left side of the Database Browser. From here, you can:

  1. Switch between databases by clicking the connection name.
  2. Optionally, add filters to search for a specific table.
  3. Optionally, refresh the database schema by clicking the Refresh button.
  4. Select the tables you'd like to open. Optionally, you can expand table to view column information, or preview a dataset by clicking on it.
  5. Click Import to create dataset objects or click Import and Load to automatically load the datasets into a session.

database browser

note

Sizeable database schemas can cause the Database Browser to time out. If you encounter this issue, reach out to your database administrator to access a more manageable version of the database. Otherwise, you can load from the database within a session.

Edit a Database

To edit a database:

  1. Double-click the database to edit, or select a database and click the Edit button from the toolbar or right-click menu. The Database Browser opens.
  2. Make your edits and click Save.
  3. To edit additional database connections, click the gear icon next connection to edit from the list of database connections.

edit a database

Save a Copy of a Database

To save a copy of an existing database, navigate to the homepage:

  1. Click Databases in the sidebar. You can also find your databases in My Work.
  2. Select the database to save a copy of.
  3. Click Edit Database. The Database Browser opens. You can also right-click on a selected database and click Save As and enter a new name for the copy.
  4. Click Save As and enter a new name for the copy.

copy a database

Within a Session

To connect to a database while in a session:

  1. In the Data tab, select Add Dataset > Connect in the skill menu. You can also click New Dataset > Connect if data has not yet been loaded.
  2. This opens the Database Browser, where you can connect to your database

To load datasets from an existing database connection, refer to Load Datasets into a Session

Additional Database Connection Information

Computation Execution

Computations are executed within your specific database environment. This approach utilizes the database's processing power, ensuring that operations are performed close to the data source, minimizing data transfer and improving performance.

Read-Only vs Writeable Connections

Generally, most database connections in DataChat will be configured as read-only. This is designed to protect your data integrity. If a DataChat skill requires moving data to a new location, such as joining two datasets from different database platforms, DataChat copies the data to its own internal compute engine rather than writing to your database connection.

For writeable connections, DataChat optimizes the performance of skills such as Join and Bin by writing the data directly to your external database as a table or view. This is typically isolated in a dedicated schema or namespace to keep the results separate from other data.

note

Data written to external databases is cleaned up when your session is deleted unless you've saved an object to the homepage or published an object to an Insights Board.