Skip to main content

Release Notes

We regularly release new versions of DataChat to bring you new features, fix issues, and more.

September 5, 2024

New Features

You can now:

  • Install DataChat in your cluster using Helm.
  • Share sessions with others directly from the session or from the homepage.
  • Resize the Data Assistant within the Data tab.
  • Ask the Data Assistant to modify column names based on conditions, like make column names more meaningful or capitalize all column names in all datasets.
  • Create temporal columns using the Data Assistant.
  • Ask the Data Assistant to describe a column.
  • Use the following keyboard shortcuts within the Data Assistant:
ShortcutDescription
EnterSend the message
Shift + EnterInsert a new line without sending the message
Ctrl/Cmd + EnterAlternative to sending the message
Ctrl/Cmd + ASelect all
Ctrl/Cmd + CCopy selected text
Ctrl/Cmd + VPasted text from clipboard
Ctrl/Cmd + ZUndo last action
Ctrl/Cmd + Shift + ZRedo last action
Ctrl/Cmd + Up/Down ArrowNavigate through previous/next messages
Ctrl/Cmd + Left/Right ArrowShift cursor to the start/end of the current line
Alt/Option + Left/Right ArrowShift cursor to the one word to the left or right
Up/Down Arrowgo up and down one line in the current textbox
EscBlur the input

This release also includes the following improvements:

  • Large database connections now use approximate counts for faster descriptive statistics.
  • Improved handling of datetime columns in the Data Assistant.
  • Visualize now samples large datasets.
  • The Data Assistant now respects dataset and column name length/character restrictions.
  • Improved percentage aggregation calculations in the Data Assistant.
  • X-axis type for histograms is now set to "Category" for consistency.
  • Columns can no longer be created with duplicate names.

Bug Fixes

  • Tables and charts generated by the Data Assistant sometimes appeared off-center.
  • Horizontal lines on certain charts wouldn't render.
  • Filters on line charts wouldn't apply correctly.
  • Visualizing a column occasionally failed.
  • When concatenating, the Data Assistant used explicit delimiter values instead of the intended character.
  • Violin charts for large datasets wouldn't render.
  • Show Me Something Interesting occasionally failed to show recommendations.
  • The "Equals" predicate was case-sensitive, causing record fetch failures when matching exact LLM values.
  • Minimum and maximum aggregates weren't supported for datetime columns.
  • Drop rows matching... wouldn't work on datasets from a BigQuery read-only connection.
  • Dates with uppercase letters were processed incorrectly.
  • Heatmap charts with conditional logic wouldn't function properly.
  • The Train a Model form wouldn't fetch column types, disabling Advanced Options.
  • Refreshing datasets in the Database Browser wouldn't update the corresponding homepage objects.
  • Show Me Something Interesting wouldn't work on Snowflake datasets.
  • Keeping/dropping rows where a column value was 0 wouldn't work.
  • Data Assistant-generated datasets had an extra empty column.
  • Replaying workflows caused flickering.
  • Hours/minutes/seconds weren't displayed when creating new datetime columns.
  • Descriptive Statistics sometimes showed the wrong numeric column type.
  • Loading a BigQuery dataset with a modified column name failed.
  • Custom queries for BigQuery connections wouldn't run.

July 31, 2024

New Features

You can now:

  • Link multiple sign-in authentication options together, such as Google SSO and one-time passwords.
  • Add custom line coordinates to charts, such as a specified KPI value.
  • Create datetime columns by extracting values from integer and string columns.
  • Rename multiple columns at once, including the option to use the Data Assistant.
  • Format chart axis ticks to include prefixes and suffixes, such as percentage (%) and currency ($) signs.

This release also includes the following improvements:

  • Improved error and inaccuracy messaging to explain at which step a specific error occurred.
  • Improved skill handling while having multiple instances of the same session open.
  • Support for JSON columns in BigQuery connections.
  • Improved relative datetime detection, such as today and yesterday, when training a machine learning model.
  • Added the capability to submit detailed feedback through a form available in the Data Assistant responses.
  • Enhanced question context in the Data Assistant to better reference previous responses.

Bug Fixes

  • Some users were unable to respond to skill prompts, such as confirmations to replace an existing dataset.
  • The Data Assistant was unable to plot correlation matrices.
  • Customizing a chart with a horizontal line wouldn't work.
  • Users were unable to connect to a Databricks database using the hive_metastore catalog.
  • Cleaning a float column while connected to a BigQuery database wouldn't work.
  • Keeping and dropping rows based on a time unit, such as yesterday or tomorrow, wouldn't work.
  • Plotting a correlation matrix while connected to a MS SQL database wouldn't work.
  • Show Descriptive Statistics would not work while connected to a SQL Server database.
  • Customizing a chart with a horizontal or vertical line, such as to indicate a KPI value, wouldn't work.
  • Intersect would not work when connected to a BigQuery database.
  • Some users were unable to ask questions in the Data Assistant when working in a collaborative session.
  • Training a time series on a numeric column would sometimes not work.
  • Visualizations from training time series would sometimes not render.
  • Dropping all columns would return an nondescript error message.
  • Conditionally cleaning all columns by replacing nulls with a specified value wouldn't work.
  • Insight recommendations would not be shown in collaborative sessions.
  • Subplots on scatter charts wouldn't render.
  • Viewing edited charts in larger window wouldn't display the charts.
  • Using math-based temporal expressions, such as Day + 5Days, while keeping or dropping rows wouldn't work.
  • Some column names could not be read when creating a new column.
  • Some Data Assistant responses would not generate the resulting dataset or chart.
  • The Data Assistant could not fuzzy match (approximate string match) if null values were present in the target columns.
  • Keeping or dropping rows from JSON columns wouldn't work.
  • Scrolling in certain Chart Builder dropdown menus caused unintended background scrolling in the application.
  • The Show Me Something Interesting feature occasionally failed to display insight recommendations.
  • Skills were not executing when responding to Yes or No prompts.

July 1, 2024

New Features

You can now:

  • Train clustering models directly within BigQuery.
  • Login to DataChat using one-time passcode instead of a password.

This release also includes the following improvements:

  • The Data Assistant can now extract a wider range of datetime formats.
  • The Data Assistant can now sort and refine data by specific keywords, helping you find information more efficiently.
  • The model evaluation metric has been updated from Silhouette Score to Davies-Bouldin Index.
  • The Data Assistant now has better recognition and matching of data entries that are similar but not identical, making it easier to handle variations in data and find relevant information.

Deprecated Functionality

  • The Publication Library feature has been deprecated and removed.

Bug Fixes

This release fixed the following issues:

  • Occasional long query request times in the Data Assistant.
  • Queries involving "matching digit values" or "match semantic substring" would fail.
  • Sticky scrolling in the Database Browser.
  • Creating new columns using date values wouldn't work.
  • Some users were prevented from sharing charts from the Data Assistant.
  • Support for mode, median, count, and percentile aggregates wouldn't work when connected to SQL Server.
  • Joining tables wouldn't work after connecting to a BigQuery database.
  • Some customizations in the Chart Builder wouldn't work.
  • Compute would fail if the dataset's column headers contained both alpha and numeric characters.
  • Clicks within some dropdown menus wouldn't register.
  • Some bar charts did not start on the first X-axis data point.
  • Loading data from Snowflake connections with all lowercase column names would fail.
  • Creating columns by concatenation in the Data Assistant wouldn't work.
  • Using some skills with a MySQL connection, such as Sort and Describe, would fail.
  • Some users were unable to ask questions on pivot tables created by the Data Assistant.
  • Generating charts using Visualize that required binning wouldn't work in the Data Assistant.
  • Users were unable to load Excel files exported from SAP.
  • Creating columns from a time delta would fail while connected to a SQL Server database.
  • Root mean squared model scores were displayed as negative values.
  • Filtering a chart with an expression wouldn't work.
  • Binning a column on a SQL Server database increased the total number of rows.
  • Creating new window columns would fail when connected to a SQL Server database.
  • Keeping rows based on date would cause some display issues.
  • Users couldn't open a public chart link shared by another user while logged into DataChat.
  • Dropping or keeping rows based on a temporal expression wouldn't work.

June 3, 2024

New Features

You can now:

  • Plot histograms using the Data Assistant.
  • Round values to the nearest whole number using the Data Assistant.
  • Take action on homepage objects in bulk.
  • Use Show me something interesting within the Data Assistant.
  • Train models directly in BigQuery.
  • Paste in the Workflow Editor.
  • Turn off public links if you are using a hosted version of DataChat.
  • Connect to MySQL databases using a connection string.

This release also includes the following improvements:

  • Chart tooltips are now scrollable.
  • You can now see a summary of both the row count and column count in the header of a dataset.
  • The Data Assistant now suggests questions when a dataset is loaded.
  • Added pagination to the Database Browser.
  • Column counts are now displayed.

Deprecated Functionality

  • The Dataset Builder has been deprecated.
  • The Number of Bins to Use for Target Column field in the Advanced Options section of the Train Model form has been removed.
  • Minimizing charts has been deprecated.

Bug Fixes

  • Datasets would "load" indefinitely if they had only one column and that column was dropped.
  • Add a column using a time delta wouldn't work on BigQuery.
  • Show me something interesting would occasionally use a non-numeric column as the KPI variable for pivot tables.
  • Adding a chart or table to an Insights Board using the Publication Library would fail.

May 6, 2024

New Features

You can now:

  • Remove Duplicates using the Data Assistant.
  • Sort charts in the Chart tab.
  • Display table explanations in the Data Assistant.
  • Add an aggregate slice size value to donut charts.
  • Apply the Count of Records aggregation to all chart types that accept column aggregations.
  • Use the None aggregation on Donut charts.
  • View in-app tutorials for common actions.
  • Cancel a skill before it finishes running.
  • Select and deselect all available datasets when loading in a session.

This release also includes the following improvements:

  • Currently active datasets are automatically moved to the top of the dataset list in the Data tab.
  • Big Query Databases have been renamed to Datasets.
  • Improved response time when cancelling a long-running skill.
  • New users now open DataChat with only the demo datasets.

Deprecated Functionality

  • The data assistant's histogram plotting feature is currently disabled, but this is only a temporary measure.

Bug Fixes

  • Creating a new column with Text on a Presto database would fail.
  • Users could submit charts with incomplete fields.
  • Visualize would allow users to select columns from inactive datasets, causing sessions to crash.
  • Clicking the link to view a candidate model's pipeline report would fail.
  • Binning a column would sometimes create non-integer binned column.

April 6, 2024

New Features

You can now:

  • Remove Duplicates from the Wrangle section of the skill menu.
  • Add a column using a time delta.
  • Add a column using a temporal expression.
  • Preview and edit a publication's workflow.
  • Hover each tab in a session to preview the tab's contents.

This release also includes the following improvements:

  • Enhanced UI in the Data Assistant to handle long dataset names.
  • Data Assistant now handles the Split skill.
  • Data Assistant now handles the Clean skill.
  • Larger default column widths.
  • Cancelling Data Assistant questions are now available after querying for more than 1 second.
  • Saved snapshots now automatically save the specified datasets to a folder.
  • Column types are now denoted next to the column name.

Bug Fixes

  • Extracting datetime columns with Extract would fail.
  • Occasionally, multi-select charts experienced failure when users selected the first two options from the Required dropdown menu.
  • Reuploading a deleted file would sometimes prompt the user to replace the existing file despite already being deleted.
  • Navigating to an Insights Board directly after publishing would fail.
  • Replaying a workflow would sometime fail to generate plots.
  • Cleaning a column by replacing NULL values with an aggregation would sometimes cause an error.
  • Some Databricks connections would fail to load data.
  • The Data Assistant would occasionally report an incorrect chart type.
  • The Keep Rows skill would use multiple datasets instead of only the one in focus.
  • Training a linear regression model would occasionally fail.
  • Creating a new column with Text on a Presto database would fail.
  • Users could submit charts with incomplete fields.
  • Visualize would allow users to select columns from inactive datasets, causing sessions to crash.
  • Clicking the link to view a candidate model's pipeline report would fail.
  • Binning a column would sometimes create non-integer binned column.

Other Notes

  • Reduced Data Assistant hallucinations.
  • Chart Builder default functionality changes:
    • Changes the default aggregation for bubble and scatter charts to Count or Average depending on the column type.
    • When changing to Violin, Boxplot, or Donut, the Data Sample is set to 5000.

March 8, 2024

DataChat has introduced a comprehensive overhaul of the session UI, aimed at seamlessly integrating the previous "DataGrid" and "Ava Conversations" functionalities into a unified interface, enhancing the data analysis workflow.

New Features

  • The session interface has undergone significant refinement to consolidate previous conversations and sessions, providing users with a unified display. Notable UI enhancements include the addition of:
    • Data tab
    • Chart tab
    • Data Assistant tab
    • New Skill menu
  • The renaming of "Recipes" to "Workflows"
  • Transitioning "Ava" and "Ask" functionalities into the "Data Assistant"
  • You can now quickly generate insights on your data with Show me something interesting. See more information here.
  • Datasets can now be hidden from the session view for better organization.
  • You can now add charts and datasets generated by the Data Assistant directly to your Data tab or Chart tab.
  • Charts now offer visibility into the underlying dataset used for their creation.
  • When saving a session as a workflow, you can now selectively include either the entire session or exclusively the generated and modified datasets.

This release also includes the following improvements:

  • Optimized display for tablet/mobile screens.
  • Implementation of more informative error messaging.

Deprecated Functionality

  • The Display skill.
  • DataChat tours.

Bug Fixes

  • Fixed an issue where Keep Rows and Drop Rows based on a condition for date and timestamp columns would fail.
Feedback