Release Notes
We regularly release new versions of DataChat to bring you new features, fix issues, and more.
November 27, 2024
New Features
You can now:
- Ask multiple questions simultaneously in collaborative sessions.
- Train time series models using BigQuery ML.
- Copy Data Assistant workflows to your clipboard.
- Rename multiple datasets at once with the Data Assistant.
- Connect to Redshift databases through the Database Browser.
- Connect and authorize user views and permissions via OAuth with Google BigQuery.
This release also includes the following improvements:
- Training a time series model now generates chart visualizations.
- The Data Assistant now ensures smoother response generation if a user’s initial question encounters an error.
- If the first attempt at returning a response in the Data Assistant fails, it will try utilizing another LLM.
- Data Assistant visualizations are now only generated when requested, or when the user's question is best answered with one.
- Data Assistant visualizations do not include unnecessary information, such as additional columns, unless requested.
- Improved integration with Snowflake databases.
- Enhanced support for large time series datasets for both BigQuery ML and in-house models with improved sampling for visualizations.
- The Data Assistant Dictionary has been removed from beta, and is now available while in a session.
Deprecated Functionality
- Staging databases for Google BigQuery.
Bug Fixes
- Extracting the YYYYMM or YYYYQ from a Snowflake connection would result in a server error.
- Sampling a dataset multiple times in a row did not always sample from the previous version.
- Some dataset columns would not be available when plotting bubble charts.
- Visualizing a column with optimization would not work when the column required binning.
- Data Assistant responses would include quotation marks in its responses when they were present in the user's original request.
- Editing a database connection would sometimes cause the Database Browser buttons, such as Save and Save As, to be unclickable.
- Binning a column based on width and a set interval size would not result in equal-width bins.
- Some charts that included multiple tabs would not render properly.
- Using the Clean skill on some Snowflake connections wouldn't work.
- The Data Assistant was unable to partially match some integer values, for example
Show me all sales in zip codes beginning with 95
.
October 28, 2024
New Features
- You can now sort pivot table columns and rows in ascending or descending order.
- The Data Assistant now uses AI-powered agents to improve user interactions by automatically tracking datasets, incorporating new ones, and delivering smarter, context-aware responses.
This release also includes the following improvements:
- Improved aggregations of non-numeric columns.
- When starting a conversation with the Data Assistant, it uses only datasets currently in focus, and new datasets are tracked and added automatically.
Deprecated Functionality
- Removed in-product tutorials of common actions.
Bug Fixes
- The Data Assistant would occasionally flicker when entering a question.
- The leftmost bar on a bar chart would sometimes overlap the Y-axis.
- Some predicate expressions, such as
is less than
andis greater than
, would not work on datetime columns. - Snowflake connections could not be edited.
- The Data Assistant occasionally referenced the DataChat API in responses.
- The Data Assistant would only show the active user avatar in collaborative sessions.
- Some histograms would display incorrect results due to column binning failures.
- The Data Assistant would incorrectly format some datetime column values, such as quarter.
- Users removing themselves from a collaborative session would cause the session to crash.
- Visualizing a binned column would not work.
September 30, 2024
New Features
- The Data Assistant Dictionary lets you add definitions to datasets and columns, improving the accuracy of Data Assistant responses.
- Table explanations are now more detailed for tables containing 10-50 rows.
- You can now use special characters in database connection passwords.
This release also includes the following improvements:
- User feedback forms now automatically include the relevant question context when submitted.
- Search functionality in the Load form has been enhanced for better results.
- When joining tables with identical column names, duplicate columns are now be automatically suffixed with "_1," "_2," and so on for clarity.
Deprecated Functionality
- The
min
,max
, andavg
horizontal line options have been removed from stacked charts to prevent rendering errors and ensure smooth performance. - The ability to share sessions using the Skill form has been removed.
Bug Fixes
- Dropping rows using the
Is Before <date>
orIs After <date>
would not work on BigQuery datasets. - The Data Assistant would overlap the Data tab's placeholder when the tab was empty.
- The Data Assistant conversation history would not consistently auto-scroll when responding to user questions.
Visualize
would not run on some datasets.- Some Data Assistant responses incorrectly referenced the DataChat API.
- Training a CatBoost model would occasionally fail due to unsupported column types.
- Visualizing columns from large datasets would sometimes not render a chart after showing a success message.
- Loading multiple datasets with similar names triggered versioning errors due to case sensitivity.
- Keeping or dropping rows with multiple predicates would not work correctly.
- The Data Assistant would not remove duplicates from the latest dataset version when asked.
- Decimal type columns were not always read correctly by the Data Assistant.
September 12, 2024
New Features
You can now:
- Install DataChat in your cluster using Helm.
- Share sessions with others directly from the session or from the homepage.
- Resize the Data Assistant within the Data tab.
- Ask the Data Assistant to modify column names based on conditions, like
make column names more meaningful
orcapitalize all column names in all datasets
. - Create temporal columns using the Data Assistant.
- Ask the Data Assistant to describe a column.
- Use the following keyboard shortcuts within the Data Assistant:
Shortcut | Description |
---|---|
Enter | Send the message |
Shift + Enter | Insert a new line without sending the message |
Ctrl/Cmd + Enter | Alternative to sending the message |
Ctrl/Cmd + A | Select all |
Ctrl/Cmd + C | Copy selected text |
Ctrl/Cmd + V | Pasted text from clipboard |
Ctrl/Cmd + Z | Undo last action |
Ctrl/Cmd + Shift + Z | Redo last action |
Ctrl/Cmd + Up/Down Arrow | Navigate through previous/next messages |
Ctrl/Cmd + Left/Right Arrow | Shift cursor to the start/end of the current line |
Alt/Option + Left/Right Arrow | Shift cursor to the one word to the left or right |
Up/Down Arrow | go up and down one line in the current textbox |
Esc | Blur the input |
This release also includes the following improvements:
- Large database connections now use approximate counts for faster descriptive statistics.
- Improved handling of datetime columns in the Data Assistant.
Visualize
now samples large datasets.- The Data Assistant now respects dataset and column name length/character restrictions.
- Improved percentage aggregation calculations in the Data Assistant.
- X-axis type for histograms is now set to "Category" for consistency.
- Columns can no longer be created with duplicate names.
Bug Fixes
- Tables and charts generated by the Data Assistant sometimes appeared off-center.
- Horizontal lines on certain charts wouldn't render.
- Filters on line charts wouldn't apply correctly.
- Visualizing a column occasionally failed.
- When concatenating, the Data Assistant used explicit delimiter values instead of the intended character.
- Violin charts for large datasets wouldn't render.
- Show Me Something Interesting occasionally failed to show recommendations.
- The "Equals" predicate was case-sensitive, causing record fetch failures when matching exact LLM values.
- Minimum and maximum aggregates weren't supported for datetime columns.
Drop rows matching...
wouldn't work on datasets from a BigQuery read-only connection.- Dates with uppercase letters were processed incorrectly.
- Heatmap charts with conditional logic wouldn't function properly.
- The Train a Model form wouldn't fetch column types, disabling Advanced Options.
- Refreshing datasets in the Database Browser wouldn't update the corresponding homepage objects.
- Show Me Something Interesting wouldn't work on Snowflake datasets.
- Keeping/dropping rows where a column value was 0 wouldn't work.
- Data Assistant-generated datasets had an extra empty column.
- Replaying workflows caused flickering.
- Hours/minutes/seconds weren't displayed when creating new datetime columns.
- Descriptive Statistics sometimes showed the wrong numeric column type.
- Loading a BigQuery dataset with a modified column name failed.
- Custom queries for BigQuery connections wouldn't run.
July 31, 2024
New Features
You can now:
- Link multiple sign-in authentication options together, such as Google SSO and one-time passwords.
- Add custom line coordinates to charts, such as a specified KPI value.
- Create datetime columns by extracting values from integer and string columns.
- Rename multiple columns at once, including the option to use the Data Assistant.
- Format chart axis ticks to include prefixes and suffixes, such as percentage (%) and currency ($) signs.
This release also includes the following improvements:
- Improved error and inaccuracy messaging to explain at which step a specific error occurred.
- Improved skill handling while having multiple instances of the same session open.
- Support for JSON columns in BigQuery connections.
- Improved relative datetime detection, such as
today
andyesterday
, when training a machine learning model. - Added the capability to submit detailed feedback through a form available in the Data Assistant responses.
- Enhanced question context in the Data Assistant to better reference previous responses.
Bug Fixes
- Some users were unable to respond to skill prompts, such as confirmations to replace an existing dataset.
- The Data Assistant was unable to plot correlation matrices.
- Customizing a chart with a horizontal line wouldn't work.
- Users were unable to connect to a Databricks database using the
hive_metastore
catalog. - Cleaning a float column while connected to a BigQuery database wouldn't work.
- Keeping and dropping rows based on a time unit, such as
yesterday
ortomorrow
, wouldn't work. - Plotting a correlation matrix while connected to a MS SQL database wouldn't work.
- Show Descriptive Statistics would not work while connected to a SQL Server database.
- Customizing a chart with a horizontal or vertical line, such as to indicate a KPI value, wouldn't work.
Intersect
would not work when connected to a BigQuery database.- Some users were unable to ask questions in the Data Assistant when working in a collaborative session.
- Training a time series on a numeric column would sometimes not work.
- Visualizations from training time series would sometimes not render.
- Dropping all columns would return an nondescript error message.
- Conditionally cleaning all columns by replacing nulls with a specified value wouldn't work.
- Insight recommendations would not be shown in collaborative sessions.
- Subplots on scatter charts wouldn't render.
- Viewing edited charts in larger window wouldn't display the charts.
- Using math-based temporal expressions, such as
Day
+5Days
, while keeping or dropping rows wouldn't work. - Some column names could not be read when creating a new column.
- Some Data Assistant responses would not generate the resulting dataset or chart.
- The Data Assistant could not fuzzy match (approximate string match) if null values were present in the target columns.
- Keeping or dropping rows from JSON columns wouldn't work.
- Scrolling in certain Chart Builder dropdown menus caused unintended background scrolling in the application.
- The Show Me Something Interesting feature occasionally failed to display insight recommendations.
- Skills were not executing when responding to Yes or No prompts.
July 1, 2024
New Features
You can now:
- Train clustering models directly within BigQuery.
- Login to DataChat using one-time passcode instead of a password.
This release also includes the following improvements:
- The Data Assistant can now extract a wider range of datetime formats.
- The Data Assistant can now sort and refine data by specific keywords, helping you find information more efficiently.
- The model evaluation metric has been updated from Silhouette Score to Davies-Bouldin Index.
- The Data Assistant now has better recognition and matching of data entries that are similar but not identical, making it easier to handle variations in data and find relevant information.
Deprecated Functionality
- The Publication Library feature has been deprecated and removed.
Bug Fixes
This release fixed the following issues:
- Occasional long query request times in the Data Assistant.
- Queries involving "matching digit values" or "match semantic substring" would fail.
- Sticky scrolling in the Database Browser.
- Creating new columns using date values wouldn't work.
- Some users were prevented from sharing charts from the Data Assistant.
- Support for mode, median, count, and percentile aggregates wouldn't work when connected to SQL Server.
- Joining tables wouldn't work after connecting to a BigQuery database.
- Some customizations in the Chart Builder wouldn't work.
Compute
would fail if the dataset's column headers contained both alpha and numeric characters.- Clicks within some dropdown menus wouldn't register.
- Some bar charts did not start on the first X-axis data point.
- Loading data from Snowflake connections with all lowercase column names would fail.
- Creating columns by concatenation in the Data Assistant wouldn't work.
- Using some skills with a MySQL connection, such as
Sort
andDescribe
, would fail. - Some users were unable to ask questions on pivot tables created by the Data Assistant.
- Generating charts using
Visualize
that required binning wouldn't work in the Data Assistant. - Users were unable to load Excel files exported from SAP.
- Creating columns from a time delta would fail while connected to a SQL Server database.
- Root mean squared model scores were displayed as negative values.
- Filtering a chart with an expression wouldn't work.
- Binning a column on a SQL Server database increased the total number of rows.
- Creating new window columns would fail when connected to a SQL Server database.
- Keeping rows based on date would cause some display issues.
- Users couldn't open a public chart link shared by another user while logged into DataChat.
- Dropping or keeping rows based on a temporal expression wouldn't work.
June 3, 2024
New Features
You can now:
- Plot histograms using the Data Assistant.
- Round values to the nearest whole number using the Data Assistant.
- Take action on homepage objects in bulk.
- Use
Show me something interesting
within the Data Assistant. - Train models directly in BigQuery.
- Paste in the Workflow Editor.
- Turn off public links if you are using a hosted version of DataChat.
- Connect to MySQL databases using a connection string.
This release also includes the following improvements:
- Chart tooltips are now scrollable.
- You can now see a summary of both the row count and column count in the header of a dataset.
- The Data Assistant now suggests questions when a dataset is loaded.
- Added pagination to the Database Browser.
- Column counts are now displayed.
Deprecated Functionality
- The Dataset Builder has been deprecated.
- The
Number of Bins to Use for Target Column
field in the Advanced Options section of the Train Model form has been removed. - Minimizing charts has been deprecated.
Bug Fixes
- Datasets would "load" indefinitely if they had only one column and that column was dropped.
Add a column
using a time delta wouldn't work on BigQuery.Show me something interesting
would occasionally use a non-numeric column as the KPI variable for pivot tables.- Adding a chart or table to an Insights Board using the Publication Library would fail.
May 6, 2024
New Features
You can now:
Remove Duplicates
using the Data Assistant.- Sort charts in the Chart tab.
- Display table explanations in the Data Assistant.
- Add an aggregate slice size value to donut charts.
- Apply the Count of Records aggregation to all chart types that accept column aggregations.
- Use the None aggregation on Donut charts.
- View in-app tutorials for common actions.
- Cancel a skill before it finishes running.
- Select and deselect all available datasets when loading in a session.
This release also includes the following improvements:
- Currently active datasets are automatically moved to the top of the dataset list in the Data tab.
- Big Query Databases have been renamed to Datasets.
- Improved response time when cancelling a long-running skill.
- New users now open DataChat with only the demo datasets.
Deprecated Functionality
- The data assistant's histogram plotting feature is currently disabled, but this is only a temporary measure.
Bug Fixes
- Creating a new column with Text on a Presto database would fail.
- Users could submit charts with incomplete fields.
Visualize
would allow users to select columns from inactive datasets, causing sessions to crash.- Clicking the link to view a candidate model's pipeline report would fail.
- Binning a column would sometimes create non-integer binned column.
April 6, 2024
New Features
You can now:
Remove Duplicates
from the Wrangle section of the skill menu.Add a column
using a time delta.Add a column
using a temporal expression.- Preview and edit a publication's workflow.
- Hover each tab in a session to preview the tab's contents.
This release also includes the following improvements:
- Enhanced UI in the Data Assistant to handle long dataset names.
- Data Assistant now handles the
Split
skill. - Data Assistant now handles the
Clean
skill. - Larger default column widths.
- Cancelling Data Assistant questions are now available after querying for more than 1 second.
- Saved snapshots now automatically save the specified datasets to a folder.
- Column types are now denoted next to the column name.
Bug Fixes
- Extracting datetime columns with
Extract
would fail. - Occasionally, multi-select charts experienced failure when users selected the first two options from the Required dropdown menu.
- Reuploading a deleted file would sometimes prompt the user to replace the existing file despite already being deleted.
- Navigating to an Insights Board directly after publishing would fail.
- Replaying a workflow would sometime fail to generate plots.
- Cleaning a column by replacing NULL values with an aggregation would sometimes cause an error.
- Some Databricks connections would fail to load data.
- The Data Assistant would occasionally report an incorrect chart type.
- The
Keep Rows
skill would use multiple datasets instead of only the one in focus. - Training a linear regression model would occasionally fail.
- Creating a new column with Text on a Presto database would fail.
- Users could submit charts with incomplete fields.
Visualize
would allow users to select columns from inactive datasets, causing sessions to crash.- Clicking the link to view a candidate model's pipeline report would fail.
- Binning a column would sometimes create non-integer binned column.
Other Notes
- Reduced Data Assistant hallucinations.
- Chart Builder default functionality changes:
- Changes the default aggregation for bubble and scatter charts to
Count
orAverage
depending on the column type. - When changing to Violin, Boxplot, or Donut, the Data Sample is set to 5000.
- Changes the default aggregation for bubble and scatter charts to
March 8, 2024
DataChat has introduced a comprehensive overhaul of the session UI, aimed at seamlessly integrating the previous "DataGrid" and "Ava Conversations" functionalities into a unified interface, enhancing the data analysis workflow.
New Features
- The session interface has undergone significant refinement to consolidate previous conversations and sessions, providing users with a unified display. Notable UI enhancements include the addition of:
- Data tab
- Chart tab
- Data Assistant tab
- New Skill menu
- The renaming of "Recipes" to "Workflows"
- Transitioning "Ava" and "Ask" functionalities into the "Data Assistant"
- You can now quickly generate insights on your data with Show me something interesting. See more information here.
- Datasets can now be hidden from the session view for better organization.
- You can now add charts and datasets generated by the Data Assistant directly to your Data tab or Chart tab.
- Charts now offer visibility into the underlying dataset used for their creation.
- When saving a session as a workflow, you can now selectively include either the entire session or exclusively the generated and modified datasets.
This release also includes the following improvements:
- Optimized display for tablet/mobile screens.
- Implementation of more informative error messaging.
Deprecated Functionality
- The
Display
skill. - DataChat tours.
Bug Fixes
- Fixed an issue where
Keep Rows
andDrop Rows
based on a condition for date and timestamp columns would fail.