Define
Define
lets you create reusable objects, such as patterns, aggregations, predicates, math expressions, and more. You can then use these objects in other skills, such as Compute
, Keep
, or Drop
.
Format
Define
has a different format for each available object:
Define a column group <name> as the columns <columns>
Define a column reference <phrase> as the column <reference column>
Define a math expression <name> as <math expression>
Define a predicate expression <name> as the expression <predicate>
Define a predicate expression <name> that satisfies (any | all) of the following conditions <predicate>
Define an aggregate expression <name> as the expression <aggregation>
Define an aggregate math expression <name> as <math expression>
. Compared to a standard math expression, aggregate math expressions allow you to define a math expression that uses aggregations along with standard math expressions, such assum(column A / column B)
.Define an aggregate query expression <name> to be <aggregation> (for each <column> | where <predicate> | such that <predicate> | sorted in (ascending | descending) order | displaying (bottom | first | last | top) <number of rows>)
Define an extract expression <name> as the expression <date part> from <datetime column>
Parameters
The parameters used in Define
include:
name
(required). The name of the object.date part
(required). For extract phrases, this is the part of the date or time that should be extracted, such asday
orhour
. See Extract for more information on the available options.datetime column
(required). For extract phrases, this is the column thedate part
should be extracted from.math expression
(required). A math expression, such as(<column x> * <column y>) / <column z>
.predicate
(required). Operators used to compare two values. Refer to Compute for more information.column
(required). A comma-separated list of columns to include in the column group.phrase
(required). A phrase to use as the name of the object.aggregation
(required). A comma-separated list of calculations. Refer to Compute for more information.expression
(required). An already-defined pattern expression.reference column
(required). The column the phrase should reference.number of rows
(optional). The number of rows to display.
Output
If the object is successfully defined, a success message is returned in the conversation history. Otherwise, an error message is returned.
Examples
Consider a dataset called “Titanic” that contains information on each passenger, including the following columns:
Age
. Their age.Gender
. Their gender.Name
. Their name.PClass
. Their class.Survived
. Whether they survived the disaster.
To define a predicate that returns true when the passenger is an adult, enter Define a predicate expression isAdult as the expression Age is greater than or equal to the value 18
.
To define an aggregation that calculates the average age of the passengers, enter Define an aggregate expression AverageAge as the expression average Age
.
To define a math expression that calculates the Age to Fare ratio for each row, enter Define a math expression AgeFareRatio as Age / Fare
.
To define an aggregate math expression that calculates the total Age to Fare ratio for the dataset, enter Define an aggregate math expression AgeFareRatio as sum(Age) / sum(Fare)
.
Uses
With the expressions defined above, we can use them in other steps. For example:
- To compute the total count of passengers who are adults, enter
Compute the count of records where isAdult
. - To visualize the average age for each passenger class, enter
Visualize AverageAge by Pclass
.