Skip to main content
Version: 0.22.2

Compare

Compare lets you compare the distribution of values in a given column to each distinct value of one or more comparison columns. Note that columns with date or time values cannot be compared.

Format

Compare uses a single utterance: Compare the column <primary column> for each <columns to compare>

Parameters

Compare uses the following parameters:

  • primary column (required). The column whose value distribution to compare to the comparison columns.
  • columns to compare (required). The columns whose values to compare against the value distribution of the primary column.

Output

If the values are successfully compared, a table is shown in the display panel. The table contains the following columns:

  • Group1, Group2, and so on. There is one column for each distinct value in the columns listed in the columns to compare parameter.
  • MatchingScore. A score, from 0 to 100, that indicates how similar the distributions between the groups are.
  • GroupSimilarity. Either "Yes" or "No" depending on whether the two distributions are determined to be similar. The distributions are considered to be similar if the pValue is greater than or equal to 0.05.
  • SampleSize1, SampleSize2, and so on. There is one column for each distinct value in the columns that are being compared against. These columns contain the number of records found for each value.
  • pValue. A value, from 0 to 1, that measures the likelihood of both columns having the same value distributions. The lower the value, the more likely it is that the two distributions are different. The values of the MatchingScore column are based on the value of the pValue column.

Examples

To compare the distribution of values in a column called Fare against the distributions of each value in the Gender column, enter Compare the column Fare for each Gender.