Define and Use Complex Expressions
The Define
skill is a useful tool to define complex expressions. In this example, we'll explore how to use defined expressions to simplify using other skills, such as Keep
and Visualize
.
Load Our Data
To start, load
the file "sba_sample.csv" into the session.
Predicate Expressions
Define a Predicate Expression
Let's say that we'd like to examine SBA data of California between the fiscal years of 2000 and 2012 where the NaicsCode isn't a null value. We could enter three separate steps to achieve this:
Keep the rows where BorrState contains CA
Keep the rows where ApprovalFiscalYear is between the values 2000 to 2012
Drop the rows where NaicsCode is null
Instead, we can define an expression to contain all of these conditions in a single step. Enter in the GEL input field:
Define a predicate expression CAapprovals that satisfies all of the following conditions BorrState contains CA, ApprovalFiscalYear is between the values 2000 to 2012, NaicsCode is not null
The log returns a message stating that the predicate expression "CAapprovals" has been created.
Use a Predicate Expression
We can now use our created expression, "CAapprovals", to view data that meets our conditions. In the GEL input field enter:
Keep the rows where CAapprovals
We can see that our step returned a new dataset, "sba_sample v2", with 17 rows that met our given conditions.
Aggregate Expressions
Define an Aggregate Expression
Using the dataset "sba_sample v2", let's say that we'd like to visually compare the average gross approval amount against each SBA district. In this case, we need to create an aggregate expression to use before creating our visualization:
- Click Define in the sidebar.
- Select Aggregate for the expression type.
- Enter "AvGrossApproval" for the Expression Name.
- Select "average" for the expression, and "GrossApproval" for the column.
- Click Submit.
The log returns a message stating that the predicate expression "AvApproval" has been created.
Use an Aggregate Expression
We can now use this expression to visualize the average gross approval amount for each SBA district. Enter in the GEL input field:
Visualize AvGrossApproval by SBADistrictOffice
Then click Chart1B to view the generated bar chart:
This chart shows us that the San Diego SBA district office held the highest average gross approval in California between the years of 2000 and 2012.
Full Recipe
Load data from the file <strong>sba_sample.csv<strong>
Define a predicate expression <strong>CAapprovals</strong> that satisfies all of the following conditions BorrState contains CA, ApprovalFiscalYear is between the values 2000 to 2012, NaicsCode is not null
Keep the rows where CAapprovals
Define an aggregate expression <strong>AvGrossApproval</strong> as the expression average GrossApproval
Visualize <strong>AvGrossApproval</strong> by <strong>SBADistrictOffice</strong>
Plot a bar chart with the x-axis SBADistrictOffice, the y-axis AvGrossApproval