Creating Workflows
How to create and structure workflows in Kogenta Connect.
Creating a workflow involves setting up queries, choosing data sources, and organizing the execution flow.
Creating a new workflow
- Navigate to the Data Workflows section
- Click the create button
- Give your workflow a name and optional description
- Select the data source(s) you want to query
Adding queries
A workflow starts with one query, but you can add more to build multi-step pipelines. Each query needs:
- A name - a descriptive identifier (e.g.,
raw_transactions,aggregated_by_region) - SQL code - the actual query, written in cells
- An execution order - determines which queries run first
Query names are important because dashboards reference specific queries by name when binding data to widgets.
Execution order
Queries execute in the order you define. Earlier queries complete before later ones begin. This lets you structure dependent operations - for example, a summary query that depends on a detail query having already run.
Referencing other queries
A key feature of multi-query workflows is that later queries can reference the output of earlier queries. You do this by using the query name as a table name in your SQL:
-- Query 1: "raw_transactions" (execution order 0)
SELECT customer_id, amount, transaction_date
FROM sales.transactions
WHERE transaction_date >= {{start_date}}
-- Query 2: "customer_totals" (execution order 1)
SELECT customer_id, SUM(amount) as total_spend, COUNT(*) as transaction_count
FROM raw_transactions
GROUP BY customer_id
-- Query 3: "top_customers" (execution order 2)
SELECT t.customer_id, t.total_spend, c.name, c.region
FROM customer_totals t
JOIN customers.dim_customers c ON t.customer_id = c.id
WHERE t.total_spend > {{min_spend}}
ORDER BY t.total_spend DESCIn this example, customer_totals references raw_transactions by name, and top_customers references customer_totals. Each query builds on the results of the previous one.
Rules for cross-query references
- Use the query name, not the cell name - if a query is named
raw_transactions, reference it asFROM raw_transactions - No forward references - a query can only reference queries with a lower execution order (queries that run before it)
- Only SELECT queries can be referenced - stored procedure calls cannot be used as a data source for other queries
Referencing data layers
When writing SQL in your queries, you reference data layers by their virtual name. How you reference a data layer depends on how it is organized:
- In a folder - use the folder path and table name:
sales.transactions,places.regions.dim_regions - Unfiled - use the source schema and table name:
schema.table_name - Unfiled with no name collision - you can use just the table name:
table_name
-- Referencing a data layer in the "sales" folder
SELECT customer_id, SUM(amount) as total_spend
FROM sales.transactions
GROUP BY customer_id
-- Referencing an unfiled data layer by just its table name
SELECT * FROM dim_customersThe workflow editor provides autocomplete for virtual names - as you type, it suggests available data layers and their columns. Virtual names are handled automatically when the query runs.
A workflow can reference data layers from multiple data sources, giving you flexibility to combine data across different databases.
Organizing with folders
Workflows can be organized into folders, and queries within a workflow can be grouped into query folders for complex workflows with many queries.
Validating queries
Before publishing, you can validate your queries with a dry run. This checks the SQL syntax and returns the result schema (column names and types) without executing the full query. Use this to catch errors early and verify that your queries produce the expected output structure.