Common Table Expressions and Recursive Queries in PostgreSQL

Introduction to Common Table Expressions (CTEs)

Common Table Expressions (CTEs) are a great way to create temporary result sets that can be referenced within a SELECT, INSERT, UPDATE, or DELETE statement. They can enhance the readability and organization of SQL queries, making them easier to understand and maintain.

Creating a Simple CTE

Let's say we have a database of employees as follows:

CREATE TABLE employees ( id SERIAL PRIMARY KEY, name VARCHAR(100), department VARCHAR(100), salary DECIMAL(10, 2) ); INSERT INTO employees (name, department, salary) VALUES ('John Doe', 'Engineering', 90000), ('Jane Smith', 'Engineering', 95000), ('Emily Johnson', 'HR', 50000), ('Michael Brown', 'HR', 55000), ('Linda Davis', 'Sales', 65000);

Now, if we want to fetch the average salary for each department, we can create a CTE to do just that:

WITH avg_salaries AS ( SELECT department, AVG(salary) AS avg_salary FROM employees GROUP BY department ) SELECT * FROM avg_salaries;

Explanation of the CTE

  1. WITH Clause: The WITH clause defines the CTE named avg_salaries.
  2. Temporary Result Set: The SQL within the parentheses is executed first, creating a temporary result set containing the average salary per department.
  3. Subsequent Query: The outer SELECT statement queries this temporary result by simply referencing its name.

Using CTEs helps in breaking complex queries into smaller, manageable pieces.

Recursive Queries in PostgreSQL

Recursive queries are powerful CTEs that allow you to query hierarchical data, which is data that can be represented in a tree structure. This is useful in cases like finding employees in an organizational structure or navigating through deeply nested data.

Understanding Recursive Queries

A recursive query works through a two-part process: defining the base case and the recursive step. Let’s consider an example using a table representing organizational hierarchy:

CREATE TABLE org_chart ( employee_id SERIAL PRIMARY KEY, employee_name VARCHAR(100), manager_id INT REFERENCES org_chart(employee_id) ); INSERT INTO org_chart (employee_name, manager_id) VALUES ('CEO', NULL), ('CTO', 1), ('CFO', 1), ('Engineer 1', 2), ('Engineer 2', 2), ('Accountant', 3);

In this scenario, we want to find all employees under the CTO.

Writing a Recursive Query

The recursive query can be written as follows:

WITH RECURSIVE employee_hierarchy AS ( SELECT employee_id, employee_name, manager_id FROM org_chart WHERE employee_name = 'CTO' -- Base case UNION ALL SELECT o.employee_id, o.employee_name, o.manager_id FROM org_chart o INNER JOIN employee_hierarchy e ON o.manager_id = e.employee_id -- Recursive step ) SELECT * FROM employee_hierarchy;

Breaking Down the Recursive Query

  1. Base Case: The first SELECT fetches the employee with the name 'CTO'. This serves as the starting point for our recursion.
  2. Recursive Step: The UNION ALL combines the results of the base case with a recursive SELECT that fetches all employees managed by each employee already found in employee_hierarchy.
  3. Result: The outer query then collects all employees associated with the CTO, including any nested subordinates.

Conclusion

Understanding and utilizing CTEs and recursive queries can significantly enhance how you structure and query data in PostgreSQL. By breaking complex logic into digestible parts and leveraging recursion for hierarchical data, you can write cleaner, more efficient SQL that is easier to maintain as your databases grow.

Share now!

Like & Bookmark!