2014-04-14

DB view dependencies in PostgreSQL

Introduction

I recently learned a clever trick in PostgreSQL, which allows you to check which DB views and tables are used by any view. It allows you to see view's dependency tree, for example:
public.v_top_department depends on public.v_department_sales_per_employee
  public.v_department_sales_per_employee depends on public.v_department_sales
    public.v_department_sales depends on public.departments
    public.v_department_sales depends on public.sales
  public.v_department_sales_per_employee depends on public.v_department_employees
    public.v_department_employees depends on public.departments
    public.v_department_employees depends on public.employees
How to achieve it

Suppose we have a simple database with employees, departments and sales. Employees belong to departments and departments make sales:
create table departments
(
  dept_id integer primary key,
  dept_name varchar(255)
);

create table employees
(
  emp_id integer primary key,
  emp_name varchar(255),
  dept_id integer,
  constraint emp_dept_fkey foreign key(dept_id) references departments(dept_id)
);

create table sales
(
  sale_id integer primary key,
  sale_value numeric,
  dept_id integer,
  constraint sale_dept_fkey foreign key(dept_id) references departments(dept_id)
);
We also have a few views built on this tables: the number of employees in each department, the sum of sales in each department, the sum of sales per employee in each department and the "top" department - the one with the best sum of sales to number of employees ratio:
create view v_department_employees as
select d.dept_id, d.dept_name, count(1) num_employees
from departments d
join employees e on d.dept_id = e.dept_id
group by d.dept_id, d.dept_name;

create view v_department_sales as
select d.dept_id, d.dept_name, sum(sale_value) sum_sales
from departments d
join sales s on d.dept_id = s.dept_id
group by d.dept_id, d.dept_name;

create view v_department_sales_per_employee as
select de.dept_id, de.dept_name, de.num_employees, ds.sum_sales, 
  ds.sum_sales / de.num_employees sales_per_emp
from v_department_employees de
join v_department_sales ds on de.dept_id = ds.dept_id;

create view v_top_department as
select dspe.dept_id, dspe.dept_name, dspe.num_employees, dspe.sum_sales, dspe.sales_per_emp
from v_department_sales_per_employee dspe
order by sales_per_emp desc
limit 1;
So how to check dependencies of a view? The trick is to use information_schema.view_table_usage view. Its columns: view_schema, view_name, table_schema, table_name describe dependencies between views and tables. This is the simplest query against this view:
select vtu.view_schema || '.' || vtu.view_name || ' depends on ' 
  || vtu.table_schema || '.' || vtu.table_name
from information_schema.view_table_usage vtu
where vtu.view_schema = 'public' and vtu.view_name = 'v_top_department';
This is its result:
public.v_top_department depends on public.v_department_sales_per_employee
As you can see it shows only direct dependencies of a view. In order to see transitive dependencies, we have to use recursive query:
with recursive view_tree(parent_schema, parent_obj, child_schema, child_obj) as 
(
  select vtu_parent.view_schema, vtu_parent.view_name, 
    vtu_parent.table_schema, vtu_parent.table_name
  from information_schema.view_table_usage vtu_parent
  where vtu_parent.view_schema = 'public' and vtu_parent.view_name = 'v_top_department'
  union all
  select vtu_child.view_schema, vtu_child.view_name, 
    vtu_child.table_schema, vtu_child.table_name
  from view_tree vtu_parent, information_schema.view_table_usage vtu_child
  where vtu_child.view_schema = vtu_parent.child_schema 
  and vtu_child.view_name = vtu_parent.child_obj
) 
select tree.parent_schema || '.' || tree.parent_obj || ' depends on ' 
  || tree.child_schema || '.' || tree.child_obj
from view_tree tree;
It returns:
public.v_top_department depends on public.v_department_sales_per_employee
public.v_department_sales_per_employee depends on public.v_department_sales
public.v_department_sales_per_employee depends on public.v_department_employees
public.v_department_sales depends on public.departments
public.v_department_sales depends on public.sales
public.v_department_employees depends on public.departments
public.v_department_employees depends on public.employees
There are more results now, but they are not very readable. Let's add some indentation and sort the rows approprietly to display tree structure:
with recursive view_tree(parent_schema, parent_obj, child_schema, child_obj, ind, ord) as 
(
  select vtu_parent.view_schema, vtu_parent.view_name, 
    vtu_parent.table_schema, vtu_parent.table_name, 
    '', array[row_number() over (order by view_schema, view_name)]
  from information_schema.view_table_usage vtu_parent
  where vtu_parent.view_schema = 'public' and vtu_parent.view_name = 'v_top_department'
  union all
  select vtu_child.view_schema, vtu_child.view_name, 
    vtu_child.table_schema, vtu_child.table_name, 
    vtu_parent.ind || '  ', 
    vtu_parent.ord || (row_number() over (order by view_schema, view_name))
  from view_tree vtu_parent, information_schema.view_table_usage vtu_child
  where vtu_child.view_schema = vtu_parent.child_schema 
  and vtu_child.view_name = vtu_parent.child_obj
) 
select tree.ind || tree.parent_schema || '.' || tree.parent_obj 
  || ' depends on ' || tree.child_schema || '.' || tree.child_obj txt, tree.ord
from view_tree tree
order by ord;
It returns:
                                        txt                                        |   ord
-----------------------------------------------------------------------------------+---------
 public.v_top_department depends on public.v_department_sales_per_employee         | {1}
   public.v_department_sales_per_employee depends on public.v_department_sales     | {1,1}
     public.v_department_sales depends on public.departments                       | {1,1,3}
     public.v_department_sales depends on public.sales                             | {1,1,4}
   public.v_department_sales_per_employee depends on public.v_department_employees | {1,2}
     public.v_department_employees depends on public.departments                   | {1,2,1}
     public.v_department_employees depends on public.employees                     | {1,2,2}
I also included the ord column in the result, so you can exactly see how the ordering works. It uses row_number() window function and collects the row_numbers in an array to deliver desired order.

If you would like to see a working example, check this sqlfiddle (or, if sqlfiddle is unavailable, this gist for a complete source code).