2014-06-15

Virtual Private Database and how to integrate it with a Java application

Virtual Private Database (VPD) allows you to implement row and column level security in your database by "hiding" data in tables, so that only a subset of all table data is accessible. The cool things is that the "hiding" can be done based on your application user's identity or roles, so that privileged users can access more data than normal users - and it is all managed at database level!

Oracle
Oracle Enterprise Edition has a very powerful implementation of this concept. It allows you to define VPD policies, which can restrict access on row and/or column level. I won't describe it in a lot of details since there's a lot of documentation on this topic (for example here). I'll just show the most basic usage example - restricting access to rows of a table. In order to do this, you can associate a table with a function which returns predicate. Whenever you SELECT/UPDATE/DELETE on such table, the result of the function is added to the WHERE clause of your statement. For example, if you have function ident_func:
create or replace function ident_func (p_schema in varchar2 default null, p_object in varchar2 default null) return varchar2 
as 
begin 
   return 'assigned_user_id = sys_context(''userenv'', ''client_identifier'')';
end;
and you run this query:
select ... from tickets;
it will be transformed into this:
select ... from tickets where assigned_user_id = sys_context('userenv', 'client_identifier');

If you are using "lower" versions of Oracle database than EE, you can use "poor man's VDP". The idea is simple: for each table, that you want to secure, you should create a view, which contains all security related checks in its definition:
create view v_tickets_vpd as
select ... 
from tickets
where assigned_user_id = sys_context('userenv', 'client_identifier');
Then you must make sure, that you are doing all queries and modifications through that view (Oracle supports updateable views).

PostgreSQL
If you are using PostgreSQL, you can also implement poor man's VPD. It is possible with set_config()/current_setting() functions:
  • set_config(setting_name, new_value, is_local) sets the value for parameter setting_name to new_value. If is_local is true the value is set for current transaction, otherwise it is set for curernt DB session.
  • current_setting(setting_name) returns the current value for parameter setting_name
The previous view, rewritten for PostgreSQL, looks like this:
create view v_tickets_vpd as
select ... 
from tickets
where assigned_user_id = current_setting('myapp.user_id')
Is it fast? The current_setting() function is defined as STABLE. This means the database knows that, for given arguments, the function will return the same value for all rows within one statement. Thus it can be called once and used in an index scan:
create table tickets
(
  id bigserial primary key,
  title varchar(255) not null,
  assigned_user_id varchar(255) not null
);

insert into tickets(title, assigned_user_id)
select 'Ticket ' || i, 'user_' || (i % 10)
from generate_series(1, 1000000) i;

create index tickets_user_idx on tickets(assigned_user_id);

create view v_tickets_vpd as
select id, title, assigned_user_id 
from tickets
where assigned_user_id = current_setting('myapp.user_id');

select set_config('myapp.user_id', 'user_1', false);

explain select count(1) from v_tickets_vpd;
On my database it returned:
Aggregate  (cost=10902.24..10902.25 rows=1 width=0)
  ->  Bitmap Heap Scan on tickets  (cost=1835.41..10657.40 rows=97933 width=0)
        Recheck Cond: ((assigned_user_id)::text = current_setting('myapp.user_id'::text))
        ->  Bitmap Index Scan on tickets_user_idx  (cost=0.00..1810.93 rows=97933 width=0)
              Index Cond: ((assigned_user_id)::text = current_setting('myapp.user_id'::text))

How to use VPD with a Java app?
No matter which "implementation" of VPD you are using (Oracle EE or poor man's VPD in Oracle/PostgreSQL) the idea is the same:
  • before you send any statements to the DB, you need to call some function to setup information about your app's user
  • after you are done working with the database you should cleanup this information (because you probably obtained the DB connection from a pool, so it's best to return it to the pool in the same state)

I know two solutions for this problem (but there are more for sure):
  • Eclipse Link has very good support for VPD. It allows you to define a listener which is executed after you get a Connection from the DataSource and before you return it.
  • If you are using Spring to manage transactions, you can achieve similar effect with PlatformTransactionManager:
public class DelegatingPlatformTransactionManager implements PlatformTransactionManager {

  private final PlatformTransactionManager delegate;

  public DelegatingPlatformTransactionManager(PlatformTransactionManager delegate) {
    this.delegate = delegate;
  }

  @Override
  public TransactionStatus getTransaction(TransactionDefinition definition) throws TransactionException {
    return delegate.getTransaction(definition);
  }

  @Override
  public void commit(TransactionStatus status) throws TransactionException {
    delegate.commit(status);
  }

  @Override
  public void rollback(TransactionStatus status) throws TransactionException {
    delegate.rollback(status);
  }
}

public class VpdPlatformTransactionManager extends DelegatingPlatformTransactionManager {

  public VpdPlatformTransactionManager(PlatformTransactionManager delegate) {
    super(delegate);
  }

  @Override
  public TransactionStatus getTransaction(TransactionDefinition definition) throws TransactionException {
    TransactionStatus transaction = super.getTransaction(definition);
    if (transaction.isNewTransaction()) {
      // setup VPD info
      TransactionSynchronizationManager.registerSynchronization(new TransactionSynchronizationAdapter() {
        @Override
        public void beforeCompletion() {
          // cleanup VPD info
        }
      });
    }
    return transaction;
  }
}

Summary
VPD is an interesting mechanism for defining row-level security. Security rules are defined in a declarative language (SQL), are located close to the data they protect and are automatically applied when you access the database.