Blog of Ankarion

A blog about applied SQL/C/C++

Transform


Preface

Jsonb is not a new feature in PostgreSQL and you can easily find some articles by just googling it(even I have one in drafts). But while writing my jsonb article, I found out that I couldn't find a proper way of working with jsonb inside triggers.

After googling for a few seconds I thought "Meh, I've tried. Maybe it is easier to write my own way of working with jsonb."

My own way of working with jsonb

So, I'm happy to introduce you the "My own way of working with jsonb". The idea is to take a json as the incoming parameter(which in perl is $_[0]) and inside the function parse it into the desired object. TA-dah!

use JSON;
my $hash = decode_json($_[0]);

Well... In the end, it became obvious that this is not the best solution because PostgreSQL 9.5+ provides "create transform" functionality which is supposed to work better.

Intro

This article is dedicated to transforms and designed in order to save you from some possible mistakes. I will show what "transform" is and how to use it on a simple artificial example, at the end of this article there is a "benchmark" section which will compare jsonb + transform vs json + decode_hash (which was described in "Preface" part)

Definition

Transforms are supposed to define the way PostgreSQL object can be represented in certain language. For example, we want to represent PostgreSQL type "hstore" as a perl type. This can be done through 4 lines of SQL code:

CREATE TRANSFORM FOR hstore LANGUAGE pl/perl (
    FROM SQL WITH FUNCTION name_of_function_from_sql (name_of_the_argument [, ...]),
    TO SQL WITH FUNCTION name_of_function_to_sql (name_of_the_argument [, ...])
)

The problem is that we have to define those two functions "name_of_function_from_sql" and "name_of_function_to_sql". They describe the way the object will be transformed.

It is not always the easiest solution, but the good news is that some transforms are already implemented. Jsonb can be transformed into python and into perl (feel free to check it out and participate in perl and python patch discussions).

So, the only thing you need to do is to install an extension:

create extension jsonb_plperl;

And show that transform should be used in the definition of the function:

TRANSFORM FOR TYPE jsonb

Full SQL code should look like this:

CREATE EXTENSION jsonb_plperl CASCADE;

CREATE FUNCTION blah(val jsonb) RETURNS jsonb
LANGUAGE plperl
TRANSFORM FOR TYPE jsonb
AS $$
    $val = $_[0];
    ...
    return $val;
$$;

Benchmarks

This is the most spectacular part of this article. We are going to find out which method is better(faster) - the old one or the "transform".

Benchmarking process is divided into two stages:

  • the init part
  • the workload part

Init

In "init" part, we initialize the functions which transforms objects into perl and then parses it back to plpgsql language:

CREATE EXTENSION jsonb_plperlu CASCADE;

CREATE FUNCTION test1(val jsonb) RETURNS jsonb
LANGUAGE plperlu
TRANSFORM FOR TYPE jsonb
AS $$
return (%_[0]);
$$;

CREATE FUNCTION test2(val text) RETURNS text
LANGUAGE plperlu
AS $$
use JSON;
my $hash = decode_json($_[0]);
return encode_json $hash;
$$;

Workloads

Taking into account the init part, workloads look like:

select testold('{...}'::json)::json;

and

select testnew('{...}'::jsonb);

Where {...} - is json which contains strings associated with it's int representation. For example:

{
    "1":1,
    "2":2,
    etc
}

Here you can find a python script I used to generate workloads. Important thing to note before running the script - you need to create a directory called "tests" inside the directory where gen_tests.py is located.

Results

On this chart, you can see how bad was my first approach in comparison with transform. The higher graph values go, the more milliseconds corresponding method worked.

The "bad practice" is the first approach I've been talking about. As you can see, "transform" grows slower, which means that "transform" approach works a lot faster. This is the reason why I called "bad practice" bad.