PostgreSQL – NoSQL and Javascript Functions with PLV8 – Part 2

PostgreSQL and it’s NoSQL features are awesome, right? If you don’t know what I am talking about, you should check my previous post to familiarize yourself with an introduction to NoSQL in Postgres. While it doesn’t go in depth of the NoSQL features, it does get you started much faster knowing that you can manipulate and operate on JSONB/JSON data using the Javascript language.

In this post we are going to talk about a bit more of the NoSQL features and functions that we can use to navigate our way around NoSQL columns. Some of the things we are going to look at:

  • Making JSONB/JSON columns part of our SELECT or UPDATE (SQL) queries
  • Modifying JSONB fields partially (or filling in details that need to come from other relational fields)

Querying JSONB/JSON NoSQL columns

In the simplest of use cases, you will find a need to select your rows based on a fields value within the JSONB. For the purpose of our exercise, let us assume a table and possible values of the column cells.

Posts

id      |  owner_id  |  post_text                                                                |   comments
———————————————————————————————–
1        |  27                 |  PostgreSQL is fun!                                            |   [ ]

In the above example, we have a Posts table which has a column called comments of type JSONB. Why JSONB? Because then we don’t need to have a separate table called comments which has a post_id column with foreign key reference to our posts table. The comments JSON structure could look like the following:

[{
   "comment": "Yes indeed it's fun",
   "timestamp": 1234567890,
   "owner": 12
},
{
   "comment": "Where I live, Mongo rules!",
   "timestamp": 1234567890,
   "owner": 18
}]

The above structure, if you notice, is firstly a JSON array since it starts with the square brackets. Secondly, in each comment in the array, we do not repeat information like owner_name or owner_profile_picture_url. We make a safe assumption that these details are available in a single Profiles table which has the profile details of each user in the system. So far so good right? It absolutely seems like the perfect spot to use NoSQL JSONB datatype. But there are some problems we will encounter later when we get down to build usable APIs against our table that need to be consumed by our front end apps.

Problem 1

How do I get the owner_name and owner_profile_picture_url for each “owner” of the comment? In the traditional world of RDBMS, it would be a simple join. That is what we will do with the owner_id in the Posts table.

SELECT 
    o.name, o.profile_picture_url, p.post_text
FROM
    posts p INNER JOIN profiles o ON p.owner_id = o.id

Let us now see how we would do something similar with the comments JSONB array. But before we get to the array, we must see what difference it would have made if instead of being an array it was a single comment.

{
   "comment": "Yes indeed it's fun",
   "timestamp": 1234567890,
   "owner": 12
}

In this case we would use one of the arrow selectors part of PostgreSQL, and do something like this:

SELECT 
 o.name, o.profile_picture_url, p.post_text, 
 p.comments->>'comment' as comment,
 co.name as comment_owner_name, co.profile_picture_url as comment_owner_profile_picture_url
FROM
 posts p INNER JOIN profiles o ON p.owner_id = o.id
 LEFT JOIN profiles co ON p.comments->>'owner' = co.id

Does that give you a fair idea of how to pick a field from a JSONB column? Therefore, if you have a meta JSONB column in a table, you could extend the meta column with additional future columns and never have to change the schema itself!

But the above hasn’t solved our problem yet, i.e. how do we do the same for a JSONB array. But before that we need to at a few more basic things with it. For instance, selecting is one thing, how do we filter based on the JSONB array? Let us look at that next.

Searching Through a JSONB Array

The simplest way to search a JSONB array from within a SQL query is to use the @> operator. This operator lets you find through individual elements in an array. Let us presume a facebook style ‘likes’ column on the posts table. Instead of solving it with a traditional RDBMS style approach where we introduce a likes table with the ids of all the users who have liked a certain post with the post_id foreign key, we will rather use a JSONB array to store all the owner IDs right within the posts table. In the example below, the post with id 1 has been liked with profiles with id 18 and 4.

id      |  owner_id  |  post_text                                                                |   comments   | likes
———————————————————————————————–
1        |  27                 |  PostgreSQL is fun!                                            |   [ ]                      | [18, 4]

Now what if we need to find all the posts which have been liked by the user with id 4, we would need to do something like this –

SELECT * FROM posts p WHERE likes @> '[4]'::jsonb

We have to take explicit care about the operator, operands and the data types of the operands. It is very easy to get sucked down the rabbit hole if you miss these details.

Problem 2

How do we join the details from a JSONB array with the concerned tables and fetch more details like name of the user or their profile picture etc. ? We left a similar question unanswered in Problem 1 which we will find answers to now.

The jsonb_array_elements function is almost a must-use when we are down to opening up an array, picking up the elements we need and joining them with another table. What this function does, is to open and break up each element in the array as if it was a cell in a set of rows. So if you were to do something like this –

SELECT jsonb_array_elements(likes) as profile_id FROM posts
WHERE posts.id = 1;

Would return you something like this –

|    profile_id   |
——————————————-
18
4

Now we can use this to join on the profiles table and fetch the user details. How we choose to do it is up to us individual developers, but here is simple way of doing this

SELECT 
    p.name, p.profile_picture_url
FROM
    profiles p INNER JOIN
    (SELECT jsonb_array_elements(likes) as profile_id FROM posts
     WHERE posts.id = 1) pid
     ON pid.profile_id = p.id

This is extremely powerful. It simplifies our otherwise complex ER Diagram by an order of magnitude. I no more need redundant data or several tables that are a result of normalization and have id columns spread all over. Things can be kept concise as long as you are cognizant of your NoSQL and RDBMS hybrid design. Keep in mind NoSQL is by definition is not enforcing which means that you could throw in anything out of place in the JSONB columns and it wouldn’t complain. So an obvious design choice would be to use relational data-model where it makes sense and use NoSQL where it doesn’t.

For our final leg in this post, we will find the answer to the question posed in Problem 1, i.e. how can we modify the JSON output so that it includes all of the information needed to send out to our consumers e.g. an API service.

Modifying Your JSONB Response

Being able to search through and connect data with JSONB is one step in the right direction. However being able to turn around concise information in modern data formats like JSON right from within your queries is what we are gunning for. The simplest way to achieve this is using PLV8 which is a native Javascript programming environment. You can modify JSON objects just the way you would in web environments.

But in order to be cognizant of performance, it pays off to learn some of the functions in Postgres that let you modify a JSON/JSONB column on the fly during a query. Let us revisit the problem we left unsolved in Problem 1 and also return the names and profile picture URLs for the people who have commented on a post. For this we will use the function called jsonb_set.

SYNTAX: jsonb_set

jsonb_set(target jsonb, path text[], new_value jsonb[,create_missing boolean])

Once you get a hang of using jsonb_set, you can manipulate JSONB objects right from within a query. It’s time to solve our problem –

SELECT 
       jsonb_set(p.comments, '{name}', ('"' || pf.name || '"')::jsonb, true) as comments
FROM
     profiles pf INNER JOIN (SELECT
                                      jsonb_array_elements(comments) as comments
                               FROM posts) p
         ON p.comments->>'owner' = pf.id::text

RESULT:
{"name": "chris", "owner": 1, "comment": "i like it"}
{"name": "abi", "owner": 2, "comment": "me too!"}

In the above solution, we added a name field to the JSONB comment. If you notice this combines the solutions from both Problem 1 and 2 to produce the right JSONB output. Just the way we added the comment owners name to the JSONB we can fetch as many details as we like and append to create a formatted JSON that is just the way our end consumer apps want!

Now venture out into the world of NoSQL and Javascript PLV8 and tell me if it enhances or spices up a relational DB setup. Bon-voyage mes amis!

PostgreSQL – Writing Javascript Functions in PLV8 with NoSQL

PostgreSQL is awesome, right?! We are doing our fourth successful project with PostgreSQL 10.3 as our Data Persistence Layer (database) and with each implementation, we are loving it. At first, it was the NoSQL features within a relational environment that got us hooked. It just is so much easier to convince the dinosaurs (old techies in their post 50s who tend to have an adverse opinion on any new tech) to go flirt with NoSQL. In my experience, as long as we stayed in their comfort zones by keeping 90% of our Data Model relational, and only about 10% of NoSQL in our structure – everyone was happy.

How did we introduce NoSQL in the traditional world of RDBMS developers?

The first thing we did, was to add a “meta” column with JSONB type to almost every table. It was almost invisible to the naked eye on an ER diagram. No one bothered to ask about a column named “meta” much at first. That changed drastically over time. Here is an example. Any time someone realised that they actually needed a many to many relationship between two tables, we would lap up the opportunity to show off what NoSQL could achieve with minimal amount of changes – and with elegance. The neglected poor old “meta” column that had spent most of it’s lifetime remaining “null” now sprung into action and solved a real world problem. To take an example, assume two tables, one called Restaurants and another called Menus. Initially we designed the system believing that a restaurant can have multiple menus (while in real life it’s only a single menu for most) so we addressed the issue by having a one to many relationship between Restaurants and Menus – i.e. one Restaurant could have several Menus. As time went on, we encountered a client who had several restaurants and each restaurant had several menus (depending on what time of the day you went there). Now unfortunately our old fashioned approach needed work-arounds to solve this problem because there was no easy way to make the a menu be a part of several restaurants. We decided to solve it using two approaches. The first one was a traditional crosslink table. The second was adding a JSON array field into the meta JSON called restaurants in the menus table and vice versa in the restaurants table. I won’t go into much detail, but you already get the idea about which solution was more elegant. NoSQL clearly won the preference.

PLV8 JavaScript Functions

OK now we are ready to dive into the world of NoSQL inside of JavaScript and look at what PLV8 can do for us. The biggest criticism we would usually take for adding NoSQL into our Data Model was about how non-standard and cumbersome it was for us to use PostgreSQL JSON functions to play around with the NoSQL data. And yes it isn’t pretty and neither is it a standardized approach that someone from the world of Oracle or SQL Server could easily familiarise themselves with. Say hello to the PLV8 extension! Now we have a standard programming language called JavaScript that is understood and known by a large group of developers. The adventurous kinds in the area of RDBMS have at some point or the other dipped their toes in NoSQL and encountered JavaScript along the way. Those were the ones I convinced on exploring PLV8 – and eureka! – in a short amount of time we had a good chunk of functions written in JavaScript living right beside the traditional PL/pgSQL functions.

OK enough, show me how it’s done

Step 1.
Add the PLV8 extension to your PostgreSQL database.

CREATE EXTENSION plv8;

Step 2.
Write your first function!

CREATE OR REPLACE FUNCTION public.getreviews(postid bigint)
 RETURNS json
 LANGUAGE 'plv8'

COST 100
 VOLATILE 
AS $BODY$


var plan = plv8.prepare('SELECT COALESCE(reviews, \'[]\'::jsonb) as reviews FROM posts WHERE id = $1', ['bigint']);
 var reviews = plan.execute([i_postsid])[0].reviews;
 for (var index = 0; index < reviews.length; index++) {
    var review = reviews[index];
    //do something with your review object
 }
 return reviews;

$BODY$;

What the above function achieves is simply take a postid and return all the reviews in a NoSQL field. But if you are a JavaScript junkie, then you already know how to open the pandora’s box now! You can manipulate the JSON way more easily compared to using inbuilt JSON functions in PostgreSQL and pass it around. In the above example, note the few things of importance. Number one is the plv8 object which acts as our bridge to the PostgreSQL database. Second is the fact that your regular PL/pgSQL is no more a first class citizen within those $BODY$ start and end markers. We have gone JavaScript!

I have kept this short to serve as an introduction and motivational to help interested developers push the NoSQL agenda. Cheers!