PostgreSQL – NoSQL and Javascript Functions with PLV8 – Part 2

PostgreSQL and it’s NoSQL features are awesome, right? If you don’t know what I am talking about, you should check my previous post to familiarize yourself with an introduction to NoSQL in Postgres. While it doesn’t go in depth of the NoSQL features, it does get you started much faster knowing that you can manipulate and operate on JSONB/JSON data using the Javascript language.

In this post we are going to talk about a bit more of the NoSQL features and functions that we can use to navigate our way around NoSQL columns. Some of the things we are going to look at:

  • Making JSONB/JSON columns part of our SELECT or UPDATE (SQL) queries
  • Modifying JSONB fields partially (or filling in details that need to come from other relational fields)

Querying JSONB/JSON NoSQL columns

In the simplest of use cases, you will find a need to select your rows based on a fields value within the JSONB. For the purpose of our exercise, let us assume a table and possible values of the column cells.

Posts

id      |  owner_id  |  post_text                                                                |   comments
———————————————————————————————–
1        |  27                 |  PostgreSQL is fun!                                            |   [ ]

In the above example, we have a Posts table which has a column called comments of type JSONB. Why JSONB? Because then we don’t need to have a separate table called comments which has a post_id column with foreign key reference to our posts table. The comments JSON structure could look like the following:

[{
   "comment": "Yes indeed it's fun",
   "timestamp": 1234567890,
   "owner": 12
},
{
   "comment": "Where I live, Mongo rules!",
   "timestamp": 1234567890,
   "owner": 18
}]

The above structure, if you notice, is firstly a JSON array since it starts with the square brackets. Secondly, in each comment in the array, we do not repeat information like owner_name or owner_profile_picture_url. We make a safe assumption that these details are available in a single Profiles table which has the profile details of each user in the system. So far so good right? It absolutely seems like the perfect spot to use NoSQL JSONB datatype. But there are some problems we will encounter later when we get down to build usable APIs against our table that need to be consumed by our front end apps.

Problem 1

How do I get the owner_name and owner_profile_picture_url for each “owner” of the comment? In the traditional world of RDBMS, it would be a simple join. That is what we will do with the owner_id in the Posts table.

SELECT 
    o.name, o.profile_picture_url, p.post_text
FROM
    posts p INNER JOIN profiles o ON p.owner_id = o.id

Let us now see how we would do something similar with the comments JSONB array. But before we get to the array, we must see what difference it would have made if instead of being an array it was a single comment.

{
   "comment": "Yes indeed it's fun",
   "timestamp": 1234567890,
   "owner": 12
}

In this case we would use one of the arrow selectors part of PostgreSQL, and do something like this:

SELECT 
 o.name, o.profile_picture_url, p.post_text, 
 p.comments->>'comment' as comment,
 co.name as comment_owner_name, co.profile_picture_url as comment_owner_profile_picture_url
FROM
 posts p INNER JOIN profiles o ON p.owner_id = o.id
 LEFT JOIN profiles co ON p.comments->>'owner' = co.id

Does that give you a fair idea of how to pick a field from a JSONB column? Therefore, if you have a meta JSONB column in a table, you could extend the meta column with additional future columns and never have to change the schema itself!

But the above hasn’t solved our problem yet, i.e. how do we do the same for a JSONB array. But before that we need to at a few more basic things with it. For instance, selecting is one thing, how do we filter based on the JSONB array? Let us look at that next.

Searching Through a JSONB Array

The simplest way to search a JSONB array from within a SQL query is to use the @> operator. This operator lets you find through individual elements in an array. Let us presume a facebook style ‘likes’ column on the posts table. Instead of solving it with a traditional RDBMS style approach where we introduce a likes table with the ids of all the users who have liked a certain post with the post_id foreign key, we will rather use a JSONB array to store all the owner IDs right within the posts table. In the example below, the post with id 1 has been liked with profiles with id 18 and 4.

id      |  owner_id  |  post_text                                                                |   comments   | likes
———————————————————————————————–
1        |  27                 |  PostgreSQL is fun!                                            |   [ ]                      | [18, 4]

Now what if we need to find all the posts which have been liked by the user with id 4, we would need to do something like this –

SELECT * FROM posts p WHERE likes @> '[4]'::jsonb

We have to take explicit care about the operator, operands and the data types of the operands. It is very easy to get sucked down the rabbit hole if you miss these details.

Problem 2

How do we join the details from a JSONB array with the concerned tables and fetch more details like name of the user or their profile picture etc. ? We left a similar question unanswered in Problem 1 which we will find answers to now.

The jsonb_array_elements function is almost a must-use when we are down to opening up an array, picking up the elements we need and joining them with another table. What this function does, is to open and break up each element in the array as if it was a cell in a set of rows. So if you were to do something like this –

SELECT jsonb_array_elements(likes) as profile_id FROM posts
WHERE posts.id = 1;

Would return you something like this –

|    profile_id   |
——————————————-
18
4

Now we can use this to join on the profiles table and fetch the user details. How we choose to do it is up to us individual developers, but here is simple way of doing this

SELECT 
    p.name, p.profile_picture_url
FROM
    profiles p INNER JOIN
    (SELECT jsonb_array_elements(likes) as profile_id FROM posts
     WHERE posts.id = 1) pid
     ON pid.profile_id = p.id

This is extremely powerful. It simplifies our otherwise complex ER Diagram by an order of magnitude. I no more need redundant data or several tables that are a result of normalization and have id columns spread all over. Things can be kept concise as long as you are cognizant of your NoSQL and RDBMS hybrid design. Keep in mind NoSQL is by definition is not enforcing which means that you could throw in anything out of place in the JSONB columns and it wouldn’t complain. So an obvious design choice would be to use relational data-model where it makes sense and use NoSQL where it doesn’t.

For our final leg in this post, we will find the answer to the question posed in Problem 1, i.e. how can we modify the JSON output so that it includes all of the information needed to send out to our consumers e.g. an API service.

Modifying Your JSONB Response

Being able to search through and connect data with JSONB is one step in the right direction. However being able to turn around concise information in modern data formats like JSON right from within your queries is what we are gunning for. The simplest way to achieve this is using PLV8 which is a native Javascript programming environment. You can modify JSON objects just the way you would in web environments.

But in order to be cognizant of performance, it pays off to learn some of the functions in Postgres that let you modify a JSON/JSONB column on the fly during a query. Let us revisit the problem we left unsolved in Problem 1 and also return the names and profile picture URLs for the people who have commented on a post. For this we will use the function called jsonb_set.

SYNTAX: jsonb_set

jsonb_set(target jsonb, path text[], new_value jsonb[,create_missing boolean])

Once you get a hang of using jsonb_set, you can manipulate JSONB objects right from within a query. It’s time to solve our problem –

SELECT 
       jsonb_set(p.comments, '{name}', ('"' || pf.name || '"')::jsonb, true) as comments
FROM
     profiles pf INNER JOIN (SELECT
                                      jsonb_array_elements(comments) as comments
                               FROM posts) p
         ON p.comments->>'owner' = pf.id::text

RESULT:
{"name": "chris", "owner": 1, "comment": "i like it"}
{"name": "abi", "owner": 2, "comment": "me too!"}

In the above solution, we added a name field to the JSONB comment. If you notice this combines the solutions from both Problem 1 and 2 to produce the right JSONB output. Just the way we added the comment owners name to the JSONB we can fetch as many details as we like and append to create a formatted JSON that is just the way our end consumer apps want!

Now venture out into the world of NoSQL and Javascript PLV8 and tell me if it enhances or spices up a relational DB setup. Bon-voyage mes amis!

Leave a Reply

Your email address will not be published. Required fields are marked *