PostgreSQL – NoSQL and Javascript Functions with PLV8 – Part 2

PostgreSQL and it’s NoSQL features are awesome, right? If you don’t know what I am talking about, you should check my previous post to familiarize yourself with an introduction to NoSQL in Postgres. While it doesn’t go in depth of the NoSQL features, it does get you started much faster knowing that you can manipulate and operate on JSONB/JSON data using the Javascript language.

In this post we are going to talk about a bit more of the NoSQL features and functions that we can use to navigate our way around NoSQL columns. Some of the things we are going to look at:

  • Making JSONB/JSON columns part of our SELECT or UPDATE (SQL) queries
  • Modifying JSONB fields partially (or filling in details that need to come from other relational fields)

Querying JSONB/JSON NoSQL columns

In the simplest of use cases, you will find a need to select your rows based on a fields value within the JSONB. For the purpose of our exercise, let us assume a table and possible values of the column cells.

Posts

id      |  owner_id  |  post_text                                                                |   comments
———————————————————————————————–
1        |  27                 |  PostgreSQL is fun!                                            |   [ ]

In the above example, we have a Posts table which has a column called comments of type JSONB. Why JSONB? Because then we don’t need to have a separate table called comments which has a post_id column with foreign key reference to our posts table. The comments JSON structure could look like the following:

[{
   "comment": "Yes indeed it's fun",
   "timestamp": 1234567890,
   "owner": 12
},
{
   "comment": "Where I live, Mongo rules!",
   "timestamp": 1234567890,
   "owner": 18
}]

The above structure, if you notice, is firstly a JSON array since it starts with the square brackets. Secondly, in each comment in the array, we do not repeat information like owner_name or owner_profile_picture_url. We make a safe assumption that these details are available in a single Profiles table which has the profile details of each user in the system. So far so good right? It absolutely seems like the perfect spot to use NoSQL JSONB datatype. But there are some problems we will encounter later when we get down to build usable APIs against our table that need to be consumed by our front end apps.

Problem 1

How do I get the owner_name and owner_profile_picture_url for each “owner” of the comment? In the traditional world of RDBMS, it would be a simple join. That is what we will do with the owner_id in the Posts table.

SELECT 
    o.name, o.profile_picture_url, p.post_text
FROM
    posts p INNER JOIN profiles o ON p.owner_id = o.id

Let us now see how we would do something similar with the comments JSONB array. But before we get to the array, we must see what difference it would have made if instead of being an array it was a single comment.

{
   "comment": "Yes indeed it's fun",
   "timestamp": 1234567890,
   "owner": 12
}

In this case we would use one of the arrow selectors part of PostgreSQL, and do something like this:

SELECT 
 o.name, o.profile_picture_url, p.post_text, 
 p.comments->>'comment' as comment,
 co.name as comment_owner_name, co.profile_picture_url as comment_owner_profile_picture_url
FROM
 posts p INNER JOIN profiles o ON p.owner_id = o.id
 LEFT JOIN profiles co ON p.comments->>'owner' = co.id

Does that give you a fair idea of how to pick a field from a JSONB column? Therefore, if you have a meta JSONB column in a table, you could extend the meta column with additional future columns and never have to change the schema itself!

But the above hasn’t solved our problem yet, i.e. how do we do the same for a JSONB array. But before that we need to at a few more basic things with it. For instance, selecting is one thing, how do we filter based on the JSONB array? Let us look at that next.

Searching Through a JSONB Array

The simplest way to search a JSONB array from within a SQL query is to use the @> operator. This operator lets you find through individual elements in an array. Let us presume a facebook style ‘likes’ column on the posts table. Instead of solving it with a traditional RDBMS style approach where we introduce a likes table with the ids of all the users who have liked a certain post with the post_id foreign key, we will rather use a JSONB array to store all the owner IDs right within the posts table. In the example below, the post with id 1 has been liked with profiles with id 18 and 4.

id      |  owner_id  |  post_text                                                                |   comments   | likes
———————————————————————————————–
1        |  27                 |  PostgreSQL is fun!                                            |   [ ]                      | [18, 4]

Now what if we need to find all the posts which have been liked by the user with id 4, we would need to do something like this –

SELECT * FROM posts p WHERE likes @> '[4]'::jsonb

We have to take explicit care about the operator, operands and the data types of the operands. It is very easy to get sucked down the rabbit hole if you miss these details.

Problem 2

How do we join the details from a JSONB array with the concerned tables and fetch more details like name of the user or their profile picture etc. ? We left a similar question unanswered in Problem 1 which we will find answers to now.

The jsonb_array_elements function is almost a must-use when we are down to opening up an array, picking up the elements we need and joining them with another table. What this function does, is to open and break up each element in the array as if it was a cell in a set of rows. So if you were to do something like this –

SELECT jsonb_array_elements(likes) as profile_id FROM posts
WHERE posts.id = 1;

Would return you something like this –

|    profile_id   |
——————————————-
18
4

Now we can use this to join on the profiles table and fetch the user details. How we choose to do it is up to us individual developers, but here is simple way of doing this

SELECT 
    p.name, p.profile_picture_url
FROM
    profiles p INNER JOIN
    (SELECT jsonb_array_elements(likes) as profile_id FROM posts
     WHERE posts.id = 1) pid
     ON pid.profile_id = p.id

This is extremely powerful. It simplifies our otherwise complex ER Diagram by an order of magnitude. I no more need redundant data or several tables that are a result of normalization and have id columns spread all over. Things can be kept concise as long as you are cognizant of your NoSQL and RDBMS hybrid design. Keep in mind NoSQL is by definition is not enforcing which means that you could throw in anything out of place in the JSONB columns and it wouldn’t complain. So an obvious design choice would be to use relational data-model where it makes sense and use NoSQL where it doesn’t.

For our final leg in this post, we will find the answer to the question posed in Problem 1, i.e. how can we modify the JSON output so that it includes all of the information needed to send out to our consumers e.g. an API service.

Modifying Your JSONB Response

Being able to search through and connect data with JSONB is one step in the right direction. However being able to turn around concise information in modern data formats like JSON right from within your queries is what we are gunning for. The simplest way to achieve this is using PLV8 which is a native Javascript programming environment. You can modify JSON objects just the way you would in web environments.

But in order to be cognizant of performance, it pays off to learn some of the functions in Postgres that let you modify a JSON/JSONB column on the fly during a query. Let us revisit the problem we left unsolved in Problem 1 and also return the names and profile picture URLs for the people who have commented on a post. For this we will use the function called jsonb_set.

SYNTAX: jsonb_set

jsonb_set(target jsonb, path text[], new_value jsonb[,create_missing boolean])

Once you get a hang of using jsonb_set, you can manipulate JSONB objects right from within a query. It’s time to solve our problem –

SELECT 
       jsonb_set(p.comments, '{name}', ('"' || pf.name || '"')::jsonb, true) as comments
FROM
     profiles pf INNER JOIN (SELECT
                                      jsonb_array_elements(comments) as comments
                               FROM posts) p
         ON p.comments->>'owner' = pf.id::text

RESULT:
{"name": "chris", "owner": 1, "comment": "i like it"}
{"name": "abi", "owner": 2, "comment": "me too!"}

In the above solution, we added a name field to the JSONB comment. If you notice this combines the solutions from both Problem 1 and 2 to produce the right JSONB output. Just the way we added the comment owners name to the JSONB we can fetch as many details as we like and append to create a formatted JSON that is just the way our end consumer apps want!

Now venture out into the world of NoSQL and Javascript PLV8 and tell me if it enhances or spices up a relational DB setup. Bon-voyage mes amis!

PostgreSQL – Writing Javascript Functions in PLV8 with NoSQL

PostgreSQL is awesome, right?! We are doing our fourth successful project with PostgreSQL 10.3 as our Data Persistence Layer (database) and with each implementation, we are loving it. At first, it was the NoSQL features within a relational environment that got us hooked. It just is so much easier to convince the dinosaurs (old techies in their post 50s who tend to have an adverse opinion on any new tech) to go flirt with NoSQL. In my experience, as long as we stayed in their comfort zones by keeping 90% of our Data Model relational, and only about 10% of NoSQL in our structure – everyone was happy.

How did we introduce NoSQL in the traditional world of RDBMS developers?

The first thing we did, was to add a “meta” column with JSONB type to almost every table. It was almost invisible to the naked eye on an ER diagram. No one bothered to ask about a column named “meta” much at first. That changed drastically over time. Here is an example. Any time someone realised that they actually needed a many to many relationship between two tables, we would lap up the opportunity to show off what NoSQL could achieve with minimal amount of changes – and with elegance. The neglected poor old “meta” column that had spent most of it’s lifetime remaining “null” now sprung into action and solved a real world problem. To take an example, assume two tables, one called Restaurants and another called Menus. Initially we designed the system believing that a restaurant can have multiple menus (while in real life it’s only a single menu for most) so we addressed the issue by having a one to many relationship between Restaurants and Menus – i.e. one Restaurant could have several Menus. As time went on, we encountered a client who had several restaurants and each restaurant had several menus (depending on what time of the day you went there). Now unfortunately our old fashioned approach needed work-arounds to solve this problem because there was no easy way to make the a menu be a part of several restaurants. We decided to solve it using two approaches. The first one was a traditional crosslink table. The second was adding a JSON array field into the meta JSON called restaurants in the menus table and vice versa in the restaurants table. I won’t go into much detail, but you already get the idea about which solution was more elegant. NoSQL clearly won the preference.

PLV8 JavaScript Functions

OK now we are ready to dive into the world of NoSQL inside of JavaScript and look at what PLV8 can do for us. The biggest criticism we would usually take for adding NoSQL into our Data Model was about how non-standard and cumbersome it was for us to use PostgreSQL JSON functions to play around with the NoSQL data. And yes it isn’t pretty and neither is it a standardized approach that someone from the world of Oracle or SQL Server could easily familiarise themselves with. Say hello to the PLV8 extension! Now we have a standard programming language called JavaScript that is understood and known by a large group of developers. The adventurous kinds in the area of RDBMS have at some point or the other dipped their toes in NoSQL and encountered JavaScript along the way. Those were the ones I convinced on exploring PLV8 – and eureka! – in a short amount of time we had a good chunk of functions written in JavaScript living right beside the traditional PL/pgSQL functions.

OK enough, show me how it’s done

Step 1.
Add the PLV8 extension to your PostgreSQL database.

CREATE EXTENSION plv8;

Step 2.
Write your first function!

CREATE OR REPLACE FUNCTION public.getreviews(postid bigint)
 RETURNS json
 LANGUAGE 'plv8'

COST 100
 VOLATILE 
AS $BODY$


var plan = plv8.prepare('SELECT COALESCE(reviews, \'[]\'::jsonb) as reviews FROM posts WHERE id = $1', ['bigint']);
 var reviews = plan.execute([i_postsid])[0].reviews;
 for (var index = 0; index < reviews.length; index++) {
    var review = reviews[index];
    //do something with your review object
 }
 return reviews;

$BODY$;

What the above function achieves is simply take a postid and return all the reviews in a NoSQL field. But if you are a JavaScript junkie, then you already know how to open the pandora’s box now! You can manipulate the JSON way more easily compared to using inbuilt JSON functions in PostgreSQL and pass it around. In the above example, note the few things of importance. Number one is the plv8 object which acts as our bridge to the PostgreSQL database. Second is the fact that your regular PL/pgSQL is no more a first class citizen within those $BODY$ start and end markers. We have gone JavaScript!

I have kept this short to serve as an introduction and motivational to help interested developers push the NoSQL agenda. Cheers!

India-China Standoff – Checkmate – Part 2

In this post, we will further explore the strategic implications of the ongoing standoff and look at the broader perspective. We covered initial ground in the part 1 of this post which you can read via the link.

It’s all about CPEC (OBOR)

Analysts have been speculating about the reason and timing of the Doklam standoff for more than two months now. Several scenario outlines pointed out in my last post have become reality, for example the statement from Chinese military that they don’t foresee a military engagement specially the short border war that the Chinese media had been hyping up. Is that the acceptance of defeat we have been looking for? Not quite. Doklam for both China and India is not as important as the CPEC and both countries are at poles end on the subject. China understands how critical it is for them to bypass the Indian Ocean in Indian control to keep their economic growth in the right trajectory. At the same time, for India it is violation of the disputed territory of Kashmir which India claims in entirety. China has been “Salami Slicing” it’s way into Kashmir and anyone with enough interest can check border regions with China on Google maps in Satellite view and clearly see how China is encroaching upon Indian territory year by year. Go zoom in on Demchok and you will see their roads being built well inside undisputed Indian territory. Why are the Chinese doing this? It’s all about CPEC. China understands that to continue its dominant economic rise it needs to export more and export faster and of course, cheaper is always better. CPEC forms a large part of that equation. It reduces it’s bottleneck on South China Sea and Indian Ocean’s straits of Malacca. Looking at the other way round, oil imports of China go predominantly through the straits of Malacca, a spot the Indian Navy dominates due to its strategic positioning in the Andamans.

Why does CPEC terrify India?

Not in 5 years, not in 10 years however in 20 years CPEC can eventually become the lasso on India’s neck. Anytime India does something against Chinese whims and fancies, the lasso can be tightened and India can be brought to it’s knees. China’s biggest pain point in its North Western theatre is supply lines and local support. CPEC changes that by making the region economically active and more and more establishments sprouting up to support the supply lines. The local population which is largely anti-China then starts tilting more towards Beijing due to the economics of scale. If ever China decides to help and support Pakistan, India would stand lesser chances to last than it does today.

What can India do today?

India has already done something today to at least begin addressing the issue. Firstly, it has called China’s frenemy bluff. China has spewed venom all over India via its state controlled media exposing the dragons true intent all along. China has always considered India a rival and always considered Pakistan a puny pawn in the game. Its philosophy has throughout been that of keeping India at bay letting it tackle Pakistan while it itself gets to silently upgrade infrastructure and boost trade. If India has to rise, it has to take the dragon head on, and the Doklam standoff is towards that eventual goal.

How will things turn out from here on?

China will remember Doklam. It will certainly loose the Doklam battle but it won’t forget it that easy. Depending on Xi’s fate, China will either respond militarily in another sector where it has the military advantage or it will make mends to show the world, and slowly light up the “insurgency fire” in the north eastern states. It would get its little pawn in the game called Pakistan to play terrorism in the Kashmir valley.

What if there is a Short-war?

Both sides will be affected even if it’s a short war. China’s economy maybe the second largest but it is in tatters. The GDP to Debt ratio is unrealistically piling up on them and the gamble can totally backfire. If China had to do a short war, it would have done it by now. It knows the risk. Very less is known about the real state of Chinese economy but it is by no means doing good. Figures known to the public tell a grim story. Is it a balloon set to explode? Possibly. But so is India’s. India isn’t doing all that well itself, but clearly in a far superior position than that of China. Militarily China will loose and there is a simple explanation to this. India is positioned strategically, and has supply lines closer to mainland. China on the other hand will have to fetch supplies from exorbitant distances, rendering it easier for India to break its supply lines. In fact Doklam standoff is just to ensure India’s supply lines are never lost control of.

Even more, India may want to use a Chinese first shot aggression as an excuse to render CPEC unstable and useless. How? Siachen. We do control the highest battle grounds in the region and could use that to our advantage giving the Chinese a tough time with CPEC. If India can stay afoot on it’s recent statement about re-acquiring PoK, and put the plans in action, it would be a disaster for China. It would loose straits of Malacca as well as CPEC leading to an eventual crash of the Chinese economy. It would be far more dangerous for China to attack India than merely using it as rhetoric.

Conclusion

While India may rest assured of victorious outcome in a short war, it does need to worry about China’s increasing military supplies to insurgents and the Maoists. A proxy war is what India is really bad at fighting. We will have to invent and invest into our own bag of proxy wars against China specially with Tibet and Xinjuang. We will then have to brace for the long haul and patiently wait for India to fight it out. Highly unlikely if the Modi government succumbs to defeat in the 2019 elections, though.

India-China Standoff – Checkmate – Part 1

India and China have been on a military standoff in the Doklam region and tri-junction area in the Sikkim Sector. The disagreement from China’s angle is that Indian troops entered undisputed Chinese territory and blocked their road construction. They also claim that the Doklam region is disputed between itself and Bhutan and India has no part to play in a bilateral dispute. India’s angle is not very clear to the public apart from defending the Siliguri corridor, also known as the Chicken’s neck. Arguments from both the sides, in my humble opinion, are foolish because both governments have strategic objectives. For China, building a road leading to no-where, is foolish and serves no other purpose than sending military equipment out to the front more quickly and effectively. India too, is giving unwarranted importance to the Doklam plateau because in the event of a full scale military confrontation, China would loose Doklam first due to clear advantages that the Indian military has in the region.

It is imperative we look at what the geography of this region looks like and what may or can happen in case of a full confrontation or a limited tactical confrontation.

India - China border standoff 2017
India – China border standoff 2017

The interesting bit is, that this standoff has two very distinct sides, one public, and the other private yet strategic. We will be exploring the strategic angle in this post series, because in all honesty, that is the angle government is unable to disclose publicly, yet is what it has in mind as the primary objective. Being no expert by any means on military strategy, I would still like to make a humble effort at decoding the strategic angle. Here goes the outline from India’s perspective:

  • A strong protest by India (read Prime Minister Modi) against China’s dual play – play with us or play against us – pick a camp and stick to it
  • We will block any attempts by Chinese at “Salami-Slicing” tactics
  • If we can’t have a full scale confrontation with Pakistan due to Nuclear deterrence issues, neither can China with India – hence – the checkmate
  • Force China to up the ante and explore all confrontation options – come to the realisation it can’t do much – and then get to the negotiation table to re-build the entire partnership
  • Realising a full scale confrontation will destroy India and severely damage China, tame the dragon to play along bilaterally, because China isn’t a Kamikaze state but a mature world power
  • Having a humungous trade deficit between the two countries, let China know it has the lions share to loose even in case of a limited conflict war-theatre
  • Send a message to all countries having disputes with China specially in the South China Sea, that the dragon has met it’s match in India, regardless of Chinese rhetoric about how inferior the Indian Military is in comparison to the Chinese Military
  • Exploit the negative rhetoric against China in India to push the manufacturing sector in India
  • Let China be very well reminded of how upset (and terrified) India is of the CPEC/OBOR because of the easy troop movement it allows for China en-circling India
  • Let the reverse of the above also be known that it is China that will have much to loose if India starts exploring pre-emptive blocking strategies against CPEC/OBOR

I hope to cover each of the strategic objectives in detail in subsequent posts and play out mock confrontation scenarios. Much of this has probably already happened and discussed behind closed doors on both sides, however it is not in the open or public domain.

 

And I am back!

Fellas! My blog is back after being dead for more than a year. It took me some maintenance and technical fixing to get the website back up and running. I intend to be more active with my blogging from now on, and talk about subjects that I care about from the bottom of my heart. The latest edition to my loved subjects is Artificial Intelligence (AI) and my work with a new startup called Karmaloop AI (www.karmaloop.ai)

Post your comments!

C# dynamic objects

The best use of dynamic variables in C# I have found in use with deserializing JSON data for which I may not have the schema, or I know is a frequently changing schema.
For everything else I just avoid using it.

What do I use? Well, NewtonSoft’s JSON deserializer and this simple statement:

dynamic apiData = JsonConvert.Deserialize<dynamic>(jsonData);

Simple and neat.
#CSharp #dynamicvariables

Linux Mint / Ubuntu – Beats Audio on HP Laptop

I am glad someone figured it out!! I will repost him so that all those stuck with crappy sound without Beats on their HP laptops while using Ubuntu get a breather.

Please note – if you are using Ubuntu 13.10 or above, you do not need to install hda-jack-retask separately, its a part of the alsa package. Install alsa-tools-gui in that case using the standard software manager.

Follow these steps (skip installing hda-jack-retask if Ubuntu 13.10 or higher)

OK! I figured it out! It sounds *awesome*!

Step 1: Install hda-jack-retask from here: https://launchpad.net/~diwic/+archive/hda (ppa:diwic/hda)

Step 2: Open hda-jack-retask

Step 3: Select the IDT 92HD91BXX codec (may be different on other models)

Step 4: Check the “Show unconnected pins” box (the internal speakers do not show as connected)

Step 5: Remap 0x0d (Internal Speaker, Front side) to “Internal speaker”

Step 6: Remap 0x0f (“Not connected” but is the under-display speakers) to “Internal speaker”

Step 7: Remap 0x10 (“Not connected” but is the subwoofer) to “Internal speaker (LFE)”

Step 8: Apply now, then test with your favorite audio program (some may not work due to Pulse reset, so find one that does, verify sound is coming from all speakers).

Step 9: If it works, select “Install boot override” to save the settings to apply at boot time.

Step 10: Reboot. When it comes back, you should have full sound from all speakers. Also test headphones. Plugging in headphones should disable sound from all internal speakers.

 

This worked awesome on my laptop! If you have questions just post in comments here.

Few tips on improving speed of your MongoDB database

Those of you who have done a project with MongoDB will notice that it functions and behaves quite differently than traditional RDBMS systems. From super fast queries to all of a sudden taking forever to return 10 documents is something beginners always face with MongoDB. I am no expert but these are the steps I took and Mongo worked much nicer than it had earlier.

  • Configure “Mongod” to run as a service Many beginners make this mistake and its a very common one. Make sure you run it as a service which allows MongoDB to do better performance management and handle incoming queries a lot better.
  • Indexing This should not even need to be mentioned, but with Mongo don’t do a blind indexing. Think of the fields you group the documents the most in your queries and set the indexes with that in mind. This will do a lot to speed up your MongoDB
  • Start using _id Again this is something people do a lot, i.e. they don’t use the inbuilt _id field. You should using that over your own ids. Since it’s an ObjectID, it indexes better and is truly unique reducing programmer headache of creating unique id fields
  • Create a re-indexer service Like any other database MongoDB needs to be re-indexed occasionally. One of the easiest ways is to create a daemon or service in your favorite language and make it do some maintenance like re-indexing and data cleanups.
  • Implement Paging in your queries This is good to do in most projects. When showing large data sets, try to page your data so that you only show enough to start with, and then fetch more as you go. Mongo has an advantage over other databases in this regard in terms of speed. Please keep in mind the field you page on is a unique index

So these are a few observations I had while designing my project in MongoDB. I will be adding more improvement techniques as I go forward. If you think some of my above points are erroneous do let me know. Also share your tricks with me!

Secret of the Romani People

A few years back when I used to be big on finding out about the origins of Indo-European people/community, I had learned about the gypsies tracing their roots back to India. Yesterday I stumbled upon the history of Romani people who are a part of the gypsy tribe and have lived in Europe for centuries not really knowing their place of origin. It will be surprising to a lot of readers that the Romani people have been genetically proven to be a race of the North-Middle-Indian territory but have lived across Europe for at least a 1000 years. The language they speak is very closely related to Hindustani or the Hindi language of India which is another startling fact.

What probably remains a big mystery is how did the Romanis or the Gypsys get to Europe from India? What made them forget the land they came from and what made them never make an effort to go back. It is a mystery even the Romani people may not know the answer to. Someday their history and their historic connection to India will be lost in the annals of time and it is unlikely if anyone will be interested in finding out the real reasons.

Do the Romani people consider themselves Indians and would they ever want to come back to India? Hmm interesting question and only a Romani connected to the roots can answer…