Hstore vs. JSON - Which to Use in Postgres
If you’re deciding what to put in Postgres and what not to, consider that Postgres can be a perfectly good schema-less database. Of course as soon as people realized this then the common comes a question, is hstore or JSON better. Which do I use and in what cases. Well first, if you’re not familiar check out some previous material on them:
- hstore on PostgresGuide
- hstore in Postgres docs
- hstore with Django
- JSON datatype
If you’re already up to date with both of them, but still wondering which to use lets dig in.
hstore is a key value store directly within your database. Its been a common favorite of mine and has been for some time. hstore gives you flexibility when working with your schema, as you don’t have to define models ahead of time. Though its two big limitations are that 1. it only deals with text and 2. its not a full document store meaning you can’t nest objects.
Though major benefits of hstore include the ability to index on it, robust support for various operators, and of course the obvious of flexibility with your data. Some of the basic operators available include:
Return the value from column
foo for key
Does the specified column
foo contain a key
Does the specified column
foo contain a value of
baz for key
Perhaps one of the best parts of hstore is that you can index on it. In particular Postgres
gist indexes allow you to index all keys and values within an hstore. A talk by Christophe Pettus of PgExperts actually highlights some performance details of hstore with indexes. To give away the big punchline in several cases hstore with gin/gist beats mongodb in performance.
JSON in contrast to hstore is a full document datatype. In addition to nesting objects you have support for more than just text (read numbers). As you insert JSON into Postgres it will automatically ensure its valid JSON and error if its well not. JSON gets a lot better come Postgres 9.3 as well with some built in operators. Though if you need more functionality in it today you should look at PLV8.
Which to Use
So which do you actually want to use in your application? If you’re already using JSON and simply want to store it in your database then the JSON datatype is often the correct pick. However, if you’re just looking for flexibility with your data model then hstore is likely the path you want to take. hstore will give you much of the flexibility you want as well as a good ability to query your data in a performant manner. Of course much of this starts to change in Postgres 9.3.