Why PostgreSQL Part 2
This post is a list of many of the reasons to use Postgres, much this content as well as how to use these features will later be curated within PostgresGuide.com. If you need to get started check out Postgres.app for Mac, or get a Cloud instance at Heroku Postgres for free
Last week I did a post on the many reasons to use Postgres. My goal with the post was two fold:
- Call out some of the historical arguments against it that don’t hold any more
- Highlight some of the awesome but more unique features that are less commonly found in databases.
Once I published the post it was clear and was immediately pointed out in the comments and on hacker news that I missed quite a few features that I’d mostly come to take for granted. Perhaps this is due to so much awesomeness existing within Postgres. A large thanks to everyone for calling these out. To help consolidate many of these, here’s a second list of the many reasons to use PostgreSQL:
Create Index Concurrently
On most traditional databases when you create an index it holds a lock on the table while it creates the index. This means that the table is more or less useless during that time. When you’re starting out this isn’t a problem, but as your data grows and you then add indexes later to improve performance it could mean downtime just to add an index. Not surprisingly Postgres has a great means of adding an index without holding that lock. Simply doing
CREATE INDEX CONCURRENTLY instead of
CREATE INDEX will create your index without holding the lock.
Of course with many features there are caveats, in the case of creating your index concurrently it does take somewhere on the order of 2-3 times longer, and cannot be done within a transaction
If you’ve ever run a migration had something break mid-way, either due to a constraint or some other means you understand what pain can come of quickly untangling such. Typically migrations on a schema are intended to be run holistically and if they fail you want to fully rollback. Some other databases such as Oracle in recent versions and SQL server do support, this. And of course Postgres supports wrapping your DDL inside a transaction. This means if an error does occur you can simply rollback and have the previous DDL statements rolled back with it, leaving your schema migrations as safe as your data, and your application in a consistent state.
Foreign Data Wrappers
I talked before about other languages within your database such as Ruby or Python, but what if you wanted to talk to other databases from your database. Postgres’s Foreign Data Wrapper allows you to fully wrap external data systems and join on them in a similar fashion to as if they existed locally within the database. Here’s a sampling of just a few of the foreign data wrappers that exist:
Conditional Constraints and Partial Indexes
In a similar fashion to affecting only part of your data you may care about an index on only a portion of your data. Or you may care about placing a constraint only where a certain condition is true. Take an example case of the white pages. Within the white pages you only have one active address, but you’ve had multiple ones over recent years. You likely wouldn’t care about the past addresses being indexed, but would want everyones current address to be indexed. With Partial Indexes becomes simple and straight forward:
Postgres in the Cloud
Postgres has been chosen by individual shops and been proven to scale by places such as Instagram and Disqus. Perhaps even more importantly it’s becoming easy to use PostgreSQL due to the many clouds that are running Postgres as a Service, such as:
Full disclosure, I work at Heroku, and am also a large fan of their database service
If you want to use your database as a queue there’s some cases where it just won’t work, as heavily discussed in a recent write-up. However, much of this could be discarded if you included Postgres in this discussion due to Listen/Notify. Postgres will allow you to LISTEN to an event and of course NOTIFY for when the event has occurred. A great example of this in action is Ryan Smith’s Queue Classic.
Fast column addition/removal
Want to add a column or remove one. With millions of records this modification in some databases could take seconds or even minutes, in cases I’ve even heard horror stories of adding a column taking hours. With Postgres this is a near immediate action. The only time you pay a higher price is when you choose to write default data to a new column.
Want inheritance in your database just like you have in with models inside your application code? Not a problem for Postgres. You can have one table easily inherit for another, leaving a cleaner data model within your database while still giving all the flexibility you’d like on your data model. The Postgres docs on DDL Inheritance do a great job of documenting how to use this and giving a very simple but clear use case.
Per transaction synchronous replication
The default mode for Postgres streaming replication is asynchronous. This works well when you want to maintain performance, but also care about your data. There are cases where you want your replication to be synchronous though. Furthermore, for some cases asynchronous may work well enough where as other data you may care more about the data and want synchronous replication, within the same database. For example, if you care about user sign-ups and purchases, but ratings of products and comments is less important Postgres provides the ability to treat them as such. With Postgres you can have per transaction synchronous replication, this means you could have strong data guarantee on your user and purchase transactions, and less guarantees on others. This means you only pay the extra performance cost where you really care about versus an all or nothing approach you have with other databases.
Hopefully you’re convinced on why Postgres is a great tool, if not take a look back at my previous post.