Skip to main content

Craig Kerstiens

Notice: Much of this post still applies, but now applies more directly to Citus. Since this post originally published, pg_shard is now deprecated. You can find some further guidance for sharding on the Citus blog and docs

Back in 2012 I wrote an overview of database sharding. Since then I’ve had a few questions about it, which have really increased in frequency over the last two months. As a result I thought I’d do a deeper dive with some actual hands on for sharding. Though for this hands on, because I do value my time I’m going to take advantage of pg_shard rather than creating mechanisms from scratch.

For those unfamiliar pg_shard is an open source extension from Citus data who has a commerical product that you can think of is pg_shard++ (and probably much more). Pg_shard adds a little extra to let data automatically distribute to other Postgres tables (logical shards) and Postgres databases/instances (physical shards) thus letting you outgrow a single Postgres node pretty simply.

Alright, enough talk about it, let’s get things up and running.

What being a PM is really like - Software is easy, People are hard

In recent months I’ve had the question nearly once a week about advice/tips for becoming a Product Manager or more commonly referred to as PM. These are generally coming from people that are either currently engineers, or previously were and are in some engineer/customer role such as a sales engineer or solution architect. There’s a number of high level pieces talking about PM and it often feels glorious, I mean you get to make product decisions right? You get to call some shots. Well that sometimes may be true, but don’t assume it’s all rainbows and sparkles.

Especially as a first time PM what your day to day will look like won’t be debating strategy all day long. Here’s a few of the good and the bad sides of being a PM.

Marketing definitions for developers

Marketing often feels like a dirty-icky thing to many developers. Well until you feel like you have a great product, but no one using it then you have to get a crash course in all of that. And while I might cover some of the actual basics in the future, just knowing what marketing people actually mean when they’re talking can be a huge jump start. Here’s a guide that distills many of the acronyms and terms down to what they actually mean in reality.

Writing more legible SQL

A number of times in a crowd I’ve asked how many people enjoy writing SQL, and often there’s a person or two. The follow up is how many people enjoy reading other people’s SQL and that’s unanimously 0. The reason for this is that so many people write bad SQL. It’s not that it doesn’t do the job, it’s just that people don’t tend to treat SQL the same as other languages and don’t follow strong code formatting guidelines. So, of course here’s some of my own recommendations on how to make SQL more readable.

My top 10 Postgres features and tips for 2016

I find during the holiday season many pick up new books, learn a new language, or brush up on some other skill in general. Here’s my contribution to hopefully giving you a few new things to learn about Postgres and ideally utilize in the new year. It’s not in a top 10 list as much as 10 tips and tricks you should be aware of as when you need them they become incredibly handy. But, first a shameless plug if you find any of the following helpful, consider subscribing to Postgres weekly a weekly newsletter with interesting Postgres content.

Postgres 9.5 - The feature rundown

The headline of Postgres 9.5 is undoubtedly: Insert… on conflict do nothing/update or more commonly known as Upsert or Merge. This removes one of the last remaining features which other databases had over Postgres. Sure we’ll take a look at it, but first let’s browse through some of the other features you can look forward to when Postgres 9.5 lands:

Going from blog posts to full launches

I recall extremely early stage where you’d build a feature, realize it was awesome, then the next day write a blog post for it. At some point you start to move from that to more coordinated launches. A larger coordinated launch allows you to reach a bigger audience, can lead to bigger deals, and help expand your overall market. But perhaps more importantly by the time you hit full launch you’ve message tested and ensured it’s going to resonate in the way you expect.

The process itself will both help amplify and validate/refine your message

This is often a more gradual process than a sudden single change, you’ll introduce new parts of this in time. And for many what an entire launch process looks like comes by trial an error, to help shorten that learning curve here’s key areas I pay attention for a launch and process followed by a rough timeline.

Postgres and Node - Hands on using Postgres as a Document Store with MassiveJS

JSONB in Postgres is absolutely awesome, but it’s taken a little while for libraries to come around to make it as useful as would be ideal. For those not following along with Postgres lately, here’s the quick catchup for it as a NoSQL database.

  • In Postgres 8.3 over 5 years ago Postgres received hstore a key/value store directly in Postgres. It’s big limitation was it was only for text
  • In the years after it got GIN and GiST indexes to make queries over hstore extremely fast indexing the entire collection
  • In Postgres 9.2 we got JSON… sort of. Really this way only text validation, but allowed us to create some functional indexes which were still nice.
  • In Postgres 9.4 we got JSONB - the B stands for Better according to @leinweber. Essentially this is a full binary JSON on disk, which can perform as fast as other NoSQL databases using JSON.

Node, Postgres, MassiveJS - A better database experience

First some background–I’ve always had a bit of a love hate relationship with ORMs. ORMs are great for basic crud applications, which inevitably happens at some point for an app. The main two problems I have with ORMs is:

  1. They treat all databases as equal (yes, this is a little overgeneralized but typically true). They claim to do this for database portability, but in reality an app still can’t just up and move from one to another.
  2. They don’t handle complex queries well at all. As someone that sees SQL as a very powerful language, taking away all the power just leaves me with pain.

Of course these aren’t the only issues with them, just the two ones I personally run into over and over.

In some playing with Node I was optimistic to explore Massive.JS as it seems to buck the trend of just imitating all other ORMs. My initial results–it makes me want to do more with Node just for this library. After all the power of a language is the ecosystem of libraries around it, not just the core language. So let’s take a quick tour through with a few highlights of what makes it really great.

Seeding a sharing-economy or platform company

These days if you’re creating a company you likely hope to accomplish more with less people, two ways of doing this fall to: The sharing economy and creating a platform. It’s easy to see the case for this when you have such unicorns like AirBnB or Uber. The opportunity for each of those to compete against hotel chains or taxi services which each need to manage their own inventory is incredibly exciting and revolutionary. In a similar fashion platforms can offer much the same, Heroku’s platform and marketplace made it easier than ever for developers to click a button and get everything they needed years ago. It’s not just their code, it’s everything from Postgres to Mongo to Logging. Or take the app store as example. Smart phones weren’t a new thing when the iPhone came out, but it was only the saviest of users that had apps installed on their windows smartphone or blackberry. The app store made the iPhone different than any other phone by allowing others to build and improve it, turning the iPhone not into a phone but a platform.