PostgreSQL NOT IN with NULL

Software

Earlier this week my colleague clued me into a potential Postgres pitfall I’d yet to be bitten by, but it could happen.

Say you’re doing a once-off query like this. Yes, it’s a contrived, simplified example.

SELECT book, isbn 
FROM inventory 
WHERE isbn NOT IN (
    SELECT isbn FROM orders
)

This superficially works, but Postgres will not match NULL values in this subquery when using NOT IN. Ouch!

So the only way to be safe is to check the field is also NOT NULL:

SELECT book, isbn 
FROM inventory 
WHERE isbn NOT IN (
    SELECT isbn FROM orders 
    WHERE isbn IS NOT NULL
)

Or the more robust alternative, short of doing a proper join, is to strategically deploy NOT EXISTS:

SELECT book, isbn 
FROM inventory 
WHERE NOT EXISTS (
    SELECT isbn FROM orders 
    WHERE inventory.isbn = orders.isbn
)

I did some digging, and also found this interesting article on Explain Extended on the performance implicaitons of using these different queries.

Author bio and support

Me!

Ruben Schade is a technical writer and IaaS engineer in Sydney, Australia who refers to himself in the third person in bios. Wait, not BIOS… my brain should be EFI by now.

The site is powered by Hugo, FreeBSD, and OpenZFS on OrionVM, everyone’s favourite cloud infrastructure provider.

If you found this post helpful or entertaining, you can shout me a coffee or buy some silly merch. Thanks!