The LATERAL join
The LATERAL join is an interesting and powerful join type that is a bit intimidating at first but when you take a closer look it is very useful in certain scenarios.
So, what is a LATERAL join anyway? From the PostgreSQL documentation:
The LATERAL key word can precede a sub-SELECT FROM item. This allows the sub-SELECT to refer to columns of FROM items that appear before it in the FROM list. (Without LATERAL, each sub-SELECT is evaluated independently and so cannot cross-reference any other FROM item.)
In a gist, it allows you to perform a sub-query in the
FROM clause and reference column values from other records in the query. It is similar to a correlated subquery in that it can reference values from another query but has the added advantages that it can return multiple values and be used in the
1. Top-N per group
For each customer, we will return the 2 most recent orders. Without a
LATERAL join this type of query would be non-trivial.
SELECT c.id, o.order_id, o.date, o.amount FROM customers c CROSS JOIN LATERAL ( SELECT id as order_id, date, amount FROM orders WHERE customer_id = c.id ORDER BY date DESC LIMIT 2 ) o ORDER BY c.id;
NOTE: I use
CROSS JOIN LATERAL above which is eqivalent to
LEFT JOIN LATERAL (...) a ON true (as shown in LATERAL examples elsewhere) but I find it more readable. Effectively, it behaves like a
2. Call User-Defined Function for each row
If you have a User-Defined function that needs to be run for each row of a query, you can use the
LATERAL join to call it and return multiple values.
SELECT t.id, s. p.pl_amount, p.pl_percentage FROM trades t CROSS JOIN LATERAL get_trade_pl(t.id) p;
3. Reuse calculated values
This is a non-obvious use of
LATERAL but one I use often. Since you can reference columns from other records in the query, you can use
LATERAL to calculate values and then reuse them in the main
SELECT statement. Otherwise, you would have to recalculate values for each usage of them in the
In the following example, start_timestamp and end_timestamp are being parsed to a date and the output of those calculations are then being used multiple times from the
SELECT l.id, c.start_date, c.end_date, (c.end_date - c.start_date) as days_diff FROM log l CROSS JOIN LATERAL ( SELECT to_date(l.start_timestamp, 'MM/dd/YYYY'), to_date(l.end_timestamp, 'MM/dd/YYYY') ) as c(start_date, end_date);