dein.fr

Writing a correlated subquery with SQLAlchemy

Charles-Axel Dein
today-i-learned

We have two tables:

  • Folders (id, company_id)
  • Milestones (id, is_done, value: i.e. when the milestone will happen)

A folder has multiple milestones. We would like to get each folder as well as its most recent milestone.

As usual with SQLAlchemy, it is a good idea to start by writing the query in plain SQL. Here's one way to do it (see 4 Ways to Join Only The First Row in SQL):

select *
from folders f

join milestones m on m.id = (
select id from milestones m
where m.folder_id = f.id and f.company_id = ... and m.is_done = false
order by `value` asc
limit 1
)

And here's how to write it with SQLAlchemy:

subq = (session.query(TMilestone.id).filter(
    TMilestone.is_done.is_(False),
    TFolder.company_id == company_id).order_by(
        TMilestone.value.asc()).limit(1).correlate(TFolder))

query = session.query(TFolder, TMilestone).join(
    TMilestone, TMilestone.id == subq)

res = query.all()