SQLAlchemy and Idle in transaction queries


Summary

Orphaned transactions are a problem that could lead to messy states in the database. This is usually a good indicator that the something is not right in the manner in which connections are established and closed on the services that query the database.


How does one find these queries

This query lists all the database connections and their current state

This usually ends up returning a bunch of transactions

pid  |    duration     |        state
-------+-----------------+---------------------
 29206 | 00:00:29.951221 | idle
 29226 | 00:00:00.384681 | idle
 29385 | 00:00:34.609666 | idle
 29417 | 00:00:18.915824 | idle
 29115 | 00:00:52.712196 | idle
 29070 | 00:01:36.071392 | idle
 29397 | 00:00:00.360605 | idle
 29201 | 00:01:06.886096 | idle
 29213 | 00:00:59.415637 | idle
 29445 | 00:00:03.620537 | idle
 29132 | 00:01:35.968193 | idle
 28947 | 00:00:47.727746 | idle
 29371 | 00:00:00.522925 | idle
 29114 | 00:00:17.283915 | idle
 29384 | 00:00:00.259531 | idle
 29434 | 00:00:10.453241 | idle
 29379 | 00:00:17.517231 | idle
 27596 | 00:12:19.165115 | idle
 16765 | 05:58:04.261973 | idle in transaction
 28766 | 00:03:55.651335 | idle
 29413 | 00:00:18.421673 | idle
 28405 | 00:07:28.114135 | idle
 29485 | 00:00:03.501864 | idle
 29229 | 00:00:00        | active
 28056 | 00:10:28.577103 | idle

We have one transaction in the idle in transaction state, one in the active state and the rest in idle state.



SQLAlchemy, Pools and Idle connections

Now, the idle state indicates that a bunch of connections have been established and remain connected without performing any operations on the database. With respect to SQLAlchemy this refers to the pool configurations. You’ll usually notice a large number of connections remaining idle depending on how you checkout and release connections back to your pool.


For example

In this example, we tell SQLAlchemy to create a pool of 35 connections (with a maximum of 65, indicated by the max_overflow).


The pool_use_lifo specifies the mechanism with which connections are checked out and closed. Using LIFO reduces the number of connections used during non- peak periods of use.


LIFO essentially specifies that the last used connection (released back to the pool) to the pool is the one that is used for the next request.


This setting is largely a preference and depends on how you want your application to behave. It’s important that though having idle connections is not a major drawback, a large number of idle connections do consume resources on the database.


Terminating Idle sessions

The most effective approach is to handle this using the session pool manager and ensuring the pool size is adequate and uses the LIFO strategy.


However, this can also be configured on postgresql using the idle_session_timeout attribute. Setting this to an integer (indicating duration in milliseconds) in postgresql.conf ensures that idle sessions are terminated after the period.


A more brute force approach to this is described in this post


SQLAlchemy, Pools and Idle In Transaction sessions

The idle in transaction connections are slightly trickier and occur due to connections stuck in a transaction. This usually occurs when a connection has issued a BEGIN statement (and possibly a few other queries post that) and is waiting for the transaction to be committed or rolled back.


In my opinion the culprit for the issue is the way in which SQLAlchemy defines the configurations for this. It’s important to acknowledge that it adheres to the approach defined in the PEP 249


Now, when a connection is checked out on an engine defined below it issues a BEGIN statement by default.

Now, let’s assume that we make the following query

We end up seeing our transaction in the Idle in transaction state.


This brings us to Postgres’ default state, which is the AUTO COMMIT mode, which is intuitive. The database doesn’t expect us to wrap queries in a BEGIN and COMMIT block and invokes the command immediately when returned. Note, we’re referring to PostgreSQL here and not SQLAlchemy.


When we need an explicit transaction, we’re expected to wrap those in a BEGIN - COMMIT or ROLLBACK block.


But, with SQLAlchemy explicitly invoking a BEGIN when a connection is checked out, we’re forced to wrap all SELECT queries in a transaction block. Now, a select clause doesn’t really warrant a transaction block and not doing so leaves us with dangling orphaned sessions.


Terminating Idle sessions

As before the brute force approach is still an option and is described in this post here but one can also configure this directly with PostgreSQL in the postgresql.conf file setting the idle_in_transaction_session_timeout. Now, this can get tricky as this could leave the application in an unpredictable state.


So, the only reasonable thing to do here is to call a ROLLBACK or COMMIT at the end of each invoked session. Closing the session automatically issues a ROLLBACK


SQLAlchemy and ISOLATION_LEVEL=AUTOCOMMIT


So, this threw me off, given that one rarely refers to AUTOCOMMIT as an isolation level. Now, this isn’t really a database isolation level. The transaction isolation levels supported by Postgres are UNCOMMITTED, READ COMMITTED, REPEATABLE READ and SERIALIZABLE.


But for some reason SQLAlchemy defines a configuration called isolation_level on the engine which sets the database API into a non-transactional auto commit mode. What this means is that the default BEGIN issued when checking out a connection is no longer issued. The onus is now on the user to issue a BEGIN, ROLLBACK clause when needed. It also seems to set the actual database isolation level to READ COMMITTED.