Siv Scripts

Solving Problems Using Code

Wed 17 May 2017

A Gentle Introduction to Context Managers: The Pythonic Way of Managing Resources

Posted by Aly Sivji in Tutorials   

Summary

  • Explore with statements and the context manager protocol
  • Implement context manager class to query MongoDB
  • Convert try...finally block to with block and increase code readability

I recently read Steve McConnell's Code Complete to level up my software development skill-set. The book has helped me become more deliberate about programming and problem solving in general. Before I sit down to write a single line of code, I take some time to plan out the work I am going to do versus code by the seat of my pants. Coding without a plan means we will have to refactor our work to deal with problems that arise from not thinking our design through. I highly recommend this book to any aspiring code jockey.

One of the major themes of Code Complete is to Program into [our] Language, Not in It; Steve defines this as follows:

Don’t limit your programming thinking only to the concepts that are supported automatically by your language. The best programmers think of what they want to do, and then they assess how to accomplish their objectives with the programming tools at their disposal.

...

Programming using the most obvious path amounts to programming in a language rather than programming into a language; it’s the programmer’s equivalent of “If Freddie jumped off a bridge, would you jump off a bridge, too?” Think about your technical goals, and then decide how best to accomplish those goals by programming into your language.

Said another way, doing things a certain way in one language doesn't mean that we should follow the same pattern in another language. When I came into Python from C# and JavaScript, I brought all my habits with me. Instead of looking for a Pythonic solution, I looked for the Python syntax to do things the way I've always done them.

Take resource management as an example. I used try...except...finally blocks (try-catch-finally in C# & JavaScript) to ensure that I closed the resource that was being consumed, even if an exception occurred. Although this method works, we are not using the tools and language patterns provided by the Python standard library.

Cue Raymond Hettinger: There MUST be a better way!

And there is! It's something we've all used before and probably never even thought about: context managers and the with statement!

In this tutorial, we will explore the with statement and context manager protocol in a bit more depth before implementing our own context manager to query MongoDB.


What You Need to Follow Along

Development Tools (Stack)

Code


with Statement and Context Manager Protocol

The with statement is a control-flow structure that allows us to encapsulate try...except...finally blocks for convenient reuse. As a result, we have written cleaner and more readable code (PEP 343 | Python Docs).

The with statement supports a runtime context which is implemented through a pair of methods executed (1) before the statement body is entered (__enter__()) and (2) after the statement body is exited (__exit__()) (Source).

The basic structure looks as follows:

with context-expression [as var]:
        with_statement_body

The context-expression requires an object that supports the context manager protocol, i.e. a class containing __enter__() and __exit__() methods. We can also point to a context manager written using generators and the contextmanager decorator.

This blog gives a great explanation of the special dunder (double underscore) methods:

  • __enter__ should return an object that is assigned to the variable after as. By default it is None, and is optional. A common pattern is to return self and keep the functionality required within the same class.
  • __exit__ is called on the original Context Manager object, not the object returned by __enter__.
  • If an error is raised in __init__ or __enter__ then the code block is never executed and __exit__ is not called.
  • Once the code block is entered, __exit__ is always called, even if an exception is raised in the code block.
  • If __exit__ returns True, the exception is suppressed. and exit

Inside of our class, we can implement the __init__() method to set up our object as the statements do not need to be repeated for each instance. For a database context manager, we can set up our connection inside __init__() and return an object or cursor from the __enter__() method.

The variable that comes after the as keyword is optional, but it should be included and used to refer to the object returned from __enter__() inside our with_statement_body.

I think that's more than enough theory. Let's head into the REPL:

In [1]:
import sys
sys.version
Out[1]:
'3.6.1 |Continuum Analytics, Inc.| (default, Mar 22 2017, 19:25:17) \n[GCC 4.2.1 Compatible Apple LLVM 6.0 (clang-600.0.57)]'
In [2]:
# Let's figure out control flow... create object with __enter__ and __exit__ methods

class Foo():
    def __init__(self):
        print('__init__ called')
        self.init_var = 0
        
    def __enter__(self):
        print('__enter__ called')
        return self
    
    def __exit__(self, exc_type, exc_value, exc_traceback):
        print('__exit__ called')
        if exc_type:
            print(f'exc_type: {exc_type}')
            print(f'exc_value: {exc_value}')
            print(f'exc_traceback: {exc_traceback}')
            
    def add_two(self):
        self.init_var += 2
In [3]:
my_object = Foo()
__init__ called
In [4]:
my_object.init_var
Out[4]:
0
In [5]:
my_object.add_two()
my_object.init_var
Out[5]:
2
In [6]:
# regular flow without exceptions
with my_object as obj:
    print('inside with statement body')
__enter__ called
inside with statement body
__exit__ called
In [7]:
# what can we access in the object that is returned inside with statement context
with my_object as obj:
    print(obj.init_var)
__enter__ called
2
__exit__ called
In [8]:
# adding 2 to the var inside statement
with my_object as obj:
    my_object.add_two()
    print(obj.init_var)
__enter__ called
4
__exit__ called
In [9]:
# using a new instance in context expression
with Foo() as obj:
    print(obj.init_var)
__init__ called
__enter__ called
0
__exit__ called
In [10]:
# raising exceptions within block
with my_object as obj:
    print('inside with statement body')
    raise Exception('exception raised').with_traceback(None)
__enter__ called
inside with statement body
__exit__ called
exc_type: <class 'Exception'>
exc_value: exception raised
exc_traceback: <traceback object at 0x110aebc08>
---------------------------------------------------------------------------
Exception                                 Traceback (most recent call last)
<ipython-input-10-a6c95ae7fd46> in <module>()
      2 with my_object as obj:
      3     print('inside with statement body')
----> 4     raise Exception('exception raised').with_traceback(None)

Exception: exception raised
In [11]:
# try to handle exception using try...except...finally
try:
    with my_object as obj:
        print('inside with statement body')
        raise Exception('exception raised').with_traceback(None)
except Exception as e:
    print('handling exception')
    print(e)
finally:
    print('Finally section')
__enter__ called
inside with statement body
__exit__ called
exc_type: <class 'Exception'>
exc_value: exception raised
exc_traceback: <traceback object at 0x110b52dc8>
handling exception
exception raised
Finally section

We can handle Exceptions inside our __exit__() block and return True to surpress it up the chain. We'll come back to this in a bit...

In [12]:
# with statement within a with statement... with-ception
with my_object as obj:
    print('inside first context')
    with my_object as obj2:
        raise Exception('exception raised inner most block').with_traceback(None)
    
    print('a')
__enter__ called
inside first context
__enter__ called
__exit__ called
exc_type: <class 'Exception'>
exc_value: exception raised inner most block
exc_traceback: <traceback object at 0x110b070c8>
__exit__ called
exc_type: <class 'Exception'>
exc_value: exception raised inner most block
exc_traceback: <traceback object at 0x110b070c8>
---------------------------------------------------------------------------
Exception                                 Traceback (most recent call last)
<ipython-input-12-926af76a23fc> in <module>()
      3     print('inside first context')
      4     with my_object as obj2:
----> 5         raise Exception('exception raised inner most block').with_traceback(None)
      6 
      7     print('a')

Exception: exception raised inner most block
In [13]:
# increment before exception, does it go thru?
my_object.init_var
Out[13]:
4
In [14]:
# how does variable context change?
# with statement within a with statement... with-ception
with my_object as obj:
    my_object.add_two()
    print('inside first context')
    with my_object as obj2:
        raise Exception('exception raised inner most block').with_traceback(None)
    
    print('a')
__enter__ called
inside first context
__enter__ called
__exit__ called
exc_type: <class 'Exception'>
exc_value: exception raised inner most block
exc_traceback: <traceback object at 0x110aeb988>
__exit__ called
exc_type: <class 'Exception'>
exc_value: exception raised inner most block
exc_traceback: <traceback object at 0x110aeb988>
---------------------------------------------------------------------------
Exception                                 Traceback (most recent call last)
<ipython-input-14-f11e7f99c627> in <module>()
      5     print('inside first context')
      6     with my_object as obj2:
----> 7         raise Exception('exception raised inner most block').with_traceback(None)
      8 
      9     print('a')

Exception: exception raised inner most block
In [15]:
my_object.init_var
Out[15]:
6

Since we created an instance of Foo outside of our with statement, we can access/change my_object as it is just an instance that can use within our code.

If we create a new instance in the with statement's context_expression, it only exists inside the with statement body.

Handling exceptions in __exit__() method

In [16]:
class Foo2():
    def __init__(self):
        print('__init__ called')
        self.init_var = 0
        
    def __enter__(self):
        print('__enter__ called')
        return self
    
    def __exit__(self, exc_type, exc_value, exc_traceback):
        print('__exit__ called')
        if exc_type:
            print(f'exc_type: {exc_type}')
            print(f'exc_value: {exc_value}')
            print(f'exc_traceback: {exc_traceback}')
            print('exception handled')
        # return True to handle exception...
        return True
            
    def add_two(self):
        self.init_var += 2
In [17]:
# using a new instance in context expression
with Foo2() as obj:
    print(obj.init_var)
    raise Exception('exception raised').with_traceback(None)
__init__ called
__enter__ called
0
__exit__ called
exc_type: <class 'Exception'>
exc_value: exception raised
exc_traceback: <traceback object at 0x110b74348>
exception handled

Playing around in the REPL is invaluable to figuring out how things work.


When to use Context Managers

Dave Brondsema gave a great talk on Decorators and Context Managers at PyCon 2012. He mentioned that we should use context managers when we see any of the following patterns in our code:

  • Open - Close (see example below)
  • Lock - Release
  • Change - Reset
  • Enter - Exit
  • Start - Stop

Arnav Khare details a lot of great use cases of Context Managers in the Real World and provides starter code for each example.


Creating a Context Manager

Project Background

A blog post by John Resig on the benefits of writing code everyday inspired me to set aside a minimum of 30 minutes each day to work on side projects. I've been documenting my streak using the #codeeveryday hashtag on Twitter.

After knocking out a few side projects from my todo list, I started working on a script that would analyze my #codeeveryday tweets and create a dashboard to display progress.

Meta Xzibit

In my AWS account, I have a Lambda script that downloads and store my tweets in a MongoDB instance running on MLab. I will be querying this data store to generate my dashboard, but you can use any Mongo collection you want.

Using try...finally Blocks

Let's take a look at our function to query MongoDB for all tweets containing a specific hashtag using the try...finally pattern:

def download_tweets_by_hashtag_nonpythonic(hashtag):
    '''Connect to MongoDB, download tweets with param hashtags

    Args:
        * hashtag - text hashtag

    Returns
        * list of Tweets containing hashtag
    '''

    tweets = []
    try:
        client = MongoClient(mlab_uri)
        db = client.get_default_database()
        coll = db[collection]
        tweets = coll.find({"entities.hashtags.text":f"{hashtag}"})
    finally:
        client.close()
        return tweets

While this does work, we have to remember to add code in our finally block to close the connection to our resource in the event that we encounter an exception.

Using Context Managers

We begin by implementing our context manager as a class.

This class requires the following methods:

  • __init__() method to set up the object. We will be connecting to our Mongo database and setting our collection variable
  • __enter__() method to return a reference to the collection object
  • __exit__() method to close the connection to the database. This connection would be closed even if an exception is raised in our with block
from pymongo import MongoClient

class MongoCollection(object):
    '''Connect to mongodb and return collection within context manager
    '''

    def __init__(self, uri, collection):
        self.client = MongoClient(uri)
        self.db = self.client.get_default_database()
        self.collection = self.db[collection]

    def __enter__(self):
        return self.collection

    def __exit__(self, exc_type, exc_value, exc_traceback):
        self.client.close()

With our context manager created, we write our search command inside a with block as follows:

def download_tweets_by_hashtag(hashtag):
    '''Connect to MongoDB, download tweets with param hashtags

    Args:
        * hashtag - text hashtag

    Returns
        * list of Tweets containing hashtag
    '''
    with MongoCollection(mlab_uri, collection) as coll:
        tweets = coll.find({'entities.hashtags.text':f'{hashtag}'})

    return list(tweets)

The code is a lot cleaner as we have abstracted away the database connection information in our MongoCollection class.


Conclusion

Thinking about software development in a deliberate manner makes us better coders. One of the ways we can do this is by "programming into our language, not in it." This means we should be using our language of choice in an idiomatic way versus porting over how we have always done things.

In this tutorial, we learned about context managers and the with statement. We created a context manager object to retrieve documents out of MongoDB and abstracted away the connection logic within our class.



 
    
 
 

Comments