Resource cleanup, compared: Python, Go and C++

As a programmer you are expected to learn new technologies regularly. Even when the documentation is excellent, there will typically be underlying assumptions that go unstated because they are so obvious to the writer. And documentation is not always good.

But if you have relevant technical depth you will be able to recognize the commonalities and differences within a category of technologies, e.g. programming languages or databases. This means a new programming language will be easier to learn: you will recognize familiar features, different trade-offs, and some of the motivations of design choices. You will also be better able to judge the usefulness of the new technology. One way to improve your technical depth is to compare a single task across multiple technologies.

Resource cleanup

Let’s consider a particular task: cleaning up a resource. If your code wants to write to a file you will open the file, write to it, and eventually close it. Forgetting to close the file might mean writes don’t get written to disk until much later than you expected, or that certain resources get leaked. On Unix systems if you don’t close file descriptors your process will eventually run out and not be able to open any new files.

Most programming languages allow returning from a function at multiple points, so cleanup ends up being repetitive. This makes it easier for you to forget to cleanup a resource you acquired or created within the function.

def write():
    f = open("myfile", "w")
    if something():
       f.close()  # Repetitive resource cleanup
       return
    
    f.write("hello")
    f.close()  # Repetitive resource cleanup

Many languages allow leaving a function in more than one way, e.g. with both returns and exceptions. Once you have exceptions in your language any part of your code might result in leaving the function due to a thrown exception, making resource cleanup even harder to get right:

def write():
    f = open("myfile", "w")
    # If this throws an exception, e.g. when disk is full,
    # then f.close() will never be run:
    f.write("hello")
    f.close()

As a result most languages provide an idiom or feature for automatically cleaning up resources, regardless of how or when you return from a function. Let’s compare the idioms for C++, Go and Python and see what we can learn.

Python

Python functions can return via returned result, or via a raised exception. One way to cleanup a resource is via try/finally clause based on the exception handling syntax of try/except:

def write():
    f = open("myfile", "w")
    try:
        f.write("hello")
    finally:
        f.close()

The code in the finally block will always be called regardless of whether the try block returned, raised an exception, or execution continues. (Python also has a more modern with idiom that I’m going to ignore for brevity’s sake.)

Go

Go lacks exceptions, so there is no exception syntax to build on. Instead, Go provides a defer statement that schedules a cleanup function to be run when the main function returns.

func write() {
    f, err := os.Open("myfile")
    if err != nil {
        return
    }
    defer f.Close()
    f.WriteString("hello")
}

Python has a similar facility implemented as library code in the unittest.TestCase class, where you can register cleanup functions for a test:

class MyTest(TestCase):
    def test_files(self):
        f = open("/tmp/myfile")
        # f.close() will be called after test finishes:
        self.addCleanup(f.close)
        # etc.

While try/finally could be used, failed tests are indicated by raising an AssertionError exception. This means any test that wants to cleanup multiple resources will be forced to have many nested try/finally clauses, which is the likely motivation for having the TestCase.addCleanup API.

C++

The C++ idiom is very different, relying on class destructors: we construct a File class whose destructor closes the file, and then allocate the File object on the stack when we use it. When the function returns the File instance on the stack is destroyed, and therefore its destructor is called and the underlying file is closed.

class File {
public:
    File(const char* filename):
        m_file(std::fopen(filename, "w")) {
    }

    ~file() {
        std::fclose(m_file);
    }
// etc.
private:
    std::FILE* m_file;
// etc.
} ;

void write() {
  File my_file("myfile");
  my_file.write("hello");
}

Notice that this relies on deterministic deallocation of my_file: since it’s on the stack, it will always be deallocated when the function ends. This mechanism cannot be used in Python or Go because they are garbage collected, and so there is no guarantee an object will be cleared from memory immediately. Python will close a file when it is garbage collected, but warns you that you should have closed it yourself:

$ python3 -Wall
>>> open("/etc/passwd", "rb")
<_io.BufferedReader name='/etc/passwd'>
>>> 1 + 2  # There's decent chance file will get GC'd now, and indeed:
__main__:1: ResourceWarning: unclosed file <_io.BufferedReader name='/etc/passwd'>
3

Some overall lessons

What have we learned from all this?

  • Languages without garbage collection can potentially rely on deterministic object destruction to cleanup resources.
  • Languages with exceptions can always support resource cleanup via exception handlers plus success case cleanup, or perhaps more simply via related syntax as in Python or Java.
  • Scheduled clean up functions can be a language feature as in Go, or a library feature available in any programming language.

You can now apply this knowledge to the next new programming language you learn.

Comparing a specific task can help you gain technical depth in other areas as well. And if you’re learning a new technology comparing tasks with technologies you already know will help you learn the new technology that much faster. A good task is easy but not completely trivial: adding integers doesn’t differ much between programming languages, so comparing it won’t teach you anything interesting. For databases you might compare “how would I allocate a unique id to a newly created record?” or “how can I safely increment a counter from multiple clients?” And ideally you should compare more than two technologies, since there’s almost always more than two solutions to any problem.


You might also enjoy:

» The tragic tale of the deadlocking Python queue
» Object ownership across programming languages
»» Get the work/life balance you need
»» Level up your technical skills