A Guide to Python Threading

In Python, threading allows you to run different parts of your program concurrently.

Threading can help you simplify the design of your program.

The greatest advantage of threading is that it helps programmers execute their programs faster.

So, if you are an experienced Python programmer and you’re looking for a faster way to execute your programs, you’ve come to the right place.

That is what I will be showing you in this article.

Table of Contents

You can skip to a specific section of this Python threading tutorial using the table of contents below:

What is a Thread?

A thread refers to a separate flow of execution.

In simple words, it is a sequence of instructions within a program that can be executed independently of the other code.

See a thread as a subset of a program.

Threads make it possible for a program to have two things happening at once.

However, in most Python 3 implementations, threads do not actually execute concurrently, they only appear to.

You may be thinking of threading as a situation where you have two or more processors running on your program at the same time, with each processor doing an independent task.

You are almost right in your thinking.

Yes, the threads may be running on different processors, but only one of them will run at a time.

To have multiple tasks run simultaneously, a non-standard implementation of Python is required.

You can write some of the code in a different language, or you can use multiprocessing which comes with an extra overhead.

For CPython users, threading may not speed up all tasks.

This is because of the way it implements Python.

It requires the interaction with GIL (Global Interpreter Lock) that limits only one thread to run at a time.

The best tasks to run using multithreading are the ones that spend much of their time waiting for external events.

Tasks that need heavy CPU computation and spend only a small period of time waiting for external events may not show an improvement in speed.

This applies to Python code running on CPython.

If you write your threads in C, they can release GIL and run concurrently.

However, if you want to run them on a different Python implementation, go through the documentation to know how threads are handled.

If you are using the standard Python implementation, check out the _multiprocessing _module.

Multithreading also makes code look cleaner.

In this article, I will be giving examples to you how multithreading makes code look cleaner and easier to read.

Starting a Thread

In this section, I will be showing you how to make a thread.

The Python standard library comes with threading, and it has most of the primitives that you will see in this article.

Thread, which is inbuilt in the module, encapsulates threads, providing a clean interface to programmers.

If you want to start a separate thread, create an instance of the Thread and invoke the .start() function on it.

Consider the following example:

import time

import logging

import threading

def my_function(thread_name):

logging.info(“Thread %s: starting”, thread_name) time.sleep(5) logging.info(“Thread %s: finished”, thread_name)

if name == “main”:

format = “%(asctime)s: %(message)s”

logging.basicConfig(format=format, level=logging.INFO,


logging.info(“Main : before thread is created”)

thread = threading.Thread(target=my_function, args=(1,))

logging.info(“Main : before running thread”)


logging.info(“Main : wait for thread to finish”)


logging.info(“Main : all done”)

The above Python code shows how to create and start a thread.

It should return the following output:

The time shown will depend on the time that you run the code.

If you look at the logging statements, you will realize that the thread is created and started by the _main _section.

thread = threading.Thread(target=my_function, args=(1,))


When creating a thread, you should pass a function to it and the list of arguments to that function.

In the above example, we have passed the function given the name my_function() and passed to it the argument 1.

So, the thread will run the function named my_function().

However, the function doesn’t do much.

It only logs some messages with time.sleep() in between them.

That’s why the output shows some time gap in between the printed messages.

The output shows that the Main section of the code finished first, then the Thread finished afterwards.

See there is a commented line in the code, that is, # thread.join().

Soon, I will be showing you where to use it.

Keep reading…

Daemon Threads

In computing, a daemon is simply a process running in the background.

However, Python has a more specific meaning for the word daemon.

A daemon thread will close immediately once the program exits.

See a daemon thread as the type of thread that runs in the background without being concerned about being shut down.

If a thread is not a daemon, the program will wait for the thread to complete before it terminates.

On the other hand, threads that are daemons are killed when the program exits.

Let’s have a closer look at the above program.

When you run it, you will realize that there is a pause of 5 seconds between the time the _all done _message is printed and the thread is finished.

From the given output, the _all done _message was printed at 10:21:53.

Then, the Thread 1: finished message was printed at 10:21:58.

A pause of 5 seconds between the two.

This pause is only waiting for the non-daemonic message to complete.

Once the Python program ends, one of the tasks of the shutdown process is to clean up the threading routine.

The source code for Python threading shows that threading._shutdown() goes through each running thread and calls the .join() function on each thread that doesn’t have a _daemon _flag set.

So, since the thread is waiting in a sleep, your program waits to exit.

Once it completes and prints the message, the .join() function returns and the program exits.

This is the kind of behavior that you will want frequently, but there many other options available for you.

However, let’s first repeat our program using a daemon thread.

You simply have to add the _daemon=true _flag to the construction of the thread.

This is shown below:

thread = threading.Thread(target=my_function, args=(1,), daemon=True)

Yes, we’ve added the daemon=True flag.

When you run the code, it will give the following output:

In the above output, the final line of the output is missing.

The reason is that the function named my_function() did not complete.

So, the thread was killed when the main reached the end and the program ended.

join() a Thread</strong>

Your goal may be to wait for a thread to stop instead of exiting your program.

We had commented a line in our code.

This is shown below:


If you want to tell a thread to wait for another thread to finish, you should call the .join() function.

Just uncomment the line and see how the program behaves.

The main thread will pause in waiting for thread to complete running.

Not that this applies to both the regular and the daemon thread.

Dealing with Multiple Threads

In our previous code, we were working with only 2 threads, that is, the _main _thread and the thread we started using the _threading.Thread _object.

However, many are the times you will want to have more than 2 threads doing an interesting work.

You can do it using the harder or the easier way.

Let’s begin with the harder way…

This is just what you know.

Consider the following code:

import time

import logging

import threading

def my_function(thread_name):

logging.info(“Thread %s: starting”, thread_name) time.sleep(5) logging.info(“Thread %s: finished”, thread_name)

if name == “main”:

format = “%(asctime)s: %(message)s”

logging.basicConfig(format=format, level=logging.INFO,


threads = list()

for num in range(3):

	logging.info("Main	: create and start thread %d.", num)

	t = threading.Thread(target=my_function, args=(num,))



for num, thread in enumerate(threads):

	logging.info("Main	: before joining thread %d.", num)


	logging.info("Main	: thread %d done", num)

The code returns the following output:

We have simply started a thread, created a thread object, and then invoked the .start() function on it.

The program stores a list of thread objects in order to wait for them later using the .join() function.

From the output, you’ll realize that the threads are started in the order that you expect.

However, they may finish in the opposite order.

Note that the orderings will be different for different runs of the code.

The operating system is responsible for determining the order by which the threads are run.

Due to this, it may be difficult for you to predict the order.

The good news is that Python provides a number of ways for coordinating threads to have them run together.

We shall look at this later.

For now, let’s look into how we can make management of threads a bit easier.

The ThreadPoolExecutor</strong>

This provides us with an easier way of starting a group of threads than the mechanism we have discussed above.

From Python 3.2, the ThreadPoolExecutor comes as part of the standard library in concurrent.futures.

You can easily create it as a context manager using the _with _statement.

You can then use it to create and destroy the pool.

Let us rewrite the main from our previous example using the ThreadPoolExecutor:

import concurrent.futures

[other code]

if name == “main”:

format = "%(asctime)s: %(message)s"

logging.basicConfig(format=format, level=logging.INFO,



with concurrent.futures.ThreadPoolExecutor(max_workers=3) as thread_executor:

	thread_executor.map(my_function, range(3))

The above code creates a ThreadPoolExecutor as a context manager, stating the number of worker threads it needs in the pool.

The code has then used .map() to iterate over the values generated by _range(3), _passing each to a thread in the pool.

The end of the with _block makes the ThreadPoolExecutor perform a _.join() on every thread in the pool.

Always use the ThreadPoolExecutor as a context manager when possible to avoid forgetting to .join() the threads.

When you run the above code, you should get the following:

Also, the threads may not finish in the same order that they were started.

The reason is that scheduling of threads is handled by the operating system, making it difficult to predict how they will finish.

Race Conditions

When handling threads, you will come across race conditions.

Race conditions occur when one or more threads access shared data or a resource.

Race conditions are a rare condition, and they produce different results.

This makes it difficult to debug them.

However, we will be dealing with a race condition that occurs every time.

We will write a class to update a database.

However, don’t worry about having a database, we will simply fake it!

We will give our class the name UpdateDatabase _and add the _.init() and _.update() _methods to it:

import time

import logging

import threading

import concurrent.futures

class UpdateDatabase:

def __init__(self):

	self.value = 0


def updateDB(self, name):

	logging.info("Thread %s: starting update", name)

	local_copy = self.value

	local_copy += 1


	self.value = local_copy

	logging.info("Thread %s: finishing update", name)

The UpdateDatabase class keeps track of a single number, that is, .value.

This data is shared, hence, it will cause a race condition.

The init() has helped us initialize the value of .value to 0.

We’ve also created the update() function.

It simulates how to read a value from a database, perform some computation on it, and then write the new value back to the database.

Since it’s a simulation, for reading from a database, we are copying .value to a local variable.

For the computation, we are adding 1 to the value then .sleep() for some time.

For writing to the database, we are copying our local variable back to .value.

You can use the fake database as follows:

if name == “main”:

format = "%(asctime)s: %(message)s"

logging.basicConfig(format=format, level=logging.INFO,



database = UpdateDatabase()

logging.info("Testing update. Starting value is %d.", database.value)

with concurrent.futures.ThreadPoolExecutor(max_workers=2) as thread_executor:

	for index in range(2):

    	thread_executor.submit(database.updateDB, index)

logging.info("Testing update. Ending value is %d.", database.value)

The code will create a ThreadPoolExecutor with two threads and then call .submit() on each thread, instructing them to execute database.updateDB().

The code will give the following output upon execution:

We have two threads, each thread running the updateDB() and adding a 1 to .value.

So, you might have been expecting the value of _database.value _to be 2.

However, it’s not.


Because we have used threading.

The two threads have an interleaving access to the shared object, that is, .value.

Due to this, they are overwriting each other’s results.

Now that you’ve know what race conditions are, let’s discuss some of the ways that can help you avoid or handle them.

Basic Synchronization Using Lock</strong>

The lock is one of the mechanisms that you can use to avoid or solve race conditions.

It provides you with a way of allowing only one thread at a time into the read, modify, then write part of your code.

So, the lock works through the mechanism of mutual exclusion.

The lock operates in the same way as a hall pass.

Only one thread is allowed to have the lock at a time.

Any other thread that wants to use the thread is forced to wait for the current owner to give it up.

This is achieved using the .acquire() and .release() functions.

To acquire the lock, a thread should call my_lock.acquire().

If the lock is held by another thread, it has to wait for it to be released.

If a thread gets the lock and refuses to give it back, the program gets stuck.

However, the lock also operates like a context manager, hence, you can use it in the _with _statement.

This means that the lock will be released automatically when the _with _exits.

Now, let’s add a lock to the UpdateDatabase class:

import time

import logging

import threading

import concurrent.futures

class UpdateDatabase:

def __init__(self):

	self.value = 0

	self._lock = threading.Lock()


def locked_updateDB(self, name):

	logging.info("Thread %s: starting update", name)

	logging.debug("Thread %s about to lock", name)

	with self._lock:

    	logging.debug("Thread %s has lock", name)

    	local_copy = self.value

    	local_copy += 1


    	self.value = local_copy

    	logging.debug("Thread %s about to release lock", name)

	logging.debug("Thread %s after release", name)

	logging.info("Thread %s: finishing update", name)

The biggest change in the above code is the addition of the _lock() member, which is an object of threading.Lock().

When executed, the code should return the following output:


Your program has just worked!

And look, the output is now 2, not 1.

The reason is that we’ve used a lock.

If you need to see the full logging, simply add the following statement after configuring the logging output in main:


The code will now return the following:

From the above output. Thread 0 acquired the lock first, then Thread 1 acquired the lock last.

Thread 1 had to wait for Thread 0 to release the lock in order to acquire it.


This is a common problem that is associated with locks.

As you know, when a thread calls .acquire(), it has to wait for the thread that is holding the lock to call .release().

Consider the code given below:

import threading

t = threading.Lock()

print(“before first acquire”)


print(“after first acquire”)


print(“lock acquired twice”)

So, what will happen when you run the above code?

The program will hang!

Ask me, why?

Because when it runs the second _t.acquire(), _it will have to wait for the lock to be released.

This will cause a deadlock.

The basic way to avoid this is by removing the second call to the lock.

However, there are two subtle things that cause deadlocks to happen:

    1. 	A bug in the code where a lock is not released properly.

    2.	A design issue in which a utility function will be called by functions that may or may not have the lock.

You can reduce the first situation by using a Lock as a context manager.

To solve the second issue, you can use RLock.

This is a Python object that allows a thread to acquire an RLock multiple times before calling .release().

The thread is still required to call .release() the same number of times that it called .acquire().


Here is what you’ve learned…

  • In Python, threading allows you to have parts of your program execute concurrently.

  • Threading makes the design of Python programs simple.

  • The Python standard library provides threading.

  • To start a separate thread, simply create an instance of Thread then invoke the .start() function.

  • If a thread is not a daemon, the program will wait for the thread to finish before it can terminate.

  • A daemon thread will shut down automatically once the program terminates.

  • Race conditions occur when two or more threads access a shared resource or piece of data.

  • You can use locks to avoid or solve race conditions.

If you enjoyed this article, be sure to join my Developer Monthly newsletter, where I send out the latest news from the world of Python and JavaScript:
Written on July 9, 2020