Python Performance Testing: Quick Tutorial & Best Practices

What Is Python Performance Testing?

Performance testing is an important part of the software development process. It verifies the application code’s stability, speed, scalability, and reliability.

The easiest way to make code perform better is to identify components that are slowing down the application. Usually, the code slowing down a system is just a small piece of the program, and fixing it is often easy once the critical code snippets are identified.

Python offers several tools to identify performance bottlenecks and track important metrics in your code. There are multiple tools that can help you test your Python code’s performance – this article will focus on a popular option, the Timeit library.

This is part of a series of articles about Python optimization.

In this article:

Tutorial: Python Performance Testing Using the Timeit Library
Python Performance Testing Best Practices
Python Performance Testing with Granulate

Tutorial: Python Performance Testing Using the Timeit Library

Python’s TimeIt module is used for timing programs. It is intended for “small bits” of code as it has important features for performing repetitive tests and can import code by setting up procedures from strings. In addition, the Timeit output displays execution time.

Following is the Python code which builds pairs from lst1 and lst2.

output = []
lst1= [1, 3, 5, 7, 9, 11]
lst2=[2, 4, 6, 8, 10, 12]
for a in lst1:
    for b in lst2:
        output.append((a, b))

The corresponding list comprehension is as follows:

output = [(a, b) for a in lst1 for b in lst2]

We can determine which is faster, either the for-loop or list comprehension approach. First, we use the above code to test tools and compare efficiency. Then, execute timeit in a command prompt. Other than the interface, both methods get the same results. To begin, let’s set a timer for our for-loop:

import timeit
test_code = '''
output = []
lst1= [1, 3, 5, 7, 9, 11]
lst2=[2, 4, 6, 8, 10, 12]
for a in lst1:
    for b in lst2:
        output.append((a, b))
'''

if __name__ == "__main__":
    time = timeit.timeit(test_code)
    print(time)

Results:

4.002317600010429

We will import the module. Then, store the test code for timeit in a string. To improve the readability of code, use a heredoc. The object is then passed to the timeit function. Timeit executed the program one million times before returning the result in seconds. For list comprehension:

test_code = '''
output = []
lst1= [1, 3, 5, 7, 9, 11]
lst2=[2, 4, 6, 8, 10, 12]
output = [(a, b) for a in lst1 for b in lst2]

'''

if __name__ == "__main__":
    time = timeit.timeit(test_code)
    print(time)

Results:

3.0901128000114113

Given that the list comprehension only requires a single line of code, we can pass it to timeit. It does not affect performance; it merely displays two methods for passing code tests. So it appears that list comprehension is efficient.

Moreover, the sample size can be increased by changing the default amount of runs:

test_code = '''
output = []
lst1= [1, 3, 5, 7, 9, 11]
lst2=[2, 4, 6, 8, 10, 12]
output = [(a, b) for a in lst1 for b in lst2]

'''

if __name__ == "__main__":
    time = timeit.timeit(test_code, number=2000000)
    print(time)

Increasing iterations doubles the output. Therefore performance appears linear:

5.393972299993038

Changing the for-loop timing also increases performance proportionally:

6.010824600001797

With repeated cycles, the distinction in performance stays consistent. Moreover, the repeat function executes the code many times:

test_code = '''
output = []
lst1= [1, 3, 5, 7, 9, 11]
lst2=[2, 4, 6, 8, 10, 12]
output = [(a, b) for a in lst1 for b in lst2]

'''

if __name__ == "__main__":
    time = timeit.repeat(test_code, number=50000, repeat=5)
    print(time)

Result:

[0.12662399996770546, 0.16766209999332204, 0.13140149996615946, 0.14834039995912462, 0.15544729999965057]

The above examples show that list comprehension is significantly faster than for-loop.

Learn more in our detailed guide to Python profiling

Python Performance Testing Best Practices

Create an Isolated Testing Environment

The testing process breaks your application, so you should avoid running tests in a production environment. A separate environment should provide similar infrastructure to production – but with fake data sets – to enable safe testing.

Identify Test Scenarios

All applications have scenarios that require testing. It is important to identify your application’s key requirements and issues so you can focus on testing relevant features and functions. Look at usage logs to determine which features are popular and risk creating a bottleneck. To recreate a realistic scenario, you must perform all the steps involved in the real world.

Analyze and Test Again

You should collect data throughout each test to understand and report issues to the development team. It’s important to retest the application after the developers implement changes, ensuring they fix the issues identified during previous tests. Achieving adequate performance often requires running multiple tests and implementing multiple changes and iterations.

Track the Right Metrics

Each test must measure the relevant metrics to understand the application. Metrics should provide important data, such as response times, error rates, number of failed transactions, memory and CPU consumption, and wait times. This data helps explain how the number of concurrent users impacts performance.

Python Performance Testing with Granulate

Granulate’s code profiler is absolutely free and open-source (unlike most other tools on the market), and extremely easy to install and start using. Also, its user interface is intuitive and flexible, letting you select different time periods, filter processes, etc.

One of the main qualities of the profiler’s UI is that it shows a unified view of your entire system, not just isolated areas, written in a specific programming language. Granulate’s profiler also lets you share graphs and profiled data with other team members by inviting them to the profile, or by exporting it as an SVG image.

The profiler tracks CPU and memory utilization as well, displaying them in nice charts for easy viewing. From there, you can monitor for occasional spikes or observe a moment in time when a known performance drop occurred.

The continuous profiler is not limited to profiling Python in Docker containers; it supports all programming languages and runtimes, as well as systems, like Kubernetes and Amazon Elastic Container Service. The profiler smoothly integrates into a system for greater observability using an aerial view, or you can drill down for granular detail. Additionally, it uses the eBPF technology to minimize overhead (boasting a utilization penalty of less than 1%) and make data transferring safer.

Python Performance Testing: Quick Tutorial and Best Practices

Ofer Dekel

Product Manager, Intel Granulate