Friday, September 15, 2017

Multithreading in Python

In the previous post, I investigated a way to preprocess images using multiple processes. In this post, I will investigate a way to preprocess images using multiple threads.

The real difference from the multiprocessing code is not much. Instead of using multiprocessing.Pool class, use multiprocessing.pool.ThreadPool class. Below is the code:

import cv2
import time
import glob
import multiprocessing.pool as mp
def process_img(image_file):
if '_canny.jpg' not in image_file:
print 'processing', image_file
img = cv2.imread(image_file)
img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
img = cv2.blur(img, (3,3))
img = cv2.Canny(img, 10, 25)
cv2.imwrite(image_file[:-4] + '_canny.jpg', img)
def single_thread(image_files):
start = time.time()
for image_file in image_files:
process_img(image_file)
print 'single thread elapsed time:', int(time.time() - start)
def multi_threads(image_files, nthr=2):
start = time.time()
pool = mp.ThreadPool(nthr)
for image_file in image_files:
pool.apply_async(process_img, (image_file,))
pool.close()
pool.join()
print nthr, 'threads elapsed time:', int(time.time() - start)
if __name__ == '__main__':
files = glob.glob('images/*.jpg')
single_thread(files)
multi_threads(files, nthr=2)
multi_threads(files, nthr=4)
view raw thread_pool.py hosted with ❤ by GitHub
The execution time for multithreading is a bit slower than that of multiprocessing, but I am not sure if this is always the case, as the difference is not significant.

single thread elapsed time: 364
threads elapsed time: 184
threads elapsed time: 115

No comments:

Post a Comment