Because of its global interpreter lock, Python doesn’t support multithreading. To me, this is a ridiculous limitation that should be gotten rid of post-haste: a programming language is not modern unless it support multithreading.
Python supports multiprocessing, but the straightforward manner of using multiprocessing requires you to pass data between processes using pickling/unpickling rather than sharing memory. Needless to say, this slows down execution when large amounts of data need to be shared by processes.
In my case, I’ve been using multiprocessing to speed up the training of a recursive autoencoder: instead of training one auto-encoder, I train an ensemble of around 30 simultaneously, and periodically average their parameters together and use these as the initialization points for further training of all the models. Pickling the parameters here doesn’t seem to be that costly, because the optimization itself takes so much time. However, it’s nice to note that there’s a relatively simple workaround that allows you to avoid the pickling/unpickling process: create the shared array as a ctypes object, and pass it into the initializer of each process, making it a global variable in each thread. Here’s some example code:
from multiprocessing import Pool, sharedctypes import numpy as np import warnings def populate(args): """ Populates the portion of the shared array arr indicated by the offset and blocksize options """ offset, blocksize = args with warnings.catch_warnings(): warnings.simplefilter('ignore', RuntimeWarning) v = np.ctypeslib.as_array(arr) for idx in range(offset, blocksize+offset): v[idx, :] = offset return offset def _init(arr_to_populate): """ Each pool process calls this initializer. Load the array to be populated into that process's global namespace """ global arr arr = arr_to_populate size = 2000 blocksize = 100 tmp = np.ctypeslib.as_ctypes(np.zeros((size, 10))) shared_arr = sharedctypes.Array(tmp._type_, tmp, lock=False) pool = Pool(processes=4, initializer=_init, initargs=(shared_arr, )) result = pool.map(populate, zip(range(0, size, blocksize), [blocksize]*(size/blocksize))) with warnings.catch_warnings(): warnings.simplefilter('ignore', RuntimeWarning) final_arr = np.ctypeslib.as_array(shared_arr) print final_arr
The warnings filter is there because a PEP compliance bug gives rise to some superfluous runtime warnings.