resqpy.multi_processing.function_multiprocessing¶
- resqpy.multi_processing.function_multiprocessing(function: Callable, kwargs_list: List[Dict[str, Any]], recombined_epc: Union[Path, str], cluster, consolidate: bool = True, require_success=False, tmp_dir_path: Union[Path, str] = '.', backend: str = 'dask', clean_up: bool = True) List[bool] ¶
Calls a function concurrently with the specfied arguments.
- Parameters
function (Callable) – the wrapper function to be called; needs to return: - index (int): the index of the kwargs in the kwargs_list; - success (bool): whether the function call was successful, however that is defined; - epc_file (Path or str): the epc file path where the objects are stored; - uuid_list (list of str): list of UUIDs of relevant objects;
kwargs_list (list of dict) – A list of keyword argument dictionaries that are used when calling the function
recombined_epc (Path or str) – A pathlib Path or path string of where the combined epc will be saved
cluster – if using the Dask backend, a LocalCluster is a Dask cluster on a local machine. If using a job queing system, a JobQueueCluster can be used such as an SGECluster, SLURMCluster, PBSCluster, LSFCluster etc
consolidate (bool) – if True and an equivalent part already exists in a model, it is not duplicated and the uuids are noted as equivalent
require_success (bool) – if True and any instance fails, then an exception is raised
tmp_dir_path (str) – path where the temporary directory is saved; defaults to the calling code directory
backend (str) – the joblib parallel backend used. Dask is used by default so a Dask cluster must be passed to the cluster argument
clean_up (bool, default True) – if True, the temporary directory used during multi processing is deleted; if False, it is left in place with its contents (to facilitate debugging)
- Returns
success_list (List[bool]) – A boolean list of successful function calls
Notes
a multiprocessing pool is used to call the function multiple times in parallel; once all results are returned, they are combined into a single epc file; this function uses the Dask backend by default to run the given function in parallel, so a Dask cluster must be setup and passed as an argument if Dask is used; Dask will need to be installed in the Python environment because it is not a dependency of the project; more info can be found at https://resqpy.readthedocs.io/en/latest/tutorial/multiprocessing.html