• Mikulas Patocka's avatar
    dm kcopyd: fix callback race · 340cd444
    Mikulas Patocka authored
    If the thread calling dm_kcopyd_copy is delayed due to scheduling inside
    split_job/segment_complete and the subjobs complete before the loop in
    split_job completes, the kcopyd callback could be invoked from the
    thread that called dm_kcopyd_copy instead of the kcopyd workqueue.
    
    dm_kcopyd_copy -> split_job -> segment_complete -> job->fn()
    
    Snapshots depend on the fact that callbacks are called from the singlethreaded
    kcopyd workqueue and expect that there is no racing between individual
    callbacks. The racing between callbacks can lead to corruption of exception
    store and it can also mean that exception store callbacks are called twice
    for the same exception - a likely reason for crashes reported inside
    pending_complete() / remove_exception().
    
    This patch fixes two problems:
    
    1. job->fn being called from the thread that submitted the job (see above).
    
    - Fix: hand over the completion callback to the kcopyd thread.
    
    2. job->fn(read_err, write_err, job->context); in segment_complet...
    340cd444
dm-kcopyd.c 13.9 KB