[LinuxPPS] 3 Feb 19:26 - [PATCH] aio: fix buggy put_ioctx call in aio_complete - v2

linuxpps: Ken Chen <> webmaster at enneenne.com
Wed Feb 7 10:45:08 CET 2007


[PATCH] aio: fix buggy put_ioctx call in aio_complete - v2  
  
An AIO bug was reported that sleeping function is being called in softirq  
context:  
  
BUG: warning at kernel/mutex.c:132/__mutex_lock_common()  
Call Trace:  
[] __mutex_lock_slowpath+0x640/0x6c0  
[] mutex_lock+0x20/0x40  
[] flush_workqueue+0xb0/0x1a0  
[] __put_ioctx+0xc0/0x240  
[] aio_complete+0x2f0/0x420  
[] finished_one_bio+0x200/0x2a0  
[] dio_bio_complete+0x1c0/0x200  
[] dio_bio_end_aio+0x60/0x80  
[] bio_endio+0x110/0x1c0  
[] __end_that_request_first+0x180/0xba0  
[] end_that_request_chunk+0x30/0x60  
[] scsi_end_request+0x50/0x300 [scsi_mod]  
[] scsi_io_completion+0x200/0x8a0 [scsi_mod]  
[] sd_rw_intr+0x330/0x860 [sd_mod]  
[] scsi_finish_command+0x100/0x1c0 [scsi_mod]  
[] scsi_softirq_done+0x230/0x300 [scsi_mod]  
[] blk_done_softirq+0x160/0x1c0  
[] __do_softirq+0x200/0x240  
[] do_softirq+0x70/0xc0  
  
See report: http://marc.theaimsgroup.com/?l=linux-kernel&m=116599593200888&w=2  
  
flush_workqueue() is not allowed to be called in the softirq context.  
However, aio_complete() called from I/O interrupt can potentially call  
put_ioctx with last ref count on ioctx and triggers bug. It is simply  
incorrect to perform ioctx freeing from aio_complete.  
  
The bug is trigger-able from a race between io_destroy() and aio_complete().  
A possible scenario:  
  
cpu0 cpu1  
io_destroy aio_complete  
wait_for_all_aios { __aio_put_req  
... ctx->reqs_active--;  
if (!ctx->reqs_active)  
return;  
}  
...  
put_ioctx(ioctx)  
  
put_ioctx(ctx);  
__put_ioctx  
bam! Bug trigger!  
  
The real problem is that the condition check of ctx->reqs_active in  
wait_for_all_aios() is incorrect that access to reqs_active is not  
being properly protected by spin lock.  
  
This patch adds that protective spin lock, and at the same time removes  
all duplicate ref counting for each kiocb as reqs_active is already used  
as a ref count for each active ioctx. This also ensures that buggy call  
to flush_workqueue() in softirq context is eliminated.  
  
Signed-off-by: "Ken Chen"   
Cc: Zach Brown   
Cc: Suparna Bhattacharya   
Cc: Benjamin LaHaise   
Cc: Badari Pulavarty   
Cc:   
Acked-by: Jeff Moyer   
Signed-off-by: Andrew Morton   
Signed-off-by: Linus Torvalds   
  
fs/aio.c

URL: http://gitweb.enneenne.com/?p=linuxpps;a=commit;h=dee11c2364f51cac53df17d742a0c69097e29a4e



More information about the LinuxPPS mailing list