In the Linux kernel, the following vulnerability has been resolved: dm-raid456, md/raid456: fix a deadlock for dm-raid456 while io concurrent with reshape For raid456, if reshape is still in progress, then IO across reshape position will wait for reshape to make progress. However, for dm-raid, in following cases reshape will never make progress hence IO will hang: 1) the array is read-only; 2) MD_RECOVERY_WAIT is set; 3) MD_RECOVERY_FROZEN is set; After commit c467e97f079f ("md/raid6: use valid sector values to determine if an I/O should wait on the reshape") fix the problem that IO across reshape position doesn't wait for reshape, the dm-raid test shell/lvconvert-raid-reshape.sh start to hang: [root@fedora ~]# cat /proc/979/stack [] wait_woken+0x7d/0x90 [] raid5_make_request+0x929/0x1d70 [raid456] [] md_handle_request+0xc2/0x3b0 [md_mod] [] raid_map+0x2c/0x50 [dm_raid] [] __map_bio+0x251/0x380 [dm_mod] [] dm_submit_bio+0x1f0/0x760 [dm_mod] [] __submit_bio+0xc2/0x1c0 [] submit_bio_noacct_nocheck+0x17f/0x450 [] submit_bio_noacct+0x2bc/0x780 [] submit_bio+0x70/0xc0 [] mpage_readahead+0x169/0x1f0 [] blkdev_readahead+0x18/0x30 [] read_pages+0x7c/0x3b0 [] page_cache_ra_unbounded+0x1ab/0x280 [] force_page_cache_ra+0x9e/0x130 [] page_cache_sync_ra+0x3b/0x110 [] filemap_get_pages+0x143/0xa30 [] filemap_read+0xdc/0x4b0 [] blkdev_read_iter+0x75/0x200 [] vfs_read+0x272/0x460 [] ksys_read+0x7a/0x170 [] __x64_sys_read+0x1c/0x30 [] do_syscall_64+0xc6/0x230 [] entry_SYSCALL_64_after_hwframe+0x6c/0x74 This is because reshape can't make progress. For md/raid, the problem doesn't exist because register new sync_thread doesn't rely on the IO to be done any more: 1) If array is read-only, it can switch to read-write by ioctl/sysfs; 2) md/raid never set MD_RECOVERY_WAIT; 3) If MD_RECOVERY_FROZEN is set, mddev_suspend() doesn't hold 'reconfig_mutex', hence it can be cleared and reshape can continue by sysfs api 'sync_action'. However, I'm not sure yet how to avoid the problem in dm-raid yet. This patch on the one hand make sure raid_message() can't change sync_thread() through raid_message() after presuspend(), on the other hand detect the above 3 cases before wait for IO do be done in dm_suspend(), and let dm-raid requeue those IO.
This Cyber News was published on www.tenable.com. Publication date: Thu, 02 May 2024 06:56:04 +0000