Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add f2fs support #7

Open
wants to merge 22 commits into
base: lineage-18.1
Choose a base branch
from

Conversation

haridhayal11
Copy link

No description provided.

Jaegeuk Kim and others added 22 commits February 23, 2021 19:06
Cherry-picked from origin/upstream-f2fs-stable-linux-4.4.y:

ba1ade71012d fscrypt: resolve some cherry-pick bugs
9e32f17d241b fscrypt: move to generic async completion
4ecacbed6e1c crypto: introduce crypto wait for async op
42d89da82b25 fscrypt: lock mutex before checking for bounce page pool
2286508d17c2 fscrypt: new helper function - fscrypt_prepare_setattr()
5cbdd42ad248 fscrypt: new helper function - fscrypt_prepare_lookup()
a31feba5c18f fscrypt: new helper function - fscrypt_prepare_rename()
95efafb6239d fscrypt: new helper function - fscrypt_prepare_link()
2b4b4f98dddf fscrypt: new helper function - fscrypt_file_open()
8c815f381cd6 fscrypt: new helper function - fscrypt_require_key()
272e43502577 fscrypt: remove unneeded empty fscrypt_operations structs
1034eeec516a fscrypt: remove ->is_encrypted()
32c0d3ae9d66 fscrypt: switch from ->is_encrypted() to IS_ENCRYPTED()
a4781dd1f175 fs, fscrypt: add an S_ENCRYPTED inode flag
ff0a3dbc9392 fscrypt: clean up include file mess
bc4a61c60bea fscrypt: fix dereference of NULL user_key_payload
a53dc7e00559 fscrypt: make ->dummy_context() return bool

Change-Id: I461d742adc7b77177df91429a1fd9c8624a698d6
Signed-off-by: Jaegeuk Kim <[email protected]>
Pull f2fs updates from Jaegeuk Kim:
 "In this round, we've followed up to support some generic features such
  as cgroup, block reservation, linking fscrypt_ops, delivering
  write_hints, and some ioctls. And, we could fix some corner cases in
  terms of power-cut recovery and subtle deadlocks.

  Enhancements:
   - bitmap operations to handle NAT blocks
   - readahead to improve readdir speed
   - switch to use fscrypt_*
   - apply write hints for direct IO
   - add reserve_root=%u,resuid=%u,resgid=%u to reserve blocks for root/uid/gid
   - modify b_avail and b_free to consider root reserved blocks
   - support cgroup writeback
   - support FIEMAP_FLAG_XATTR for fibmap
   - add F2FS_IOC_PRECACHE_EXTENTS to pre-cache extents
   - add F2FS_IOC_{GET/SET}_PIN_FILE to pin LBAs for data blocks
   - support inode creation time

  Bug fixs:
   - sysfile-based quota operations
   - memory footprint accounting
   - allow to write data on partial preallocation case
   - fix deadlock case on fallocate
   - fix to handle fill_super errors
   - fix missing inode updates of fsync'ed file
   - recover renamed file which was fsycn'ed before
   - drop inmemory pages in corner error case
   - keep last_disk_size correctly
   - recover missing i_inline flags during roll-forward

  Various clean-up patches were added as well"

Cherry-pick from origin/upstream-f2fs-stable-linux-4.4.y:

5f9b3abb911f f2fs: support inode creation time
9fb0de175172 f2fs: rebuild sit page from sit info in mem
1062a0c01829 f2fs: stop issuing discard if fs is readonly
fa043fae9030 f2fs: clean up duplicated assignment in init_discard_policy
b007190234d6 f2fs: use GFP_F2FS_ZERO for cleanup
35b11839a1ae f2fs: allow to recover node blocks given updated checkpoint
e56500860be0 f2fs: recover some i_inline flags
64aa9569a1bf f2fs: correct removexattr behavior for null valued extended attribute
70b3a923daff f2fs: drop page cache after fs shutdown
8069a0e983d9 f2fs: stop gc/discard thread after fs shutdown
bb924f777717 f2fs: hanlde error case in f2fs_ioc_shutdown
700b53f21ee8 f2fs: split need_inplace_update
f31d52811c1f f2fs: fix to update last_disk_size correctly
eeb0118b8340 f2fs: kill F2FS_INLINE_XATTR_ADDRS for cleanup
c1b74c967092 f2fs: clean up error path of fill_super
d5efd57e013b f2fs: avoid hungtask when GC encrypted block if io_bits is set
c4027d08430b f2fs: allow quota to use reserved blocks
18d267c273a9 f2fs: fix to drop all inmem pages correctly
4dca47531eb0 f2fs: speed up defragment on sparse file
999f806a7c9e f2fs: support F2FS_IOC_PRECACHE_EXTENTS
84960fca96c4 f2fs: add an ioctl to disable GC for specific file
292c8e1cfd4d f2fs: prevent newly created inode from being dirtied incorrectly
58b1f5b0fcf1 f2fs: support FIEMAP_FLAG_XATTR
6afa9a94d09b f2fs: fix to cover f2fs_inline_data_fiemap with inode_lock
10f4a4140b61 f2fs: check node page again in write end io
b203c58dfd55 f2fs: fix to caclulate required free section correctly
d49132d45cb0 f2fs: handle newly created page when revoking inmem pages
2ce6b9d8167e f2fs: add resgid and resuid to reserve root blocks
f53dcf6799ab f2fs: implement cgroup writeback support
1338f376d5a3 f2fs: remove unused pend_list_tag
d4f19f6266ab f2fs: avoid high cpu usage in discard thread
b78e9302e2e3 f2fs: make local functions static
62438ba87b79 f2fs: add reserved blocks for root user
06a366757ff7 f2fs: check segment type in __f2fs_replace_block
4c6bc4be375a f2fs: update inode info to inode page for new file
591b33638733 f2fs: show precise # of blocks that user/root can use
b242d7edc537 f2fs: clean up unneeded declaration
87b8168e9ef0 f2fs: continue to do direct IO if we only preallocate partial blocks
2b4d859bd9d8 f2fs: enable quota at remount from r to w
54bf13a0adcd f2fs: skip stop_checkpoint for user data writes
25ef3006ba23 f2fs: fix missing error number for xattr operation
cff2c7fe417b f2fs: recover directory operations by fsync
e2bb618a0a6b f2fs: return error during fill_super
8a2c11d8658d f2fs: fix an error case of missing update inode page
cd38d5ada5a4 f2fs: fix potential hangtask in f2fs_trace_pid
e81cafbeba4b f2fs: no need return value in restore summary process
04d44000d633 f2fs: use unlikely for release case
925d0933d8f0 f2fs: don't return value in truncate_data_blocks_range
f7986c416d1b f2fs: clean up f2fs_map_blocks
e4f5e26cdadf f2fs: clean up hash codes
1f994d47080c f2fs: fix error handling in fill_super
e7db649b5fb1 f2fs: spread f2fs_k{m,z}alloc
5d4e487b9929 f2fs: inject fault to kvmalloc
8b33886c37cd f2fs: inject fault to kzalloc
d94680798786 f2fs: remove a redundant conditional expression
3bc01114a338 f2fs: apply write hints to select the type of segment for direct write
c80f01959114 f2fs: switch to fscrypt_prepare_setattr()
bb8b850365ff f2fs: switch to fscrypt_prepare_lookup()
9ab470eaf8a8 f2fs: switch to fscrypt_prepare_rename()
aeaac517a12d f2fs: switch to fscrypt_prepare_link()
101c6a96ad1c f2fs: switch to fscrypt_file_open()
6d025237a1f8 f2fs: remove repeated f2fs_bug_on
b01e03d724de f2fs: remove an excess variable
e1f9be2f7c82 f2fs: fix lock dependency in between dio_rwsem & i_mmap_sem
e5c7c8601030 f2fs: remove unused parameter
f130dbb98a68 f2fs: still write data if preallocate only partial blocks
47ee9b259811 f2fs: introduce sysfs readdir_ra to readahead inode block in readdir
55e2f89181ce f2fs: fix concurrent problem for updating free bitmap
e1398f6554b4 f2fs: remove unneeded memory footprint accounting
2d69561135f2 f2fs: no need to read nat block if nat_block_bitmap is set
4dd2d0733809 f2fs: reserve nid resource for quota sysfile

Signed-off-by: Jaegeuk Kim <[email protected]>
Pull f2fs update from Jaegeuk Kim:
 "In this round, we've mainly focused on performance tuning and critical
  bug fixes occurred in low-end devices. Sheng Yong introduced
  lost_found feature to keep missing files during recovery instead of
  thrashing them. We're preparing coming fsverity implementation. And,
  we've got more features to communicate with users for better
  performance. In low-end devices, some memory-related issues were
  fixed, and subtle race condtions and corner cases were addressed as
  well.

  Enhancements:
   - large nat bitmaps for more free node ids
   - add three block allocation policies to pass down write hints given by user
   - expose extension list to user and introduce hot file extension
   - tune small devices seamlessly for low-end devices
   - set readdir_ra by default
   - give more resources under gc_urgent mode regarding to discard and cleaning
   - introduce fsync_mode to enforce posix or not
   - nowait aio support
   - add lost_found feature to keep dangling inodes
   - reserve bits for future fsverity feature
   - add test_dummy_encryption for FBE

  Bug fixes:
   - don't use highmem for dentry pages
   - align memory boundary for bitops
   - truncate preallocated blocks in write errors
   - guarantee i_times on fsync call
   - clear CP_TRIMMED_FLAG correctly
   - prevent node chain loop during recovery
   - avoid data race between atomic write and background cleaning
   - avoid unnecessary selinux violation warnings on resgid option
   - GFP_NOFS to avoid deadlock in quota and read paths
   - fix f2fs_skip_inode_update to allow i_size recovery

  In addition to the above, there are several minor bug fixes and clean-ups"

Cherry-pick from origin/upstream-f2fs-stable-linux-4.4.y:

42bf67fc543b f2fs: remain written times to update inode during fsync
6cb5aa02bfbd f2fs: make assignment of t->dentry_bitmap more readable
a8d07f1f9c62 f2fs: truncate preallocated blocks in error case
86444d600692 f2fs: fix a wrong condition in f2fs_skip_inode_update
db2188a68704 f2fs: reserve bits for fs-verity
ee2e74b3f00e f2fs: Add a segment type check in inplace write
0192e0a4502f f2fs: no need to initialize zero value for GFP_F2FS_ZERO
49338842e9b2 f2fs: don't track new nat entry in nat set
d6a69d5e6568 f2fs: clean up with F2FS_BLK_ALIGN
2c8834a7a2c9 f2fs: check blkaddr more accuratly before issue a bio
6ab573a9d96f f2fs: Set GF_NOFS in read_cache_page_gfp while doing f2fs_quota_read
7419dcb8be02 f2fs: introduce a new mount option test_dummy_encryption
9321e22c038c f2fs: introduce F2FS_FEATURE_LOST_FOUND feature
8a5719615847 f2fs: release locks before return in f2fs_ioc_gc_range()
739ace131cdf f2fs: align memory boundary for bitops
4c55abe4f8d2 f2fs: remove unneeded set_cold_node()
30654507e0a2 f2fs: add nowait aio support
d909e9410634 f2fs: wrap all options with f2fs_sb_info.mount_opt
5738be52b3e8 f2fs: Don't overwrite all types of node to keep node chain
0bdeb167c843 f2fs: introduce mount option for fsync mode
6bc490f0eedc f2fs: fix to restore old mount option in ->remount_fs
0c9c3e034410 f2fs: wrap sb_rdonly with f2fs_readonly
6c6611223a79 f2fs: avoid selinux denial on CAP_SYS_RESOURCE
076a6f32fe5d f2fs: support hot file extension
58edcdbca67a f2fs: fix to avoid race in between atomic write and background GC
1e0aeb0af9ed f2fs: do gc in greedy mode for whole range if gc_urgent mode is set
10b2d001d6ac f2fs: issue discard aggressively in the gc_urgent mode
a5052f32b940 f2fs: set readdir_ra by default
1aa536a624cc f2fs: add auto tuning for small devices
0ffdffc8f106 f2fs: add mount option for segment allocation policy
b79829891249 f2fs: don't stop GC if GC is contended
766d2321697f f2fs: expose extension_list sysfs entry
98b329de5026 f2fs: fix to set KEEP_SIZE bit in f2fs_zero_range
4d409fa3346b f2fs: introduce sb_lock to make encrypt pwsalt update exclusive
1f6bac14c100 f2fs: remove redundant initialization of pointer 'p'
946aefc7545d f2fs: flush cp pack except cp pack 2 page at first
e5081a52ac09 f2fs: clean up f2fs_sb_has_xxx functions
a292477154b5 f2fs: remove redundant check of page type when submit bio
190e64a819df f2fs: fix to handle looped node chain during recovery
889d98087652 f2fs: handle quota for orphan inodes
92b12bb1a23e f2fs: support passing down write hints to block layer with F2FS policy
22fa74c2b097 f2fs: support passing down write hints given by users to block layer
180900373ec1 f2fs: fix to clear CP_TRIMMED_FLAG
0671fae134bb f2fs: support large nat bitmap
eceb943d5d59 f2fs: fix to check extent cache in f2fs_drop_extent_tree
2e2a339c9853 f2fs: restrict inline_xattr_size configuration
41dda1164137 f2fs: fix heap mode to reset it back
39575737bb62 f2fs: fix potential corruption in area before F2FS_SUPER_OFFSET
7e0e7995ee97 fscrypt: fix build with pre-4.6 gcc versions
31d3279a4fca fscrypt: fix up fscrypt_fname_encrypted_size() for internal use
82bec888567b fscrypt: define fscrypt_fname_alloc_buffer() to be for presented names
168a90782888 fscrypt: calculate NUL-padding length in one place only
042ae9f4cfbf fscrypt: move fscrypt_symlink_data to fscrypt_private.h
f9550c24c20e fscrypt: remove fscrypt_fname_usr_to_disk()
7ac4756a2474 f2fs: switch to fscrypt_get_symlink()
6b76f58e24bd f2fs: switch to fscrypt ->symlink() helper functions
fd457d2c4e04 fscrypt: new helper function - fscrypt_get_symlink()
a1cdacb7ae0d fscrypt: new helper functions for ->symlink()
7f43602f4d10 fscrypt: trim down fscrypt.h includes
d9cadc11bdcf fscrypt: move fscrypt_is_dot_dotdot() to fs/crypto/fname.c
e6fe930580cb fscrypt: move fscrypt_valid_enc_modes() to fscrypt_private.h
efefa434f47e fscrypt: move fscrypt_operations declaration to fscrypt_supp.h
7ed178bc8ae9 fscrypt: split fscrypt_dummy_context_enabled() into supp/notsupp versions
3f16e09dadfb fscrypt: move fscrypt_ctx declaration to fscrypt_supp.h
8216a0b51a3b fscrypt: move fscrypt_info_cachep declaration to fscrypt_private.h
dfe0b3b1b67f fscrypt: move fscrypt_control_page() to supp/notsupp headers
3a2c79177822 fscrypt: move fscrypt_has_encryption_key() to supp/notsupp headers

Signed-off-by: Jaegeuk Kim <[email protected]>
Cherry-picked from:
  origin/upstream-f2fs-stable-linux-4.4.y

85d2070f60c6 ("f2fs: turn down IO priority of discard from background")
4738f527db84 ("f2fs: don't split checkpoint in fstrim")
31e2713935ea ("f2fs: issue discard commands proactively in high fs utilization")
70676ef73646 ("f2fs: add fsync_mode=nobarrier for non-atomic files")
bb53d06b5f21 ("f2fs: let fstrim issue discard commands in lower priority")

Signed-off-by: Jaegeuk Kim <[email protected]>
Patch series "Ranged pagevec tagged lookup", v3.

In this series I provide a ranged variant of pagevec_lookup_tag() and
use it in places where it makes sense.  This series removes some common
code and it also has a potential for speeding up some operations
similarly as for pagevec_lookup_range() (but for now I can think of only
artificial cases where this happens).

This patch (of 16):

Implement a variant of find_get_pages_tag() that stops iterating at
given index.  Lots of users of this function (through pagevec_lookup())
actually want a range lookup and all of them are currently open-coding
this.

Also create corresponding pagevec_lookup_range_tag() function.

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Jan Kara <[email protected]>
Reviewed-by: Daniel Jordan <[email protected]>
Cc: Bob Peterson <[email protected]>
Cc: Chao Yu <[email protected]>
Cc: David Howells <[email protected]>
Cc: David Sterba <[email protected]>
Cc: Ilya Dryomov <[email protected]>
Cc: Jaegeuk Kim <[email protected]>
Cc: Ryusuke Konishi <[email protected]>
Cc: Steve French <[email protected]>
Cc: "Theodore Ts'o" <[email protected]>
Cc: "Yan, Zheng" <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
We want only pages from given range in btree_write_cache_pages() and
extent_write_cache_pages().  Use pagevec_lookup_range_tag() instead of
pagevec_lookup_tag() and remove unnecessary code.

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Jan Kara <[email protected]>
Reviewed-by: David Sterba <[email protected]>
Reviewed-by: Daniel Jordan <[email protected]>
Cc: David Sterba <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
We want only pages from given range in ceph_writepages_start().  Use
pagevec_lookup_range_tag() instead of pagevec_lookup_tag() and remove
unnecessary code.

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Jan Kara <[email protected]>
Reviewed-by: Daniel Jordan <[email protected]>
Reviewed-by: "Yan, Zheng" <[email protected]>
Cc: Ilya Dryomov <[email protected]>
Cc: "Yan, Zheng" <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
We want only pages from given range in ext4_writepages().  Use
pagevec_lookup_range_tag() instead of pagevec_lookup_tag() and remove
unnecessary code.

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Jan Kara <[email protected]>
Reviewed-by: Daniel Jordan <[email protected]>
Cc: "Theodore Ts'o" <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
We want only pages from given range in f2fs_write_cache_pages().  Use
pagevec_lookup_range_tag() instead of pagevec_lookup_tag() and remove
unnecessary code.

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Jan Kara <[email protected]>
Reviewed-by: Chao Yu <[email protected]>
Reviewed-by: Daniel Jordan <[email protected]>
Cc: Jaegeuk Kim <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
In several places we want to iterate over all tagged pages in a mapping.
However the code was apparently copied from places that iterate only
over a limited range and thus it checks for index <= end, optimizes the
case where we are coming close to range end which is all pointless when
end == ULONG_MAX.  So just remove this dead code.

[[email protected]: fix warnings]
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Jan Kara <[email protected]>
Reviewed-by: Daniel Jordan <[email protected]>
Reviewed-by: Chao Yu <[email protected]>
Cc: Jaegeuk Kim <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
__get_first_dirty_index() wants to lookup only the first dirty page
after given index.  There's no point in using pagevec_lookup_tag() for
that.  Just use find_get_pages_tag() directly.

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Jan Kara <[email protected]>
Reviewed-by: Chao Yu <[email protected]>
Reviewed-by: Daniel Jordan <[email protected]>
Cc: Jaegeuk Kim <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
We want only pages from given range in gfs2_write_cache_jdata().  Use
pagevec_lookup_range_tag() instead of pagevec_lookup_tag() and remove
unnecessary code.

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Jan Kara <[email protected]>
Reviewed-by: Daniel Jordan <[email protected]>
Cc: Bob Peterson <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
We want only pages from given range in nilfs_lookup_dirty_data_buffers().
Use pagevec_lookup_range_tag() instead of pagevec_lookup_tag() and
remove unnecessary code.

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Jan Kara <[email protected]>
Reviewed-by: Daniel Jordan <[email protected]>
Acked-by: Ryusuke Konishi <[email protected]>
Cc: Ryusuke Konishi <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
Use pagevec_lookup_range_tag() in __filemap_fdatawait_range() as it is
interested only in pages from given range.  Remove unnecessary code
resulting from this.

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Jan Kara <[email protected]>
Reviewed-by: Daniel Jordan <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
Use pagevec_lookup_range_tag() in write_cache_pages() as it is
interested only in pages from given range.  Remove unnecessary code
resulting from this.

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Jan Kara <[email protected]>
Reviewed-by: Daniel Jordan <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
Currently pagevec_lookup_range_tag() takes number of pages to look up
but most users don't need this.  Create a new function
pagevec_lookup_range_nr_tag() that takes maximum number of pages to
lookup for Ceph which wants this functionality so that we can drop
nr_pages argument from pagevec_lookup_range_tag().

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Jan Kara <[email protected]>
Reviewed-by: Daniel Jordan <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
Use new function for looking up pages since nr_pages argument from
pagevec_lookup_range_tag() is going away.

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Jan Kara <[email protected]>
Reviewed-by: "Yan, Zheng" <[email protected]>
Reviewed-by: Daniel Jordan <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
All users of pagevec_lookup() and pagevec_lookup_range() now pass
PAGEVEC_SIZE as a desired number of pages.  Just drop the argument.

Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Jan Kara <[email protected]>
Reviewed-by: Daniel Jordan <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
Avoid conflicts with Samsung's backported __blkdev_issue_discard in blk-lib.c
tl;dr: trying to use backported one but it resulted in a horrible panic

Signed-off-by: Diep Quynh <[email protected]>
This patch shows the fsync_mode=nobarrier mount option in
f2fs_show_options().

Signed-off-by: Sahitya Tummala <[email protected]>
Reviewed-by: Chao Yu <[email protected]>
Signed-off-by: Jaegeuk Kim <[email protected]>
@Wizardboy92
Copy link

God

Copy link

@Wizardboy92 Wizardboy92 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok

Royna2544 pushed a commit to Roynas-Android-Playground/android_kernel_samsung_universal8895 that referenced this pull request May 15, 2023
…loc()

commit d29f59051d3a07b81281b2df2b8c9dfe4716067f upstream.

The voice allocator sometimes begins allocating from near the end of the
array and then wraps around, however snd_emu10k1_pcm_channel_alloc()
accesses the newly allocated voices as if it never wrapped around.

This results in out of bounds access if the first voice has a high enough
index so that first_voice + requested_voice_count > NUM_G (64).
The more voices are requested, the more likely it is for this to occur.

This was initially discovered using PipeWire, however it can be reproduced
by calling aplay multiple times with 16 channels:
aplay -r 48000 -D plughw:CARD=Live,DEV=3 -c 16 /dev/zero

UBSAN: array-index-out-of-bounds in sound/pci/emu10k1/emupcm.c:127:40
index 65 is out of range for type 'snd_emu10k1_voice [64]'
CPU: 1 PID: 31977 Comm: aplay Tainted: G        W IOE      6.0.0-rc2-emu10k1+ exynos8895#7
Hardware name: ASUSTEK COMPUTER INC P5W DH Deluxe/P5W DH Deluxe, BIOS 3002    07/22/2010
Call Trace:
<TASK>
dump_stack_lvl+0x49/0x63
dump_stack+0x10/0x16
ubsan_epilogue+0x9/0x3f
__ubsan_handle_out_of_bounds.cold+0x44/0x49
snd_emu10k1_playback_hw_params+0x3bc/0x420 [snd_emu10k1]
snd_pcm_hw_params+0x29f/0x600 [snd_pcm]
snd_pcm_common_ioctl+0x188/0x1410 [snd_pcm]
? exit_to_user_mode_prepare+0x35/0x170
? do_syscall_64+0x69/0x90
? syscall_exit_to_user_mode+0x26/0x50
? do_syscall_64+0x69/0x90
? exit_to_user_mode_prepare+0x35/0x170
snd_pcm_ioctl+0x27/0x40 [snd_pcm]
__x64_sys_ioctl+0x95/0xd0
do_syscall_64+0x5c/0x90
? do_syscall_64+0x69/0x90
? do_syscall_64+0x69/0x90
entry_SYSCALL_64_after_hwframe+0x63/0xcd

Signed-off-by: Tasos Sahanidis <[email protected]>
Cc: <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Takashi Iwai <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Royna2544 pushed a commit to Roynas-Android-Playground/android_kernel_samsung_universal8895 that referenced this pull request May 15, 2023
…g the sock

[ Upstream commit 3cf7203ca620682165706f70a1b12b5194607dce ]

There is a race condition in vxlan that when deleting a vxlan device
during receiving packets, there is a possibility that the sock is
released after getting vxlan_sock vs from sk_user_data. Then in
later vxlan_ecn_decapsulate(), vxlan_get_sk_family() we will got
NULL pointer dereference. e.g.

   #0 [ffffa25ec6978a38] machine_kexec at ffffffff8c669757
   8890q#1 [ffffa25ec6978a90] __crash_kexec at ffffffff8c7c0a4d
   8890q#2 [ffffa25ec6978b58] crash_kexec at ffffffff8c7c1c48
   8890q#3 [ffffa25ec6978b60] oops_end at ffffffff8c627f2b
   8890q#4 [ffffa25ec6978b80] page_fault_oops at ffffffff8c678fcb
   exynos8895#5 [ffffa25ec6978bd8] exc_page_fault at ffffffff8d109542
   exynos8895#6 [ffffa25ec6978c00] asm_exc_page_fault at ffffffff8d200b62
      [exception RIP: vxlan_ecn_decapsulate+0x3b]
      RIP: ffffffffc1014e7b  RSP: ffffa25ec6978cb0  RFLAGS: 00010246
      RAX: 0000000000000008  RBX: ffff8aa000888000  RCX: 0000000000000000
      RDX: 000000000000000e  RSI: ffff8a9fc7ab803e  RDI: ffff8a9fd1168700
      RBP: ffff8a9fc7ab803e   R8: 0000000000700000   R9: 00000000000010ae
      R10: ffff8a9fcb748980  R11: 0000000000000000  R12: ffff8a9fd1168700
      R13: ffff8aa000888000  R14: 00000000002a0000  R15: 00000000000010ae
      ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
   exynos8895#7 [ffffa25ec6978ce8] vxlan_rcv at ffffffffc10189cd [vxlan]
   exynos8895#8 [ffffa25ec6978d90] udp_queue_rcv_one_skb at ffffffff8cfb6507
   exynos8895#9 [ffffa25ec6978dc0] udp_unicast_rcv_skb at ffffffff8cfb6e45
  exynos8895#10 [ffffa25ec6978dc8] __udp4_lib_rcv at ffffffff8cfb8807
  exynos8895#11 [ffffa25ec6978e20] ip_protocol_deliver_rcu at ffffffff8cf76951
  exynos8895#12 [ffffa25ec6978e48] ip_local_deliver at ffffffff8cf76bde
  exynos8895#13 [ffffa25ec6978ea0] __netif_receive_skb_one_core at ffffffff8cecde9b
  exynos8895#14 [ffffa25ec6978ec8] process_backlog at ffffffff8cece139
  exynos8895#15 [ffffa25ec6978f00] __napi_poll at ffffffff8ceced1a
  exynos8895#16 [ffffa25ec6978f28] net_rx_action at ffffffff8cecf1f3
  exynos8895#17 [ffffa25ec6978fa0] __softirqentry_text_start at ffffffff8d4000ca
  exynos8895#18 [ffffa25ec6978ff0] do_softirq at ffffffff8c6fbdc3

Reproducer: https://github.com/Mellanox/ovs-tests/blob/master/test-ovs-vxlan-remove-tunnel-during-traffic.sh

Fix this by waiting for all sk_user_data reader to finish before
releasing the sock.

Reported-by: Jianlin Shi <[email protected]>
Suggested-by: Jakub Sitnicki <[email protected]>
Fixes: 6a93cc9 ("udp-tunnel: Add a few more UDP tunnel APIs")
Signed-off-by: Hangbin Liu <[email protected]>
Reviewed-by: Jiri Pirko <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
Signed-off-by: Ulrich Hecht <[email protected]>
Royna2544 pushed a commit to Roynas-Android-Playground/android_kernel_samsung_universal8895 that referenced this pull request May 15, 2023
[ Upstream commit b18cba09e374637a0a3759d856a6bca94c133952 ]

Commit 9130b8dbc6ac ("SUNRPC: allow for upcalls for the same uid
but different gss service") introduced `auth` argument to
__gss_find_upcall(), but in gss_pipe_downcall() it was left as NULL
since it (and auth->service) was not (yet) determined.

When multiple upcalls with the same uid and different service are
ongoing, it could happen that __gss_find_upcall(), which returns the
first match found in the pipe->in_downcall list, could not find the
correct gss_msg corresponding to the downcall we are looking for.
Moreover, it might return a msg which is not sent to rpc.gssd yet.

We could see mount.nfs process hung in D state with multiple mount.nfs
are executed in parallel.  The call trace below is of CentOS 7.9
kernel-3.10.0-1160.24.1.el7.x86_64 but we observed the same hang w/
elrepo kernel-ml-6.0.7-1.el7.

PID: 71258  TASK: ffff91ebd4be0000  CPU: 36  COMMAND: "mount.nfs"
 #0 [ffff9203ca3234f8] __schedule at ffffffffa3b8899f
 8890q#1 [ffff9203ca323580] schedule at ffffffffa3b88eb9
 8890q#2 [ffff9203ca323590] gss_cred_init at ffffffffc0355818 [auth_rpcgss]
 8890q#3 [ffff9203ca323658] rpcauth_lookup_credcache at ffffffffc0421ebc
[sunrpc]
 8890q#4 [ffff9203ca3236d8] gss_lookup_cred at ffffffffc0353633 [auth_rpcgss]
 exynos8895#5 [ffff9203ca3236e8] rpcauth_lookupcred at ffffffffc0421581 [sunrpc]
 exynos8895#6 [ffff9203ca323740] rpcauth_refreshcred at ffffffffc04223d3 [sunrpc]
 exynos8895#7 [ffff9203ca3237a0] call_refresh at ffffffffc04103dc [sunrpc]
 exynos8895#8 [ffff9203ca3237b8] __rpc_execute at ffffffffc041e1c9 [sunrpc]
 exynos8895#9 [ffff9203ca323820] rpc_execute at ffffffffc0420a48 [sunrpc]

The scenario is like this. Let's say there are two upcalls for
services A and B, A -> B in pipe->in_downcall, B -> A in pipe->pipe.

When rpc.gssd reads pipe to get the upcall msg corresponding to
service B from pipe->pipe and then writes the response, in
gss_pipe_downcall the msg corresponding to service A will be picked
because only uid is used to find the msg and it is before the one for
B in pipe->in_downcall.  And the process waiting for the msg
corresponding to service A will be woken up.

Actual scheduing of that process might be after rpc.gssd processes the
next msg.  In rpc_pipe_generic_upcall it clears msg->errno (for A).
The process is scheduled to see gss_msg->ctx == NULL and
gss_msg->msg.errno == 0, therefore it cannot break the loop in
gss_create_upcall and is never woken up after that.

This patch adds a simple check to ensure that a msg which is not
sent to rpc.gssd yet is not chosen as the matching upcall upon
receiving a downcall.

Signed-off-by: minoura makoto <[email protected]>
Signed-off-by: Hiroshi Shimamoto <[email protected]>
Tested-by: Hiroshi Shimamoto <[email protected]>
Cc: Trond Myklebust <[email protected]>
Fixes: 9130b8dbc6ac ("SUNRPC: allow for upcalls for same uid but different gss service")
Signed-off-by: Trond Myklebust <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
[uli: backport to 4.4]
Signed-off-by: Ulrich Hecht <[email protected]>
Royna2544 pushed a commit to Roynas-Android-Playground/android_kernel_samsung_universal8895 that referenced this pull request May 15, 2023
[ Upstream commit 6c4ca03bd890566d873e3593b32d034bf2f5a087 ]

During EEH error injection testing, a deadlock was encountered in the tg3
driver when tg3_io_error_detected() was attempting to cancel outstanding
reset tasks:

crash> foreach UN bt
...
PID: 159    TASK: c0000000067c6000  CPU: 8   COMMAND: "eehd"
...
 exynos8895#5 [c00000000681f990] __cancel_work_timer at c00000000019fd18
 exynos8895#6 [c00000000681fa30] tg3_io_error_detected at c00800000295f098 [tg3]
 exynos8895#7 [c00000000681faf0] eeh_report_error at c00000000004e25c
...

PID: 290    TASK: c000000036e5f800  CPU: 6   COMMAND: "kworker/6:1"
...
 8890q#4 [c00000003721fbc0] rtnl_lock at c000000000c940d8
 exynos8895#5 [c00000003721fbe0] tg3_reset_task at c008000002969358 [tg3]
 exynos8895#6 [c00000003721fc60] process_one_work at c00000000019e5c4
...

PID: 296    TASK: c000000037a65800  CPU: 21  COMMAND: "kworker/21:1"
...
 8890q#4 [c000000037247bc0] rtnl_lock at c000000000c940d8
 exynos8895#5 [c000000037247be0] tg3_reset_task at c008000002969358 [tg3]
 exynos8895#6 [c000000037247c60] process_one_work at c00000000019e5c4
...

PID: 655    TASK: c000000036f49000  CPU: 16  COMMAND: "kworker/16:2"
...:1

 8890q#4 [c0000000373ebbc0] rtnl_lock at c000000000c940d8
 exynos8895#5 [c0000000373ebbe0] tg3_reset_task at c008000002969358 [tg3]
 exynos8895#6 [c0000000373ebc60] process_one_work at c00000000019e5c4
...

Code inspection shows that both tg3_io_error_detected() and
tg3_reset_task() attempt to acquire the RTNL lock at the beginning of
their code blocks.  If tg3_reset_task() should happen to execute between
the times when tg3_io_error_deteced() acquires the RTNL lock and
tg3_reset_task_cancel() is called, a deadlock will occur.

Moving tg3_reset_task_cancel() call earlier within the code block, prior
to acquiring RTNL, prevents this from happening, but also exposes another
deadlock issue where tg3_reset_task() may execute AFTER
tg3_io_error_detected() has executed:

crash> foreach UN bt
PID: 159    TASK: c0000000067d2000  CPU: 9   COMMAND: "eehd"
...
 8890q#4 [c000000006867a60] rtnl_lock at c000000000c940d8
 exynos8895#5 [c000000006867a80] tg3_io_slot_reset at c0080000026c2ea8 [tg3]
 exynos8895#6 [c000000006867b00] eeh_report_reset at c00000000004de88
...
PID: 363    TASK: c000000037564000  CPU: 6   COMMAND: "kworker/6:1"
...
 8890q#3 [c000000036c1bb70] msleep at c000000000259e6c
 8890q#4 [c000000036c1bba0] napi_disable at c000000000c6b848
 exynos8895#5 [c000000036c1bbe0] tg3_reset_task at c0080000026d942c [tg3]
 exynos8895#6 [c000000036c1bc60] process_one_work at c00000000019e5c4
...

This issue can be avoided by aborting tg3_reset_task() if EEH error
recovery is already in progress.

Fixes: db84bf4 ("tg3: tg3_reset_task() needs to use rtnl_lock to synchronize")
Signed-off-by: David Christensen <[email protected]>
Reviewed-by: Pavan Chebbi <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Jakub Kicinski <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
Signed-off-by: Ulrich Hecht <[email protected]>
Royna2544 pushed a commit to Roynas-Android-Playground/android_kernel_samsung_universal8895 that referenced this pull request May 15, 2023
commit 60eed1e3d45045623e46944ebc7c42c30a4350f0 upstream.

code path:

ocfs2_ioctl_move_extents
 ocfs2_move_extents
  ocfs2_defrag_extent
   __ocfs2_move_extent
    + ocfs2_journal_access_di
    + ocfs2_split_extent  //sub-paths call jbd2_journal_restart
    + ocfs2_journal_dirty //crash by jbs2 ASSERT

crash stacks:

PID: 11297  TASK: ffff974a676dcd00  CPU: 67  COMMAND: "defragfs.ocfs2"
 #0 [ffffb25d8dad3900] machine_kexec at ffffffff8386fe01
 8890q#1 [ffffb25d8dad3958] __crash_kexec at ffffffff8395959d
 8890q#2 [ffffb25d8dad3a20] crash_kexec at ffffffff8395a45d
 8890q#3 [ffffb25d8dad3a38] oops_end at ffffffff83836d3f
 8890q#4 [ffffb25d8dad3a58] do_trap at ffffffff83833205
 exynos8895#5 [ffffb25d8dad3aa0] do_invalid_op at ffffffff83833aa6
 exynos8895#6 [ffffb25d8dad3ac0] invalid_op at ffffffff84200d18
    [exception RIP: jbd2_journal_dirty_metadata+0x2ba]
    RIP: ffffffffc09ca54a  RSP: ffffb25d8dad3b70  RFLAGS: 00010207
    RAX: 0000000000000000  RBX: ffff9706eedc5248  RCX: 0000000000000000
    RDX: 0000000000000001  RSI: ffff97337029ea28  RDI: ffff9706eedc5250
    RBP: ffff9703c3520200   R8: 000000000f46b0b2   R9: 0000000000000000
    R10: 0000000000000001  R11: 00000001000000fe  R12: ffff97337029ea28
    R13: 0000000000000000  R14: ffff9703de59bf60  R15: ffff9706eedc5250
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018
 exynos8895#7 [ffffb25d8dad3ba8] ocfs2_journal_dirty at ffffffffc137fb95 [ocfs2]
 exynos8895#8 [ffffb25d8dad3be8] __ocfs2_move_extent at ffffffffc139a950 [ocfs2]
 exynos8895#9 [ffffb25d8dad3c80] ocfs2_defrag_extent at ffffffffc139b2d2 [ocfs2]

Analysis

This bug has the same root cause of 'commit 7f27ec9 ("ocfs2: call
ocfs2_journal_access_di() before ocfs2_journal_dirty() in
ocfs2_write_end_nolock()")'.  For this bug, jbd2_journal_restart() is
called by ocfs2_split_extent() during defragmenting.

How to fix

For ocfs2_split_extent() can handle journal operations totally by itself.
Caller doesn't need to call journal access/dirty pair, and caller only
needs to call journal start/stop pair.  The fix method is to remove
journal access/dirty from __ocfs2_move_extent().

The discussion for this patch:
https://oss.oracle.com/pipermail/ocfs2-devel/2023-February/000647.html

Link: https://lkml.kernel.org/r/[email protected]
Signed-off-by: Heming Zhao <[email protected]>
Reviewed-by: Joseph Qi <[email protected]>
Cc: Mark Fasheh <[email protected]>
Cc: Joel Becker <[email protected]>
Cc: Junxiao Bi <[email protected]>
Cc: Changwei Ge <[email protected]>
Cc: Gang He <[email protected]>
Cc: Jun Piao <[email protected]>
Cc: <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Signed-off-by: Ulrich Hecht <[email protected]>
Royna2544 pushed a commit to Roynas-Android-Playground/android_kernel_samsung_universal8895 that referenced this pull request May 15, 2023
[ Upstream commit 250870824c1cf199b032b1ef889c8e8d69d9123a ]

GCC warns about the pattern sizeof(void*)/sizeof(void), as it looks like
the abuse of a pattern to calculate the array size. This pattern appears
in the unevaluated part of the ternary operator in _INTC_ARRAY if the
parameter is NULL.

The replacement uses an alternate approach to return 0 in case of NULL
which does not generate the pattern sizeof(void*)/sizeof(void), but still
emits the warning if _INTC_ARRAY is called with a nonarray parameter.

This patch is required for successful compilation with -Werror enabled.

The idea to use _Generic for type distinction is taken from Comment exynos8895#7
in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108483 by Jakub Jelinek

Signed-off-by: Michael Karcher <[email protected]>
Acked-by: Randy Dunlap <[email protected]> # build-tested
Link: https://lore.kernel.org/r/619fa552-c988-35e5-b1d7-fe256c46a272@mkarcher.dialup.fu-berlin.de
Signed-off-by: John Paul Adrian Glaubitz <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
Signed-off-by: Ulrich Hecht <[email protected]>
@MrLucifer92
Copy link

GitHub accounts

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants