概述

原文: Racing against the clock -- hitting a tiny kernel race window

Part.1

The bug & race

The kernel tries to figure out whether it can account for all references to some file by comparing the file's refcount with the number of references from inflight SKBs (socket buffers). If they are equal, it assumes that the UNIX domain sockets subsystem effectively has exclusive access to the file because it owns all references.

The problem is that struct file can also be referenced from an RCU read-side critical section (which you can't detect by looking at the refcount), and such an RCU reference can be upgraded into a refcounted reference using get_file_rcu() / get_file_rcu_many() by __fget_files() as long as the refcount is non-zero.

dup() -> __fget_files()
    **file = files_lookup_fd_rcu(files, fd); // fdt->fd[fd] (1)**
    ...
    **get_file_rcu_many(file, refs) // update: f_count+1 (2)**

close() -> unix_gc()
		list_for_each_entry_safe(u, next, &gc_inflight_list, link) {
	    **total_refs = file_count(u->sk.sk_socket->file);  // read f_count: 1 (3)**
	    inflight_refs = atomic_long_read(&u->inflight);  // inflight_refs: 1
	    ...
			if (total_refs == inflight_refs) { // compare 
				list_move_tail(&u->link, &gc_candidates);
		        ...

unix_gc() 中 file 和 skb 没有同步释放可能造成的影响?