[Ocfs2-devel] [PATCH 2/2] Ocfs2: Implement new OCFS2_IOC_INFO ioctl for ocfs2.
Tristan
tristan.ye at oracle.com
Tue Dec 1 18:29:28 PST 2009
Joel Becker wrote:
> On Tue, Dec 01, 2009 at 11:20:25AM +0800, Tristan wrote:
>
>> Size of each request may differ from type to type, think about:
>>
>
> Yes, but you are not passing an array of structures, you are
> passing an array of pointers. Andi's right, of course, that NULL
> termination is rife for abuse. Let's pass it via a wrapper structure:
>
> ---------------------------------------------------------------------------
> struct ocfs2_info {
> __u64 info_requests; /* Array of __u64 pointers to our requests */
> __u32 info_count; /* Number of requests in info_requests */
> };
> ---------------------------------------------------------------------------
>
> Why is info_requests a __u64? Because we have to handle 32bit
> binaries on 64bit systems. In fact, info_requests is an array of 64bit
> values. See the example below.
> I've also modified struct ocfs2_info_request to have a magic
> number and a size:
>
> ---------------------------------------------------------------------------
> /* Magic number of all requests */
> #define OCFS2_INFO_MAGIC 0x4F32494E /* Magic number for requests */
>
> struct ocfs2_info_request {
> /*00*/ __u32 ir_magic; /* Magic number */
>
Why we need a magic here, just to improve the security?
> __u32 ir_code; /* Info request code */
> __u32 ir_size; /* Size of request */
> __u32 ir_flags; /* Request flags */
> /*10*/ /* Request specific fields */
> };
> ---------------------------------------------------------------------------
>
> A user now does:
>
> ---------------------------------------------------------------------------
> get_info(fd)
> {
> struct ocfs2_info_request_clustersize crq = {
> .ir_magic = OCFS2_INFO_MAGIC,
> .ir_code = OCFS2_INFO_REQUEST_CLUSTERSIZE,
> .ir_size = sizeof(struct ocfs2Info_request_clustersize),
> };
> struct ocfs2_info_request_blocksize brq = {
> .ir_magic = OCFS2_INFO_MAGIC,
> .ir_code = OCFS2_INFO_REQUEST_BLOCKSIZE,
> .ir_size = sizeof(struct ocfs2Info_request_blocksize),
> };
> uint64_t info_array[2] = {(unsigned long)&crq, (unsigned long)&brq};
> struct ocfs2_info info = {
> .info_requests = info_array,
> .info_count = 2,
> };
>
> rc = ioctl(fd, OCFS2_IOC_INFO, &info);
> ...
> }
> ---------------------------------------------------------------------------
>
> o2info can create convenience wrappers for setting the magic
> number, code, and size, of course. Note that we cast the request
> pointers to unsigned longs so that the promotion from ulong to uint64_t
> is handled by C. This code works on 32bit and 64bit.
>
Just a confusion here, uint64_t should be 'unsigned long long' on 32bits
arch, and 'unsigned long' on 64bits arch, why we not use:
uint64_t info_array[2] = {(unsigned long long)&crq, (unsigned long long)&brq};
> How does the kernel handle this? One by one, of course. First
> we copy_from_user() the ocfs2_info structure. We check that info_count
> is within some sane boundary (let's say 50 requests) so that someone
> doesn't pass one million. We also check that info_requests is not
> NULL. Both allow us to return -EINVAL.
> Then, for each request in the array, we grab the header,
> validate it, and call a handler function:
>
> --------------------------------------------------------------------------
> u64 req_addr;
> struct ocfs2_info_request __user *reqp;
>
> for (i = 0; i < info->info_count; i++) {
> ret = -EFAULT;
> if (get_user(&req_addr,
> (unsigned long)((u64**)(info->info_requests) + i)))
> break;
> reqp = (struct ocfs2_info_request *)(unsigned long)req_addr;
> ret = ocfs2_info_handle(file, reqp);
> if (ret)
> break;
> }
> --------------------------------------------------------------------------
>
How to avoid accessing an arbitrary mem address, such as, userspace
passing a wrong info_count(correct_info_count + 1), so the last entry
for pointer should be invalid, how to avoid to access that?
> Now we have a pointer to the *userspace* memory, we need to get
> the header into kernelspace:
>
> --------------------------------------------------------------------------
> int ocfs2_info_handle(file, struct ocfs2_info_request __user *reqp)
> {
> int ret = 0;
> struct ocfs2_info_request req;
>
> if (copy_from_user(&req, reqp, sizeof(struct ocfs2_info_request)))
> return -EFAULT;
>
> if (req.ir_magic != OCFS2_INFO_MAGIC)
> return -EINVAL;
> switch (req.ir_code) {
> case OCFS2_INFO_REQUEST_CLUSTERSIZE:
> if (req.ir_size != sizeof(struct ocfs2_info_request_clustersize)) {
> ret = -EINVAL;
> break;
> }
> ret = ocfs2_info_handle_clustersize(file, reqp, &req);
> break;
>
> ...
>
> default:
> /* Unknown code */
> ret = -EINVAL;
> break;
> }
>
> return ret;
> }
> --------------------------------------------------------------------------
>
> We can probably wrap the switch into a lookup table, but let's
> not worry about that for now.
> As you can see, we've validated the info structure, then we
> validate each request as we get it. We check the magic, the code, and
> the size to make sure userspace is doing it right. Only then do we
> actually call the handler function.
> Some handler functions will need information from the extra
> portion of the request. Simple ones will not, and need not copy
> anything else from userspace. The clustersize handler is an example.
> It needs no more information from the passed request. It just needs to
> fill in the result to userspace.
>
> --------------------------------------------------------------------------
> int ocfs2_info_handle_clustersize(struct file *file,
> struct ocfs2_info_request __user *user_req,
> struct ocfs2_info_request *req)
> {
> struct ocfs2_info_request_clustersize result;
>
> memcpy(&result, req, sizeof(struct ocfs2_info_request));
> result.ir_request.ir_flags |= OCFS2_INFO_FL_FILLED;
> result.ir_clustersize =
> OCFS2_SB(file->f_dentry->d_inode->i_sb)->s_clutsersize;
>
> if (copy_to_user((struct ocfs2_info_request_clustersize __user *)user_req,
> &result,
> sizeof(struct ocfs2_info_request_clustersize)))
> return -EFAULT;
> return 0;
> }
> --------------------------------------------------------------------------
>
> Done right, we can switch between a single ioctl with an array
> of requests, a single ioctl with one request per call, and even multiple
> ioctls, all without changing the handler functions. Whee!
>
The scheme above was fine to me:-)
> Joel
>
>
More information about the Ocfs2-devel
mailing list