[Ocfs2-test-devel] [PATCH 1/1] Ocfs2-test: Add destructive testcase for reflink_test v1.

Tristan tristan.ye at oracle.com
Mon Nov 2 22:02:21 PST 2009


Sunil Mushran wrote:
> I forgot to add. The reflink op should happen once before the children
> are forked. Make an option to set the number of children. We may want
> to limit the number during debugging.

Yes, the number of child procs will be designed to be optionable, and 
the reflink op also happened before fork(), see:

ret = prep_orig_file_in_chunks(orig_path, chunk_no);
should_exit(ret);

printf("  *SubTest %d: Do reflinks to reflink the extents.\n",
               sub_testno++);

ret = do_reflinks(orig_path, orig_path, ref_counts, 0);
should_exit(ret);


Tristan.
>
> Sunil Mushran wrote:
>> Tristan Ye wrote:
>>  
>>> Per sunil's request, we're going to add a destructive testcase for 
>>> reflink
>>> to expose an expected issue in existing reflink kernel codes: 
>>> O_DIRECT writes
>>> will not flush the metadata accordingly when being performed on 
>>> reflinked files
>>> after the completion of write operation, which means a reflinked 
>>> file may wrongly
>>> points to an old reflinked extent after a unexpectedly crash of 
>>> machine.
>>>
>>> The v1 testcase is really a quick&dirty one, but it did reveal the 
>>> problem we
>>> mentioned above. after doing reflinks, we fork procs to perform 
>>> random writes
>>> on a reflinked file, and each write is going to be logged 
>>> accordingly via wire
>>> to a remote listener server. then crash the machine somehow, 
>>> afterwards, the
>>> logfile recorded by listener will be used for verification.
>>>
>>> I may file a bug on bugzilla to track the issue when tao get ready 
>>> to change the
>>> reflink kernel codes.
>>>       
>>
>> Do file it. I want to see if the problem was what we expected.
>>
>>  
>>> +
>>> +        /* child to do CoW*/
>>> +        if (pid == 0) {
>>> +
>>> +            srand(getpid());
>>> +
>>> +            for (j = 0; j < chunk_no; j++) {
>>> +                prep_rand_dest_write_unit(&du, get_rand(0, chunk_no 
>>> - 1));
>>> +                ret = do_write_chunk_file(orig_path, &du);
>>> +                if (ret)
>>> +                    return -1;
>>>       
>>
>> do_write_chunk_file() is too heavy. Better if we explicitly open the
>> file once for each child and then do the io.
>>
>> Also, do_write_chunk() should handle short writes by resubmitting
>> the remaining write.
>>
>>  
>>> +           
>>> +                memset(log_rec, 0, sizeof(log_rec));
>>> +                snprintf(log_rec, sizeof(log_rec), "%lu\t%llu\t%c\n",
>>> +                     du.d_chunk_no, du.d_timestamp, 
>>> +                     du.d_char);
>>> +                write(sockfd, log_rec, strlen(log_rec) + 1);
>>>       
>>
>> So this would be the best place to reset the box. Do the memset, 
>> snprintf
>> before the fswrite so that we have back-to-back fswrite and 
>> wirewrite. The
>> reset should be after the wirewrite. Our aim is to ensure the last 
>> odirect write
>> made it to disk.... with the metadata changes.
>>
>>  
>>> +
>>> +                /*
>>> +                if (get_rand(0, 1)) {
>>> +                    snprintf(dest, PATH_MAX,
>>> +                         "%s_target_%d_%d",
>>> +                         orig_path, getpid(), j);
>>> +                    ret = reflink(orig_path, dest, 1);
>>> +                    should_exit(ret);
>>> +                    memset(log_rec, 0, 100);
>>> +                    snprintf(log_rec, 100, 
>>> "Reflinking:\t%s->%s\n",orig_path, dest);
>>> +                    write(sockfd, log_rec, 100);
>>> +                   
>>> +                }
>>> +                */
>>> +                usleep(100000);
>>> +            }
>>> +
>>> +            exit(0);
>>> +        }
>>> +
>>> +        if (pid > 0)
>>> +            child_pid_list[i] = pid;
>>> +
>>> +    }
>>> +
>>> +    usleep(100000 * 2);
>>> +
>>> +    /*
>>> +     * Are you ready to crash the box?
>>> +    */
>>> +    system("echo b>/proc/sysrq-trigger");
>>>       
>>
>> See my comment above. We could choose a random number from a pool of
>> say  1000 and the first thread to match that number resets the box 
>> after the
>> wirewrite.
>>
>>  
>>>  
>>> +int fill_chunk_pattern(char *pattern, struct dest_write_unit dwu)
>>> +{
>>> +    unsigned long mem_offset = 0;
>>> +    unsigned long checksum = 0;
>>> +
>>> +    memset(pattern, 0, CHUNK_SIZE);
>>> +    mem_offset = 0;
>>> +
>>> +    memmove(pattern , &dwu.d_chunk_no, sizeof(unsigned long));
>>> +    mem_offset += sizeof(unsigned long);
>>> +    memmove(pattern + mem_offset, &dwu.d_timestamp,
>>> +        sizeof(unsigned long long ));
>>> +    mem_offset += sizeof(unsigned long long);
>>> +    memmove(pattern + mem_offset, &checksum, sizeof(unsigned long));
>>> +    mem_offset += sizeof(unsigned long);
>>> +
>>> +    memset(pattern + mem_offset, dwu.d_char, CHUNK_SIZE - 
>>> mem_offset * 2);
>>> +    mem_offset = CHUNK_SIZE - mem_offset;
>>> +
>>> +    memmove(pattern + mem_offset, &checksum, sizeof(unsigned long));
>>> +    mem_offset += sizeof(unsigned long);
>>> +    memmove(pattern + mem_offset, &dwu.d_timestamp,
>>> +        sizeof(unsigned long long ));
>>> +    mem_offset += sizeof(unsigned long long);
>>> +    memmove(pattern + mem_offset, &dwu.d_chunk_no, sizeof(unsigned 
>>> long));
>>> +
>>> +    return 0;
>>> +}
>>> +
>>> +int dump_pattern(char *pattern, struct dest_write_unit *dwu)
>>> +{
>>> +    unsigned long mem_offset = 0;
>>> +    unsigned long checksum = 0;
>>> +
>>> +    memset(dwu, 0, sizeof(struct dest_write_unit));
>>> +
>>> +    memmove(&dwu->d_chunk_no, pattern, sizeof(unsigned long));
>>> +    mem_offset += sizeof(unsigned long);
>>> +    memmove(&dwu->d_timestamp, pattern + mem_offset,
>>> +        sizeof(unsigned long long));
>>> +    mem_offset += sizeof(unsigned long);
>>> +    memmove(&checksum, pattern + mem_offset, sizeof(unsigned long));
>>> +    mem_offset += sizeof(unsigned long);
>>> +
>>> +    memmove(&dwu->d_char, pattern + mem_offset, 1);
>>> +    mem_offset = CHUNK_SIZE - mem_offset;
>>> +
>>> +    memmove(&checksum, pattern + mem_offset, sizeof(unsigned long));
>>> +    mem_offset += sizeof(unsigned long);
>>> +    memmove(&dwu->d_timestamp, pattern + mem_offset,
>>> +        sizeof(unsigned long long));
>>> +    mem_offset += sizeof(unsigned long long);
>>> +    memmove(&dwu->d_chunk_no, pattern + mem_offset, sizeof(unsigned 
>>> long));
>>> +
>>> +    return 0;
>>> +}
>>> +
>>> +int verify_chunk_pattern(char *pattern, struct dest_write_unit dwu)
>>> +{
>>> +    char tmp_pattern[CHUNK_SIZE];
>>> +   
>>> +    fill_chunk_pattern(tmp_pattern, dwu);
>>> +
>>> +    return !memcmp(pattern, tmp_pattern, sizeof(struct 
>>> dest_write_unit));
>>> +}
>>> +
>>> +int prep_orig_file_in_chunks(char *file_name, unsigned long chunks)
>>> +{
>>> +
>>> +    int fd, ret, o_ret, flags;
>>> +    unsigned long offset = 0;
>>> +    unsigned long size = CHUNK_SIZE * chunks, chunk_no = 0;
>>> +    struct dest_write_unit dwu;
>>> +
>>> +    if ((CHUNK_SIZE % DIRECTIO_SLICE) != 0) {
>>> +
>>> +        fprintf(stderr, "File size in destructive tests is expected 
>>> to "
>>> +            "be %d aligned, your chunk size %d is not allowed.\n",
>>> +            DIRECTIO_SLICE, CHUNK_SIZE);
>>> +        return -1;
>>> +    }
>>> +
>>> +    flags = FILE_RW_FLAGS;
>>> +
>>> +    fd = open64(file_name, flags, FILE_MODE);
>>> +
>>> +    if (fd < 0) {
>>> +        o_ret = fd;
>>> +        fd = errno;
>>> +        fprintf(stderr, "create file %s failed:%d:%s\n", file_name, 
>>> fd,
>>> +            strerror(fd));
>>> +        fd = o_ret;
>>> +        return fd;
>>> +    }
>>> +
>>> +    /*
>>> +     * Original file for desctrutive tests, it consists of chunks.
>>> +     * Each chunks consists of following parts:
>>> +     * chunkno + timestamp + checksum + random chars +     * + 
>>> checksum + timestamp + chunkno
>>> +     *
>>> +    */
>>> +   
>>> +    while (offset < size) {
>>> +
>>> +        memset(&dwu, 0, sizeof(struct dest_write_unit));
>>> +        dwu.d_chunk_no = chunk_no;
>>> +        fill_chunk_pattern(chunk_pattern, dwu);
>>> +
>>> +        ret = pwrite(fd, chunk_pattern, CHUNK_SIZE, offset);
>>> +        if (ret < 0) {
>>> +            o_ret = ret;
>>> +            ret = errno;
>>> +            fprintf(stderr, "write failed:%d:%s\n", ret,
>>> +                strerror(ret));
>>> +            return ret;
>>> +        }
>>> +
>>> +        chunk_no++;
>>> +        offset += CHUNK_SIZE;
>>> +    }
>>> +
>>> +    close(fd);
>>> +    return 0;
>>> +}
>>> +
>>>  int verify_reflink_pair(const char *src, const char *dest)
>>>  {
>>>      int fds, fdd, ret, o_ret;
>>> @@ -1279,3 +1406,177 @@ int do_write_file(char *fname, struct 
>>> write_unit *wu)
>>>  
>>>      return ret;
>>>  }
>>> +
>>> +unsigned long long get_time_microseconds(void)
>>> +{
>>> +    unsigned long long curtime_ms = 0;
>>> +    struct timeval curtime;
>>> +
>>> +    gettimeofday(&curtime, NULL);
>>> +
>>> +    curtime_ms = (unsigned long long)curtime.tv_sec * 1000000 +
>>> +                     curtime.tv_usec;
>>> +
>>> +    return curtime_ms;
>>> +}
>>> +
>>> +void prep_rand_dest_write_unit(struct dest_write_unit *du,
>>> +                   unsigned long chunk_no)
>>> +{
>>> +    du->d_char = rand_char();
>>> +    du->d_chunk_no = chunk_no;
>>> +    du->d_timestamp = get_time_microseconds();
>>> +}
>>> +
>>> +int do_write_chunk(int fd, struct dest_write_unit *du)
>>> +{
>>> +    int ret;
>>> +
>>> +    fill_chunk_pattern(chunk_pattern, *du);
>>> +
>>> +    ret = pwrite(fd, chunk_pattern, CHUNK_SIZE, CHUNK_SIZE * 
>>> du->d_chunk_no);
>>> +    if (ret == -1) {
>>> +        fprintf(stderr, "write error %d: \"%s\"\n", errno,
>>> +            strerror(errno));
>>> +        return -1;
>>> +    }
>>> +
>>> +    return 0;
>>> +}
>>> +
>>> +int do_write_chunk_file(char *fname, struct dest_write_unit *du)
>>> +{
>>> +    int fd, ret, o_ret, flags = open_rw_flags;
>>> +
>>> +    if (test_flags & DSCV_TEST)
>>> +        flags |= O_DIRECT;
>>> +
>>> +    fd  = open64(fname, flags);
>>> +
>>> +    if (fd < 0) {
>>> +        o_ret = fd;
>>> +        fd = errno;
>>> +        fprintf(stderr, "open file %s failed:%d:%s\n", fname, fd,
>>> +            strerror(fd));
>>> +        fd = o_ret;
>>> +        return fd;
>>> +    }
>>> +
>>> +        ret = do_write_chunk(fd, du);
>>> +
>>> +    close(fd);
>>> +
>>> +    return ret;
>>> +}
>>> +
>>> +
>>> +int init_sock(char *serv, int port)
>>> +{
>>> +    int sockfd;
>>> +    struct sockaddr_in servaddr;
>>> +
>>> +    sockfd = socket(AF_INET, SOCK_STREAM, 0);
>>> +    bzero(&servaddr, sizeof(struct sockaddr_in));
>>> +    servaddr.sin_family = AF_INET;
>>> +    servaddr.sin_port = htons(port);
>>> +    inet_pton(AF_INET, serv, &servaddr.sin_addr);
>>> +
>>> +    connect(sockfd, (struct sockaddr *)&servaddr, sizeof(servaddr));
>>> +
>>> +    return sockfd;
>>> +}
>>> +
>>> +int verify_dest_files(char *log, char *orig, unsigned long chunk_no)
>>> +{
>>> +    FILE *logfile;
>>> +    struct dest_write_unit *dwus, dwu;
>>> +    unsigned long i, t_bytes = sizeof(struct dest_write_unit) * 
>>> chunk_no;
>>> +    int fd = 0, ret = 0, o_ret;
>>> +    char dest[PATH_MAX];
>>> +
>>> +    memset(&dwu, 0, sizeof(struct dest_write_unit));
>>> +
>>> +    dwus = (struct dest_write_unit *)malloc(t_bytes);
>>> +    memset(dwus, 0, t_bytes);
>>> +
>>> +    logfile = fopen(log, "r");
>>> +    if (!logfile) {
>>> +        fprintf(stderr, "Error %d opening dest log: %s\n", errno,
>>> +            strerror(errno));
>>> +        ret = -EINVAL;
>>> +        goto bail;
>>> +    }
>>> +
>>> +    while (!feof(logfile)) {
>>> +
>>> +        ret = fscanf(logfile, "%lu\t%llu\t%c\n", &dwu.d_chunk_no,
>>> +                     &dwu.d_timestamp, &dwu.d_char);
>>> +
>>> +        if (ret != 3) {
>>> +            fprintf(stderr, "input failure from dest log, ret "
>>> +                "%d, %d %s\n", ret, errno, strerror(errno));
>>> +            ret = -EINVAL;
>>> +            goto bail;
>>> +        }
>>> +
>>> +        if (dwu.d_timestamp >= dwus[dwu.d_chunk_no].d_timestamp) {
>>> +
>>> +            /*
>>> +            printf("#%lu \tchunk record updated, from [%llu](%c) to 
>>> [%llu](%c)\n",
>>> +                dwu.d_chunk_no, dwus[dwu.d_chunk_no].d_timestamp, 
>>> dwus[dwu.d_chunk_no].d_char,
>>> +                dwu.d_timestamp, dwu.d_char);
>>> +            */
>>> +            memmove(&dwus[dwu.d_chunk_no], &dwu,
>>> +                sizeof(struct dest_write_unit));
>>> +        }
>>> +
>>> +    }
>>> +
>>> +    fd = open64(orig, open_ro_flags, FILE_MODE);
>>> +    if (fd < 0) {
>>> +        ret = fd;
>>> +        fd = errno;
>>> +        fprintf(stderr, "open file %s failed:%d:%s\n",
>>> +            orig, fd, strerror(fd));
>>> +        goto bail;
>>> +    }
>>> +
>>> +    for (i = 0; i < chunk_no; i++) {
>>> +
>>> +        ret = pread(fd, chunk_pattern, CHUNK_SIZE, CHUNK_SIZE * i);
>>> +        if (ret < 0) {
>>> +            o_ret = ret;
>>> +            ret = errno;
>>> +            fprintf(stderr, "read failed:%d:%s\n", ret,
>>> +                strerror(ret));
>>> +            ret = o_ret;
>>> +            goto bail;
>>> +        }
>>> +
>>> +        if (!verify_chunk_pattern(chunk_pattern, dwus[i])) {
>>> +
>>> +            dump_pattern(chunk_pattern, &dwu);   
>>> +            fprintf(stderr, "An inconsistent chunk record found!\n"
>>> +                "Expected:\tchunkno(%ld)\ttimestamp(%llu)\tchar(%c)\n"
>>> +                "Found   
>>> :\tchunkno(%ld)\ttimestamp(%llu)\tchar(%c)\n",
>>> +                dwus[i].d_chunk_no, dwus[i].d_timestamp, 
>>> dwus[i].d_char,
>>> +                dwu.d_chunk_no, dwu.d_timestamp, dwu.d_char);
>>> +            ret = -1;
>>> +            goto bail;
>>> +
>>> +        }
>>> +
>>> +    }
>>>       
>>
>> So I wanted the timestamp in the header and trailer to expose fractured
>> blocks. Say we make two writes to a chunk and both the writes happen
>> to be the same random character. If we compare the timestamp in the 
>> header
>> with that in the trailer will ensure the entire chunk hit the disk. 
>> Currently
>> your verification is not comparing the header with the trailer.
>>
>> _______________________________________________
>> Ocfs2-test-devel mailing list
>> Ocfs2-test-devel at oss.oracle.com
>> http://oss.oracle.com/mailman/listinfo/ocfs2-test-devel
>>   
>




More information about the Ocfs2-test-devel mailing list