[OracleOSS] [TitleIndex] [WordIndex]



Oct 14th, 2008

SUSE: Mark Fasheh, Coly Li

ORACLE: Tao Ma, Tiger Yang, Sunil Mushran, Joel Becker, Tristan Ye

Kernel 2.6.28

2.6.28 merge window is open. Extended Attributes, Localalloc Throttling, JBD2 support and POSIX Lock support have been pushed. A few extended attribute patches still remain. Tiger and Tao will mail the patches for ACLs and Security attributes by Friday. If so, Mark should be able to push the patches in this window.

Tiger will mail the tools patches for extended attributes after the kernel patches have been pushed.

Discussed JBD2 soft lockup issue. As it happens only during umount, Mark suggested we check whether the journal has already been shutdown.

Indeed it is. A patch with the fix will be provided for testing.

Joel has emailed JBD2 tools patches. Mark will review them.

OCFS2 1.4.2

Localalloc throttling needs to be backported to the 1.4 tree. That and the bug fixes should be enough on the fs side. Tools is missing Tao's fsck slot recovery patches that are being tested by Marcos. Also, Sunil will refresh the manpages.

Kernel 2.6.29

List of features for 2.6.29 are as follows:


Jan Kara

Indexed Directories


Metadata Checksums


Online Add slots

Inode Allocation Improvement

Support for more than 32 bits worth of clusters

The last three tasks will be assigned once we are done with 2.6.28 merging.


Tristan will re-test extended attributes as pushed to Linus' tree.

Marcos has been testing 2.6.27-rc7 + patches for 2.6.28. While the tests have not yet completed, he has not encountered any crashes. However, he is seeing a slowdown during create files. He will email us some numbers as compared to 2.6.26.

July 8th, 2008

SUSE: Mark Fasheh, Jeff Mahoney

ORACLE: Tao Ma, Tiger Yang, Sunil Mushran, Joel Becker, Marcos Matsunaga, Tristan Ye

Mark led the call by mentioning that he will be pushing a dlm bug fix patch tomorrow so that it hits 2.6.26. As that bug was introduced in 2.6.26-rc1, we shouldn't have to worry about older kernels.

Recovery/Mount race was discussed. The issue here is that in 2.6.25 we removed mount voting that removed a message that informed the recovery thread to skip a journal replay if another node managed to mount and take that slot before the recovery thread could lock the superblock. Currently the recovery thread attempts to lock the journal/slot and hangs if that slot is assigned to another node. While the race is very tiny, we need to address this.

One solution is to keep a recovery generation in the journal inode. Mounting nodes replaying the journal will increment this generation. The recovery thread, before attempting the blocking lock on the journal, will dirty read the journal inode and compare the recovery generation with what it saved off during mount. If it has changed, it will skip the replay. If not, it will go ahead with the blocking lock. As all live nodes are get the down event for a dead node, they will be able to keep the recovery generation uptodate.

Tiger is working on the Extended attribute tools support.

Tristan has written an alpha drop off the single and multi node extended attribute test. Marcos will review the tests. The tests and their statuses can be reviewed here.

Tao is working on adding inline data support in tools.

Joel is cleaning up tunefs. He is refactoring the code such that it becomes easy to add new functionality.

The array has been upgraded. Marcos will launch the db tests for 1.4 tonite.

Jeff asked about plans to add quotas. We need to look into it. He will be asking Jan Kara to join the effort.

The next call is scheduled for July 22nd, 2008.

May 27th, 2008

SUSE: Coly Li, Mark Fasheh, Jeff Mahoney

ORACLE: Tao Ma, Tiger Yang, Sunil Mushran, Joel Becker

Tiger's patch to add splice io support was pushed to the 1.4 tree. However, the build now fails on SLES10. Jeff will provide a patch to fix this issue. (splice io support will not be enabled on sles10 as the kernels lacks the interface.)

Jeff will diff the 1.4 tree with the one in SLES10 and provide patches for the diffs. That is, ignoring the known userspace clusterstack (hasf) changes. The aim here is to ensure the two trees are as similar as possible.

Coly will start the destructive tests. He will email the test plan soon.

Tiger and Tao will email the first version of the EA patches next Friday.

Mark has coded up the dynamic sizing of local alloc. He will email the patches later this week/next week for review and testing.

There was a discussion when to merge the stack-user branch to master. The qs was whether to first address the bind mount in the branch itself or whether to merge and then fix the bug. The general consensus was to first merge and then fix as the bug itself is fairly confined and easily fixable. The bug will have to be addressed before we ship 1.4.1 tools.

The next call is scheduled for June 10th, 2008.

April 1st, 2008

SUSE: Coly Li, Mark Fasheh, Jeff Mahoney

ORACLE: Tao Ma, Tiger Yang, Sunil Mushran, Joel Becker

Reviewed the list of patches for 1.4 queued for 2.6.26. Sunil needs to email few more... o2net tracking and fslock instrumentation. Jan's dlm hash resize also needs to be looked at.

Mark will email lkml all patches queued for 2.6.26.

Coly ran all the tests on 2 nodes. He needs more nodes to run destructive tests.

Both ocfs2 and ocfs2-tools repos have been tagged with the 1.4.0 tag. SUSE will ship SLES10 SP2 with 1.4.0 of fs/tools.

We released the same packages for RHEL5 U2 Beta kernel 2.6.18-84.el5. Our next release will be after the patches queued for 2.6.26 are pushed upstream. We will call it 1.4.0-2.

The next call is scheduled for April 15th, 2008.

March 18th, 2008

SUSE: Coly Li

ORACLE: Tao Ma, Tiger Yang, Sunil Mushran

Informed all that Marcos was running tests for the BETA release. We are aiming to release the packages by the end-of-the-week.

Coly announced running most tests on a 2 node cluster. He will be getting two quad core boxes and was planning on running 4 xen environments on each. Explained that it may be better if he ran 2 as we want atleast a dual core on each node.

Tiger will look into activating splice io support in 1.4 for 2.6.18/2.6.16 kernels.

The next call is scheduled for April 1st, 2008.

March 4th, 2008

SUSE: Coly Li

ORACLE: Tao Ma, Marcos Matsunaga, Mark Fasheh, Joel Becker and Sunil Mushran

The discussion started with patch fixes pending for 2.6.25-rc4. Mark is getting the branch ready for Marcos to pull to run the tests. Meanwhile it was decided that we should pull even the trivial fixes into 1.4 (from mainline) to make future merging easier.

Jan's o2net null pointer fix patch is clashing with Tao's o2net reconnect patch. Tao will respin the patches and break them into two:

  1. o2net null pointer fix atop current mainline for pushing into 2.6.25-rc4.
  2. o2net reconnect patch with the null pointer fix for pushing into 2.6.26.

Some patches already in the queue for 2.6.26 are:

  1. Tao's inode steal patch
  2. Jan's ocfs2_rename() lock optimization patch

Some patches that need to be pulled into 1.4 from mainline are:

  1. Joel's new dlm handshake with versioning (including the endian fix)
  2. Mark's fix writeout in ocfs2_data_convert_worker()

Patches currently in review are:

  1. Jan's DLM hash size mount option
  2. Jan's fs lockres lock stats
  3. DLM debugging

Marcos indicated that he had tested ocfs2-tools head last week and that it was looking good. Tao has pushed the last of the online resize patches to the tools tree.

We are planning to release a BETA drop after the current bug fixes are pushed to the mainline. The packages will be provided with the 2.6.18-53.1.6.el5.bz472427 kernel. We will need to make that kernel available for download.

Coly has a 3 node Xen setup. He has run the single node tests and will get started on the multi node tests.

The next call is scheduled for March 18th, 2008.

February 5th, 2008

SUSE: Jeff Mahoney, Coly Li

ORACLE: Tao Ma, Marcos Matsunaga, Mark Fasheh, Joel Becker and Sunil Mushran

All of Jeff's compat patches for SLES10 have been push to the ocfs2-1.4 git tree. Also pushed were the new features added to 2.6.25, including clustered flock(), online resize, etc.

Next on the block are the ORACORE and CDSL patches.

Marcos has the ORACORE patch. The patch is smaller than the one we had in 1.2, in that, it does not have any performance hack. What remains to be seen is whether tablespace creates are any better between sparse enabled volumes vs non-sparse enabled volumes. What we are hoping for is not only should the create be significantly faster on sparse enabled volumes, the performance on non-sparse should not be any worse than on 1.2.

The CDSL patch is still to be provided.

Marcos reported running all fs tests on a 4 node cluster without any problems.

Coly had issues running in his environment. Jeff suspects it is the SLES10 SP2 iscsi/xen code that is the cause of the problem and will provide Coly with SP1 kernel with ocfs2 1.4.

Tao will be working on a patch to allow the fs to steal inodes from a inode allocator for other slots. This fix is required as the fs returns ENOSPC when it cannot grow its inode allocator, eventhough other slots may have plenty of space remaining.

Jan is getting access to a 16 node cluster and will be using it to get some performance numbers.

The next call is scheduled for February 19th, 2008.

January 29th, 2008

Call was canceled.

January 22nd, 2008

SUSE: Jan Kara, Jeff Mahoney

ORACLE: Marcos Matsunaga, Mark Fasheh, Joel Becker and Sunil Mushran

Details of the OCFS2 1.4 git tree were emailed last week. Jeff created a patch for it to build on SLES10 SP2. He wanted that patch to be added to the 1.4 tree. (Sunil will review the patch and ensure it does not break EL builds.)

Sunil announced 19 more patches were pushed today to make 1.4 compatible with 2.6.24.

Mark talked about the patches that are due to be pushed when 2.6.25 opens. These include clustered flock(), merging of the Meta and Data locks, removal of mount/umount votes as well as fs voting as a whole, online resize, among other changes.

Jeff expressed concern that xattr may not be available in time for SLES10 SP2. We will discuss this further next week.

Marcos is expected to get some nodes available later this week for 1.4 testing. Jeff indicated that Coly has been allocated some hardware and that he is expected to do the same on SP2.

The next call is scheduled for January 29th, 2008.

January 15th, 2008

Call was canceled.

January 8th, 2008

SUSE: Jan Kara, Jeff Mahoney and Coly Li

ORACLE: Marcos Matsunaga, Tao Ma, Mark Fasheh, Joel Becker and Sunil Mushran

It was announced that the OCFS2 1.4 GIT tree will be up by early next week. Jan and Jeff will then work on adding patches to make it work with SLES10 SP2.

Jan discussed his vfs patch for handling arch-specific s_maxbytes. This is a work-in-progress.

The next call is scheduled for January 15th, 2008.

December 18th, 2007

SUSE: Jan Kara, Jeff Mahoney and Coly Li

ORACLE: Marcos Matsunaga, Tiger Yang, Tao Ma, Mark Fasheh, Joel Becker and Sunil Mushran

After introductions, Sunil announced that the 1.4 git source tree was under construction. The source in that tree was obtained from mainline between 2.6.23 and 2.6.24. The git commit logs will have the exact details.

Marcos then described his testing procedure and that he runs all the fs tests against each new kernel tree. He is currently working on getting the test coverage numbers for the 2.6.24 kernel. The results will be posted on the wiki.

Jan mentioned that he had emailed Mark patches concerning s_max_bytes. Mark is reviewing the patches.

Jeff talked about customer requests coming from SUSE product management. Hosting Xen guest images was the biggest upcoming use case. His top two new feature requests were Extended Attributes (POSIX ACLs) and User Locks (with and without range locking).

Tiger talked about his progress with extended attributes. He has submitted patches that implement xattr that are stored inline in the inode. He is currently working on supporting larger number of xattrs that have to be stored external to the inode.

Mark mentioned that he was working on implementing clustered flock(). For more on user locks, refer this page. He also told Tao that his online resize patches were ready for commit.

Joel summarized the userspace clustering he has been working on. This new stack will be in addition to the in-kernel o2cb stack we have currently. Jeff mentioned that while SUSE is planning to stick with the Linux-HA stack they have currently, that stack will also be using openais under the hood. The open question here is the effort that will be required to integrate SUSE's Linux-HA stack with OCFS2's userspace clustering interface.

As far as tasks for the SUSE folks go, Jan will continue to work on benchmarking and look at performance bottlenecks. Coly Li will be helping us with testing. They both will continue to work on the mainline kernel/ocfs2 until 1.4 is available on SLES10.

As far as development tasks go, SUSE folks were made aware of the task lists maintained in the wiki and that one can contribute even by working on the userspace tools. As in, whenever we add a feature in the fs, we need supporting code in fsck/lib/mkfs/tune/debug. For e.g., inline data was pushed upstream before 2.6.24-rc1 but it's tools support is still lagging.

The next call is scheduled for January 8th, 2008.

2012-11-08 13:01