[Ocfs2-users] Journal replay after crash, kernel BUG at fs/ocfs2/journal.c:1700!, 2.6.36
Ronald Moesbergen
intercommit at gmail.com
Fri Oct 29 04:51:33 PDT 2010
>>>> [157768.261818] (ocfs2rec,14060,0):ocfs2_replay_journal:1605
>>>> Recovering node 0 from slot 0 on device (8,32)
>>>> [157772.850182] ------------[ cut here ]------------
>>>> [157772.850211] kernel BUG at fs/ocfs2/journal.c:1700!
>>>
>>> Strange. the bug line is
>>> BUG_ON(osb->node_num == node_num);
>>> and it detects the same node number in the cluster.
>
> I just tried to reproduce it and succeeded. Here's what I did:
> - unmount the filesystem on node app02
> - shutdown the o2cb services on app02
> - Do a halt -f on app01, which still has the OCFS2 volume mounted.
> - Start o2cb services on app02
> - Mount the OCFS2 filesystem -> BUG
>
> Works everytime. So one of the 2 variables checked in that BUG_ON
> statement must no be set correctly somewhere.
One final bit of information: I just retested on 2.6.35.7 and it works
fine there, so this looks like a regression.
[ 819.719661] (ocfs2rec,4135,0):ocfs2_replay_journal:1605 Recovering
node 0 from slot 1 on device (8,32)
[ 823.013843] (ocfs2rec,4135,0):ocfs2_begin_quota_recovery:407
Beginning quota recovery in slot 1
[ 823.018420] (ocfs2_wq,4117,0):ocfs2_finish_quota_recovery:598
Finishing quota recovery in slot 1
Notice the difference in slot number in the recovery message though
(recovery was running on app02 in both situations).
Regards,
Ronald.
More information about the Ocfs2-users
mailing list