[Ocfs2-users] Partition table crash, where can I find debug message?

Frank Zhang Frank.Zhang at citrix.com
Wed Oct 12 10:49:44 PDT 2011


Sorry, it's not power outage, it's just a normal reboot.
Is this serious to corrupt the super block?

From: Frank Zhang
Sent: Wednesday, October 12, 2011 10:37 AM
To: 'Sunil Mushran'
Cc: 'ocfs2-users at oss.oracle.com'
Subject: RE: [Ocfs2-users] Partition table crash, where can I find debug message?

Thanks Suni. Yes the terminology should be super block corruption.
I checked with my colleague they said  the ISCSI server suffered a power outage yesterday so they rebooted it.
Given it was under heavy usage because of many VM running on, I guess this may be the cause. now I am trying to recover it

From: Sunil Mushran [mailto:sunil.mushran at oracle.com]<mailto:[mailto:sunil.mushran at oracle.com]>
Sent: Wednesday, October 12, 2011 10:08 AM
To: Frank Zhang
Cc: 'ocfs2-users at oss.oracle.com'
Subject: Re: [Ocfs2-users] Partition table crash, where can I find debug message?

Not sure what you mean by a partition table crash. Is it that someone
overwrote the partition table on the iscsi server? That's what it looks
like. If mount cannot detect the fs type, then it means atleast superblock
corruption. And such corruptions typically caused by external entities.
Stray dd perhaps.

Did you try recovering the superblock using one of the the backups?
fsck.ocfs2 -r [1-6] /dev/sdX ?

On 10/11/2011 07:04 PM, Frank Zhang wrote:
Hi Experts, recently I observed a partition table crash that made me really scared.
I have two OVM servers sharing OCFS2 over iscsi, after running  a bunch of VMs for a while,  all VMs were gone and I saw the mount points of OCFS2 gone on both hosts.
Then I tried to mount it again, the iscsi device crashed by saying "please specify filesystem type". I checked dmesg but there is nothing useful except

"SCSI device sdc: drive cache: write back
sdc: unknown partition table
sd 2:0:0:1: Attached scsi disk sdc
sd 2:0:0:1: Attached scsi generic sg3 type 0
OCFS2 Node Manager 1.4.4
OCFS2 DLM 1.4.4
OCFS2 DLMFS 1.4.4
OCFS2 User DLM kernel interface loaded
connection1:0: detected conn error (1011)"

basically after logging into ISCSI device on both hosts, I created soft links of /dev/ovm_iscsi1 pointing to device node under /dev/disk/by-path/real_isci_device, then I formatted /dev/ovm_iscsi1 to OCFS2 and mounted them to somewhere(of course I configured /etc/ocfs2/cluster.conf and made o2cb correctly start).
Could somebody tell me where to get more debug info to trace the problem? This is really scared considering I may lose all my VMs because of the silent crash.

And is there any way to recover the partition table? Thanks








_______________________________________________

Ocfs2-users mailing list

Ocfs2-users at oss.oracle.com<mailto:Ocfs2-users at oss.oracle.com>

http://oss.oracle.com/mailman/listinfo/ocfs2-users

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://oss.oracle.com/pipermail/ocfs2-users/attachments/20111012/b6448ced/attachment-0001.html 


More information about the Ocfs2-users mailing list