[Oracleasm-users] Re: [suse-oracle] re: SELS 10 - Kernel 2.6.16.27.0.9 locks up - Again.

Peter Santos psantos at cheetahmail.com
Wed May 2 05:36:36 PDT 2007


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Alexei,
	the reason we are using asmlib is because our experience with managing
	raw devices is limited and we don't want to run into additional trouble
	down the road.
	
	we've tried these tests over and over and it seems that the machine just
	locks up when we run consecutive "dd" commands .. after about an hr the
	machine locks up.  When the oracleasm is down we can't reproduce this, but when
	the service is up, we get the locking problem. The only thing that I'm
	uncertain about is that when the raw service starts up the raw devices
	are bound, but the permissions on those devices were root:root when
	oracleasm started. Only after did I change the permissions.  I'm going to 	
	try this test one more time in this sequence.
		1. bind the raw devices.
		2. set the proper permissions on those devices
		3. start the oracleasm service.
		4. do /etc/init.d/oracleasm/status and listdisks to make sure that
		   everything looks correct.
		5. run a number of "dd" commands to some local storage and see if
		   machine locks up.
		   prompt>  dd if=/dev/zero of=/z0/test/testthere3 bs=4k count=22000000

	The frustrating thing is that the machine just locks up and there is no logging. Also
	it requires that we go to the data center to physically restart the machine.

	The other thing is that our hardware is certified on SLES 9 (SP3), but not on SLES 10. Again,
	I'm not show how important this is, but we can/might try SLES 9 if we can't get this resolved.
	The certification bulletin for our hardware on SLES 9 is 83873.
	
	Here is the module information for ASM.

	dbt1:~ # modinfo oracleasm
	filename:       /lib/modules/2.6.16.27-0.9-smp/kernel/drivers/addon/oracleasm/oracleasm.ko
	license:        GPL
	version:        2.0.3
	author:         Joel Becker <joel.becker at oracle.com>
	description:    Kernel driver backing the Generic Linux ASM Library.
	vermagic:       2.6.16.27-0.9-smp SMP gcc-4.1
	depends:
	srcversion:     B35F9F20EF40931C318A5EA

	Any ideas on how to troubleshoot this would be great!


- -peter


Alexei_Roudnev wrote:
> Advice # 1 - drop asmlib and never use it. It is useless piece of software.
> Linux have 'raw' which do the same but is standard component, not omee made
> as asmlib.
> 
> Then repeat tests again.
> 
> ----- Original Message ----- 
> From: "Peter Santos" <psantos at cheetahmail.com>
> To: <suse-oracle at suse.com>
> Sent: Monday, April 30, 2007 12:15 PM
> Subject: [suse-oracle] re: SEL 10 - Kernel 2.6.16.27.0.9 locks up
> 
> 
> Folks,
> I'm trying to find out how to go about investigating an issue
> where our test server running 10.2.0.3 (x86_64) is locking up when we run
>> a
> few dd commands sequentially (dd if=/dev/zero of=/z0/test/testthere2 bs=4k
>> count=5000000) .. where /z0 was
> just some local storage.
> 
> He did a kernel upgrade to version 2.6.16.27.0.9 a couple of weeks ago. We
>> then installed
> the following ASM packages on top of that.
> 
> oracleasmlib-2.0.2-1.x86_64.rpm
> oracleasm-support-2.0.3-1.x86_64.rpm
> oracleasm-2.6.16.27-0.9-smp-2.0.3-1.x86_64.rpm
> 
> We are using SEL 10 + 10.2.0.3 + ASM via ASMLib.
> 
> At random intervals the machine would crash with no information in the
>> /var/log/messages. We ran a memory test
> on it and it was fine.  Finally our SA recompiled the latest kernel from
>> source ( 2.6.21-smp) and after a number
> of "dd" tests ,the machine did NOT crash.  With the latest kernel from
>> source, ASM was not started because of
> version mismatch!
> 
> ASM may or may not be the problem, but what is the best way to
>> troubleshoot this?
> The machine has the following spec:
> - Dell 6800  with 4 dual core CPUs (Intel(R) Xeon(TM) CPU 2.60GHz )
> - Storage is DS4400
> - Storage Driver: Fibre Channel: QLogic Corp. QLA2312 Fibre Channel
>> Adapter (rev 02)
> 
> -peter
> 
> 
>>
- --
To unsubscribe, email: suse-oracle-unsubscribe at suse.com
For additional commands, email: suse-oracle-help at suse.com
Please see http://www.suse.com/oracle/ before posting
>>
>>

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.1 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGOIXUoyy5QBCjoT0RAtF2AKCGy+d6+p/C88fQ2pbEYOOjmKIWZQCeInqA
nhkQebGQE+Dz3tC3EpzhC/U=
=o4fN
-----END PGP SIGNATURE-----



More information about the Oracleasm-users mailing list