[rds-devel] FW: RDS -- hanging kernel

Andy Grover andy.grover at oracle.com
Fri Apr 30 12:43:39 PDT 2010


On 04/29/2010 10:01 PM, Tang, Changqing wrote:
> Andy, I have dynamic MPI test over RDS, where a rank is killed
> randomly, and a new process is forked and join the 'game'. I have the
> following /var/log/message output:

<snip>

> I understand the send completion error is remote access error (10),
> or remote op error (11).
>
> What is the possible reason for recv completion error (4, local
> protection error) ?

I'm afraid I don't know. We should be using the Reserved L_key so there 
shouldn't be any local access issues, afaik. I don't think this has 
previously been an issue we've had to address.

You might try asking again on linux-rdma at vger.kernel.org.

Regards -- Andy



More information about the rds-devel mailing list