[Ocfs2-tools-devel] [RFC] PATCH: verify slot number in __ocfs2_read_slot_map()
Sunil Mushran
sunil.mushran at oracle.com
Sat Mar 7 09:21:52 PST 2009
On Sat, Mar 07, 2009 at 11:06:18PM +0800, Coly Li wrote:
> These days, I am testing ocfs2 with user space cluster stack (pmck). Right now
> there is a deadlocking when stating ocfs2 volume mount point. See
> https://bugzilla.novell.com/show_bug.cgi?id=482752
>
> What I want to say is, when this dealocking happens, running mounted.ocfs2 will
> get a segmentation fault. I traced the coredump, it was because data read from
> __ocfs2_read_slot_map() might be (partial?) invalid, in ocfs2_print_nodes():
> 66 node_num = map->md_slots[i].sd_node_num;
> 67 if (names && names[node_num] && *(names[node_num]))
> node_num in 66 can be a very large number (due to the invalid data from
> __ocfs2_read_slot_map()), and names[node_num] references to an illegal memory
> region.
>
> I checked code and tried to find out a method to verify whether slot map reading
> is valid when the deadlocking happens, but no idea so far.
>
> A secondary solution is verify slot map number in __ocfs2_write_slot_map(), I
> attach the patch here.
>
> I still have no idea how this deadlock happens, still trace the code. Forgive me
> that I can not provide more information on the deadlock.
>
> Is the secondary solution acceptable ?
> Or is there solution to check whether I/O in __ocfs2_write_slot_map() is valid ?
mounted.ocfs2 does dirty reads. So we cannot trust the read.
>
> Thanks.
>
> Signed-off-by: Coly Li <coly.li at suse.de>
> ---
> libocfs2/slot_map.c | 27 ++++++++++++++++++++++++++-
> 1 files changed, 26 insertions(+), 1 deletions(-)
>
> diff --git a/libocfs2/slot_map.c b/libocfs2/slot_map.c
> index c33f458..870112a 100644
> --- a/libocfs2/slot_map.c
> +++ b/libocfs2/slot_map.c
> @@ -54,6 +54,29 @@ void ocfs2_swap_slot_map_extended(struct
> ocfs2_slot_map_extended *se,
> bswap_32(se->se_slots[i].es_node_num);
> }
>
> +/* es_node_num should be swapped to local cpu endian */
> +static errcode_t __ocfs2_verify_node_num(struct ocfs2_slot_map *sm,
> + int num_slots)
> +{
> + int i;
> +
> + for (i = 0; i < num_slots; i++)
> + if (sm->sm_slots[i] > num_slots)
> + return OCFS2_ET_INTERNAL_FAILURE;
This does not look right. num_slots should be changed to
OCFS2_MAX_NUM_NODES or whatever that macro is called. The slot contains
the node number. The slotnumber is implicit.
> + return 0;
> +}
> +
> +/* es_node_num should be swapped to local cpu endian */
> +static errcode_t __ocfs2_verify_node_num_extended(struct
> ocfs2_slot_map_extended *se,
> + int num_slots)
> +{
> + int i;
> + for (i = 0; i < num_slots; i++)
> + if (se->se_slots[i].es_node_num > num_slots)
> + return OCFS2_ET_INTERNAL_FAILURE;
Same as above.
> + return 0;
> +}
> +
> static errcode_t __ocfs2_read_slot_map(ocfs2_filesys *fs,
> int num_slots,
> union ocfs2_slot_map_wrapper *wrap)
> @@ -90,13 +113,15 @@ static errcode_t __ocfs2_read_slot_map(ocfs2_filesys *fs,
> se = (struct ocfs2_slot_map_extended *)slot_map_buf;
> ocfs2_swap_slot_map_extended(se, num_slots);
> wrap->mw_map_extended = se;
> + ret = __ocfs2_verify_node_num_extended(se, num_slots);
> } else {
> sm = (struct ocfs2_slot_map *)slot_map_buf;
> ocfs2_swap_slot_map(sm, num_slots);
> wrap->mw_map = sm;
> + ret = __ocfs2_verify_node_num(sm, num_slots);
> }
>
> - return 0;
> + return ret;
> }
>
> errcode_t ocfs2_read_slot_map(ocfs2_filesys *fs,
>
>
>
> --
> Coly Li
> SuSE Labs
>
> _______________________________________________
> Ocfs2-tools-devel mailing list
> Ocfs2-tools-devel at oss.oracle.com
> http://oss.oracle.com/mailman/listinfo/ocfs2-tools-devel
More information about the Ocfs2-tools-devel
mailing list