[Ocfs2-tools-devel] [PATCH] o2hbmonitor: add semaphore
Sunil Mushran
sunil.mushran at oracle.com
Fri Dec 10 08:48:24 PST 2010
On 12/10/2010 03:09 AM, Srinivas Eeda wrote:
> add semaphore to limit to one o2hbmonitor per node
>
> Signed-off-by: Srinivas Eeda<srinivas.eeda at oracle.com>
> ---
> o2monitor/o2hbmonitor.c | 45 +++++++++++++++++++++++++++++++++++++++++++++
> 1 files changed, 45 insertions(+), 0 deletions(-)
>
> diff --git a/o2monitor/o2hbmonitor.c b/o2monitor/o2hbmonitor.c
> index f01da76..8c240f1 100644
> --- a/o2monitor/o2hbmonitor.c
> +++ b/o2monitor/o2hbmonitor.c
> @@ -43,6 +43,8 @@
> #include<libgen.h>
> #include<syslog.h>
> #include<errno.h>
> +#include<sys/ipc.h>
> +#include<sys/sem.h>
>
> #define SYS_CONFIG_DIR "/sys/kernel/config"
> #define O2HB_CLUSTER_DIR SYS_CONFIG_DIR"/cluster"
> @@ -61,6 +63,8 @@
> #define SLOW_POLL_IN_SECS 10
> #define FAST_POLL_IN_SECS 2
>
> +#define O2HB_SEM_MAGIC_KEY 0x4A32594B
Can you change this to 0x6F326862. That's o2hb in hex.
Easier to remember.
> +
> char *progname;
> int interactive;
> int warn_threshold_percent;
> @@ -300,6 +304,41 @@ static void monitor(void)
> }
> }
>
> +static int getlock(void)
> +{
> + int ret, semid, vals[1] = { 0 };
> + struct sembuf trylock[2] = {
> + {.sem_num = 0, .sem_op = 0, .sem_flg = SEM_UNDO|IPC_NOWAIT},
> + {.sem_num = 0, .sem_op = 1, .sem_flg = SEM_UNDO|IPC_NOWAIT},
> + };
> +
> + semid = semget(O2HB_SEM_MAGIC_KEY, 1, 0);
> + if (semid< 0) {
> + semid = semget(O2HB_SEM_MAGIC_KEY, 1,
> + IPC_CREAT|IPC_EXCL|S_IRUSR);
> + if (semid< 0)
> + goto out;
> + semctl(semid, 0, SETALL, vals);
> + if (semop(semid, trylock, 2)< 0)
> + goto out;
> + else
> + return 0;
> + }
> + if (semop(semid, trylock, 2)< 0)
> + goto out;
> + return 0;
> +out:
> + ret = errno;
> + if (ret == EAGAIN) {
> + fprintf(stderr, "o2hbmonitor already running\n");
> + syslog(LOG_WARNING, "o2hbmonitor already running\n");
> + } else {
> + fprintf(stderr, "o2hbmonitor failed to start\n");
> + syslog(LOG_WARNING, "o2hbmonitor failed to start\n");
> + }
daemon redirects stderr to /dev/null. That's why I had suggested
taking the lock before. And we should not be erroring out if we
cannot determine if it is running or not. There is little downside of
multiple instances of this monitor running. But there is a downside
if users assume it is running and it is not.
> + return ret;
> +}
> +
> static void usage(void)
> {
> fprintf(stderr, "usage: %s [-w percent] -[ivV]\n", progname);
> @@ -360,6 +399,12 @@ int main(int argc, char **argv)
> }
>
> openlog(progname, LOG_CONS|LOG_NDELAY, LOG_DAEMON);
> + ret = getlock();
> + if (ret) {
> + closelog();
> + return ret;
> + }
> +
> monitor();
> closelog();
>
More information about the Ocfs2-tools-devel
mailing list