[DTrace-devel] [PATCH] Review perf_event_open() calls

Kris Van Hees kris.van.hees at oracle.com
Wed Oct 14 22:18:34 PDT 2020


On Wed, Oct 14, 2020 at 05:35:45PM -0400, eugene.loh at oracle.com wrote:
> From: Eugene Loh <eugene.loh at oracle.com>
> 
> Our perf_event_open() calls should be cleaned up a little.  The issues
> are minor, with no user-observable effects, but cleaner code is better.
> 
> Set size in "struct perf_event_attr".

Sure, though see my suggestions below.

> Use flag PERF_FLAG_FD_CLOEXEC.

This is actually not necessary.  I am not opposed to it but the code as it
was didn't pose an issue because we're not using execve().

> Use PERF_SAMPLE_RAW only for the perf_event buffer to which BPF programs
> will write.

That is actually harmless, but yes, we can drop it.

> Distribute perf_events evenly among CPUs.  (Note that perf_event buffers
> and profile-n events are one-per-CPU.  Further, tick-n events are
> distributed randomly for load balancing.  The only change needed is to
> distribute tp_attach() randomly over CPUs rather than mapping them all
> to CPU 0.)

Can you explain your rationale here, i.e. point out where there is an issue?
As you mention, perf_event ring buffers (our output buffers) are per-CPU, and
profile-n events are (as required) per-CPU.  The tick-n events being spread
across all CPUs makes sense also because they are events that fire as timer
events (and are out of necessity tied to a specific CPU).  I am not quite sure
where speading SDT events over the various CPUs accomplishes load balancing
because those events are triggered by executing code and therefore execute on
the same CPU as where the executable code was running.  So I am unclear on
what exactly you are load balancing here?

> Signed-off-by: Eugene Loh <eugene.loh at oracle.com>
> ---
>  libdtrace/dt_peb.c          | 1 +
>  libdtrace/dt_prov_profile.c | 3 +--
>  libdtrace/dt_prov_sdt.c     | 8 ++++++--
>  3 files changed, 8 insertions(+), 4 deletions(-)
> 
> diff --git a/libdtrace/dt_peb.c b/libdtrace/dt_peb.c
> index 0ef1c419..b8c49e2f 100644
> --- a/libdtrace/dt_peb.c
> +++ b/libdtrace/dt_peb.c
> @@ -75,6 +75,7 @@ dt_peb_open(dt_peb_t *peb)
>  	memset(&attr, 0, sizeof(attr));
>  	attr.config = PERF_COUNT_SW_BPF_OUTPUT;
>  	attr.type = PERF_TYPE_SOFTWARE;
> +	attr.size = sizeof(struct perf_event_attr);

attr.size = sizeof(attr);

>  	attr.sample_type = PERF_SAMPLE_RAW;
>  	attr.sample_period = 1;
>  	attr.wakeup_events = 1;
> diff --git a/libdtrace/dt_prov_profile.c b/libdtrace/dt_prov_profile.c
> index 8541bee6..22cd968a 100644
> --- a/libdtrace/dt_prov_profile.c
> +++ b/libdtrace/dt_prov_profile.c
> @@ -293,7 +293,6 @@ static int attach(dtrace_hdl_t *dtp, const dt_probe_t *prp, int bpf_fd)
>  	memset(&attr, 0, sizeof(attr));
>  	attr.type = PERF_TYPE_SOFTWARE;
>  	attr.config = PERF_COUNT_SW_CPU_CLOCK;
> -	attr.sample_type = PERF_SAMPLE_RAW;
>  	attr.size = sizeof(struct perf_event_attr);
>  	attr.wakeup_events = 1;
>  	attr.freq = 0;
> @@ -307,7 +306,7 @@ static int attach(dtrace_hdl_t *dtp, const dt_probe_t *prp, int bpf_fd)
>  			j = rand() % dtp->dt_conf.num_online_cpus;
>  
>  		fd = perf_event_open(&attr, -1, dtp->dt_conf.cpus[j].cpu_id,
> -				     -1, 0);
> +				     -1, PERF_FLAG_FD_CLOEXEC);
>  		if (fd < 0)
>  			continue;
>  		if (ioctl(fd, PERF_EVENT_IOC_SET_BPF, bpf_fd) < 0) {
> diff --git a/libdtrace/dt_prov_sdt.c b/libdtrace/dt_prov_sdt.c
> index 4b2ac31b..8775103f 100644
> --- a/libdtrace/dt_prov_sdt.c
> +++ b/libdtrace/dt_prov_sdt.c
> @@ -77,14 +77,18 @@ int tp_attach(dtrace_hdl_t *dtp, const dt_probe_t *prp, int bpf_fd)
>  	if (datap->event_fd == -1) {
>  		int			fd;
>  		struct perf_event_attr	attr = { 0, };
> +		int			cpu_index, cpu;
> +
> +		cpu_index = rand() % dtp->dt_conf.num_online_cpus;
> +		cpu = dtp->dt_conf.cpus[cpu_index].cpu_id;
>  
>  		attr.type = PERF_TYPE_TRACEPOINT;
> -		attr.sample_type = PERF_SAMPLE_RAW;
> +		attr.size = sizeof(struct perf_event_attr);

attr.size = sizeof(attr);

>  		attr.sample_period = 1;
>  		attr.wakeup_events = 1;
>  		attr.config = datap->event_id;
>  
> -		fd = perf_event_open(&attr, -1, 0, -1, 0);
> +		fd = perf_event_open(&attr, -1, cpu, -1, PERF_FLAG_FD_CLOEXEC);
>  		if (fd < 0)
>  			return dt_set_errno(dtp, errno);
>  
> -- 
> 2.18.4
> 
> 
> _______________________________________________
> DTrace-devel mailing list
> DTrace-devel at oss.oracle.com
> https://oss.oracle.com/mailman/listinfo/dtrace-devel



More information about the DTrace-devel mailing list