Calculate Relationship Between X And Y Variables - Example 1

Calculate the linear relationship between message size and transmission time using the linReg() function

Query

logscale
linReg(x=bytes_sent, y=send_duration)

Introduction

The linReg() function can be used to calculate a linear relationship between two variables by using least-squares fitting. The function is used to analyze different performance relationships in a system, for example: response size and transmission time, server load and total response size, or server load and request types.

In this example, the linReg() function is used to calculate the linear relationship between bytes_sent (x variable) and send_duration (y variable). The example shows the relationship between message size (bytes sent in a server) and transmission time (time to send the bytes).

Example incoming data might look like this:

@timestampbytes_sentsend_duration
2025-04-07 13:00:0010240.15
2025-04-07 13:00:0120480.25
2025-04-07 13:00:0240960.45
2025-04-07 13:00:0381920.85
2025-04-07 13:00:045120.08
2025-04-07 13:00:05163841.65
2025-04-07 13:00:0630720.35
2025-04-07 13:00:0761440.65
2025-04-07 13:00:08102401.05
2025-04-07 13:00:0946080.48

Step-by-Step

  1. Starting with the source repository events.

  2. flowchart LR; %%{init: {"flowchart": {"defaultRenderer": "elk"}} }%% repo{{Events}} 0{{Aggregate}} result{{Result Set}} repo --> 0 0 --> result style 0 fill:#ff0000,stroke-width:4px,stroke:#000;
    logscale
    linReg(x=bytes_sent, y=send_duration)

    Correlates bytes_sent with send_duration, showing the relationship between message size and transmission time (the variables x and y) and outputs the results in fields named _slope (slope value),_intercept (intercept value),_r2 (adjusted R-squared value), and _n (number of data points). These four key values indicate relationship strength and reliability.

  3. Event Result set.

Summary and Results

The query is used to calculate a linear relationship between bytes_sent (x variable) and send_duration (y variable).

Calculating the relationship between size of data transferred and time taken to send data is useful, for example, in trend analysis, performance monitoring, or anomaly detection.

Sample output from the incoming example data:

_slope_intercept_r2_n
9.823069852941172E-50.042764705882353260.999689733689508110

_slope is the additional time needed per byte sent.

_intercept is the baseline transmission time.

_r2 is the statistical accuracy of the linear model.

_n is the total number of data points analyzed.