Window Functions

This page unpacks QuestDB window functions and provides references.

Window functions exist within many SQL dialects. QuestDB is consistent with expected function.

What is a Window Function?

A window function performs a calculation across a set of rows that are related to the current row. This set of related rows is called a "window", defined by an OVER clause that follows the window function.

In practical terms, window functions are used when you need to perform a calculation that depends on a group of rows, but you want to retain the individual rows in the result set. This is different from aggregate functions like a cumulative sum or avg, which perform calculations on a group of rows and return a single result.

The underlying mechanism of a window function involves three components:

  • Partitioning: The PARTITION BY clause divides the result set into partitions (groups of rows) upon which the window function is applied. If no partition is defined, the function treats all rows of the query result set as a single partition.

  • Ordering: The ORDER BY clause within the OVER clause determines the order of the rows in each partition.

  • Frame Specification: This defines the set of rows included in the window, relative to the current row. For example, ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW includes all rows from the start of the partition to the current row.

Use cases for window functions are vast.

They are often used in analytics for tasks such as:

  • Calculating running totals or averages
  • Finding the maximum or minimum value in a sequence or partition
  • Ranking items within a specific category or partition
  • Calculating moving averages or cumulative sums

Window functions are tough to grok.

An analogy before we get to building:

Imagine a group of cars in a race. Each car has a number, a name, and a finish time. If you wanted to know the average finish time, you could use an aggregate function like avg to calculate it. But this would only give you a single result: the average time. You wouldn't know anything about individual cars' times.

Now, let's say you want to know how each car's time compares to the average. Enter window functions. A window function allows you to calculate the average finish time (the window), but for each car (row) individually.

For example, you could use a window function to calculate the average finish time for all cars, but then apply this average to each car to see if they were faster or slower than the average. The OVER clause in a window function is like saying, "for each car, compare their time to the average time of all cars."

So, in essence, window functions allow you to perform calculations that consider more than just the individual row or the entire table, but a 'window' of related rows. This 'window' could be all rows with the same value in a certain column, like all cars of the same engine size, or it could be a range of rows based on some order, like the three cars who finished before and after a certain car.

This makes window functions incredibly powerful for complex calculations and analyses.

Building Window Functions

At the peak of its complexity, a window function can appear as such:

functionName OVER (
PARTITION BY columnName [, columnName ...]
ORDER BY columnName [ASC | DESC] [, columnName [ASC | DESC] ...]
RANGE | ROWS (
UNBOUNDED PRECEDING | offset PRECEDING | CURRENT ROW
| BETWEEN (UNBOUNDED PRECEDING | offset PRECEDING | CURRENT ROW)
AND (offset PRECEDING | CURRENT ROW)
)
EXCLUDE CURRENT ROW | EXCLUDE NO OTHERS
)

The broad scope of possible choices can be overwhelming. But once the options become clear, assembling valuable and performant queries is quick work.

We will break down the above into:

  1. Base Function
  2. OVER Clause - PARTITION & ORDER
  3. "Frame" Clause - RANGE or ROWS
  4. Exclusion Option

Base Function

This reference page demonstrates 5 base functions:

  1. avg()
  2. first_value()
  3. rank()
  4. row_number()
  5. Cumulative sum()

We can assemble our window functions into "blocks" so that it is easier to understand. Each block is then explained in its own small section.

The base function is the first block.

It contextualizes the "way we look through our window":

avg(price) ... (
...
)
...
)

Next we define OVER. It is the key to assembling valuable and performant window functions:

avg(price) OVER (
...
)
...
)

Within OVER, we will define PARTITION BY and ORDER BY, as well as provide our "Frame" clause, which details our RANGE or ROWS. This is the heart and shape of our window:

avg(price) OVER (
PARTITION BY columnName [, columnName ...]
ORDER BY columnName [ASC | DESC] [, columnName [ASC | DESC] ...]
RANGE | ROWS (
UNBOUNDED PRECEDING | offset PRECEDING | CURRENT ROW
| BETWEEN (UNBOUNDED PRECEDING | offset PRECEDING | CURRENT ROW)
AND (offset PRECEDING | CURRENT ROW)
)
...
)

Finally, our exclusion clauses indicate what to omit. It's a bit like sculpting the final details into the window:

avg(price) OVER (
PARTITION BY columnName [, columnName ...]
ORDER BY columnName [ASC | DESC] [, columnName [ASC | DESC] ...]
RANGE | ROWS (
UNBOUNDED PRECEDING | offset PRECEDING | CURRENT ROW
| BETWEEN (UNBOUNDED PRECEDING | offset PRECEDING | CURRENT ROW)
AND (offset PRECEDING | CURRENT ROW)
)
EXCLUDE CURRENT ROW | EXCLUDE NO OTHERS
)

OVER Clause - PARTITION & ORDER

The OVER clause defines how data is grouped and processed. When you set the function ahead of OVER, it's a bit like a "for each" operation. It is framed as: "perform this function OVER related rows based on the following terms".

It can be used with PARTITION BY and ORDER BY to set unique parameters and organize the rows. For performance reasons, if ORDER BY is set within an OVER clause, it should match the base query's ORDER BY.

"Frame" Clause - RANGE or ROWS

Window functions use a "frame" to define the subset of data the function operates on. Two modes are available for defining this frame: RANGE and ROWS.

RANGE Mode

RANGE mode defines the window frame based on a range of values in the ORDER BY column. This is useful when the data has a continuous or time-based nature.

For example, to calculate a moving average of prices over time, you might use RANGE mode with ORDER BY timestamp:

SELECT symbol, price, timestamp,
avg(price) OVER (
PARTITION BY symbol
ORDER BY timestamp
RANGE BETWEEN '1' HOUR PRECEDING AND CURRENT ROW)
as moving_avg
FROM trades

This calculates the average price for each symbol, for the current row and all rows with a timestamp within the preceding hour.

ROWS Mode

ROWS mode defines the window frame based on a specific number of rows. This is useful when you want to consider a fixed number of rows, regardless of their values.

For example, to calculate a moving average of the last N prices, you might use ROWS mode:

SELECT symbol, price, timestamp,
avg(price) OVER (PARTITION BY symbol ORDER BY timestamp ROWS BETWEEN 3 PRECEDING AND CURRENT ROW) as moving_avg
FROM trades

This calculates the average price for each symbol, for the current row and the three preceding rows.

Common Syntax

Both RANGE and ROWS modes share similar syntax for defining the frame:

  • UNBOUNDED PRECEDING: The window starts at the first row of the partition
  • value PRECEDING or offset PRECEDING: The window starts at a specified value or number of rows before the current row
  • CURRENT ROW: The window starts or ends at the current row
  • BETWEEN (UNBOUNDED PRECEDING | value PRECEDING | CURRENT ROW) AND (value PRECEDING | CURRENT ROW): The window starts and ends at specified points relative to the current row

The choice between RANGE and ROWS depends on the nature of your data and the specific requirements of your calculation.

Default Frame Definition

When the frame clause is not specified, the default frame is RANGE UNBOUNDED PRECEDING, which includes all rows from the start of the partition to the current row.

  • If ORDER BY is not present, the frame includes the entire partition, as all rows are considered equal.

  • If ORDER BY is present, the frame includes all rows from the start of the partition to the current row. Note that UNBOUNDED FOLLOWING is only allowed when the frame start is UNBOUNDED PRECEDING, which means the frame includes the entire partition.

Exclusion Option

The OVER clause can also include an exclusion option, which determines whether certain rows are excluded from the frame:

  • EXCLUDE CURRENT ROW: Excludes the current row in ROWS mode and all rows with the same ORDER BY value in RANGE mode. This is equivalent to setting the frame end to 1 PRECEDING.
  • EXCLUDE NO OTHERS: Includes all rows in the frame. This is the default if no exclusion option is specified.

Time Units

The time units that can be used in window functions are:

  • day
  • hour
  • minute
  • second
  • millisecond
  • microsecond

Plural forms of these time units are also accepted.

avg

In the context of window functions, avg(value) calculates the average of value over the set of rows defined by the window frame.

Arguments:

  • value: The column of numeric values to calculate the average of.

Return value:

  • The average of value for the rows in the window frame.

Description

When used as a window function, avg() operates on a "window" of rows defined by the OVER clause. The rows in this window are determined by the PARTITION BY, ORDER BY, and frame specification components of the OVER clause.

The avg() function respects the frame clause, meaning it only includes rows within the specified frame in the calculation. The result is a separate average for each row, based on the corresponding window of rows.

Note that the order of rows in the result set is not guaranteed to be the same with each execution of the query. To ensure a consistent order, use an ORDER BY clause outside of the OVER clause.

Examples:

Examples below use trades table:

CREATE TABLE trades (
symbol SYMBOL,
price DOUBLE,
amount DOUBLE,
timestamp TIMESTAMP
) TIMESTAMP (timestamp)
PARTITION BY DAY WAL;

INSERT INTO trades(symbol, price, amount, timestamp)
VALUES
('ETH-USD', 2615.54, 0.00044, '2022-03-08 18:03:57'),
('BTC-USD', 39269.98, 0.001, '2022-03-08 18:03:57'),
('BTC-USD', 39265.31, 0.000127, '2022-03-08 18:03:58'),
('BTC-USD', 39265.31, 0.000245, '2022-03-08 18:03:58'),
('BTC-USD', 39265.31, 0.000073, '2022-03-08 18:03:58'),
('BTC-USD', 39263.28, 0.00392897, '2022-03-08 18:03:58'),
('ETH-USD', 2615.35, 0.02245868, '2022-03-08 18:03:58'),
('ETH-USD', 2615.36, 0.03244613, '2022-03-08 18:03:58'),
('BTC-USD', 39265.27, 0.00006847, '2022-03-08 18:03:58'),
('BTC-USD', 39262.42, 0.00046562, '2022-03-08 18:03:58');

Moving average price over latest 4 rows

SELECT symbol, price, amount, timestamp,
avg(price) OVER (PARTITION BY symbol
ROWS BETWEEN 3 PRECEDING AND CURRENT ROW)
FROM trades
symbolpriceamounttimestampavg
ETH-USD2615.540.000442022-03-08T18:03:57.609765Z2615.54
BTC-USD39269.980.0012022-03-08T18:03:57.710419Z39269.98
BTC-USD39265.310.0001272022-03-08T18:03:58.357448Z39267.645
BTC-USD39265.310.0002452022-03-08T18:03:58.357448Z39266.8666
BTC-USD39265.310.0000732022-03-08T18:03:58.357448Z39266.4775
BTC-USD39263.280.003928972022-03-08T18:03:58.357448Z39264.8025
ETH-USD2615.350.022458682022-03-08T18:03:58.612275Z2615.445
ETH-USD2615.360.032446132022-03-08T18:03:58.612275Z2615.4166
BTC-USD39265.270.000068472022-03-08T18:03:58.660121Z39264.7925
BTC-USD39262.420.000465622022-03-08T18:03:58.660121Z39264.07

Moving average price over preceding X rows

If frame is specified only on preceding rows, avg() returns null until at least one non-null value enters the frame.

SELECT symbol, price, amount, timestamp,
avg(price) OVER (PARTITION BY symbol
ROWS BETWEEN UNBOUNDED PRECEDING AND 4 PRECEDING)
FROM trades
symbolpriceamounttimestampavg
ETH-USD2615.540.000442022-03-08T18:03:57.609765Znull
BTC-USD39269.980.0012022-03-08T18:03:57.710419Znull
BTC-USD39265.310.0001272022-03-08T18:03:58.357448Znull
BTC-USD39265.310.0002452022-03-08T18:03:58.357448Znull
BTC-USD39265.310.0000732022-03-08T18:03:58.357448Znull
BTC-USD39263.280.003928972022-03-08T18:03:58.357448Z39269.98
ETH-USD2615.350.022458682022-03-08T18:03:58.612275Znull
ETH-USD2615.360.032446132022-03-08T18:03:58.612275Znull
BTC-USD39265.270.000068472022-03-08T18:03:58.660121Z39267.6450
BTC-USD39262.420.000465622022-03-08T18:03:58.660121Z39266.8666

Moving average price over values in the latest second

SELECT symbol, price, amount, timestamp,
avg(price) OVER (PARTITION BY symbol
ORDER BY timestamp
RANGE BETWEEN '1' SECOND PRECEDING AND CURRENT ROW)
FROM trades
ORDER BY timestamp
symbolpriceamounttimestampavg
ETH-USD2615.540.000442022-03-08T18:03:57.609765Z2615.54
BTC-USD39269.980.0012022-03-08T18:03:57.710419Z39269.98
BTC-USD39265.310.0001272022-03-08T18:03:58.357448Z39267.645
BTC-USD39265.310.0002452022-03-08T18:03:58.357448Z39266.8666
BTC-USD39265.310.0000732022-03-08T18:03:58.357448Z39266.4775
BTC-USD39263.280.003928972022-03-08T18:03:58.357448Z39265.838
ETH-USD2615.350.022458682022-03-08T18:03:58.612275Z2615.35
ETH-USD2615.360.032446132022-03-08T18:03:58.612275Z2615.355
BTC-USD39265.270.000068472022-03-08T18:03:58.660121Z39265.7433
BTC-USD39262.420.000465622022-03-08T18:03:58.660121Z39265.2685

Moving average price over values in the latest second, descending designated timestamp order

SELECT symbol, price, amount, timestamp,
avg(price) OVER (PARTITION BY symbol
ORDER BY timestamp DESC
RANGE BETWEEN '1' SECOND PRECEDING AND CURRENT ROW)
FROM trades
ORDER BY timestamp DESC
symbolpriceamounttimestampavg
BTC-USD39262.420.000465622022-03-08T18:03:58.660121Z39262.42
BTC-USD39265.270.000068472022-03-08T18:03:58.660121Z39263.845
ETH-USD2615.360.032446132022-03-08T18:03:58.612275Z2615.36
ETH-USD2615.350.022458682022-03-08T18:03:58.612275Z2615.355
BTC-USD39263.280.003928972022-03-08T18:03:58.357448Z39263.6566
BTC-USD39265.310.0000732022-03-08T18:03:58.357448Z39264.07
BTC-USD39265.310.0002452022-03-08T18:03:58.357448Z39264.318
BTC-USD39265.310.0001272022-03-08T18:03:58.357448Z39264.4833
BTC-USD39269.980.0012022-03-08T18:03:57.710419Z39265.2685
ETH-USD2615.540.000442022-03-08T18:03:57.609765Z2615.54

Moving average over default frame

Default frame is RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW, which spans whole partition in absence of ORDER BY clause.

SELECT symbol, price, amount, timestamp,
avg(price) OVER (PARTITION BY symbol)
FROM trades
ORDER BY timestamp
symbolpriceamounttimestampavg
ETH-USD2615.540.000442022-03-08T18:03:57.609765Z2615.4166
BTC-USD39269.980.0012022-03-08T18:03:57.710419Z39265.2685
BTC-USD39265.310.0001272022-03-08T18:03:58.357448Z39265.2685
BTC-USD39265.310.0002452022-03-08T18:03:58.357448Z39265.2685
BTC-USD39265.310.0000732022-03-08T18:03:58.357448Z39265.2685
BTC-USD39263.280.003928972022-03-08T18:03:58.357448Z39265.2685
ETH-USD2615.350.022458682022-03-08T18:03:58.612275Z2615.4166
ETH-USD2615.360.032446132022-03-08T18:03:58.612275Z2615.4166
BTC-USD39265.270.000068472022-03-08T18:03:58.660121Z39265.2685
BTC-USD39262.420.000465622022-03-08T18:03:58.660121Z39265.2685

Moving average over default ordered frame

SELECT symbol, price, amount, timestamp,
avg(price) OVER (PARTITION BY symbol ORDER BY timestamp)
FROM trades
ORDER BY timestamp
symbolpriceamounttimestampavg
ETH-USD2615.540.000442022-03-08T18:03:57.609765Z2615.54
BTC-USD39269.980.0012022-03-08T18:03:57.710419Z39269.98
BTC-USD39265.310.0001272022-03-08T18:03:58.357448Z39267.645
BTC-USD39265.310.0002452022-03-08T18:03:58.357448Z39266.8666
BTC-USD39265.310.0000732022-03-08T18:03:58.357448Z39266.4775
BTC-USD39263.280.003928972022-03-08T18:03:58.357448Z39265.838
ETH-USD2615.350.022458682022-03-08T18:03:58.612275Z2615.445
ETH-USD2615.360.032446132022-03-08T18:03:58.612275Z2615.4166
BTC-USD39265.270.000068472022-03-08T18:03:58.660121Z39265.7433
BTC-USD39262.420.000465622022-03-08T18:03:58.660121Z39265.2685

Moving average over whole result set

SELECT symbol, price, amount, timestamp,
avg(price) OVER ()
FROM trades
ORDER BY timestamp
symbolpriceamounttimestampavg
ETH-USD2615.540.000442022-03-08T18:03:57.609765Z28270.3130
BTC-USD39269.980.0012022-03-08T18:03:57.710419Z28270.3130
BTC-USD39265.310.0001272022-03-08T18:03:58.357448Z28270.3130
BTC-USD39265.310.0002452022-03-08T18:03:58.357448Z28270.3130
BTC-USD39265.310.0000732022-03-08T18:03:58.357448Z28270.3130
BTC-USD39263.280.003928972022-03-08T18:03:58.357448Z28270.3130
ETH-USD2615.350.022458682022-03-08T18:03:58.612275Z28270.3130
ETH-USD2615.360.032446132022-03-08T18:03:58.612275Z28270.3130
BTC-USD39265.270.000068472022-03-08T18:03:58.660121Z28270.3130
BTC-USD39262.420.000465622022-03-08T18:03:58.660121Z28270.3130

first_value

In the context of window functions, first_value(value) calculates the first value in the set of rows defined by the window frame.

Arguments:

  • value: Any numeric value.

Return value:

  • The first occurrence of value (including null) for the rows in the window frame.

Description

first_value() operates on a "window" of rows defined by the OVER clause. The rows in this window are determined by the PARTITION BY, ORDER BY, and frame specification components of the OVER clause.

The first_value() function respects the frame clause, meaning it only includes rows within the specified frame in the calculation. The result is a separate value for each row, based on the corresponding window of rows.

Note that the order of rows in the result set is not guaranteed to be the same with each execution of the query. To ensure a consistent order, use an ORDER BY clause outside of the OVER clause.

Examples:

Examples below use trades table defined above.

First price over latest 4 rows

SELECT
symbol,
price,
amount,
timestamp,
first_value(price) OVER (
PARTITION BY symbol
ROWS BETWEEN 3 PRECEDING AND CURRENT ROW
)
FROM
trades
symbolpriceamounttimestampfirst_value
ETH-USD2615.540.000442022-03-08T18:03:57.000000Z2615.54
BTC-USD39269.980.0012022-03-08T18:03:57.000000Z39269.98
BTC-USD39265.310.0001272022-03-08T18:03:58.000000Z39269.98
BTC-USD39265.310.0002452022-03-08T18:03:58.000000Z39269.98
BTC-USD39265.310.0000732022-03-08T18:03:58.000000Z39269.98
BTC-USD39263.280.003928972022-03-08T18:03:58.000000Z39265.31
ETH-USD2615.350.022458682022-03-08T18:03:58.000000Z2615.54
ETH-USD2615.360.032446132022-03-08T18:03:58.000000Z2615.54
BTC-USD39265.270.000068472022-03-08T18:03:58.000000Z39265.31
BTC-USD39262.420.000465622022-03-08T18:03:58.000000Z39265.31

First price over preceding rows except 4 latest

If frame is specified only on preceding rows, first_value() returns null until at least one non-null value enters the frame.

SELECT
symbol,
price,
amount,
timestamp,
first_value(price) OVER (
PARTITION BY symbol
ROWS BETWEEN UNBOUNDED PRECEDING AND 4 PRECEDING
)
FROM
trades
symbolpriceamounttimestampfirst_value
ETH-USD2615.540.000442022-03-08T18:03:57.000000Znull
BTC-USD39269.980.0012022-03-08T18:03:57.000000Znull
BTC-USD39265.310.0001272022-03-08T18:03:58.000000Znull
BTC-USD39265.310.0002452022-03-08T18:03:58.000000Znull
BTC-USD39265.310.0000732022-03-08T18:03:58.000000Znull
BTC-USD39263.280.003928972022-03-08T18:03:58.000000Z39269.98
ETH-USD2615.350.022458682022-03-08T18:03:58.000000Znull
ETH-USD2615.360.032446132022-03-08T18:03:58.000000Znull
BTC-USD39265.270.000068472022-03-08T18:03:58.000000Z39269.98
BTC-USD39262.420.000465622022-03-08T18:03:58.000000Z39269.98

First value of price in the latest second

SELECT symbol, price, amount, timestamp,
first_value(price) OVER (PARTITION BY symbol
ORDER BY timestamp
RANGE BETWEEN '1' SECOND PRECEDING AND CURRENT ROW)
FROM trades
ORDER BY timestamp
symbolpriceamounttimestampfirst_value
ETH-USD2615.540.000442022-03-08T18:03:57.000000Z2615.54
BTC-USD39269.980.0012022-03-08T18:03:57.000000Z39269.98
BTC-USD39265.310.0001272022-03-08T18:03:58.000000Z39269.98
BTC-USD39265.310.0002452022-03-08T18:03:58.000000Z39269.98
BTC-USD39265.310.0000732022-03-08T18:03:58.000000Z39269.98
BTC-USD39263.280.003928972022-03-08T18:03:58.000000Z39269.98
ETH-USD2615.350.022458682022-03-08T18:03:58.000000Z2615.54
ETH-USD2615.360.032446132022-03-08T18:03:58.000000Z2615.54
BTC-USD39265.270.000068472022-03-08T18:03:58.000000Z39269.98
BTC-USD39262.420.000465622022-03-08T18:03:58.000000Z39269.98

First value of price in the latest second, descending designated timestamp order

SELECT
symbol,
price,
amount,
timestamp,
first_value(price) OVER (
PARTITION BY symbol
ORDER BY timestamp DESC
RANGE BETWEEN '1' SECOND PRECEDING AND CURRENT ROW
)
FROM
trades
ORDER BY
timestamp DESC
symbolpriceamounttimestampfirst_value
BTC-USD39262.420.000465622022-03-08T18:03:58.000000Z39262.42
BTC-USD39265.270.000068472022-03-08T18:03:58.000000Z39262.42
ETH-USD2615.360.032446132022-03-08T18:03:58.000000Z2615.36
ETH-USD2615.350.022458682022-03-08T18:03:58.000000Z2615.36
BTC-USD39263.280.003928972022-03-08T18:03:58.000000Z39262.42
BTC-USD39265.310.0000732022-03-08T18:03:58.000000Z39262.42
BTC-USD39265.310.0002452022-03-08T18:03:58.000000Z39262.42
BTC-USD39265.310.0001272022-03-08T18:03:58.000000Z39262.42
BTC-USD39269.980.0012022-03-08T18:03:57.000000Z39262.42
ETH-USD2615.540.000442022-03-08T18:03:57.000000Z2615.36

First value of price in default frame

Default frame is RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW, which spans the whole partition in absence of an ORDER BY clause.

SELECT
symbol,
price,
amount,
timestamp,
first_value(price) OVER (
PARTITION BY symbol
)
FROM
trades
ORDER BY
timestamp
symbolpriceamounttimestampfirst_value
ETH-USD2615.540.000442022-03-08T18:03:57.000000Z2615.54
BTC-USD39269.980.0012022-03-08T18:03:57.000000Z39269.98
BTC-USD39265.310.0001272022-03-08T18:03:58.000000Z39269.98
BTC-USD39265.310.0002452022-03-08T18:03:58.000000Z39269.98
BTC-USD39265.310.0000732022-03-08T18:03:58.000000Z39269.98
BTC-USD39263.280.003928972022-03-08T18:03:58.000000Z39269.98
ETH-USD2615.350.022458682022-03-08T18:03:58.000000Z2615.54
ETH-USD2615.360.032446132022-03-08T18:03:58.000000Z2615.54
BTC-USD39265.270.000068472022-03-08T18:03:58.000000Z39269.98
BTC-USD39262.420.000465622022-03-08T18:03:58.000000Z39269.98

First value of price in default ordered frame

SELECT
symbol,
price,
amount,
timestamp,
first_value(price) OVER (
PARTITION BY symbol
ORDER BY timestamp
)
FROM
trades
ORDER BY
timestamp
symbolpriceamounttimestampfirst_value
ETH-USD2615.540.000442022-03-08T18:03:57.000000Z2615.54
BTC-USD39269.980.0012022-03-08T18:03:57.000000Z39269.98
BTC-USD39265.310.0001272022-03-08T18:03:58.000000Z39269.98
BTC-USD39265.310.0002452022-03-08T18:03:58.000000Z39269.98
BTC-USD39265.310.0000732022-03-08T18:03:58.000000Z39269.98
BTC-USD39263.280.003928972022-03-08T18:03:58.000000Z39269.98
ETH-USD2615.350.022458682022-03-08T18:03:58.000000Z2615.54
ETH-USD2615.360.032446132022-03-08T18:03:58.000000Z2615.54
BTC-USD39265.270.000068472022-03-08T18:03:58.000000Z39269.98
BTC-USD39262.420.000465622022-03-08T18:03:58.000000Z39269.98

First value of price in whole result set

SELECT
symbol,
price,
amount,
timestamp,
first_value(price) OVER ()
FROM
trades
ORDER BY
timestamp
symbolpriceamounttimestampfirst_value
ETH-USD2615.540.000442022-03-08T18:03:57.000000Z2615.54
BTC-USD39269.980.0012022-03-08T18:03:57.000000Z2615.54
BTC-USD39265.310.0001272022-03-08T18:03:58.000000Z2615.54
BTC-USD39265.310.0002452022-03-08T18:03:58.000000Z2615.54
BTC-USD39265.310.0000732022-03-08T18:03:58.000000Z2615.54
BTC-USD39263.280.003928972022-03-08T18:03:58.000000Z2615.54
ETH-USD2615.350.022458682022-03-08T18:03:58.000000Z2615.54
ETH-USD2615.360.032446132022-03-08T18:03:58.000000Z2615.54
BTC-USD39265.270.000068472022-03-08T18:03:58.000000Z2615.54
BTC-USD39262.420.000465622022-03-08T18:03:58.000000Z2615.54

rank

In the context of window functions, rank() assigns a unique rank to each row within the window frame, with the same rank assigned to rows with the same values. Rows with equal values receive the same rank, and a gap appears in the sequence for the next distinct value; that is, the row_number of the first row in its peer group.

Arguments:

  • rank() does not require arguments.

Return value:

  • The rank of each row within the window frame. Return value type is long.

Description

When used as a window function, rank() operates on a "window" of rows defined by the OVER clause. The rows in this window are determined by the PARTITION BY and ORDER BY components of the OVER clause.

The rank() function assigns a unique rank to each row within its window, with the same rank assigned to rows with the same values in the ORDER BY clause of the OVER clause. It ignores the frame clause, meaning it considers all rows in each partition, regardless of the frame specification.

Note that the order of rows in the result set is not guaranteed to be the same with each execution of the query. To ensure a consistent order, use an ORDER BY clause outside of the OVER clause.

Examples:

For a given table housing:

CREATE TABLE housing (
id INT,
price DOUBLE,
rating INT,
location STRING,
date_sold TIMESTAMP
);

INSERT INTO housing(id, price, rating, location, date_sold)
VALUES
(2, 246.3393, 1, 'alcatraz_ave', '2021-02-01 00:00:00'),
(10, 69.2601, 5, 'alcatraz_ave', '2021-02-01 04:00:00'),
(15, 616.2569, 3, 'westbrae', '2021-02-01 08:00:00'),
(3, 112.7856, 5, 'south_side', '2021-02-01 12:00:00'),
(17, 993.3345, 1, 'south_side', '2021-02-01 16:00:00'),
(8, 937.4274, 1, 'berkeley_hills', '2021-02-01 20:00:00'),
(4, 207.7797, 1, 'alcatraz_ave', '2021-02-02 00:00:00'),
(17, 352.3193, 3, 'downtown', '2021-02-02 04:00:00'),
(3, 140.0437, 1, 'westbrae', '2021-02-02 08:00:00'),
(15, 971.7142, 1, 'westbrae', '2021-02-02 12:00:00');

The following query uses rank() to display output based on the rating:

SELECT
location,
price,
date_sold,
rating,
rank() OVER (
ORDER BY rating ASC
) AS rank
FROM
housing
ORDER BY
rank
locationpricedate_soldratingrank
westbrae971.71422021-02-02T12:00:00.000000Z11
westbrae140.04372021-02-02T08:00:00.000000Z11
alcatraz_ave207.77972021-02-02T00:00:00.000000Z11
berkeley_hills937.42742021-02-01T20:00:00.000000Z11
south_side993.33452021-02-01T16:00:00.000000Z11
alcatraz_ave246.33932021-02-01T00:00:00.000000Z11
downtown352.31932021-02-02T04:00:00.000000Z37
westbrae616.25692021-02-01T08:00:00.000000Z37
south_side112.78562021-02-01T12:00:00.000000Z59
alcatraz_ave69.26012021-02-01T04:00:00.000000Z59

row_number

In the context of window functions, row_number() assigns a unique row number to each row within the window frame. For each partition, the row number starts with one and increments by one.

Arguments:

  • row_number() does not require arguments.

Return value:

  • The row number of each row within the window frame. Return value type is long.

Description

When used as a window function, row_number() operates on a "window" of rows defined by the OVER clause. The rows in this window are determined by the PARTITION BY and ORDER BY components of the OVER clause.

The row_number() function assigns a unique row number to each row within its window, starting at one for the first row in each partition and incrementing by one for each subsequent row. It ignores the frame clause, meaning it considers all rows in each partition, regardless of the frame specification.

Note that the order of rows in the result set is not guaranteed to be the same with each execution of the query. To ensure a consistent order, use an ORDER BY clause outside of the OVER clause.

Examples:

Given a table trades, the queries below use row_number() with a WHERE clause to filter trading records added within one day.

The following query assigns row numbers and orders output based on them:

SELECT
symbol,
price,
amount,
row_number() OVER () AS row_num
FROM trades
WHERE timestamp > DATEADD('d', -1, NOW())
ORDER BY row_num ASC;
-- The ORDER BY clause arranges the output based on the assigned row_num.
symbolpriceamountrow_num
BTC-USD20633.470.175692981
ETH-USD1560.041.32892
ETH-USD1560.040.33
ETH-USD15601.404267864
BTC-USD20633.480.001790925

The following query groups the table based on symbol and assigns row numbers to each group based on price:

SELECT
symbol,
price,
amount,
row_number() OVER (PARTITION BY symbol ORDER BY price) AS row_num
FROM trades
WHERE timestamp > DATEADD('d', -1, NOW())
ORDER BY row_num ASC;
-- The ORDER BY clause arranges the output based on the assigned row_num.
symbolpriceamountrow_num
BTC-USD1479.410.109046331
ETH-USD200000.11
BTC-USD1479.450.022
ETH-USD200000.0002492

Cumulative sum

In the context of window functions, sum(value) calculates the sum of value in the set of rows defined by the window frame. Also known as "cumulative sum".

Arguments:

  • value: Any numeric value.

Return value:

  • The sum of value for the rows in the window frame.

Description

When used as a window function, sum() operates on a "window" of rows defined by the OVER clause. The rows in this window are determined by the PARTITION BY, ORDER BY, and frame specification components of the OVER clause.

The sum() function respects the frame clause, meaning it only includes rows within the specified frame in the calculation. The result is a separate value for each row, based on the corresponding window of rows.

Note that the order of rows in the result set is not guaranteed to be the same with each execution of the query. To ensure a consistent order, use an ORDER BY clause outside of the OVER clause.

Examples:

Examples below use trades table defined above.

Moving price sum over latest 4 rows

SELECT
symbol,
price,
amount,
timestamp,
sum(price) OVER (
PARTITION BY symbol
ROWS BETWEEN 3 PRECEDING AND CURRENT ROW
)
FROM
trades
symbolpriceamounttimestampsum
ETH-USD2615.540.000442022-03-08T18:03:57.000000Z2615.54
BTC-USD39269.980.0012022-03-08T18:03:57.000000Z39269.98
BTC-USD39265.310.0001272022-03-08T18:03:58.000000Z78535.29
BTC-USD39265.310.0002452022-03-08T18:03:58.000000Z117800.6
BTC-USD39265.310.0000732022-03-08T18:03:58.000000Z157065.91
BTC-USD39263.280.003928972022-03-08T18:03:58.000000Z157059.21
ETH-USD2615.350.022458682022-03-08T18:03:58.000000Z5230.89
ETH-USD2615.360.032446132022-03-08T18:03:58.000000Z7846.25
BTC-USD39265.270.000068472022-03-08T18:03:58.000000Z157059.17
BTC-USD39262.420.000465622022-03-08T18:03:58.000000Z157056.28

Moving price sum over preceding rows except 4 latest

If frame is specified only on preceding rows, sum() returns null until at least one non-null value enters the frame.

SELECT
symbol,
price,
amount,
timestamp,
sum(price) OVER (
PARTITION BY symbol
ROWS BETWEEN UNBOUNDED PRECEDING AND 4 PRECEDING
)
FROM
trades
symbolpriceamounttimestampsum
ETH-USD2615.540.000442022-03-08T18:03:57.000000Znull
BTC-USD39269.980.0012022-03-08T18:03:57.000000Znull
BTC-USD39265.310.0001272022-03-08T18:03:58.000000Znull
BTC-USD39265.310.0002452022-03-08T18:03:58.000000Znull
BTC-USD39265.310.0000732022-03-08T18:03:58.000000Znull
BTC-USD39263.280.003928972022-03-08T18:03:58.000000Z39269.98
ETH-USD2615.350.022458682022-03-08T18:03:58.000000Znull
ETH-USD2615.360.032446132022-03-08T18:03:58.000000Znull
BTC-USD39265.270.000068472022-03-08T18:03:58.000000Z78535.29
BTC-USD39262.420.000465622022-03-08T18:03:58.000000Z117800.6

Moving price sum over the latest second

SELECT
symbol,
price,
amount,
timestamp,
sum(price) OVER (
PARTITION BY symbol
ORDER BY timestamp
RANGE BETWEEN '1' SECOND PRECEDING AND CURRENT ROW
)
FROM
trades
ORDER BY
timestamp
symbolpriceamounttimestampsum
ETH-USD2615.540.000442022-03-08T18:03:57.000000Z2615.54
BTC-USD39269.980.0012022-03-08T18:03:57.000000Z39269.98
BTC-USD39265.310.0001272022-03-08T18:03:58.000000Z78535.29
BTC-USD39265.310.0002452022-03-08T18:03:58.000000Z117800.6
BTC-USD39265.310.0000732022-03-08T18:03:58.000000Z157065.91
BTC-USD39263.280.003928972022-03-08T18:03:58.000000Z196329.19
ETH-USD2615.350.022458682022-03-08T18:03:58.000000Z5230.89
ETH-USD2615.360.032446132022-03-08T18:03:58.000000Z7846.25
BTC-USD39265.270.000068472022-03-08T18:03:58.000000Z235594.46
BTC-USD39262.420.000465622022-03-08T18:03:58.000000Z274856.88

Moving price sum over the latest second, descending designated timestamp order

SELECT
symbol,
price,
amount,
timestamp,
sum(price) OVER (
PARTITION BY symbol
ORDER BY timestamp DESC
RANGE BETWEEN '1' SECOND PRECEDING AND CURRENT ROW
)
FROM
trades
ORDER BY
timestamp DESC
symbolpriceamounttimestampsum
BTC-USD39262.420.000465622022-03-08T18:03:58.000000Z39262.42
BTC-USD39265.270.000068472022-03-08T18:03:58.000000Z78527.69
ETH-USD2615.360.032446132022-03-08T18:03:58.000000Z2615.36
ETH-USD2615.350.022458682022-03-08T18:03:58.000000Z5230.71
BTC-USD39263.280.003928972022-03-08T18:03:58.000000Z117790.97
BTC-USD39265.310.0000732022-03-08T18:03:58.000000Z157056.28
BTC-USD39265.310.0002452022-03-08T18:03:58.000000Z196321.59
BTC-USD39265.310.0001272022-03-08T18:03:58.000000Z235586.9
BTC-USD39269.980.0012022-03-08T18:03:57.000000Z274856.88
ETH-USD2615.540.000442022-03-08T18:03:57.000000Z7846.25

Moving price sum over default frame

Default frame is RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW, which spans whole partition in absence of ORDER BY clause.

SELECT
symbol,
price,
amount,
timestamp,
sum(price) OVER (
PARTITION BY symbol
)
FROM
trades
ORDER BY
timestamp
symbolpriceamounttimestampsum
ETH-USD2615.540.000442022-03-08T18:03:57.000000Z7846.25
BTC-USD39269.980.0012022-03-08T18:03:57.000000Z274856.88
BTC-USD39265.310.0001272022-03-08T18:03:58.000000Z274856.88
BTC-USD39265.310.0002452022-03-08T18:03:58.000000Z274856.88
BTC-USD39265.310.0000732022-03-08T18:03:58.000000Z274856.88
BTC-USD39263.280.003928972022-03-08T18:03:58.000000Z274856.88
ETH-USD2615.350.022458682022-03-08T18:03:58.000000Z7846.25
ETH-USD2615.360.032446132022-03-08T18:03:58.000000Z7846.25
BTC-USD39265.270.000068472022-03-08T18:03:58.000000Z274856.88
BTC-USD39262.420.000465622022-03-08T18:03:58.000000Z274856.88

Moving price sum over default ordered frame

SELECT
symbol,
price,
amount,
timestamp,
sum(price) OVER (
PARTITION BY symbol
ORDER BY timestamp
)
FROM
trades
ORDER BY
timestamp
symbolpriceamounttimestampsum
ETH-USD2615.540.000442022-03-08T18:03:57.000000Z2615.54
BTC-USD39269.980.0012022-03-08T18:03:57.000000Z39269.98
BTC-USD39265.310.0001272022-03-08T18:03:58.000000Z78535.29
BTC-USD39265.310.0002452022-03-08T18:03:58.000000Z117800.6
BTC-USD39265.310.0000732022-03-08T18:03:58.000000Z157065.91
BTC-USD39263.280.003928972022-03-08T18:03:58.000000Z196329.19
ETH-USD2615.350.022458682022-03-08T18:03:58.000000Z5230.89
ETH-USD2615.360.032446132022-03-08T18:03:58.000000Z7846.25
BTC-USD39265.270.000068472022-03-08T18:03:58.000000Z235594.46
BTC-USD39262.420.000465622022-03-08T18:03:58.000000Z274856.88

Moving price sum over whole result set

SELECT
symbol,
price,
amount,
timestamp,
sum(price) OVER ()
FROM
trades
ORDER BY
timestamp
symbolpriceamounttimestampsum
ETH-USD2615.540.000442022-03-08T18:03:57.000000Z282703.13
BTC-USD39269.980.0012022-03-08T18:03:57.000000Z282703.13
BTC-USD39265.310.0001272022-03-08T18:03:58.000000Z282703.13
BTC-USD39265.310.0002452022-03-08T18:03:58.000000Z282703.13
BTC-USD39265.310.0000732022-03-08T18:03:58.000000Z282703.13
BTC-USD39263.280.003928972022-03-08T18:03:58.000000Z282703.13
ETH-USD2615.350.022458682022-03-08T18:03:58.000000Z282703.13
ETH-USD2615.360.032446132022-03-08T18:03:58.000000Z282703.13
BTC-USD39265.270.000068472022-03-08T18:03:58.000000Z282703.13
BTC-USD39262.420.000465622022-03-08T18:03:58.000000Z282703.13