No dialect of SQL, as far as I know, allows you to average date values – but a little mathematical investigation reveals there’s no good reason for that.

Try the following Postgres query on the scott schema:

SELECT avg(hiredate)
FROM emp

Postgres complains that avg is not defined for the date type:

ERROR:  function avg(date) does not exist
LINE 2: SELECT avg(hiredate)
               ^
HINT:  No function matches the given name and argument types. You might
need to add explicit type casts.

Think about it this way: the average tenure of employees – the number of days since they were hired – is clearly well-defined, because tenure is a number, and you can add up the values and divide by the count. But when you have the average tenure, you can subtract it from today’s date and find the average hiredate:

SELECT current_date
     + ROUND(AVG(hiredate - current_date))::int AS avg_hiredate
FROM emp;

 avg_hiredate 
--------------
 1982-05-03
(1 row)

By the way, you don’t have to use today’s date. Whichever origin you pick, the query will give the same answer.

This origin-invariance is why you can’t easily tell what epoch a system uses to store dates. While the representation must have an origin – UNIX uses January 1st, 1970, Postgres uses January 1st, 2000, and Excel effectively uses December 31st, 1899 – a well-functioning date type doesn’t give any clues what that origin is.

Why do dates behave this way? There is no zero date, and hiredate * 2 is meaningless. What is meaningful is the difference of two dates (a number of days) and a date plus a number of days. A set with that structure – you can subtract two points to get a vector, and add a vector to a point, but not add two points – is called an affine space.

Temperatures are an affine space: the average of 0°C and 100°C is 50°C, and in Fahrenheit, the average of 32° and 212° is 122°, which is the same temperature – but adding two temperatures is nonsense.

Pointers in the C and C++ programming languages work the same way. Given a buffer defined by its endpoints, pointers pStart and pEnd, can you find the midpoint? It is not the arithmetic mean:

(pStart + pEnd) / 2

That expression does not compile, because you cannot add two pointers. But you can subtract them:

pStart + (pEnd - pStart) / 2

Another sign you are dealing with an affine space is when subtracting two values yields a different type. (The difference between two SQL DATE values is an INTERVAL; the difference between two C void * values is a ptrdiff_t.) The difference is a vector, not a point. Vectors have a zero value, which is the value you get when you subtract a point from itself.

The geospatial folks have their act together, as usual. The PostGIS ST_Centroid function lets you compute the center of mass – “average” generalized to 2D or 3D – of a collection of points, lines, shapes, or other geometric objects. Once again, the representation of geometries requires an origin, but the choice of origin does not affect the calculated centroid.

Conclusion

Dates, temperatures, pointers, and other affine spaces all have well-defined averages.

Because affine spaces have no origin, you cannot add values, and therefore the “add them up and divide by n” formula for arithmetic mean does not work, but you can average them by picking an arbitrary origin and averaging the distances to that origin.

So, let’s add avg for DATE, DATETIME and TIMESTAMP values to SQL.

If you have comments, please reply on Bluesky @julianhyde.bsky.social or Twitter: