Date : 4 August 1999

Climate Normals, Part 1

During most TV weather reports and some others, you'll see a mention of normals, particularly normal high and low temperatures (main site). Because of weather's inherent variability, many people realize that a day when maximum and minimum temperatures equal the normals is quite rare. Meteorologically speaking though, the term does not have the typical connotation - such as something usual or expected. Instead, a normal refers to the average or smoothed average of a meteorological parameter. Though seemingly simple, this can become quite complicated; as illustrated below. Thus the word normal refers more so to a statistical sense - likely chosen because plot of daily average temperatures for a year :


often closely resembles a (statistical) normal distribution.

Climate Normals

At a National Climatic Data Center (NCDC) WWW site, the term climate normal is defined as being the arithmetic average of a meteorological element during a 30-year period. As further explained at that site though, this is actually only true for annual and monthly normals, but not daily normals. The number 30 was likely chosen because it is often the smallest number for which a sample of data is considered statistically significant. I.e., if daily values are considered, fewer than 30 data points won't likely create meaningful or reliable statistics (though the exact number is arbitrary - 29 points is obviously not much different than 30). Too many years is undesirable, because climates can change. Thus, the most recent 3 decades provides a reasonable notion of what the current average weather should be.

Monthly and Annual Normals

Monthly normals are determined as the average of all monthly values during a 30-year period :

Tmnorm = (Sy=1,30 Sd=1,k Tyd) / (30 k)

Tmnorm : monthly normal
y : year number of the 30-year period
d : day of month with k days
Tyd : value of parameter T during year y & day d

For example, for a 31-day month (k = 31), 30 × 31 = 930 values are averaged if the record is complete. T above could represent minimum or maximum temperature, for example. The average of the 12 monthly normals determines the annual normals :

Tanorm = (Sm=1,12 Tm) / 12

Tanorm : annual normal
m : number of month

Daily Normals

As mentioned, calculation of these slightly differs from the definition stated above. Though people seemingly don't mind monthly normals which irregularly vary, daily variations such as shown above is not tolerated. Common sense says that if climate were unchanging for an infinite number of years, such averages should smoothly vary. Thus rather than using discrete averages, daily normals are calculated from smoothly varying curves. Below I show that this attempt is only as good as the monthly normals are. (Though many more values determine monthly normals, persistent spells of unusual weather cause them to significantly differ from the supposed ideal distribution also).
Cubic Spline
As described in a referenced link above, daily averages are not used for computing daily normals. Instead, a cubic spline is fit thru monthly normals. These are cubic (3rd degree) polynomials which are used for interpolating (passing directly thru) a series of data points. This is done such that the value of a function and its 2nd derivative match at the interpolation points. Considering the following diagram :


The following equations define the cubic spline (for which i = 1 to 5 is shown above) :

y = A yi + B yi+1 + C yi'' + D yi+1''

A = (xi+1 - x) / (xi+1 - xi)
B = (x - xi) / (xi+1 - xi)
C = (A3 - A)(xi+1 - xi)2 / 6
D = (B3 - B)(xi+1 - xi)2 / 6

Perhaps you recognize equations A & B as linear interpolation formulas between points xi & xi+1, such that A = 1 & B = 0 at x = xi and A = 0 & B = 1 at x = xi+1, with intermediate A & B values between those points. y derivatives are :

y' = (yi+1 - yi)/(xi+1 - xi) - (3A2 - 1)(xi+1 - xi)(yi'')/6 + (3B2 -1)(xi+1 - xi)(yi+1'')/6

y'' = A yi'' + B yi+1''

My purpose for writing all this is to illustrate the interpolation property of a cubic spline. Similarly as for above, for the interval between xi & xi+1, A = 1 & B = 0 for y'' = yi'', and A = 0 & B = 1 for y'' = yi+1''. Thus, yi'', yi+1'', yi+2''... are 2nd derivatives at the interpolation points. Thus values of the interpolation points determine y (the function) and its 2nd derivative (curvature). So the curves pass thru the interpolation points and their curvatures match there, providing the smooth curve thru them sought.

For climate normals, the values xi, xi+1, xi+2... represent months along the abscissa, and the yi, yi+1, yi+2... are mean monthly values (of a weather parameter such as minimum temperature) along the ordinate. Evaluating the equation for 1st derivative (y') for x=xi for the intervals (xi-1,xi) & (xi,xi+1) and equating these yields the following equation for yi-1'', yi'', & yi+1'' :

(xi - xi-1)(yi-1'')/6 + (xi+1 - xi-1)(yi'')/3 + (xi+1 - xi)(yi+1'')/6 = (yi+1 - yi)/(xi+1 - xi) - (yi - yi-1)/(xi - xi-1)

Considering N interpolation points, this provides a system of N-2 equations for the N unknown values of y''. 2 more conditions are needed for a solution. These are typically boundary values, the most common being natural boundary conditions of y1'' = yN'' = 0. Using those, the above equation is typically solved as a matrix equation for the y'' values. Then these values can be inserted into the equation for y to calculate its value at any point x (between each pair of xi & xi+1). I omit these gruesome details here.

Calculation of Daily Normals
If you are following the discussion this far, you probably realize that the cubic splines will generally vary smoothly except perhaps at the endpoints where boundary conditions are arbitrarily chosen. As mentioned in the links above, the official solution to this problem calculates the cubic splines for 24 monthly values, repeating the months of July-December before a year of data, and the months of January-June after a year of data. I did this for a 36-year period, reasons for which are explained in the next article :


Then the cubic splines for the central 12-month period should be a smoothly varying curve with very similar values at the beginning and end of a year, as illustrated.

After this is done, the problem is then mapping the monthly (x,y) values to daily values. Considering the number of days during each month, counting February 29 as ¼ day :

 Month   Days of year   Midpoint
  JAN        0-31        15.5
  FEB      31-59.25      45.125
  MAR    59.25-90.25     74.75
  APR    90.25-120.25   105.25
  MAY   120.25-151.25   135.75
  JUN   151.25-181.25   166.25
  JUL   181.25-212.25   196.75
  AUG   212.25-243.75   227.75
  SEP   243.25-273.25   258.25
  OCT   273.25-304.25   288.75
  NOV   304.25-334.25   319.25
  DEC   334.25-365.25   349.75

the formulas for such mapping can become rather complicated. For example, the mean value for January corresponds with noon January 16 (day 15.5), but the mean for February with .125 of a day after midnight February 15 (day 45.125). The period between the means for July & August can be easily be split as 31 equal periods between noons of the 16th of each month, because both months are 31 days. Showing all the gruesome details here would be too long, but doing this provides a first set of smooth daily normals - (36-year period for these) :


cubic splines interpolating the monthly normals. A problem though is that the average of these daily normals during a month or year no longer equals the monthly or annual normals (unless because of freak chance). So they are then adjusted so that the daily normals in the tables you see do equal the monthly (and thus annual) normals when averaged. This is accomplished using both a modification of the cubic spline and manual editing where necessary. Thus the final daily normals are modified cubic spline interpolations of the monthly normals.
Disclaimer
I am not sure if the official method uses natural boundary conditions or the type of daily mapping I describe above, but the basic method is as described above. Some adjustments are also made if a station's locations or surroundings change significantly, if some data is missing, etc., which are mentioned at the NCDC site. Special types of calculations are made for probabilities of precipitation and frost & freeze dates, and for variances of parameters. My main purpose here though is a discussion of the basic temperature and precipitation normals most often shown.

Examples

Now that the methods of calculating normals are discussed, I provide some examples next article, and discuss their representativeness.

* August 1960 data for DTW - Detroit Metro airport (in Romulus, MI) used because that for 1996 was missing.


Text and embedded images are copyright of Joseph Bartlo, though may be used with proper crediting.

Home Page