Working with Climate Data

Monthly climatic data from the Eads Bridge, from 1893 to the 1960’s. It’s a comma separated file (.csv) that can be imported into pretty much any spreadsheet program.

135045.csv

The last three columns are mean (MMNT), minimum (MNMT), and maximum (MXMT) monthly temperature data, which are good candidates for analysis by pre-calculus students who are studying sinusoidal functions. For an extra challenge, students can also try analyzing the total monthly precipitation patterns (TPCP). The precipitation pattern is not nearly as nice a sinusoidal function as the temperature.

Students should try to deconstruct the curve into component functions to see the annual cycles and any longer term patterns. This type of work would also be a precursor the the mathematics of Fourier analysis.

This data comes from the National Climatic Data Center (NCDC) website.

Influence Explorer: Data on Campaign Contributions by Politician and by Major Contributors

Influence Explorer is an excellent resource for assessing data about money in politics.

The website Influence Explorer has a lot of easily accessible data about the contributions of companies and prominent people to lawmakers. As a resource for civics research it’s really nice, but the time series data also makes it a useful resource for math; algebra and pre-calculus, in particular.

Analyzing the 20th Century Carbon Dioxide Rise: A pre-calculus assignment

Carbon dioxide concentration (ppm) measured at the Mona Loa observatory in Hawaii shows exponential growth and a periodic annual variation.

The carbon dioxide concentration record from Mona Loa in Hawaii is an excellent data set to work with in high-school mathematics classes for two key reasons.

The first has to do with the spark-the-imagination excitement that comes from being able to work with a live, real, scientific record (updated every month) that is so easy to grab (from Scrippts), and is extremely relavant given all the issues we’re dealing with regarding global climate change.

The second is that the data is very clearly the sum of two different types of functions. The exponential growth of CO2 concentration in the atmosphere over the last 60 years dominates, but does not swamp, the annual sinusoidal variability as local plants respond to the seasons.

Assignment

So here’s the assignment using the dataset (mona-loa-2012.xls or plain text mona-loa-2012.csv):

1. Identify the exponential growth function:

Add an exponential curve trendline in a spreadsheet program or manual regression. If using the regression (which I’ve found gives the best match) your equation should have the form:

 y = a b^{cx} + d

while the built-in exponential trendline will usually give something simpler like:

 y = a e^{bx}

2. Subtract the exponential function.

Put the exponential function (model) into your spreadsheet program and subtract it from data set. The result should be just the annual sinusoidal function.

Dataset with the exponential curve subtracted.

If you look carefully you might also see what looks like a longer wavelength periodicity overlain on top of the annual cycle. You can attempt to extract if you wish.

3. Decipher the annual sinusoidal function

Try to match the stripped dataset with a sinusoidal function of the form:

 y = a \sin (bx+c) + d

A good place to start at finding the best-fit coefficients is by recognizing that:

  • a = amplitude;
  • b = frequency (which is the inverse of the wavelength;
  • c = phase (to shift the curve left or right); and
  • d = vertical offset (this sets the baseline of the curve.

Wrap up

Now you have a model for carbon dioxide concentration, so you should be able to predict, for example, what the concentration will be for each month in the years 2020, 2050 and 2100 if the trends continue as they have for the last 60 years. This is the first step in predicting annual temperatures based on increasing CO2 concentrations.

Inverse Relationships

Inverse relationships pop-up everywhere. They’re pretty common in physics (see Boyle’s Law for example: P ∝ 1/V), but there you sort-of expect them. You don’t quite expect to see them in the number of views of my blog posts, as are shown in the Popular Posts section of the column to the right.

Table 1: Views of the posts on the Montessori Muddle in the previous month as of October 16th, 2012.

Post Post Rank Views
Plate Tectonics and the Earthquake in Japan 1 3634
Global Atmospheric Circulation and Biomes 2 1247
Equations of a Parabola: Standard to Vertex Form and Back Again 3 744
Cells, cells, cells 4 721
Salt and Sugar Under the Microscope 5 686
Google Maps: Zooming in to the 5 themes of geography 6 500
Market vs. Socialist Economy: A simulation game 7 247
Human Evolution: A Family Tree 8 263
Osmosis under the microscope 9 219
Geography of data 10 171

You can plot these data to show the relationship.

Views of the top 10 blog posts on the Montessori Muddle in the last month (as of 10/16/2012).

And if you think about it, part of it sort of makes sense that this relationship should be inverse. After all, as you get to lower ranked (less visited) posts, the number of views should asymptotically approach zero.

Questions

So, given this data, can my pre-Calculus students find the equation for the best-fit inverse function? That way I could estimate how many hits my 20th or 100th ranked post gets per month.

Can my Calculus students use the function they come up with to estimate the total number of hits on all of my posts over the last month? Or even the top 20 most popular posts?

Using Real Data, and Least Squares Regression, in pre-Calculus

The equation of our straight line model (red line) matches the data (blue diamonds) pretty well.

One of the first things that my pre-Calculus students need to learn is how to do a least squares regression to match any type of function to real datasets. So I’m teaching them the most general method possible using MS Excel’s iterative Solver, which is pretty easy to work with once you get the hang of it.

Log, reciprocal and square root functions can all be matched using least squares regression.

I’m teaching the pre-Calculus using a graphical approach, and I want to emphasize that the main reason we study the different classes of functions — straight lines, polynomials, exponential curves etc.— is because of how useful they are at modeling real data in all sorts of scientific and non-scientific applications.

So I’m starting each topic with some real data: either data they collect (e.g. bring water to a boil) or data they can download (e.g. atmospheric CO2 from Mauna Loa). However, while it’s easy enough to pick two points, or draw a straight line by eye, and then determine its linear equation, it’s much trickier if not impossible when dealing with polynomials or transcendental functions like exponents or square-roots. They need a technique they can use to match any type of function, and least squares regression is the most commonly used method of doing this. While calculators and spreadsheet programs, like Excel, use least squares regression to draw trendlines on their graphs, they can’t do all the different types of functions we need to deal with.

The one issue that has come up is that not everyone has Excel and Solver. Neither OpenOffice nor Apple’s spreadsheet software (Numbers) has a good equivalent. However, if you have a good initial guess, based on a few datapoints, you can fit curves reasonably well by changing their coefficients in the spreadsheet by hand to minimize the error.

I’m working on a post on how to do the linear regression with Excel and Solver. It should be up shortly.

Notes

If Solver is not available in the Tools menu you may have to activate it because it’s an Add In. Wikihow explains activation.

Some versions of Excel for the Mac don’t have Solver built in, but you can download it from Frontline.

Snow and Ice Data

The National Snow and Ice Data Center has some interesting data-sets available, including a number of measures of the extent of Arctic sea-ice showing how fast it has been melting.

Current extent of Arctic Ice. Data from the National Snow and Ice Data Center.

The Easy-to-use Data Products page has a lot of real data that middle and high school students can use for projects.

Superfund Sites in Your Area – And Other Environmental Cleanups in Your Community

EPA's Cleanups in My Community map for St. Louis and its western suburbs.

Want to find your nearest superfund site? The EPA has an interactive page called, Clean Up My Community, that maps brownfields, hazardous waste, and superfund sites anywhere in the U.S.

Note:

  • Brownfields are places, usually in cities, that can’t be easily re-developed because there’s some existing pollution on the site.
  • Superfund sites are places where there is hazardous pollution that the government is cleaning up because the companies that caused the pollution have gone out of business, or because the government caused the pollution in the first place. The military is probably the biggest source of government pollution, particularly from fuel leaks and radioactive waste.

EPA’s Enviromapper

Enviromapper via the EPA. Image links to the map for St. Albans, MO, but you can find information for anywhere in the U.S..

The EPA’s Enviromapper website is great way to identify sources of hazardous materials and other types of pollution in your area, which might be a good way of stirring up student interest in the topic.

Not only can you map the broad category of pollution – air, water, radiation etc – but you can also find specific information about the different types of pollution or potential pollution the EPA has information about. I found a nearby site with sulfuric acid, for example.

And, if you want to slog through a lot of closely written reports, you can find a lot more details about any site you come across. Some of this information might also be useful – who knows?