Since we most commonly talk about radioactive decay in terms of half lives, we can write the equation for the amount of a radioisotope (A) as a function of time (t) as:
where:
To reverse this equation, to find the age of a sample (time) we would have to solve for t:
Take the log of each side (use base 2 because of the half life):
Use the rules of logarithms to simplify:
Now rearrange and solve for t:
So we end up with the equation for time (t):
Now, because this last equation is a linear equation, if we’re careful, we can use it to determine the half life of a radioisotope. As an assignment, find the half life for the decay of the radioisotope given below.
Based on my students’ statistics projects, I automated the method (using R) to calculate the z-score for all the states in the U.S. We used the John Hopkins daily data.
The R functions (test.R) assumes all of the data is in a folder (COVID-19-master/csse_covid_19_data/csse_covid_19_daily_reports_us/), and outputs the graphs to the folder ‘images/zscore/‘ which needs to exist.
For a Statistics project, I took raw COVID data from John Hopkins University on May 20, 2020. With the data, I found the general statistics and then compared how cases are going up in Missouri every month.
State
Confirmed
Deaths
Population
CasesPerCapita
Alabama
13052
522
4779736
2.73069475
Alaska
401
10
710231
0.564605037
Arizona
14906
747
6392017
2.33197127
Arkansas
5003
107
2915918
1.715754695
California
85997
3497
37253956
2.30839914
Colorado
22797
1299
5029196
4.532931308
Connecticut
39017
3529
3574097
10.91660355
Delaware
8194
310
897934
9.125392289
District of Columbia
7551
407
705749
10.69927127
Florida
47471
2096
18801310
2.524877256
Georgia
39801
1697
9687653
4.108425436
Hawaii
643
17
1360301
0.4726895003
Idaho
2506
77
1567582
1.598640454
Illinois
100418
4525
12830632
7.826426633
Indiana
29274
1864
6483802
4.514943547
Iowa
15620
393
3046355
5.127439186
Kansas
8507
202
2853118
2.981650251
Kentucky
8167
376
4339367
1.88207174
Louisiana
35316
2608
4533372
7.790227672
Maine
1819
73
1328361
1.369356673
Maryland
42323
2123
5773552
7.330496027
Massachusetts
88970
6066
6547629
13.5881248
Michigan
53009
5060
9883640
5.363307445
Minnesota
17670
786
5303925
3.331495072
Mississippi
11967
570
2967297
4.032963333
Missouri
11528
640
5988927
1.92488571
Montana
478
16
989415
0.4831137591
Nebraska
11122
138
1826341
6.089771844
Nevada
7388
377
2700551
2.735738003
New Hampshire
3868
190
1316470
2.938160383
New Jersey
150776
10749
8791894
17.14943333
New Mexico
6317
283
2059179
3.067727478
New York
354370
28636
19378102
18.28713669
North Carolina
20262
726
9535483
2.124905471
North Dakota
2095
49
672591
3.114820151
Ohio
29436
1781
11536504
2.551552879
Oklahoma
5532
299
3751351
1.474668726
Oregon
3801
144
3831074
0.992149982
Pennsylvania
68126
4770
12702379
5.36324731
Rhode Island
13356
538
1052567
12.68897847
South Carolina
9175
407
4625364
1.983627667
South Dakota
4177
46
814180
5.130315164
Tennessee
18412
305
6346105
2.90130718
Texas
51673
1426
25145561
2.054955147
Utah
7710
90
2763885
2.789551664
Vermont
944
54
625741
1.50861139
Virginia
32908
1075
8001024
4.112973539
Washington
18971
1037
6724540
2.821159514
West Virginia
1567
69
1852994
0.8456584317
Wisconsin
13413
481
5686986
2.35854282
Wyoming
787
11
563626
1.396315997
The Table above is the raw data I extracted but I added the population of each state and then calculated the cases per capita by dividing the confirmed cases by the population. This allows you to compare each state equally.
After getting the raw data I did the statistical analysis on the confirmed cases and cases per capita.
Confirmed Cases
Min.
401
Q1
5268
Median
13052
Q3
34112
Max
354370
Mean
30364
Inter-Q
28844
Standard Div
5513.53
Missouri
11528
Missouri Z
-3.416323118
The data above is the analysis from the confirmed cases. The analysis is for all 50 states.
Confirmed Cases per Capita
Min.
0.4727
Q1
1.9543
Median
2.9013
Q3
5.2468
Max
18.2871
Mean
4.4639
Inter-Q
3.2925
Standard Div
4.101132
Missouri
1.92488571
Missouri Z
-0.6191008458
The data above is the analysis from the confirmed cases per capita. The analysis is for all 50 states.
Missouri Predictions
After I did the analysis for all 50 states I focused on the rise of cases in Missouri from April to September. Then I predicted the number of cases in the future if the rise in cases stays the same. More than likely the cases will be higher or lower than the predicted number. If the state implements safety precautions the curve could flatten out. If the state does nothing and people keep taking it less and less seriously than more then likely the curve will get stepper.
Above are the data and graphs I used to predicate the cases at the beginning of October and End. The two highlighted boxes are the predictions.
I predict there will be 130,278 cases in Missouri on the first of October. On the 21st I predict there will be 166,268 cases.
Let’s take a look at the summary statistics for the number of confirmed cases, which is in the column labeled “Confirmed”:
> summary(mydata$Confirmed)
Min. 1st Qu. Median Mean 3rd Qu. Max.
317 1964 4499 15347 13302 253060
This shows that the mean is 15, 347 and the maximum is 253,060 confirmed cases.
I’m curious about which state has that large number of cases, so I’m going to print out the columns with the state names (“Province_State”) and the number of confirmed cases (“Confirmed”). From our colnames command above we can see that “Province_State” is column 1, and “Confirmed” is column 6, so we’ll use the command:
> mydata[ c(1,6) ]
The “c(1,6)” says that we want the columns 1 and 6. This command outputs
Province_State Confirmed
1 Alabama 5079
2 Alaska 321
3 American Samoa 0
4 Arizona 5068
5 Arkansas 1973
6 California 33686
7 Colorado 9730
8 Connecticut 19815
9 Delaware 2745
10 Diamond Princess 49
11 District of Columbia 2927
12 Florida 27059
13 Georgia 19407
14 Grand Princess 103
15 Guam 136
16 Hawaii 584
17 Idaho 1672
18 Illinois 31513
19 Indiana 11688
20 Iowa 3159
21 Kansas 2048
22 Kentucky 3050
23 Louisiana 24523
24 Maine 875
25 Maryland 13684
26 Massachusetts 38077
27 Michigan 32000
28 Minnesota 2470
29 Mississippi 4512
30 Missouri 5890
31 Montana 433
32 Nebraska 1648
33 Nevada 3830
34 New Hampshire 1447
35 New Jersey 88722
36 New Mexico 1971
37 New York 253060
38 North Carolina 6895
39 North Dakota 627
40 Northern Mariana Islands 14
41 Ohio 12919
42 Oklahoma 2680
43 Oregon 1957
44 Pennsylvania 33914
45 Puerto Rico 1252
46 Rhode Island 5090
47 South Carolina 4446
48 South Dakota 1685
49 Tennessee 7238
50 Texas 19751
51 Utah 3213
52 Vermont 816
53 Virgin Islands 53
54 Virginia 8990
55 Washington 12114
56 West Virginia 902
57 Wisconsin 4499
58 Wyoming 317
59 Recovered 0
Looking through, we can see that New York was the state with the largest number of cases.
Note that we could have searched for the row with the maximum number of Confirmed cases using the command:
> d2[which.max(d2$Confirmed),]
Merging Datasets
In class, we’ve been editing the original data file to add a column with the state populations (called “Population”). I have this in a separate file called “state_populations.txt” (which is also a comma separated variable file, .csv, even if not so labeled). So I’m going to import the population data:
> pop <- read.csv("state_population.txt")
Now I’ll merge the two datasets to add the population data to “mydata”.
> mydata <- merge(mydata, pop)
Graphing (Histograms and Boxplots)
With the datasets together we can try doing a histogram of the confirmed cases. Note that there is a column labeled “Confirmed” in the mydata dataset, which we’ll address as “mydata$Confirmed”:
> hist(mydata$Confirmed)
Note that on April 20th, most states had very few cases, but there were a couple with a lot of cases. It would be nice to see the data that’s clumped in the 0-50000 range broken into more bins, so we’ll add an optional argument to the hist command. The option is called breaks and we’ll request 20 breaks.
> hist(mydata$Confirmed, breaks=20)
Calculations (cases per 1000 population)
Of course, simply looking at the number of cases in not very informative because you’d expect, with all things being even, that states with the highest populations would have the highest number of cases. So let’s calculate the number of cases per capita. We’ll multiply that number by 1000 to make it more human readable:
The dataset still has a long tail, but we can see the beginnings of a normal distribution.
The next thing we can do is make a boxplot of our cases per 1000 people. I’m going to set the range option to zero so that the plot has the long tails:
> boxplot(mydata$ConfirmedPerCapita1000, range=0)
The boxplot shows, more or less, the same information in the histogram.
Finding Specific Data in the Dataset
We’d like to figure out how Missouri is doing compared to the rest of the states, so we’ll calculate the z-score, which tells how many standard deviations you are away from the mean. While there is a built in z-score function in R, we’ll first see how we can use the search and statistics methods to find the relevant information.
First, finding Missouri’s number of confirmed cases. To find all of the data in the row for Missouri we can use:
> mydata[mydata$Province_State == "Missouri",]
which gives something like this. It has all of the data but is not easy to read.
Province_State Population Country_Region Last_Update Lat
26 Missouri 5988927 US 2020-04-20 23:36:47 38.4561
Long_ Confirmed Deaths Recovered Active FIPS Incident_Rate People_Tested
26 -92.2884 5890 200 NA 5690 29 100.5213 56013
People_Hospitalized Mortality_Rate UID ISO3 Testing_Rate
26 873 3.395586 84000029 USA 955.942
Hospitalization_Rate ConfirmedPerCapita1000
26 14.82173 0.9834817
To extract just the “Confirmed” cases, we’ll add that to our command like so:
To follow up on the introduction to Logic Gates post, this assignment is intended to help students practice using functions and logic statements.
Write a set of function that act as logic gates. That is, they take in one or two inputs, and gives a single true or false output based on the truth tables. Write functions for all 8 logic gates in the link. An example python program with a function for an AND gate (the function is named myAND) is given in the glowscript link below.
Write a function that uses these functions to simulate an half-adder circuit. Create a truth table for the input and output.
Write a function that uses the gate functions to simulate a full-adder circuit. Create a truth table for the input and output.
Logic gates are the building blocks of computers. The gates in the figure above take one or two inputs (A and B) and give different results based on the type of gate. Note that the last row of gates are just the opposite of the gates in the row above (NAND gives the opposite output to AND).
As an example, two gates, an AND and an XOR, can be used to make a half-adder circuit
By feeding in the four different combinations of inputs for A and B ([0, 0], [1, 0], [0, 1], and [1, 1]) you can see how these two gates add the two numbers in binary.
I find this to be an excellent introduction to how computers work and why they’re in binary.
Are the elements of larger atoms harder to melt than those of smaller atoms?
We can investigate this type of question if we assume that bigger atoms have more protons (larger atomic number), and compare the atomic number to the properties of the elements.
Your job is to use the data linked above to draw a graph to show the relationship between Atomic Number of the element and the property you are assigned.
Question 2.
What is the relationship between the number of valence electrons of the elements in the data table and the property you were assigned.
Bonus Question
Bonus 1: The atomic number can be used as a proxy for the size of the element because it gives the number of protons, but it’s not a perfect proxy. What is the relationship between the atomic number and the atomic mass of the elements?