Public Health Biostatistics Analysis Coursework

Exclusively available on Available only on IvyPanda® Written by Human No AI

Paired t test

This analyzes the possibility of an improvement in the quality-of-life measure for ten patients taken at random from those who participated in a trial of a minimally invasive procedure to treat Benign Prostatic Hyperplasia (BPH). BPH is a non-cancerous prostate gland enlargement common to millions of elderly men. The two variables in question are qol_base (quality of life at baseline, before treatment) and qol_3mo (the hypothesized improvement after three months of staying with the treatment.

Table 1.

Paired Samples Statistics
MeanNStd. DeviationStd. Error Mean
Pair 1qol_3mo2.10101.287.407
qol_base3.80101.135.359

Table 1 (alongside) from the output of the paired samples t test shows, first of all, that mean QoL had “fallen” from 3.80 at baseline to 2.10 after three months. This is a positive finding since QoL was a self-rated scale comprising the items: 0=Delighted, 1=Pleased, 2=Mostly satistifed, 3=Mixed, 4=Mostly dissatisfied, 5=Unhappy, and 6=Terrible.

Table 2.

Paired Samples Test
Paired Differences
MeanStd. DeviationStd. Error Mean95% Confidence Interval of the DifferencetdfSig. (2-tailed)
LowerUpper
Pair 1qol_3mo – qol_base-1.71.4940.473-2.769-0.631-3.59790.006

The derived t value of -3.597 in Table 2 above (ignoring the negative sign owing to the sequence in which the two variables were inserted into the SPSS calculation menu) is so high it bears a significance statistic of p = 0.006. The odds are about 6 in a thousand that such a difference could arise due to chance variation alone. Since this significance result far exceeds the criterion value of α = 0.05, one would have to conclude that BPH patients who received the treatment did experience a subjective improvement.

One-sample t test

Testing the variable delta (delt_qol), which gives the result of the simple calculation qol_3mo minus qol_base, we apply 0 as the test value because the reasonable working hypothesis is that the treatment engenders a perceptible improvement in quality of life ratings.

Table 3.

One-Sample Statistics
NMeanStd. DeviationStd. Error Mean
delt_qol101.701.494.473

Table 4.

One-Sample Test
Test Value = 0
tdfSig. (2-tailed)Mean Difference95% Confidence Interval of the Difference
LowerUpper
delt_qol3.5979.0061.700.632.77

Table 3 (above) shows, first of all, that the mean QoL Delta value is 1.70, suggesting that the 10 patients typically assigned satisfaction ratings two levels higher after three months on the treatment. Table 4 answers the next question, which is that the derived t value of 3.597 for all delta’s minus the benchmark of zero bears a significance statistic of p = 0.006. Once again, the chance that such a large difference from zero could have occurred due to random variation alone is less than six in a thousand. We conclude yet again that the treatment makes a difference for perceived quality of life.

Are Distributions Normal?

Visual inspection of the histograms alone shows that the QoL ratings at baseline roughly approximates a normal distribution while, three months later, the distribution is decidedly skewed toward the “I’m pleased” end of the scale.

Histogram for QoL at Baseline
Figure 1. Histogram for QoL at Baseline.
Histogram for QoL at Three Months
Figure 2: Histogram for QoL at Three Months.

Non-parametric Test of Median

If you have a concern about the small sample size and perhaps non-normal data, choose an appropriate nonparametric test to compare the median QoL score at the baseline and at 3 months. (Hint: You need to research nonparametric tests.)

We employ the nonparametric Wilcoxon Rank-Sum Test to compare the medians of what are nominally two independent groups but really the same group of patients answering rating scales at two points in time.

Since we are testing for a leftward shift in QoL ratings at three months and since the mean rank for all negative deltas outnumber the positive ones (Table 5), the correct usage of the p value reported in Table 6 below is to halve it, thus: p = 0.014/2 = 0.007. As it happens, this is lower than the criterion α = 05.

Table 5.

Ranks
NMean RankSum of Ranks
qol_3mo – qol_baseNegative Ranks8a5.3843.00
Positive Ranks1b2.002.00
Ties1c
Total10
a. qol_3mo < qol_base
b. qol_3mo > qol_base
c. qol_3mo = qol_base

Table 6.

Test Statisticsb
qol_3mo – qol_base
Z-2.448a
Asymp. Sig. (2-tailed).014
a. Based on positive ranks.
b. Wilcoxon Signed Ranks Test

Even switching to non-parametric statistics yields credence to the same conclusion: BPH patients are subjectively better off after three months under the treatment regiment.

Independent Sample Design

Comparing Boxplots

Boxplot of Bone Loss by Case-Control Group.
Figure 3: Boxplot of Bone Loss by Case-Control Group.

Comparing Histograms

Comparing Histograms Across Case and Control Groups.
Figure 4: Comparing Histograms Across Case and Control Groups.

Analysis of Bone Loss Employing a t test

This discretionary test employed the scale variable “change in bone mineral’ density or content and the nominal variable breastfeeding status. Subjects of the study were women who breastfed their newborns and were checked up on at three months of lactation. The control group consisted of formula-feeding mothers and non-pregnant, non- lactating women. Breastfeeding status was used as the grouping variable in the independent-samples t test.

One therefore formulates the hypotheses to be tested as:

H0, the null hypothesis = Among adult women, breastfeeding has no effect on calcium mineral density in bones.

Ha, the alternative hypothesis = Breastfeeding has an effect on bone mineral density in adult women.

Table 7.

Group Statistics
Ref = LaskeyNMeanStd. DeviationStd. Error Mean
Percent bone losscontrol22.3091.2983.2768
breast-feeding47-3.5872.5056.3655

Table 8.

Independent Samples Test
Levene’s Test for Equality of Variancest-test for Equality of Means
FSig.tdfSig. (2-tailed)Mean DifferenceStd. Error Difference95% Confidence Interval of the Difference
LowerUpper
Percent bone lossEqual variances assumed11.2550.0016.857670.0003.8960.5682.7625.031
Equal variances not assumed8.49966.1970.0003.8960.4582.9814.812

Mothers who breastfeed do experience mineral bone loss, as shown by the fact that control women showed an improvement averaging 0.309 on whatever calcium intake they followed (it is not the purpose of the current analysis to account for dietrary and other modifying factors in the Laskey et al. study. The comparable figure for breastfeeding mothers was frank bone mineral loss of 3.59 over the three month run of the study (see Table 7 above). At 3.90, the mean difference in bone mineral change between the two groups lies within the bounds of the 95% confidence interval of the difference (2.8 at the lower bound and 5.0 at the upper limit(Table 8).

A rather high F value for Levene’s Test (11.25, see Table 4) and a test for significance p < 0.05 confirms that the variances in the case-control subject database are homogenous and that one basic assumption of the t test therefore holds. Hence, one scrutinizes the row for “Equal variances assumed”. In either case, the derived t score is high enough for the given degrees of freedom to yield a significance statistic that is below the even more rigorous cut-off of p ≤ 0.01. Since such a difference could have occurred purely by chance less than once in a thousand re-sampling passes, one rejects the null hypothesis, accepts the alternate and concludes that breastfeeding has a deleterious effect on bone mineral density.

95% C.I. for Mean Difference in Percentage of Bone Loss

The 95% confidence interval is shown below.

Table 9.

95% Confidence Interval of the Difference
LowerUpper
2.7625.031
2.9814.812

Cross-Tabulation

Association Between Esophageal Cancer and Alcohol Consumption

This is a test of the relationship between alcohol consumption, measured in grams per day, and the nominal variable with two levels (“no” and “yes”), developing esophageal cancer. The latter was used as the grouping variable in the independent-samples t test.

The hypotheses may therefore be articulated as follows:

One therefore formulates the hypotheses to be tested as:

H0, the null hypothesis = there is no linear relationship at all between alcohol consumption levels and esophageal cancer morbidity. Elevated alcohol consumption is not a predictor of developing esophageal cancer.

Ha, the alternative hypothesis = Adults presenting with a history of elevated alcohol consumption are at greater risk for esophageal cancer.

An exploratory cross-tabulation (Table 10) suggests that those who have not yet developed esophageal cancer are much less likely to be moderate or heavy alcohol drinkers. Nearly half of the control group of 775 adults reported consuming 39 grams a day or less. Nearly all of the remainder, a cumulative 86%, had self-reported alcohol consumption levels of under 80 grams a day.

On the other hand, the 200 men who did develop esophageal cancer tended to cluster around 40 to 100 grams daily. Even more revealing, the distribution skewed towards high consumption.

Table 10.

Alcohol consumption * Esophageal cancer Crosstabulation
Esophageal cancer
casecontrolTotal
Alcohol consumption0 – 39 gm/dayCount29386415
% within Esophageal cancer14.5%49.8%42.6%
40 – 79 gm/dayCount75280355
% within Esophageal cancer37.5%36.1%36.4%
80 – 119 gm/dayCount5187138
% within Esophageal cancer25.5%11.2%14.2%
120+ gm/dayCount452267
% within Esophageal cancer22.5%2.8%6.9%
TotalCount200775975
% within Esophageal cancer100.0%100.0%100.0%

Both the chi-square and odds ratio tests (Table 11) bore significance statistics with p < 0.001, suggesting that the differences between the case and control groups cannot be accounted for by chance variation.

Table 11.

Chi-Square Tests
ValuedfAsymp. Sig. (2-sided)
Pearson Chi-Square1.590E23.000
Likelihood Ratio146.4983.000
Linear-by-Linear Association152.9741.000
N of Valid Cases975
a. 0 cells (.0%) have expected count less than 5. The minimum expected count is 13.74.

Table 12.

Group Statistics
Esophageal cancerNMeanStd. DeviationStd. Error Mean
Alcohol consumptioncase2002.56.996.070
control7751.67.785.028

This is bolstered by the outcome of the independent samples t test (Table 13, below) where, irrespective of the assumption taken about homogeneity of variances across the case and control groups, the t value is so large that, the significance statistic p < 0.001 shows, the difference in alcohol consumption between case and control groups is extremely unlikely to have occurred due to chance alone.

Table 13.

Independent Samples Test
Levene’s Test for Equality of Variancest-test for Equality of Means
FSig.tdfSig. (2-tailed)Mean DifferenceStd. Error Difference95% Confidence Interval of the Difference
LowerUpper
Alcohol consumptionEqual variances assumed36.5960.00013.4649730.0000.8890.0660.7591.019
Equal variances not assumed11.722266.2240.0000.8890.0760.7401.038

Accordingly, one rejects the null hypothesis and concludes that there is a tangible difference in the propensity of esophageal cancer patients to imbibe more alcohol.

Applicability of Fisher’s Exact Test

Fisher’s exact test of independence across, in this case, the case and control groups of subjects, applies for 2 X 2 contingency tables. This means Fisher’s can be used if the alcohol consumption is recoded as a categorical variable. One would have had to go back to the raw data to reclassify all subjects as either “non-drinker” or “drinker.” Alternatively, a better real-world representation might be “non-drinker or light drinker” versus “moderate or heavy drinker.”

Implementing Fisher’s exact test, the null hypothesis would be articulated as that the relative proportions of the dependent variable, having esophageal cancer, are independent of a second variable. So, the null hypothesis would be articulated around the premise that the proportion who drink (or indulge heavily) is equal among those who contract esophageal cancer and perfectly healthy adults.

Another consideration that voids the applicability of Fisher’s exact test to this data set is that the algorithm is more accurate than the chi-square or the G tests of independence principally when sample sizes are small. In this case, however, the study covered 200 adults in varying stages of esophageal cancer and no les that 775 control subjects. Hence, Fisher’s is not a sound choice in this situation.

Association Between Esophageal Cancer and Alcohol Consumption as Nominal Variable

In fact the database does have a dichotomized version of the alcoholic consumption variable employing 80 grams or more daily as a cut-off for marked indulgence in the vice. The chi-square test is available for testing differences in proportions between categorical variables.

We now test for differences in proportions of those who drink relatively little (if at all) and the heavier drinkers between adults who presented with esophageal cancer or did not.

The hypotheses may be articulated as follows:

H0, the null hypothesis = there is no association between drinking alcohol and esophageal cancer morbidity; put another way, there is no difference in esophageal cancer risk between adults who drink heavily or not.

Ha, the alternative hypothesis = Drinking rate and esophageal cancer are related; esophageal cancer risk is greater for adults who are regular or heavy drinkers.

Table 14.

Esophageal cancer * Alcohol dichotomized Crosstabulation
Alcohol dichotomized
80+ gms/day0-79 gms/dayTotal
Esophageal cancercaseCount96104200
% within Alcohol dichotomized46.8%13.5%20.5%
controlCount109666775
% within Alcohol dichotomized53.2%86.5%79.5%
TotalCount205770975
% within Alcohol dichotomized100.0%100.0%100.0%

Crosstabulation reveals that the likelihood of presenting with esophageal cancer is greater when one is a regular or heavy drinker. Just 13.5 % (or one in seven) of light or non-drinkers “came down” with esophageal cancer. With regular or heavy drinking, the risk rose to just under half.

Table 15.

Alcohol dichotomized
Observed NExpected NResidual
80+ gms/day205487.5-282.5
0-79 gms/day770487.5282.5
Total975

Table 16.

Esophageal cancer
Observed NExpected NResidual
case200487.5-287.5
control775487.5287.5
Total975

If the proportions of dichotomized drinking rates are representative of the universe of adults, then around eight in ten are light or non-drinkers.

Table 17.

Test Statistics
Alcohol dichotomizedEsophageal cancer
Chi-Square327.410a339.103a
df11
Asymp. Sig..000.000
a. 0 cells (.0%) have expected frequencies less than 5. The minimum expected cell frequency is 487.5.

Since the computed chi square statistic of 327.4 and 339.1 for the fact of regular alcohol consumption and esophageal cancer morbidity, respectively, (Table 17) exceeds the criterion chi-square statistic of 3.84 at 1 degree of freedom and α = 0.05, we reject the null hypothesis. As well, the probability of a significant difference like this occurring purely by chance is shown to be less than one in a thousand (p < 0.001). Accordingly, we accept the alternative hypothesis and conclude that adults who drink regularly or heavily bear a higher risk of manifesting esophageal cancer sooner or later.

Manual Calculation of 95% C.I. of Odds Ratio Estimate

The odds ratio can be defined as the ratio of the odds of an event occurring in one group to the odds of it occurring in another group, or to a sample-based estimate of that ratio. These groups might be men and women, an experimental group and a control group, or any other dichotomous classification. If the probabilities of the event in each of the groups are p1 (first group) and p2 (second group), then the odds ratio is:

Formula

where qx=1-px. An odds ratio of 1 indicates that the condition or event under study is equally likely in both groups. An odds ratio greater than 1 indicates that the condition or event is more likely in the first group. And an odds ratio less than 1 indicates that the condition or event is less likely in the first group. The odds ratio must be greater than or equal to zero if it is defined. It is undefined if p2q1 equals zero.

Cite This paper
You're welcome to use this sample in your assignment. Be sure to cite it correctly

Reference

IvyPanda. (2022, March 4). Public Health Biostatistics Analysis. https://ivypanda.com/essays/public-health-biostatistics-analysis/

Work Cited

"Public Health Biostatistics Analysis." IvyPanda, 4 Mar. 2022, ivypanda.com/essays/public-health-biostatistics-analysis/.

References

IvyPanda. (2022) 'Public Health Biostatistics Analysis'. 4 March.

References

IvyPanda. 2022. "Public Health Biostatistics Analysis." March 4, 2022. https://ivypanda.com/essays/public-health-biostatistics-analysis/.

1. IvyPanda. "Public Health Biostatistics Analysis." March 4, 2022. https://ivypanda.com/essays/public-health-biostatistics-analysis/.


Bibliography


IvyPanda. "Public Health Biostatistics Analysis." March 4, 2022. https://ivypanda.com/essays/public-health-biostatistics-analysis/.

If, for any reason, you believe that this content should not be published on our website, you can request its removal.
Updated:
This academic paper example has been carefully picked, checked, and refined by our editorial team.
No AI was involved: only qualified experts contributed.
You are free to use it for the following purposes:
  • To find inspiration for your paper and overcome writer’s block
  • As a source of information (ensure proper referencing)
  • As a template for your assignment
1 / 1