Custom Random Number Generator Essay

Exclusively available on Available only on IvyPanda® • No AI

Table of Contents

Introduction
Mean, median, mode calculations
Graphical representations
Formulation of hypotheses
Confidence intervals
Conclusion
References

Introduction

An integer can be generated randomly using a computer program known as custom random number generator. It generates a random number between two chosen integers. Two integers are keyed in depending on the desired interval (Glosser, 1998). The number is generated in a predictable manner and is pseudo random in nature (Pseudo-random numbers, n.d.). The generated random number is 3.

Starting with the 3^rd piece of data, systematic sampling was used and the subset of data selected is shown in table 1 below.

Age_ Unsuccessful Applicants	Age_ Successful Applicants
37	34
42	39
44	42
45	35
49	38
54	45
56	39
34	48
39	41
43	33
45	36
46	38
53	42
54	35
57	38
37	41
41	47
44	48
45	39
48	33
37	36
55	38
37	39
37	44
44	35
44	45
49	39
49	48
36	51
41	33
30	36
29	38
39	39

Table 1: Systematically sampled subsets of data.

Systematic sampling involves choosing a starting number at random and thereafter at regular intervals determined from the generated random integer. The samples are 100 dividing by the random integer 3 gives 33.3, so every third applicant is chosen starting from the 3^rd applicant. Therefore, the subset data generated is of 33 applicants.

Mean, median, mode calculations

Determining mean, median and mode

The summation of the two data sets is obtained

For unsuccessful applicants, ∑ = 1440

Mean = total age of applicants/Total number of applicants

= 1440/33 =43.64 yrs

For successful applicants, ∑=1312

Mean=1312/33 =39.75yrs

The median is the middle number of the data set. Arranging in ascending order, the median is found.

For unsuccessful applicants, median =44 yrs

For successful applicants, median=39 yrs

The mode is the number occurring most often in the data set.

For unsuccessful applicants, mode=37yrs

For successful applicants, mode=39yrs

Determining range and standard deviation

The variance =∑(xi-x’)^2/n where xi is the ith element in the data set, x’ is the mean and n is the total number of the data.

For unsuccessful applicants

Age Unsuccessful Applicants	xi-x’	(xi-x’)^2
29	-14.63636364	214.2231405
30	-13.63636364	185.9504132
34	-9.636363636	92.85950413
36	-7.636363636	58.31404959
37	-6.636363636	44.04132231
37	-6.636363636	44.04132231
37	-6.636363636	44.04132231
37	-6.636363636	44.04132231
37	-6.636363636	44.04132231
39	-4.636363636	21.49586777
39	-4.636363636	21.49586777
41	-2.636363636	6.950413223
41	-2.636363636	6.950413223
42	-1.636363636	2.67768595
43	-0.636363636	0.404958678
44	0.363636364	0.132231405
44	0.363636364	0.132231405
44	0.363636364	0.132231405
44	0.363636364	0.132231405
45	1.363636364	1.859504132
45	1.363636364	1.859504132
45	1.363636364	1.859504132
46	2.363636364	5.58677686
48	4.363636364	19.04132231
49	5.363636364	28.76859504
49	5.363636364	28.76859504
49	5.363636364	28.76859504
53	9.363636364	87.67768595
54	10.36363636	107.4049587
54	10.36363636	107.4049587
55	11.36363636	129.1322314
56	12.36363636	152.8595041
57	13.36363636	178.5867769
∑=1440		∑=1711.636364
Mean x’=43.63636364

Table 2: Variance table for unsuccessful data subset.

Variance =1711.636364/33 =51.867

Population Standard deviation σ=√variance =√51.867 =7.20

The sample standard deviation S=√(X-X’)/n-1

S=√1711.6363/32=7.31359

The range is the difference between the lowest and highest values in a data set. For unsuccessful applicants, the range=57-29=28yrs.

Successful applicants

Age Successful Applicants	xi-x’	(xi-x’)^2
34	-5.757575758	33.1496786
33	-6.757575758	45.66483012
33	-6.757575758	45.66483012
33	-6.757575758	45.66483012
35	-4.757575758	22.63452709
35	-4.757575758	22.63452709
35	-4.757575758	22.63452709
36	-3.757575758	14.11937557
36	-3.757575758	14.11937557
36	-3.757575758	14.11937557
38	-1.757575758	3.089072544
38	-1.757575758	3.089072544
38	-1.757575758	3.089072544
38	-1.757575758	3.089072544
38	-1.757575758	3.089072544
39	-0.757575758	0.573921028
39	-0.757575758	0.573921028
39	-0.757575758	0.573921028
39	-0.757575758	0.573921028
39	-0.757575758	0.573921028
39	-0.757575758	0.573921028
41	1.242424242	1.543617998
41	1.242424242	1.543617998
42	2.242424242	5.028466483
42	2.242424242	5.028466483
44	4.242424242	17.99816345
45	5.242424242	27.48301194
45	5.242424242	27.48301194
47	7.242424242	52.45270891
48	8.242424242	67.93755739
48	8.242424242	67.93755739
48	8.242424242	67.93755739
51	11.24242424	126.3921028
∑=1312		∑=768.0606061
Mean x’=39.75757576

Table 3: Variance table for successful data subset.

Variance =768.0606061/33=23.274

Population standard deviation=√23.274 = 4.824

The sample standard deviation S=√(X-X’)/n-1

S=√768.0606/32=4.8991

Range=51-33=18yrs

Results table

	Unsuccessful applicants	Successful applicants
Mean	43.64	39.75
Median	44	39
Mode	37	39
Range	28	18
Standard deviation	7.31359	4.8991

Table 4: Results table.

The two sets of data have their mean, median and mode centered around 37-44yrs. We have many older applicants in the unsuccessful group as compared to the successful group. This can be seen from the wide range obtained (28yrs) and the mean age of 43.64. The ages for the unsuccessful group deviate from the mean by a very wide margin of 7.31359. The successful group is mostly centered around 39yrs with less deviation 4.8991 from the mean.

Graphical representations

Histogram representations

*Figure 1: Histogram illustrating the distribution of unsuccessful applicants.*

*Figure 2: Histogram illustrating the distribution of successful applicants.*

*Figure 3: Histogram showing median and bell shaped distribution of unsuccessful applicants.*

The median is 44, mode =37 and mean is 43.64.

*Figure 4: Histogram showing median and bell shaped distribution of successful applicants.*

The median is 39, mode =39 and mean is 39.75.

Constructing box plots

Unsuccessful applicants

Maximum value=57yrs;

Minimum value=29yrs;

Median=44;

1^st quartile=37;

3^rd quartile=49;

Inter quartile range=49-37=12;

Upper fence for outliers=3^rd quartile+1.5IQR= 49+ (1.5*12) =67;

Lower fence for outliers=1^st quartile-1.5IQR=37-(1.5*12) =19;

Therefore, there are no outliers.

Successful applicants

Minimum value=33;

Maximum value=51;

1^st quartile=36;

Median=39;

3^rd quartile=42;

Inter quartile range=42-36=6;

Upper fence for outliers=3^rd quartile+1.5IQR= 42+(1.5*6)=51;

Lower fence for outliers=1^st quartile-1.5IQR=36-(1.5*6) =27;

The data value 51 is on the border. Therefore, there is no outlier.

Histograms and box plots help in showing the distribution of the data sets. The histogram for unsuccessful applicants shows a normal evenly distributed population (bell shaped) on both sides. This implies the age of the unsuccessful applicants is balanced about the central point (median). The histogram for successful applicants is skewed towards the right. This implies that the age of successful applicants is unbalanced about the central point (median = 39) but its distribution is near a normal distribution. The box plot for unsuccessful applicants shows an even distribution of the whiskers meaning its population is evenly distributed while that for successful applicants is uneven. The lower and upper quartiles have 25% of lower and upper values respectively.

Formulation of hypotheses

The hypotheses will test whether the means are different. The hypothesized mean is 40 yrs. The null hypothesis will determine whether the sample means of both samples are equal, unsuccessful applicants mean μ1=successful applicants means μ2, such that the means of the two groups are not significantly different.

Null hypothesis Ho: μ1=μ2
Alternate hypothesis H1: μ1≠μ2 (two tailed test)

The alpha level represents the significance level of the hypothesis. It gives the probability of rejecting the null hypothesis when it is true (type 1 error), that is, it is the probability of concluding that the research hypothesis is true when the null hypothesis is true. When the p-value is less than 0.025 for a two tailed test, the null hypothesis is rejected.

The z statistic will be used for hypothesis testing. The population size is large at 33 (n˃30) and the population standard deviations are known.

The z statistic is

where x bar is the sample mean of unsuccessful and successful applicants, σ is the standard deviation and n is the sample size.

The Z value is given as Z=43.64-39.75/√(7.31359²/33⁺4.8991²/33)=1.6566.

The shaded region below is the rejection region R.

Rejection region R: Z˃1.96 and z<1.96

*Figure 7: Sketch of the critical region in hypothesis testing.*

Since 1.6566 <1.96, we accept the null hypothesis. This means the means of the unsuccessful and successful groups are not significantly different.

The null hypothesis is not rejected. This implies that the means of the two groups is not significantly different. The mean age of the unsuccessful applicants is 43.64 while for successful applicants is 39.75.

Confidence intervals

Margin of error

Margin of error m= mean ± standard deviation

For unsuccessful group, margin of error m= 43.63±7.31359

M=36.31 to 50.94

For successful applicants, the margin of error is M=39.75±4.8991=34.85 to 44.64

At 90% confidence level, the critical value is 1.64 which is calculated from (1-0.9)/2=0.05 and looking it up the z value from the table.

*Figure 8: Illustration of confidence intervals.*

Desired confidence interval

The difference of means x1-x2=43.64-39.75=3.89

Standard error of the difference=√(σ1²/n1+σ2²/n2)=√(7.313²/33+4.8991²/33)=1.5313

CI=X1-X2±1.64*1.531.

CI=3.89±2.51=1.38 to 6.4.

Confidence intervals are signs of estimates reliability. The two data samples have been sampled from a larger group of data. The confidence interval shows how frequently a particular unknown parameter is included in an observed interval.

Conclusion

From the graphical representations and after hypothesis testing, it was found that the mean age for unsuccessful applicants was 43.64 yrs while that for successful applicants was 39.75. The margin of error for both data sets is small. The confidence interval for the group is 1.38 to 6.4. This means that the sampled population is highly reliable in giving the correct results. The histograms and box plots portray a near normal bell shaped distribution. It can be safely concluded that there is no discrimination in the selection. Since sampling was done, it is assumed that the whole population will present an almost equal result or improve on the result hence the hiring process can be retained.

References

Glosser, M. G. (1998). Custom random number generator. Web.

Pseudo-random numbers, (n.d.). Web.