Spss Project - AssignmentWorkHelp

Spss Project

Student ID number

MSc Global Public Health 



Coursework Deadline:


Word Count:

Module Leader:

Table of Contents

Introduction. 3

Objectives. 3

Methods. 4

Description of data collection and sampling procedure. 4

Study design and quality control 4

Variables used in the analysis. 5

Description of the analysis plan. 5

Results. 6

Description of the sample. 6

Presentation of the results of analytical analyses. 8

Discussion and conclusions. 10

Comparing the findings to the WHO values. 10

Suggested reasons for the patterns and trends. 10

Limitations of the data analysis and survey methodology. 10

Identification of the need for intervention. 11

References. 12


Obesity is determined to be a ailment where extreme or irregular fat accumulation inside adipose tissue harms health. Usually, obesity and overweight are measured by utilizing body mass index, while associated weight recommendations and special growth chart exist for children. Currently, in Pinkland, the prevalence of obesity and overweight in children and adults has been increasing rapidly. It has been evaluated that both these conditions are highly linked with a huge risk of emerging type 2 diabetes, cardiovascular disease and cancers. The data obtained from the Health Survey determines that around 24% of adults both women and men were obese, while 34% of women and 44% of men are overweight with a BMI of around 25-29.9 within Pinkland (Keaver et al., 2020). It has been observed that approximately 16% of children who comes under the age group of 2 to 15 were obese and 14% were overweight. While, for those aged 2 to 10 years, 14.4% of girls and 16.3% boys were obese and for those aged 11-15 years, 19.0% of girls and 17.6% of boys were obese. Therefore, it can be said that obesity is considered to be a public health problem which has created a negative impact on all socio-economic groups and age in Pinkland.

However, the survey is needed as it helped in providing relevant information regarding the prevalence of obesity or overweight by determining the characteristics of a large population in Pinkland. It also helped in serving adequate sample to collect targeted outcomes where helps in making major decisions and drawing conclusions. Thus, it is considered to be a reliable method of inquiry with well-constructed questionnaire design along with questions.


The objectives of the research are given below:

  • To identify the potential opportunities for CRC screening in Pinkland in the context of diabetes management
  • To regulate the prevalence of obesity or overweight in Pinkland
  • To demonstrate the demographic characteristics followed by age, marital status, sex and ethnicity
  • To summarize the data associated with BMI
  • To examine how BMI is related with educational attainment, age, sex, occupation, car ownership and ethnicity
  • To determine the relationship among ethnicity and long-standing illness
  • To analyze whether the burden of disease is distributed among the ethnic groups


Description of data collection and sampling procedure

Data collection

According to the periodic national Health Survey of Pinkland (HSP), the information were gathered during June 11 and July 2011 in order to analyze the dietary status of adults. It has been observed that the data were gathered with the help of National Centre for Health Research (NCHR) in coordination with the Department of Public at the University of Central Pinkland (UCP). The date were gathered by engaging office-based staff present at the UCP Department of Public Health where every research area comprise of a Health Manager who is made accountable for executing the project and monitoring the activities. A letter has been sent to each address before conducting the survey, to explain the survey briefly along with its purpose (Rudolf et al., 2019). Other information leaflets provided by the interviewer helped the participants to acquire a greater detail. Furthermore, interviews were directed by utilizing Computer-Assisted Personal Interviewing (CAPI) on socio-demographic features like age, educational qualifications and sex.

By the end of the interview, every individual’s weight and height were examined. If the participants desired for a record of their weight and height dimension, the interviewer created a Measurement Record Card.

Sampling procedure

The sample was collected by utilizing multistage stratified probability sampling. It is considered to be an efficient sampling method that correlates the methods of multistage sampling and stratified sampling (Etikan and Bala, 2017). The sampling method was done with postcode sectors as the major sampling unit stratified as per the health authority regions along with the percentage of households with a head of household within a non-manual occupation. Moreover, for households the Postcode Address File was considered to be the sampling frame. Stratification was associated with the geographical area, and the Pinkland Postcode Address File was utilized as the sampling frame (Musal and Ekin, 2018). The sample was casually chosen by utilizing 7,000 addresses within 700 postcode sectors where up to 10 adults were questioned in every household.

Study design and quality control

While conducting the research, the cross-sectional study is used in the form of study design which focused towards the observation of a defined population at a single point in time interval (Alili and Krstev, 2019). Within this study design the outcomes along with exposures are considered instantaneously. It helped in evaluating the data of different variables gathered from participants.

Furthermore, the research emphasized towards using a control variable which enabled the researcher in addressing the selection bias within a particular observation group. It helped in analyzing that the statistical inferences are controlled by particular variables which may absorb the explicability of the model (Wang et al., 2017). Therefore, quality control is being ensured within the research by providing proper training to the interviewers and making an efficient utilization of interviewer-administered questionnaires.

Variables used in the analysis

The variables utilized within the analysis are age, sex, ethnicity, marital status, higher educational qualification, height, weight, occupation, car ownership along with the presence of limiting long-standing illness. All these variables helped the researcher in analyzing and collecting accurate data and drive the research process (Schlossarek, Syrovátka and Vencálek, 2019). The variables has been evaluated by using two measures scale and nominal. A nominal variable provide values only as labels by categorizing female and male participants. While, scale is utilized for measuring variables within various classifications which does not includes a quantitative order or value. While measuring the variables, it is seen that age, height and weight comes under scale variable, whereas sex, highest educational qualification, ethnic group, marital status, household size, car or van ownership comes under nominal variable.

Description of the analysis plan

The analysis plan determines the utilization of HSP to collect accurate data based on the prevalence of obesity or overweight among the population living in Pinkland. The software utilized for analyzing the data is Statistical Package for Social Sciences (SPSS) that is regarded as statistical platform used by researchers to evaluate the complex statistical data (Zou, Lloyd and Baumbusch, 2020). The sampling method used helped in gathering relevant data from the participants. It helped in gathering a statistical information based on the research method by monitoring certain selected units. In order to conduct descriptive statistics, the frequency table has been utilized to display the number of times the variables has taken place within the sample, like males and females. The weight and height of the participants were measured to determine the BMI (Choi et al., 2017). The weight of the participants were measured by utilizing Tanila electronic scales which possess a digital display. While, the use of the variables helped in achieving the research goals and analyze the research outcomes.


Description of the sample

The sample of 540 participants collected by determining certain variables like age, sex, ethnic groups and marital status (Zhou et al., 2016). The table presented below provides the frequencies in terms of sex, where it has been evaluated that the frequency of male is estimated to be 248 with 45.9% and the frequency of female is 292 with 54.1%.

Frequency Percent Valid Percent Cumulative Percent
Valid white 523 96.9 96.9 96.9
mixed 4 .7 .7 97.6
asian 10 1.9 1.9 99.4
black 1 .2 .2 99.6
other 2 .4 .4 100.0
Total 540 100.0 100.0

Table 1: Ethnic Group


The above table determines the ethnicity of the participants where it is seen that among the respondents 96.9% are white, 7% are mixed, 1.9% are Asian, 2% are black and 4% are others.

The table given above determines the marital status of the participants where it is seen that 23.7% of respondents are single, 57.8% are married, 2.0% are separated, 10.2% are divorced and 6.3% are divorced.

Figure: Bar chart for sex variable

(Source: Self-constructed)

Moreover, the frequencies based on BMI determines that among 540 participants the BMI of around 493 people were valid, while the BMI of 47 people are missing. Mean, median and mode is determined to be a measure of location (Xu and Deng, 2017). The mean is estimated to be 27.09, while the median is determined to be 26.56 and the mode is considered to be 28.57 and based on this the standard deviation is determined to be 4.93.


To measure the BMI, continuous variable has been used to take an unlimited number of values among the highest and lowest points of measurement (Liu et al., 2020). It is seen that the BMI of 2% of individuals is under 18.5, hence they are considered to be underweight, while the BMI of 6.5% of individuals comes under 18.5 to 24.99 which determines that they have normal weight. While, the BMI of 8.3% of individuals is above 25 which indicates that they are overweight and the BMI of 9.6% of individuals is above 30 which determines that they are obese. The graph given below determines the BMI of the participants.

Figure: BMI of the participants

(Source: Self-constructed)

Moreover, based on 95% of confidence level for mean the lower bound is estimated to be 26.66 and the upper bund is determined to be 27.53 which is considered to be significant.

Presentation of the results of analytical analyses

In order to investigate the association of BMI with ethnicity, car ownership, occupation as well as presence of long-standing illness Pearson and chi-square test has been used (Shih and Fay, 2017). The table given below shows that based on Pearson Correlation the BMI of participants in terms of ethnic group is considered to be 0.088, 0.045 in terms of car ownership, 0.049 in terms of occupation and 0.184 in terms of limiting longstanding illness (Turhan, 2020). As a result, it shows that all these variables are not significant, hence they are not associated with the BMI.

Table 2: Correlations

The Pearson test is used for identifying the relationship between BMI and age as both are scale variables.

Table 3: T-Test for identifying association of BMI with sex

Since BMI is a continuous variable and sex is a categorical variable, the Student t-test is used for the relationship evaluation. The significance is 0.004 (>0.001), which means the relationship between BMI and sex is not significant.

Table 4: ANOVA test for evaluating the relationship between BMI and educational attainment

The ANOVA test is used for evaluating the relationship between BMI and educational attainment. ANOVA tests are done for evaluating association between a categorical (> 2levels) and a continuous variable. Here educational attainment has five levels so, the ANOVA test is used to evaluate its relationship with the scale variable BMI.


Discussion and conclusions

Comparing the findings to the WHO values

As per WHO reports, majority of adults within Pinkland such as 60% of women and 67% of men are dealing with obesity or overweight. The report also determines that around 20% of children aged 6 years were considered as obese. In terms of obesity and overweight among adolescents, up to 17% of girls and 18% of boys among 11-year-olds were overweight in Pinkland (Muttarak, 2018). Therefore, it is seen that 11,117 hospital admissions is openly attributable towards overweight or obesity.

Suggested reasons for the patterns and trends

A trend is determined to be an overall direction of a price over a certain period of time, while pattern is regarded as a set of data which follows an identifiable form which is being found in the present data. One of the major reason for the patterns and trends in the data based on the research and literature is the research questions along with descriptive reviews, which are regarded as generic and commonly associate with the publication trends and patters. Overall, patterns within the research outputs categorized the current elements as well as historic trends of the field of medical informatics (Quessy, Rivest and Toupin, 2019). Another reason which for the trends and patterns in the research is the use of journal frequencies along with impact factors related to obesity and overweight.

Limitations of the data analysis and survey methodology

The limitations of the data analysis is that the data remained incomplete with few of the missing values and even due to lack of a substantial part of the data or section which in turn limited its usability. Moreover, as the data has been gathered by conducting surveys, hence the data could be inaccurate as differs in format and quality. It is because they possess different structures along with attributes (Hazra, 2017). As the data has been collected through survey methodology, the data does not possess huge compatibility throughout the data fields. Thus, such type of data needs a significant preprocessing before analyzing it.

Moreover, the reliability of the survey data is affected as the respondents did not feel motivated, which seems that they might have provided inaccurate results. It determined that participants might have felt uncomfortable giving answers or presenting themselves in a hostile manner.

Identification of the need for intervention

However, there is a need to verify all the variables that is used within the model and analyze the scope of the data over time in order to avoid seasonality trap. The limitations found in the survey methodology could be improved by clearly defining the aim of the survey and keeping rating scale questions reliable throughout the survey (Yang et al., 2019). It is highly recommended that the survey should be focused and short and the questions asked to the participants must be kept simple by utilizing closed ended questions. This would have helped in increase the response rates of the survey by making an efficient utilization of sophisticated adjustment techniques. On the other hand, the data analysis could have been improved by collecting as well as evaluating the data as well as setting clear measurable priorities Moreover, it is necessary to interpret the results found after analyzing the data (Zeiler, Müller and Bertsche, 2016). As a result, it will help in drawing appropriate conclusions from the data and would help in the decision-making process. Before, collecting the data it becomes important to analyze the information which could be gathered from current sources or databases.



Alili, A. and Krstev, D., 2019. Using spss for research and data analysis. Knowledge International Journal32(3), pp.363-368.

Choi, J., Kim, B., Hahn, H., Park, H., Jeong, Y., Yoo, J. and Jeong, M.K., 2017. Data mining-based variable assessment methodology for evaluating the contribution of knowledge services of a public research institute to business performance of firms. Expert Systems with Applications84, pp.37-48.

Etikan, I. and Bala, K., 2017. Sampling and sampling methods. Biometrics & Biostatistics International Journal5(6), p.00149.

Hazra, A., 2017. Using the confidence interval confidently. Journal of thoracic disease9(10), p.4125.

Keaver, L., Xu, B., Jaccard, A. and Webber, L., 2020. Morbid obesity in the UK: A modelling projection study to 2035. Scandinavian Journal of Public Health48(4), pp.422-427.

Li, Y.H., 2017. Text feature selection algorithm based on Chi-square rank correlation factorization. Journal of Interdisciplinary Mathematics20(1), pp.153-160.

Liu, Y., Mu, Y., Chen, K., Li, Y. and Guo, J., 2020. Daily activity feature selection in smart homes based on pearson correlation coefficient. Neural Processing Letters, pp.1-17.

Musal, M. and Ekin, T., 2018. Information‐theoretic multistage sampling framework for medical audits. Applied Stochastic Models in Business and Industry34(6), pp.893-907.

Muttarak, R., 2018. Normalization of plus size and the danger of unseen overweight and obesity in England. Obesity26(7), pp.1125-1129.

Quessy, J.F., Rivest, L.P. and Toupin, M.H., 2019. Goodness-of-fit tests for the family of multivariate chi-square copulas. Computational Statistics & Data Analysis140, pp.21-40.

Rudolf, M., Perera, R., Swanston, D., Burberry, J., Roberts, K. and Jebb, S., 2019. Observational analysis of disparities in obesity in children in the UK: Has Leeds bucked the trend?. Pediatric obesity14(9), p.e12529.

Schlossarek, M., Syrovátka, M. and Vencálek, O., 2019. The Importance of Variables in Composite Indices: A Contribution to the Methodology and Application to Development Indices. Social Indicators Research145(3), pp.1125-1160.

Shih, J.H. and Fay, M.P., 2017. Pearson’s chi‐square test and rank correlation inferences for clustered data. Biometrics73(3), pp.822-834.

Turhan, N.S., 2020. Karl Pearsons chi-square tests. Educational Research and Reviews15(9), pp.575-580.

Wang, Y., Chen, G., Wang, J., Luo, C., An, Q. and Zhao, J., 2017. A variable-capacity power system driven by geothermal energy: research methodology and preliminary experimental study. Energy Procedia142, pp.278-283.

Xu, H. and Deng, Y., 2017. Dependent evidence combination based on shearman coefficient and pearson coefficient. IEEE Access6, pp.11634-11640.

Yang, Q., Su, M., Li, Y. and Wang, R., 2019. Revisiting the Relationship Between Correlation Coefficient, Confidence Level, and Sample Size. Journal of Chemical Information and Modeling59(11), pp.4602-4612.

Zeiler, P., Müller, F. and Bertsche, B., 2016. New methods for the availability prediction with confidence level. Risk, Reliability and Safety: Innovating Theory and Practice: Proceedings of ESREL 2016 (Glasgow, Scotland, 25-29 September 2016), p.313.

Zhou, H., Deng, Z., Xia, Y. and Fu, M., 2016. A new sampling method in particle filter based on Pearson correlation coefficient. Neurocomputing216, pp.208-215.

Zou, D., Lloyd, J.E. and Baumbusch, J.L., 2020. Using SPSS to analyze complex survey data: a primer. Journal of Modern Applied Statistical Methods18(1), p.16.