SOC 304 – Social Statistics

STATISTICAL REPORT

Your statistical report is based on demonstrating three skills:

–

Your ability to use SPSS in order to generate the findings needed for the write-up

of your report

Your ability to do some calculation of statistics based on some findings from the

output.

Your ability to interpret and make sense of your output.

This report is a means to assess your level of quantitative literacy (see rubric table

attached at the end of this document).

Your quantitative report should be organized as follows:

1. Analysis: Answer ALL questions below as each step is graded. Having the

answers ready on this word document will help you submit the graded answers in

the Statistical Literacy report quiz. Submit your answers via the Statistical

Literacy Report Quiz on D2L. You have unlimited attempts. Don’t press the

submit button unless you are done. There is no time limit. There is however a

deadline.

2. Submit in the Statistical Literacy Report assignment dropbox: your

complete SPSS output used for the analysis (you can export your output as an

RTF file or you can copy and paste into your word document) and submit it in the

dropbox for the “Statistical Project.” Note that the deadline for the output is

earlier than the Statistical Literacy Report. I will be checking your output file to

see if you made some

Guidelines/requirements for the analysis to find under this module

“Statistical Literacy Report”:

1. You will need to use the GSS 2014 data (GSS2014.sav) file (available on

D2L)

2. You will need to use the Fall 2018 SCSU survey SPSS data file

(FallSurvey2018_SOC 304.sav file) (available on D2L).

3. Remember that you can only open this SPSS (.sav) files from SPSS

accessed via Apps Anywhere AFTER you have uploaded those files using

Web/Files Space (See “how to” Adobe Recording on D2L)

SOC 304 – Social Statistics

1. UNIVARIATE ANALYSIS (31 POINTS)

This section 1 on univariate analysis is based on using both data sets (GSS 2014

and SCSU 2018 Fall survey). Remember to choose appropriate measures of

descriptive statistics based on the level of measurement of each variable. In

addition, for EACH variable, you need to describe with one sentence or two the

descriptive statistics you computed. Nominal and ordinal variables require you to

mention valid percent (not percent) . For the univariate analysis of HRS1 and

HRSRELAX, provide the mean and the standard deviation.

1) SEX.

a. In your GSS 2014 file, describe the SEX composition. Use valid percent

in your statement. (1 point)

b. Also, based on this data file, and this SEX variable, what is the ratio of

men to women? For every 10 women, there are ___ men. (1 point)

c. In your SCSU 2018 Fall survey file, describe the sex composition (use the

SGender variable). Use valid percent in your statement. (1 point)

d. Also, Also, based on this data file, and this SGender variable, what is the

ratio of men to women? For every 10 women, there are ___ men. (1

point)

2) HRSRELAX. (“After an average work day, about how many hours do you have to

relax or pursue activities that you enjoy?”) (from the GSS 2014 file)

a. Provide the mean and standard deviation. (2 points)

b. provide the values for approximately 68% of the middle respondent

sample (between – 1 and +1 SD). (2 points)

c. For HRSRELAX, compute the confidence interval and provide the

following missing information. (2 points) At a 99% confidence level, the

confidence interval for the population mean of HRSRELAZ is between

____ and ____. (Hint: you need to use the EXPLORE menu from

ANALYZE and change the confidence level in “Statistics” from 95 to 99)

3) CHILDS. (from the GSS 2014 data file)

a. Recode CHILDS into CHILDSR with only 4 categories so as “0,” “1,” “2,”

are coded respectively “0,” “1,” and “2” and collapse values “3” through “8”

into value “3.” Label the values of your new variable CHILDSR (in the

SOC 304 – Social Statistics

variable view window) so that “0” = “no children,” “1” = “1 child,” “2” = “2

children,” and “3” = “3 children or more.” (See the SPSS primer for using

the “transform into different variable” instructions). Then run a frequency

distribution of CHILDSR making sure the value labels are visible on your

table.

Fill in the blank. There are ___ respondents who have 3 children or more. (.5

point)

b. Use valid percent to describe CHILDSR in your statement. (2 points)

4) HRS1. (from the GSS 2014 data file)

a. Provide the mean and standard deviation. (2 points)

b. provide the values for approximately 68% of the middle respondent

sample (between – 1 and +1 SD). (2 points)

5) Q28IMM_MUS1. (SCSU 2018 Fall survey file)

a. Use valid percent values to describe respondents’ stand on the statement

“Muslim immigrants are more difficult to integrate into American society

because of their religion.” Treat values 88, and 99 as missing (see

“using SPSS primer” specific to missing and recoding or the Adobe

Connect recording for SPSS assignments 11, 12, and 13). Make sure you

set the missing values so that you end up with only 5 categories (strongly

agree, somewhat agree, neutral, somewhat disagree, and strongly

disagree). Set “Don’t Know” and “Refused” as system missing.

Remember that you can set the missing values on the variable view

window in the “missing” cells for that variable. (2.5 points)

b. Then adding valid percentages, fill in the blanks. ___ % of respondents

overall agree with this statement compared to ___ % disagree with this

statement. (2 points)

6) Q16JOB_TRUMP. (SCSU 2018 Fall survey file)

a. Use valid percent values to describe respondents’ rating of president

Trump’s overall performance. Treat values 88 and 99 as missing values.

Make sure you set the missing values so that you end up with only 5

categories (Excellent, pretty good, only fair, poor, very poor). Remember

that you can set the missing values on the variable view window in the

“missing” cells for that variable (2 points).

b. Then adding valid percentages, fill in the blanks. ____ % of all

respondents who answered this question rate president Trump at least as

SOC 304 – Social Statistics

“pretty good” compared to ____ % who view president Trump at the best

as “pretty poor.”. (2 points)

7) Q5PARTY1 (SCSU 2018 Fall survey file)

a. Only set “refused” value 99 as missing. NOTE: In part 5 of this

assignment, you will need to set values 3 to 9 as missing system

(SYSMIS) using the recoding into a different variable command so that

you are only looking at 2 categories (Democrat and Republican). Run the

frequency distribution of Q5PARTY1.

b. The number of respondents who see themselves as “Not political” is ____.

(1 point)

c. Use valid percent values to describe respondents’ political affiliation. (5

points)

2. COMPARING TWO GROUPS MEANS (15 POINTS)

For this section, you need to use the GSS 2014 data. Does SEX of the respondent

matter in terms of how many hours one has to relax at the end of the day? Run the

independent samples T-test on SPSS in order to answer this question. You will need to

select “Options” to get some information about sample means. Hint: Pay attention to

the value numbers for male and female when selecting two groups to run the

independent samples T-test. (See SPSS assignment covering chapter 9 for illustration)

Make sure you review the chapter on two-independent samples testing before you do

the following:

a. Run the independent samples T-test on SPSS (make sure the output

for this is in the output file submitted in the dropbox (2 points)

b. Address the 3 required assumptions (Step 1 of hypothesis testing) (2

points)

c. State the hypotheses (H0 and H1) (1 point)

d. Provide values for sample means of each subgroup/categorical group. (2

points)

e. Calculate the sample mean difference between those two groups. (1 point)

f. Provide the value of the obtained t. (1 point)

g. Provide probability of t (p-value for T). Hint: look for “sig.” (1 point)

SOC 304 – Social Statistics

h. Answer the following question. Based on the number found for the pvalue, did the test statistic fall into the critical region? If so, at what alpha

level (.05?, .02?, .01?, .001?)? Remember that the P-value must be

smaller than the alpha and that we want to use the smallest alpha as

possible (2 points)

i. Make a decision (Reject or fail to reject) (1 point)

j. Interpret the results in terms of what you were testing and what it means in

comparing those two groups on your dependent variable. (2 points)

3. ANALYSIS OF VARI ANCE – ANOVA (18 POINTS)

You are still using GSS 2014 in this section. You will need to run a “One-Way

ANOVA” in order to answer the following question. Use HRSRELAX in the

“dependent list” and CHILDSR in “factor.” Also, make sure you select

“Descriptive” from the “Options” menu as well as “Bonferroni” from the “Post

Hoc…” menu. (See SPSS assignment 10 covering ANOVA for illustration)

Does the number of children matter in terms of how many hours to relax

respondents have?

a. Run ANOVA (make sure the output is in the dropbox)

b. Address the 4 required assumptions (Step 1 of hypothesis testing). (4

points)

c. State the hypotheses. (2 points)

d. Provide the value of the F ratio (obtained) (1 point)

e. Provide probability of F ratio (obtained) or the P-value of F. (1 point)

f. Indicate from your output how was calculated the MSW and MSB. Hints:

Use the actual values from the output. (4 points)

MSW = _____ / _____.

MSB = _____ /___.

g. Use your results from the Bonferroni post hoc test in order to interpret

the mean difference between each of the three categorical groups. You

should describe the mean differences regarding statistical significance.

For each comparison you need to provide the value for the mean

difference as well as information about being statistically significant at .05.

Hint: Look for stars. For each two groups comparison, make a statement

about the difference in means between these groups. For instance, “there

SOC 304 – Social Statistics

is no statistically significant mean difference between democrats and

republicans.” If you have a total of 4 categories, you need to make 6

comparative statements (group 1 with group 2, group 1 with group 3,

group 1 with group 4, group 2 with group 3, group 2 with group 4, group 3

with group 4). (6 points)

4. CORRELATION AND COEFFICIENT OF DETERMINATION (5 POINTS)

Now using HRS1 and HRSRELAX, what is the relationship between the number of

hours one works and number of hours to relax? Review chapter 13 for this section.

a. Run a bivariate correlation between those two variables.

b. Provide the value for Pearson’s r. (1 point)

c. Provide the significance level for Pearson’s r (P-value). Is r statistically

significant? At what alpha level? (0.5 point)

d. Interpret (based on r) the strength of the relationship, the direction of the

relationship and pattern between those two variables (Greater the number

of working hours per week…). (1.5 points)

e. Calculate r2 based on r and interpret r2 as a PRE measure. Hints: how

much is explained by the variance in …) (2 points).

5. NOMINAL AND ORDINAL BIVARIATE RELATIONSHIPS (31 POINTS)

In this section, you need to use the SCSU Fall 2018 data file. Make sure you have

dealt with missing values prior to exploring those relationships.

Explore the following relationships.

1. Does being a democrat or a republican matter in terms of agreeing or

disagreeing with the statement “Muslim immigrants are more difficult to

integrate into American society because of their religion”? Hint: you need

to create a new variable called PARTYR recoding the Q5PARTY1

variable and setting as system missing (SYSMIS any value that is NOT

either democrat or republican (values 3-9). You should then end up with a

binary variable (dichotomous variable). You will also need to use the

dependent variable “Q28IMM_MUS1.” Make sure you set the missing

values so that you end up with only 5 categories (strongly agree,

somewhat agree, neutral, somewhat disagree, and strongly disagree).

SOC 304 – Social Statistics

Remember that you can set the missing values on the variable view

window in the “missing” cells for that variable. (15.5 points)

2. Does the rating of Trump’s overall performance matter in predicting

attitudes towards the statement “Muslim immigrants are more difficult to

integrate into American society because of their religion”? Use the

variable “Q28IMM_MUS1” as previously used (only 5 categories) as the

dependent variable. Use the variable “Q16JOB_TRUMP” after making

sure that you only keep 5 categories (Excellent, Pretty good, Only fair,

Poor, Very poor). (15.5 points)

For each of them (1. And 2.) you need to do the following:

k. Run appropriate bivariate tables (column percentages). Hint: make

sure you select the appropriate variable for column or row and that you

check “column” percentage in the “cells” option.

l. Interpret percent values in cell (you need to write sentences that includes

percentages for each cell). Remember to compare percentages across

after you obtain them within each column. (2 points)

m. State the hypotheses for Chi-square, (1 point)

n. compute Chi-square and provide the value of Chi-square, (1 point)

o. What is the p-value (sig.) for Chi-square? State the alpha level for Chisquare. Is it statistically significant? If so, at .05? at .01? at .001? (2

points)

p. Make a statement about your decision to reject/fail to reject the null

hypothesis, (1 point)

q. Interpret the results from chi-square (be specific about the variables). Is

there a relationship between those variables? (1 point)

r. Compute the appropriate measure of association (pay attention to level of

measurement of each variable). Use Cramer’s V and Lambda for nominal

variables. (2 points) Use Gamma for ordinal variables. (1 point)

s. State the alpha level (p-value) associated with that measure of

association. (2 points)

t. Describe the:

i. Strength of the relationship (weak, moderate, strong) while

providing the value of the measure of association, (1 point)

ii. Pattern of the relationship (Example: Democrat respondent are

more likely to… than…) reading the bivariate table’s column

percentages across, (1 point)

iii. Direction of the relationship when appropriate (positive/negative)

based on your measure of association. (1 point)

SOC 304 – Social Statistics

u. Use the measure of association that is appropriate for PRE and provide

the PRE value in % in a sentence interpreting the bivariate relationship.

(1.5 points)

SOC 304 – Social Statistics

Capstone

4

Interpretation

Ability to explain information

presented in mathematical

forms (e.g., equations,

graphs, diagrams, tables,

words)

Representation

Ability to convert relevant

information into various

mathematical forms (e.g.,

equations, graphs, diagrams,

tables, words)

Calculation

Milestones

3

2

1

Provides accurate

explanations of information

presented in mathematical

forms. Makes appropriate

inferences based on that

information. For example,

accurately explains the trend

data shown in a graph and

makes reasonable predictions

regarding what the data

suggest about future events.

Provides accurate explanations of

information presented in

mathematical forms. For instance,

accurately explains the trend data

shown in a graph.

Provides somewhat accurate

explanations of information

presented in mathematical forms,

but occasionally makes minor

errors related to computations or

units. For instance, accurately

explains trend data shown in a

graph, but may miscalculate the

slope of the trend line.

Attempts to explain

information presented in

mathematical forms, but

draws incorrect

conclusions about what

the information means.

For example, attempts to

explain the trend data

shown in a graph, but will

frequently misinterpret the

nature of that trend,

perhaps by confusing

positive and negative

trends.

Skillfully converts relevant

information into an insightful

mathematical portrayal in a

way that contributes to a

further or deeper

understanding.

Competently converts relevant

information into an appropriate and

desired mathematical portrayal.

Completes conversion of

information but resulting

mathematical portrayal is only

partially appropriate or accurate.

Completes conversion of

information but resulting

mathematical portrayal is

inappropriate or

inaccurate.

Calculations attempted are

essentially all successful and

sufficiently comprehensive to

solve the problem.

Calculations are also

presented elegantly (clearly,

concisely, etc.)

Calculations attempted are

essentially all successful and

sufficiently comprehensive to solve

the problem.

Calculations attempted are either Calculations are attempted

unsuccessful or

but are both unsuccessful

and are not

represent only a portion of the

comprehensive.

calculations required to

comprehensively solve the

problem.

SOC 304 – Social Statistics

Application / Analysis

Uses the quantitative analysis

Uses the quantitative analysis of

Uses the quantitative analysis of

Uses the quantitative

of data as the basis for deep

data as the basis for competent

data as the basis for workmanlike

analysis of data as the

Ability to make judgments and

and thoughtful judgments,

judgments, drawing reasonable and

(without inspiration or nuance,

basis for tentative, basic

draw appropriate

drawing insightful, carefully

appropriately qualified conclusions

ordinary) judgments, drawing

judgments, although is

conclusions based on the

qualified conclusions from this

from this work.

plausible conclusions from this

hesitant or uncertain about

quantitative analysis of data,

work.

work.

drawing conclusions from

while recognizing the limits of

this work.

this analysis

Assumptions

Ability to make and evaluate

important assumptions in

estimation, modeling, and

data analysis

Communication

Explicitly describes

assumptions and provides

compelling rationale for why

each assumption is

appropriate. Shows

awareness that confidence in

final conclusions is limited by

the accuracy of the

assumptions.

Explicitly describes assumptions

and provides compelling rationale

for why assumptions are

appropriate.

Explicitly describes assumptions.

Uses quantitative information

Uses quantitative information in

Uses quantitative information, but

in connection with the

connection with the argument or

does not effectively connect it to

Expressing quantitative

argument or purpose of the

purpose of the work, though data

the argument or purpose of the

evidence in support of the

work, presents it in an

may be presented in a less than

work.

argument or purpose of the

effective format, and

completely effective format or some

work (in terms of what

explicates it with consistently

parts of the explication may be

evidence is used and how it is

high quality.

uneven.

formatted, presented, and

contextualized)

Attempts to describe

assumptions.

Presents an argument for

which quantitative

evidence is pertinent, but

does not provide adequate

explicit numerical support.

(May use quasiquantitative words such as

“many,” “few,” “increasing,”

“small,” and the like in

place of actual quantities.)

