Descriptive Statistics Data Analysis Plan

Identifying Information

Student (Full Name):

Class:

Instructor:

Date:

Scenario: Please write a few lines describing your scenario and the four variables (in addition to income)

you have selected.

Use Table 1 to report the variables selected for this assignment. Note: The information for the required

variable, “Income,” has already been completed and can be used as a guide for completing information

on the remaining variables.

Table 1. Variables Selected for the Analysis

Variable Name in the Data Set

Description (See the data dictionary for describing the

variables.)

Type of Variable (Qualitative or Quantitative)

Variable 1: “Income”

Annual household income in USD. Quantitative

Variable 2: “Marital status” Married or not married Qualitative

Variable 3: “Housing” Annual expenditure on housing in USD Quantitative

Variable 4: “Transport” Annual expenditure in USD Quantitative

Variable 5: “Food” Annual expenses on food in USD Quantitative

Reason(s) for Selecting the Variables and Expected Outcome(s):

1. Variable 1: “Income” -The variable is useful for determining the annual household income. It is

important because other variables such as expenditures on food and housing have to come

from income. It is expected that the mean annual household income will be greater than the

mean annual household total expenditure

2. Variable 2: “ Marital status “ -To analyze the marital status of households in the data set. It is

expected that households with married people will have more expenditure

3. Variable 3: “Housing “ -To analyze housing expenditures in the households. Housing is a major

expenditure in households. The annual expenditure on housing is expected to be among the

biggest expenditure items in households

4. Variable 4: “Transport“ -To analyze the total annual household expenditure on transport. It is

expected that the total annual household expenditures on transport will be less than the

annual household income but will be significant

5. Variable 5: “Food “ -Food is a major household expenditure. Therefore, it is necessary to

analyze it. It is expected to take a significant portion of total expenditures.

Data Set Description:

The variable “Income” is a quantitative variable which means the annual household income in USD.

“Marital status” is a qualitative variable which means whether one is married or not. “Housing” is a

quantitative variable means the annual expenditure on housing in USD. The variable “Transport” is

quantitative variable which means the annual household expenditures on transport in USD. The variable

“Food” is a quantitative variable which means the annual household expenditure on food in USD.

Proposed Data Analysis:

Measures of Central Tendency and Dispersion

Complete Table 2. Numerical Summaries of the Selected Variables and briefly explain why you choose

those measurements. Note: The information for the required variable, “Income,” has already been

completed and can be used as a guide for completing information on the remaining variables.

Table 2. Numerical Summaries of the Selected Variables

Variable Name Measures of Central Tendency and Dispersion

Rationale for Why Appropriate

Variable 1:

“Income”

● Number of Observations

● Median ● Sample Standard

Deviation

I am using median for two reasons: 1. If there are any outliers or the data is not

normally distributed, the median is the best measure of central tendency.

2. The variable is quantitative.

I am using sample standard deviation for three reasons:

1. The data is a sample from a larger data set. 2. It is the most commonly used measure of

dispersion. 3. The variable is quantitative.

Variable 2: “Marital

Status”

N/A Does not apply because it is a qualitative variable

Variable 3:

“Housing”

 Mean

 Sample Standard deviation

 Range

I am using mean because:

 It is one of the most popular measure of central tendency

 This variable is quantitative I am using sample standard deviation because:

 It is a common measure of dispersion

 The variable is quantitative I am using range because:

 It can capture the outliers

 It is a common measure of dispersion

Variable 4:” Annual

Expenditures”

 Median

 Sample standard deviation

I am using median because:

 It is useful measure of central tendency when data is skewed

 The variable is quantitative I am using sample standard deviation because:

 It is a common measure of dispersion

 The variable is quantitative

Variable 5: “Food” Mean Range

I am using mean because:

 It is one of the most popular measure of central tendency

 This variable is quantitative I am using range because:

 It can capture the outliers as I want to know the difference between the highest and lowest household expenditures on food

 It is a common measure of dispersion

Graphs and/or Tables

Complete Table 3. Type of Graphs and/or Table for Selected Variables and briefly explain why you

choose those graphs and/or tables. Note: The information for the required variable, “Income,” has

already been completed and can be used as a guide for completing information on the remaining

variables.

Table 3. Type of Graphs and/or Tables for Selected Variables

Variable Name Graph and/or Table Rationale for why Appropriate?

Variable 1:

“Income”

Graph: I will use the histogram to show the normal distribution of data.

Histogram is one of the best plot to show the normal distribution of quantitative level data .

Variable 2:”

Marital status”

Table: Frequency distribution table It will be useful in establishing the frequencies of people who are married and those who are not

Variable 3: Graph: Histogram It is useful to display the shape of the distribution of a quantitative data set

Variable 4: Graph: Histogram It is useful to display the shape of the distribution of a quantitative data set

Variable 5: Graph: Histogram It is useful to display the shape of the distribution of a quantitative data set