Data Science Logistic Regression
Objective: Using Logistic Regression to handle a binary outcome.
Given the prostate cancer dataset, in which biopsy results are given for 97 men: •
You are to predict tumor spread in this dataset of 97 men who had undergone a biopsy. •
The measures to be used for prediction are: age, lbph, lcp, gleason, and lpsa.
This implies that binary dependent variable of lcavol will be the outcome variable.
We start by loading the appropriate libraries in R: ROCR, ggplot2, and aod packages as follows:
Next, we load the csv file and check the statistical properties of the csv File as follow:
> setwd(“C:/RData”) # your working directory > tumor <- read.csv(“prostate.csv”) # loading the file > str(tumor)
# check the properties of the file . . . continue from here!
Reference R Documentation (2016). Prostate cancer data. Retrieved from http://rafalab.github.io/pages/649/prostate.html