Which titles matter for predicting Survivability for titanic datase

The code snippet displaying titles which matter for predicting Survivability for the titanic dataset.

Reading titanic dataset and converting Survived variable into factor:

1.First, read the titanic file. Set the working directory properly.
2.Convert a Survived variable into factor column.

titanic <- read.csv(
              file = "titanic.csv",
              stringsAsFactors = FALSE
              )     #(1)

titanic$Survived <- as.factor(titanic$Survived)      #(2)

Discretizing categories:

Converting all Pclass values into variables:

titanic$pclass_one <- 0
titanic$pclass_two <- 0
titanic$pclass_three <- 0
titanic[titanic$Pclass==1,"pclass_one"] <- 1
titanic[titanic$Pclass==2,"pclass_two"] <- 1
titanic[titanic$Pclass==3,"pclass_three"] <- 1

Converting all the values of Embarked variables into variables:

titanic$embarked_q <- 0
titanic$embarked_s <- 0
titanic$embarked_c <- 0
titanic[titanic$Embarked=="Q","embarked_q"] <- 1
titanic[titanic$Embarked=="S","embarked_s"] <- 1
titanic[titanic$Embarked=="C","embarked_c"] <- 1

Converting values of gender into variables:

titanic$sex_m <- 0
titanic$sex_f <- 0
titanic[titanic$Sex=="male","sex_m"] <- 1
titanic[titanic$Sex=="female","sex_f"] <- 1

Filling Missing values of Age:

titanic[is.na(titanic$Age),"Age"] <- 28

Building Random Forest:

1.Install randomForest package if you have not installed it.
2.Load library packages.
3.Create the list of features for random forest.
4.Create random forest model .
5.Use varImpPlot to create scatter plot for variable importance calculated by random forest.

install.packages("randomForest").    #(1)
library(randomForest)                     #(2)
features <- c("Survived","Age", "SibSp", "Parch", "Fare", "pclass_one","pclass_two","pclass_three","embarked_q","embarked_s","embarked_c","sex_m","sex_f")       #(3)
titanic.forest <- randomForest(Survived~., data = titanic[,features], importance=TRUE)    #(4)
varImpPlot(titanic.forest)       #(5)

varImportance

One way to analyze how much impact a feature could make on target feature (Survived) is to use feature importances property from machine learning algorithm library.