How to submit prediction to Kaggle?

Code sample illustrating building a model, creating predictions, and writing out a CSV suitable to submit to Kaggle

Read in the Titanic training dataset:

NOTE - Set your working directory to the correct location.
1.Read titanic.csv file and set StringAsFactors to FALSE.

titanic <- read.csv("train.csv", stringsAsFactors = FALSE)   #(1)

Subset data for a simple model based on only Sex:

2.Select only Survived and Sex columns:

titanic.simple <- titanic[, c("Survived", "Sex")]     #(2)

Set up factorial (categorical variables):

3.Set Survived variable to factor column.
4.Set Sex variable to factor column too.

titanic.simple$Survived <- as.factor(titanic.simple$Survived)   #(3)
titanic.simple$Sex <- as.factor(titanic.simple$Sex)     #(4)

Build an rpart decision tree:

5.Install the rpart.plot package if you have not installed it.
6.Load the library rpart.
7.Load the library rpart.plot.

install.packages("rpart.plot")    #(5)
library(rpart)    #(6)
library(rpart.plot)    #(7)

Ensure everyone gets the same model and train:

8.Set a seed for reproducibility.
9.Create the simple machine learning model with rpart.

set.seed(4786)     #(8)
simple.tree <- rpart(Survived ~ ., data = titanic.simple)     #(9)

10.Make pretty plot of tree.

prp(simple.tree)     #(10) 

Working with the test data:

11.Read the test.csv file and setting stringsAsFactors as FALSE.
12.Convert Sex variable into factor column.

titanic.test <- read.csv("test.csv", stringsAsFactors = FALSE)   #(11)
titanic.test$Sex <- as.factor(titanic.test$Sex)      #(12)

Create predictions:

13.Create a prediction using predict function.

preds <- predict(simple.tree, titanic.test, type = "class")     #(13)

Preparing for submission:

14.Create dataframe for submission
15 Write out a .CSV suitable for Kaggle submission

submission <- data.frame(PassengerId = titanic.test$PassengerId,
                         Survived = preds)   #(14)
write.csv(submission, file = "MySubmission.csv", row.names = FALSE)   #(15)