Introduction
As automated systems are replacing traditional business processes a lot of Data is being generated. Companies all around the world from publishing houses to FMCGs to Health Care all are trying to unlock the value of data. The objective is to gain insights from larger and complicated data sets. These insights are being used to make business decisions and staying competitive. Therefore, Data Science is becoming one of the most sought-after field for the young professionals all over the world. We can see this passion and interest for Data Science in wide variety of people coming from different backgrounds. The interest behind the field of study is certainly increasing exponentially one of the reasons is that the most successful companies are actively looking to hire Data Scientist or even expanding current job roles to include aspects of this field.
Ask Me Anything: Data Science Interviews webinar by the Data Science Dojo was a great opportunity to learn about Data Science career options, foundation and Data Science job roles. The webinar also included Q/A session where data science enthusiasts asked many questions related to Data Science interviews and careers from the experts.
Data Science Dojo is one of the leading Data Science training organization. It provides short duration in person hands on data science training that can kick start your career. Data Science Dojo has trained over 4000 working professionals from 1000 companies globally. These includes companies from Banking sector to Biotechnology, IT to social media giants.
Data Science Dojo holds its bootcamp all over the world the course is basic to intermediate level which includes Fundamentals of Data Science, Machine Learning, Data Visualization, Data Exploration, Feature Engineering with R, Predictive Analytics, Classification models and much more. It offers pre bootcamp content to get your prepared and post bootcamp content available online for attendees to keep practicing and implementing their learning from the bootcamp.
The webinar is presented by Data Scientist and Lead Instructor at Data Science Dojo, Rebecca Merrett. Rebecca holds a bachelor’s degree of information and media from the University of Technology Sydney, and a post graduate diploma in mathematics and statistics from the University of Southern Queensland. She has a background in technical writing for games development and has written for tech publications. She has also worked on many consultation projects for Data Science Dojo's clients.
Co-Hosting the webinar was Data Scientist & Instructor Tarun Shrivas, who is a seasoned professional in Marketing Research and Analytics with more than a decade of experience in analytics on political campaigns, consumer behavior and the CPG industry. Tarun holds a master’s degree in business analytics from Seattle University (Seattle, WA) and a bachelor’s degree in electrical engineering from the Jamia Millia Islamia (New Delhi).
The webinar starts with a presentation on how to best prepare yourself for the Data Science Industry. The discussion includes Data Scientist Types, Interviews Types, Roles available, commonly used tools by Data Scientists, and building your portfolio. The Presentation is followed by live questions from the audience who are enthusiastically listening to the webinar.
Passion for the Data Science
One thing that was evident from the webinar was the overwhelming response from the audience. More than 100 attendees joined the live webinar and continued to listen to the experts talk about Data Science careers and answering some of the questions for around 90 minutes. The audience seemed genuinely passionate about Data Science and building their career in this field.
Entering the Field of Data Science
As we know Data Science is a technical field so there is prerequisite to make a successful career in data science. Rebecca and Tarun talk about the foundation needed to pursue a career in this field. The foundation should include good understanding of Mathematics and Statistics concepts. Data Science Dojo believes in Data science for everyone therefore we saw a lot of encouragement from our experts on joining the field even if you’re from completely different background.
Types of Interviews and expectation from the candidates
During the presentation a lot of focus was giving to making candidates understand the types of interview questions, types of roles in data science and how can they can best capitalize on their skillset.
The interviewer is expecting candidates to have knowledge about Database tools and skillset to read, retrieve and make sense of data available. Working knowledge of SQL queries to retrieve data. Scripting language or tool to analyze and model data and familiarity with Big data infrastructure tools and application.
Data Scientists roles also requires candidates to have good foundation of Mathematics and Statistics. Few of the topics discussed in the webinar were
1. Conditional probability
2. Bayes theorem
3. Normal and Binomial distribution
4. Central limit theorem
5. Linear Regression
These are very core subjects to the field of data science. If your applying for role involving Product Management and Analytics, then knowledge of A/B Testing must be demonstrated as well.
Types of Roles Available
As we know Data Science is a vast field so there are variety of roles available. Following are the three main types of roles in the field of Data Science.
The ‘all-rounder’ Data Scientist
Data Scientist is expected to build predictive models which includes processing and cleaning data, isolating key features and collecting new features. Data Scientists should be familiar with big data and machine learning concepts and should be able to drive business decisions.
The ‘business facing’ Data Analysts
Data analyst is expected to visualize and segment data in a way that would help a business gather actionable insights. Data Analyst will use the data to analyze key problem, opportunity or trend that can be used in decision making. This might include product management role by driving product forward with better insights and better strategy. Data Analysts should be able to transform/manipulate large data sets, produce data visualization and track web analytics.
The ‘geeky’ Data Engineer
Data Engineer is dedicated to deploying analytic solution in the real world through front end application. Data engineer should be able to set up infrastructure for large data and should have strong software engineering skills.
Example questions
Some example interview questions were shared to give an idea to candidates on what to expect and how to best prepare for the interview.
Math & Stats
Example Question: Students’ academic scores follow a normal distribution with a mean of 18 and a standard deviation of 6. What proportion of students have scored between 18 and 24?
To solve this you should be familiar with Stats basics like z-score for normal distribution. Difference of sample mean from the population mean.
Product & Metrics
A company has created a web page to promote a product and encourage signup. One version of the page includes the “Find out more” the other version, “Learn more about us!”. Before going ahead with “Learn more about us!” What action would you take to ensure this is the right choice in terms of user signup?
This is a typical A/B Testing question. To answer that you will conduct an A/B testing with both versions of the page. Certain audience group would be exposed to version 1 and other to the version 2 so that we can ascertain which version of the page leads to more signups
We can also conduct T-test to see if there is a real difference in the number of signups with the different page versions.
The important thing is to keep your end goal in mind. If the end goal is number of signups, then you would prefer the version which leads to more signups even if that page does get a lot of traffic.
Most commonly used tools by data scientists were discussed so that the audience become familiar with them and build their portfolio.
Here is the list of the most commonly used tools
§ R
§ Python
§ Apache Hadoop
§ MapReduce
§ NoSQL Databases
§ Cloud Computing
§ D3
§ Apache Pig
§ Tableau
§ iPython Notebooks
§ GitHub
R and Phyton have the most extensive set of libraries & tools to help and automate everyday tasks. As Data engineer you’re more likely to work with Hadoop, MapReduce and Spark. For analyst more interactive data visualizing tools would be used such as Tableau.
Resume Tips
Resume is often the first impression your potential employer receives therefore it is very important to carefully design your resume. In the webinar Resume structure and design were discussed in detail.
Structure
You should highlight your strong selling points. This could be one of your interesting projects which might be relevant to the employer. Organizing the resume in the most optimal manner is important to communicate your strong selling points and relevant content.
Design
Keep the resume interesting and to the point. Avoid having multiple pages and lengthy content. Your resume should include contact information and hyperlinks to your projects. It is a great idea to share your content like your website, LinkedIn profile and other portfolio resources on your resume.
Experience section
If you have job experience the important thing is to focus on results you achieved rather than what actions. You want the hiring team to perceive you as result driven. Be sure to list your experiences in chronological order. Start bullets with action verbs whenever possible. Quantify or state results of your action where possible.
Include personnel projects which should demonstrate your practical skills and how it might help solve real business problem that company is having. You want to make sure most valuable and relevant projects are highlighted.
Make sure to highlight your business acumen skills along with the data science skills. Even as data scientist you should have great communication and interpersonal skills and good understanding of other business functions. You want the interviewer to perceive you as holistic.
Customize your resume based on the type of role your applying for and the company. Carefully studying every role your applying will give you an opportunity to customize your resume and present the most relevant and impactful content.
What Not to Do in an Interview
Rebecca explained what not do in an interview from her own experience interviewing for data science candidates.
The most important thing is to provide clear examples of your experience with data and statistical analysis if not then your chances of landing the job might be affected. You should be able to provide clear examples of each component of a project you worked with, solving a specific problem, discussing the outcomes of your effort and other activities you were involved in.
Data Scientists are sometimes perceived as not actively contributing towards a business goal. As majority of their role is focused on research and analytics. Therefore it is very important to have business acumen and thinking in terms of creating value for the business with the data analysis.
Not having practical experience only having purely academic or research background could backfire in some scenarios. It is very important to have some projects, internships or work experience.
Not asking the right questions is also another area to focus on. When candidate asks the right questions this shows their interest in the company and the field. Therefore proper research about the company and the role is important to come up with the right questions to ask.
It's important to have good communication skills and be presentable. The soft skills could make it or break it for the candidate. You need to position yourself as approachable person. Being too serious or distant might affect your chances of landing the job. Even as Data Scientist you should be interested in other business functions so that your perceived as all rounder and adaptable.
Talking in clichés is not a good idea you need to use the interview as an opportunity to differentiate yourself. Talking too much in clichés may hinder your chances of standing out from the crowd.
Most of those tips could be applicable to candidates applying for other roles as well. Having knowledge about the company, being practical, building your project portfolio and improving your communication skills could be helpful for candidate applying for a wide variety of roles.
Q/A Session
The questions were already posted before the webinar begin by the users and some of the live questions were also answered. Attendees seemed very interested in finding out about the Data Science education and foundation requirement and how to enter the field of Data Science as a fresh graduate with limited experience. Here are some of the questions from the webinar along with the response from Rebecca and Tarun.
Q: How to handle LinkedIn invitations from strangers and how to respond to a recruiter reaching out?
Best way to respond to recruiters is to take time composing the reply. You want to present yourself as very interested in the company and their business. You also need to be appreciative of the fact that recruiter is reaching out to you. You can talk about their products or services, project they are working on or any new development which may require new hiring. Present yourself as potential problem solver for their business problem.
Q: What are some of the important questions to ask during the data science interview?
You can ask about what kind of data they are working with. Company could be work with highly problematic data and that’s the reason they are hiring an expert. They could be having data modeling or even data management problem. So, it’s good idea to find out what data problem are they facing this will give you insight on your day to day activities and the job role.
Q: How to answer what are you expecting from this role?
This question is another way of asking how the company fits into your overall career plan. Here you want to justify your current position, maybe you are just entering the field of data science or switching careers or companies. You need to justify why you’re choosing that particular company and the role.
Q: Sharing new ideas with the interviewers about the company would be a good sign?
It is a definitely good to share you new ideas but keep in mind to first completely understand the problem they are having. To propose a solution, you need to have good understanding of the problem first.
Q: How can I answer question about the most important metrics for an ad marketing campaign?
To answer these kinds of questions it is important to have end goal in mind. Ask yourself what the company is trying to achieve at the end of the day with the help of this metrics. For example, if company is using number of clicks on a webpage or an email and not considering the end goal of signups then this won’t give them the clear picture of the campaign success. One of the pages have 60 % click rate but 0 % signup while the other have only 20 % click rate but 90 % sign up then in this case the latter would be considered more successful. So, answering questions about the marketing metrics please keep in mind the end goal.
Q: What is the best thing I can do while in college to land a job in data science after graduating?
Best thing to do is gain experience and one of the best ways to gain experience is from the community projects. Look for charitable organizations or community organization that might not have a lot of budget to hire someone but willing to have volunteers to lead them into right direction.
For example, an environmental organization looking to collect donations. They have data about different potential cities to set up donation drives. You could conduct population & demographic analysis to find out about the best cities for setting up the donation drives.
Q: There is a lot of competition for the entry level data science jobs. How to overcome?
Yes it is challenging especially if you’re talking about Indian sub-continent. Comparing number of jobs available to the candidates but this is not the case for all the regions. If we talk about US then the scenario is different the number of jobs are abundant compare to supply of talent but there is also another challenge of having the right skill set and experience. Companies are looking to hire individual with particular skill set and having business acumen. So it is important to keep improving your skill set and gain experience to be able to compete. Having skills other than data science can also help to differentiate you from the competition. Try to learn about other business functions and have more holistic skillset.
Q: What are the things that data scientist should keep in mind when searching for his first job?
Sometimes it’s better idea to go for smaller companies as they could provide you more valuable experience. You could really make an impact working for a smaller company as only few people are running the data science projects. As most of the competition is looking to get into tech giants it might be a good idea to start your career with a smaller company where competition is less and more opportunities to learn and grow.
Q: Do I need Masters/PhD or advance degree to get into data science
It is not necessary to get advance degrees to start your career in data science. Although it is very important to have good foundation which you can get from your bachelors or some other degree as it is the case with most technical fields. But keep getting advance degrees does not always guarantee you the best job. It is equally important to gain experience with community projects, internships or trainee opportunities. Having PhD means you have become an excellent researcher and more specialized working on very difficult problems this sometimes means opportunities available for the advance degree holders are somewhat limited.
Q: Where can you practice machine learning?
Going to hackathons is a good way to practice your machine learning skills in a comfortable setting surrounding by like minded people. It is also a good environment for guidance and feedback to improve your machine learning skills. You can also start practicing on Kaggle competition as well.
Q: What kind of portfolio is required to get into entry level Data Science job?
Working on your foundation is important having good foundation of mathematics and statistics is required. Being able to understand metrics and business problem is also required for most data science roles. Understanding of linear algebra, conditional probability and Bayes theorem and central tendencies is necessary. Having a strong foundation is important that could help you with the tools of data science and making analysis. Your portfolio should showcase your understanding of the core concepts and familiarity with some of the commonly used tools.
Q: How to transition from one career to another? For example, from cloud computing development environment to data science or from marketing and automation to data Science or from software engineering to data Science.
There are always some skills that are transferable. If we talk about digital marketing, there is a lot of analytics in this field and requires data science.
If you’re looking at the big production systems, there is many components of software engineering involved. So being skilled in software engineering and data science would be great advantage.
For cloud computing you can deploy your models if the company is big enough for the heavy-duty infrastructure. You need to find a role where your skills are transferable
Also, if you are already working somewhere your current organization would be the best place to make the transition into another function. After that you can definitely look into company where your preferred role is available and where data science is encouraged.
Q: What’s the interviewer’s approach when hiring for fresh data scientists?
Conceptual clarity is very important even if you don’t have a lot of experience in different data science domains. Make sure whatever you have put in your resume you should be very clear about the concept. Interviewer will also evaluate your understanding of basic concepts which includes Mathematics, Statistics and Machine learning. This will give the company sense of how much effort is required to train the candidate.
Q: How do I tell a story about myself and my projects to stand out
It is very important to provide interviewer opportunity to look at the work you have done. For that you can use GitHub repository to make your analytics available. Including your links of your repository on the resume is a good idea too. Even better is to build a portfolio on WordPress to get noticed.
Portfolio website has become more common nowadays if you look at the companies hiring pages, they do ask for LinkedIn profile, GitHub repository and your website. So this is a great opportunity to showcase your work in an efficient manner. Your portfolio should not be limited to your code and output only but should also include some writing sample that describes your output. It is always a good idea to showcase your communication skills. Most of the times hiring person is evaluating if you are able to clearly communicate your analysis and findings so communication becomes an essential skill.
If we put your work online it becomes easier for the hiring team to research about you. So at the time of the interviews they have better idea of your abilities and this could make a big difference.
After Q/A sessions presenters talked about their own experience and the career path they took. Rebecca talked about working on consulting projects and free-lancing with Data Science Dojo for a while during that time she worked on many community projects as well. Last year she joined Data Science Dojo for a more permanent role and her time with the company has been very valuable.
Tarun talked about having experience in Research and Consulting for seven years. The role involved data analysis and reporting to the business audience. The main focus was on brand and consumer research. Tarun also have extensive experience in researching about political campaigns and political parties.
Currently both the presenters are working as Lead instructors and Data Scientists at the Data Science Dojo. They are involved in curriculum development, content development and also working on some interesting consulting projects for the Data Science Dojo clients. The webinar lasted about 90 minutes and the audience was quiet interested throughout the duration.
Conclusion
The webinar was perfect combination of practical information and guidelines to kick start your career in data science. A great deal of discussion is applicable to candidates applying for role outside of the data science domain as well.
It is important for candidates to have conceptual understanding about the field and demonstrate interest and understanding of the company they are applying for. To start your career in data science first step is to have good foundation of the core subjects. The next step is to build your portfolio. To gain experience best thing is to avail a training opportunity, internship or volunteer for community projects. It is a good idea to look for smaller companies with more involving role rather than larger companies with very predetermined and focused job role. Having strong technical skills along with interpersonal and communication skills will help you stand out from the crowd in this highly competitive job market.
This is a companion discussion topic for the original entry at https://blog.datasciencedojo.com/p/dd4906c7-34d8-4142-bf56-53d90eb40bcf/