As automated systems replace traditional business processes, a huge amount of data is being generated. Companies all around the world, from publishing houses to health care, are trying to unlock the value of data. Consequently, data science is becoming one of the most sought-after fields for young professionals all over the world.
Recently, Data Science Dojo hosted a data science interview AMA. It was a great opportunity to learn about data science career options and job roles. The webinar also included a Q/A session where data science enthusiasts asked many questions related to data science interviews and careers from the panelists.
The webinar is presented by Data Scientist and Lead Instructor, Rebecca Merrett. She holds a post graduate diploma in Mathematics and Statistics from the University of Southern Queensland. Co-hosting the webinar is Data Scientist Tarun Shrivas, who is a seasoned professional in Marketing Research and Analytics. He holds a master’s degree in Business Analytics from Seattle University (Seattle, WA).
The webinar begins with a presentation on how to best prepare yourself for the Data Science Industry. The discussion includes the different types of data scientists, data science interviews, job roles, commonly used tools, and how to go about building your portfolio. The presentation concludes with about 60 minutes of Q/A. You can watch the video below or continue reading.
- 0:00:00 – Introduction
- 0:01:41 – About Rebecca
- 0:02:22 – About Tarun
- 0:02:50 – Rebecca’s Presentation
- 0:22:23 – Q/A
Entering the Field of Data Science
The Presenters talk about how there's no right way to building a foundation in data science. You can attend a university, a data science bootcamp, independent mentoring, or even free online courses. Some of these paths will take more effort than others, but one thing was evident, you MUST have a strong understanding of mathematical and statistical concepts.
Types of Interviews and Expectations
Throughout the presentation emphasis was given to understanding the types of interview questions, job roles, and how candidates can best capitalize on their skillset.
The interviewer is expecting candidates to have knowledge about database tools and have the skills required to read, retrieve, and make sense of the available data. A working knowledge of SQL queries is always helpful as well.
A Data Scientist role also requires candidates to have a fundamental understanding of the following:
- Conditional probability
- Bayes theorem
- Normal and Binomial distribution
- Central limit theorem
- Linear Regression
Does this cover everything you should know? No, but these are some of the core subjects in data science. If you're applying for a role involving product management and analytics, then experience with A/B testing will most likely need to be demonstrated.
Roles Available
As we know, data science is a vast field so it's understandable that there are variety of job functions available. Following are the three main types of data science roles.
1. The ‘All-Rounder’ Data Scientist
The Data Scientist is expected to build predictive models which includes processing and cleaning data, isolating key features, and collecting new features. Data Scientists should be familiar with big data and machine learning concepts and should be able to drive business decisions.
2. The ‘Business Facing’ Data Analyst
The Data analyst is expected to visualize and segment data in a way that can help a business gather actionable insights. Data Analyst uses data to understand a key problem, opportunity, or trend that can be utilized in decision making. Data Analysts should be able to transform and manipulate large data sets, produce visualizations, and track web analytics.
3. The ‘Geeky’ Data Engineer
The Data Engineer is dedicated to deploying analytic solutions in the real world through front end application. A Data engineer should be able to set up the infrastructure for large amounts of data and possess strong software engineering skills.
Example questions
Here are some of the example questions and answers presented at the end of the AMA to give candidates an idea about what to expect and how to best prepare for the interview.
Math & Stats
Example Question: Students’ academic scores follow a normal distribution with a mean of 18 and a standard deviation of 6. What proportion of students have scored between 18 and 24?
To solve this, you should be familiar with the z-score for normal distribution to difference the sample mean from the population mean in proportion to the standard deviation.
Product & Metrics
A company has created a web page to promote a product and encourage signup. One version of the page includes the “Find out more” the other version, “Learn more about us!”. Before going ahead with the second call-to-action, what action would you take to ensure this is the right choice in terms of user signup?
This is a typical A/B test question. You will need to conduct an A/B test with both versions of the page. One audience group will be exposed to version 1 and other to version 2 so that we can ascertain which version of the page leads to more signups
The important thing is to keep your end goal in mind. If the end goal is the number of signups, then you would prefer the version which leads to higher proportion of signups even if that page does not get a lot of traffic.
The most commonly used tools by data scientists are discussed so that the audience may become familiar with them to build their portfolio.
Here is the list of the most commonly used tools by Data Scientists:
- R
- Python
- Apache Hadoop
- MapReduce
- NoSQL Databases
- Cloud Computing
- D3
- Apache Pig
- Tableau
- iPython Notebooks
- GitHub
R and Phyton have the most extensive set of libraries & tools to help and automate everyday tasks. If you're a Data Engineer, you’re more likely to work with Hadoop, MapReduce, and Spark and as a Data analysts interactive data visualizing tools such as Tableau would be frequently used.
Resume Tips
The resume is often the first impression your potential employer receives. Therefore, it's important to carefully design your resume. In the webinar, resume structure and design is discussed in detail.
Structure
You should highlight your strong selling points first. This could be one of your interesting projects which is relevant to the employer. Organizing your resume in the most optimal manner is important to communicate your strong selling points and relevant content.
Design
Keep your resume interesting and to the point. Avoid having multiple pages and lengthy content. Your resume should include contact information and hyperlinks to your projects. It's a great idea to share content like your website, LinkedIn profile, and other portfolio resources on your resume.
Experience section
If you have job experience the important thing is to focus on results you achieved rather than actions you took. You want the hiring team to perceive you as result driven. Be sure to list your experiences in chronological order.
Here are some tips to make your resume stand out:
- Start bullet points with action verbs where possible.
- Quantify or state results of your action where possible.
- Include Data Science projects and publications.
- Highlight your business acumen skills.
- Customize your resume based on the type of job role.
- Use Resume analyzers: vmock, jobscan
What not to do in an Interview
Tarun and Rebecca explained what not do in an interview from their own experience interviewing data science candidates. The most important thing is to provide clear examples of your experience with data and statistical analysis if not then your chances of landing the job might be affected. You should provide clear examples of each component of a project you worked with, solving a specific problem, discussing the outcomes of your effort and other activities you were involved in.
Here are few other things to avoid in an interview:
- Not giving concrete examples of experience with data and statistical analysis.
- Lack of business acumen.
- Purely academic or research background.
- Not asking the right questions.
- Being too serious. Try your best to make it a pleasant experience for your interviewer.
- Lack of knowledge about the company.
- Poor communication skills.
- Talking in clichés (“I’m a team player", “I’m a perfectionist”).
Most of those tips are applicable to candidates applying for variety of roles. Having knowledge about the company, being practical, building your project portfolio and improving your communication skills is relevant for most job roles today.
Questions and Answers
Attendees posted few of the questions prior to the webinar while some of the live questions were also answered. The audience seemed very interested in finding out about data science education and foundation requirement and how to enter the field as a fresh graduate with limited experience.
Q: How to handle LinkedIn invitations from strangers and how to respond to a recruiter reaching out?
The best way to respond to recruiters is to take time composing the reply. You want to present yourself as very interested in the company and their business. You also need to be appreciative of the fact that the recruiter is reaching out to you. You can talk about their products and services, a project they are working on, or any new development which may require new hiring. Present yourself as potential problem solver for their business.
Q: What are some of the important questions to ask during the data science interview?
You can ask about what kind of data they are working with. the company could be working with highly problematic data and that’s the reason they are hiring an expert. They could be having a data modeling or data management problem. So, it’s a good idea to find out what data problem are they facing. This will give you insight on your day to day activities and the job role.
Q: How to answer what are you expecting from this role?
This question is another way of asking how the company fits into your overall career plan. Here you want to justify your current position, maybe you are just entering the field of data science or switching careers or companies. You need to justify why you’re choosing this particular company and the role.
Q: Sharing new ideas with the interviewers about the company would be a good sign?
It is definitely good to share new ideas but keep in mind that first you need to understand the problem they are having. To propose a solution, you need to have good understanding of the problem.
Q: How can I answer question about the most important metrics for an ad marketing campaign?
To answer this questions it is important to have an end goal in mind. Ask yourself what the company is trying to achieve at the end of the day with the help of this metric. For example, if company is using number of clicks on a webpage and not considering the end goal of signups then this will not give them the clear picture of the campaign success. One of the pages has a 60% click rate but zero signups while the other has only a 20% click rate but a 90% signup rate then, in this case, the latter would be considered more successful. So, answering questions about marketing metrics please keep in mind the end goal.
Q: What is the best thing I can do while in college to land a job in data science after graduating?
The Best thing to do is gain experience, and one of the best ways to gain experience is from community projects. Look for charitable organizations or community organizations that might not have a big budget to hire someone but willing to have volunteers lead them in the right direction.
For example, an environmental organization looking to collect donations. They have data about different potential cities to set up donation drives. You could conduct population & demographic analysis to find out about the best cities for setting up the donation drives.
Q: There is a lot of competition for the entry level data science jobs. How do you stand out?
Yes, it is challenging especially if you’re talking about the Indian sub-continent. If we talk about the US, then the scenario is different. The number of jobs are abundant compared to supply of talent, but there's also another challenge of having the right skill set and experience. Companies are looking to hire individual with particular skill set. So it is important to keep improving your skills and gain experience to be able to compete. Having skills other than that of data science can also help to differentiate you from the competition. Try to learn about other business functions to create a more holistic profile.
Q: What are the things that data scientist should keep in mind when searching for his/her first job?
Sometimes it’s better to go for smaller companies as they could provide you more valuable experience. You could really make an impact working for a smaller company as only a few people are running the data science projects. While most of the competition is looking to get into tech giants it might be a good idea to start your career with a smaller company where competition is less, and more opportunities are available to learn and grow.
Q: Do I need Masters/PhD or advance degree to get into data science?
It's not necessary to get advance degrees to start your career in data science. Although it's very important to have good foundation which you can get from your bachelors or some other degree as it is the case with most technical fields. But getting advance degrees does not always guarantee you the best job. It's equally important to gain experience with community projects, internships or trainee opportunities. Having a PhD means you have become an excellent researcher and more specialized working on very difficult problems. This sometimes means opportunities available for the advance degree holders may be somewhat limited.
Q: Where can you practice machine learning?
Going to hackathons is a good way to practice your machine learning skills in a comfortable setting. It's also a good environment for guidance and feedback to improve your machine learning skills. You can also start practicing on Kaggle.
Q: What kind of portfolio is required to get into an entry level Data Science job?
Working on your foundation is very important for entry level data science jobs. Having a good foundation of mathematics and statistics is required. Being able to understand the metrics and business problem is also required for most data science roles. Understanding of linear algebra, conditional probability, Bayes theorem and central tendencies is necessary. Having a strong foundation helps you with the tools of data science and making analysis. Your portfolio should showcase an understanding of the core concepts and familiarity with some of the commonly used tools.
Q: How to transition from one career to another? For example, from cloud computing development environment to data science or from marketing and automation to data Science or from software engineering to data Science.
There are always some skills that are transferable. If we talk about digital marketing, there is a lot of analytics in this field and requires data science.
If you’re looking at the big production systems, there are many components of software engineering involved. So being skilled in software engineering and data science would be great advantage. For cloud computing you can deploy your models if the company is big enough for the heavy-duty infrastructure. You need to find a role where your skills are transferable.
Also, if you are already working somewhere your current organization would be the best place to make the transition into another function. After that you can definitely look for a company where your preferred role is available and where data science is encouraged.
Q: What’s the interviewer’s approach when hiring for fresh data scientists?
Conceptual clarity is very important even if you don’t have years of experience in different data science domains. Make sure whatever you mention in your resume you should be very clear about the concept behind it. The Interviewer will also evaluate your understanding of basic concepts which includes Mathematics, Statistics and Machine learning. This will give the company sense of how much effort is required to train the candidate.
Q: How do I tell a story about myself and my projects to stand out?
It is very important to provide interviewer opportunity to look at the work you have done. For that purpose, you can use the GitHub repository to make your analytics available. Including links of your repository on your resume is a good idea too. Even better is to build a portfolio on WordPress to get noticed.
Portfolio websites are becoming more common nowadays. If you look at the companies hiring pages, they do ask for LinkedIn profile, GitHub repository and your website. So, this is a great opportunity to showcase your work in an efficient manner. Your portfolio should not be limited to your code and output only, but should also include some writing sample that describes your output. It's always a good idea to showcase your communication skills. Most of the time, the hiring person is evaluating if you're able to clearly communicate your analysis and findings so communication becomes an essential skill.
If you put your work online it becomes easier for the hiring team to research about you. So at the time of the interview they have better idea of your abilities which could make a big difference.
Conclusion
The webinar was a perfect combination of practical information and guidelines to kick start your career in data science. A great deal of discussion is applicable to candidates applying for a role outside of the data science domain.
It's important for candidates to have conceptual understanding about the field and demonstrate an interest and understanding of the company they are applying for. To start your career in data science, your first step is to have a strong foundation of the core subjects. The next step is to build your portfolio. Make sure to always be working on your experience. Volunteering for a community project is a great way to practice your skills. Having strong technical skills along with interpersonal and communication skills will help you stand out from the crowd in this highly competitive job market. Don't forget about applying for smaller companies. Your role will be more involved, and the lessons you learn from mistakes and successes will be more profound.
Thanks for reading! I hope this has given you a good understanding about data science career options and how to best prepare for an interview. Here is another awesome blog on 101 Data Science Interview Questions to help you get fully prepared for the interview.
This is a companion discussion topic for the original entry at https://blog.datasciencedojo.com/data-science-interview-ama/