You might be thinking what is a data scientist and what do data scientists do? Data scientists investigate data in order to extract knowledge, patterns and insights from the data. They then use their findings to help solve problems or predict future behaviors.
Advances in technology and access to data has made data science a very popular and in-demand career field. Data science can be used across any industry such as banking, manufacturing, transportation, education, etc.
An example of how data science could be used in banking would be with credit card fraud. The credit card companies use your historical transaction data such as your common purchases and location to help them develop your spending patterns. They use this type of data pattern to predict if a charge is fraudulent or not.
If you are wondering exactly how to become a data scientist we have some tips to help!
Let’s get into the nitty gritty of a data scientist job description and exactly what their work might entail. Here we will walk through the steps a data scientist might take on a company project.
Investigate- A data scientist needs to put on their detective hat to investigate and understand the details of the problems. They would begin a project by asking a lot of questions.
Go Digging- They find and pull all of the relevant data available. This data could be pulled across multiple company platforms such as web servers or databases.
Spring Cleaning- Clean the data set of any errors by finding and using existing data relationships and mapping of data.
Exploratory Data Analysis (EDA)- This is a statistical approach used to analyze sets of data to categorize or summarize common traits across the data.
Data Modeling- Creating a visual of the data and its connections. Data models show the connection of data to business operations.
Testing- Use sample data to test your model.
Present your findings- Show where errors could be reduced in order to improve efficiency and solve the problems found during step 1.
Hopefully, you now have a better idea of what a data scientist's job description entails on a typical day.
Below are the top technical skills specific to data science that a good data scientist should have:
Mathematics- This is considered to be one of the most important skills of a data scientist. Data scientists rely on math for algorithms, data analyses and developing patterns and insights from the data they are investigating.
Statistics- This fundamental skill is needed since data scientists analyze large amounts of data. Sometimes this data is structured and other times it is not. Once they have completed their cleaning and analysis of the data they have to present their findings. These steps define statistics.
Programming- The good news with programing is you don’t have to be an expert programmer but you do need to understand it. Python is the most common programming for data science so having some experience with this is important. Others might argue that R programming is also important but this was built as a statistical language. Python is the better choice for Machine Learning which is another skill in this list below.
Data Wrangling- This skill is the procedure of cleaning data to make it easier to access and analysis.
Software- Having knowledge of SAS and Other Analytical Tools is important as well. These softwares help with data management, analytics, analysis and other important steps in your data science projects.
Database Management- Relational Database Systems are the new modern standard. Creating links and relations between data is a key component of data science and these database systems help data scientists do just that! You should be familiar with SQL which runs database systems like Spark and Hadoop.
Machine Learning- Machine learning is a type of Artificial Intelligence (A.I.). This A.I. automates the building of data models and identifies the data patterns a data scientist needs.
Data Visualization- This skill is telling the story of data and its relationships in a visual or graphic way.
A data scientist skills also should include a variety of non technical or soft skills such as:
Communication- A good data scientist needs effective communication skills. This skill is critical to tell the story about a data set and present problems and solutions to stakeholders.
Structured Thinking- This skill can best be described as the method of breaking down a problem into smaller problems. Then you first start with solving the smaller problems in order to solve the bigger problem.
Curiosity- You have to be curious to understand the big issues of the data you have to ask a lot of "Why" questions.
Business Domain Knowledge- Having business expertise in a certain field like healthcare can help you better understand what stakeholders need.
Collaboration- You will need to be able to work across business units to fully understand the complexity of your data and the “why”.
Continuous Learner- Technology is always changing. The methods used by data scientists are also changing and growing so keeping up with these changes is important to your success.
A data scientist is considered more senior than a data analyst. The data scientist can tell the data story to help predict the future based on past data patterns they have discovered. A data analyst is focused on only one piece of what a data scientist does. This piece is the gathering of insights or understanding of the data patterns.
A data engineer is someone who has a greater expertise in programming than a data scientist. A date engineer maintains databases and software systems. The focus of the data scientist is cleaning and organizing the data to tell a story.
A Data Analyst, Data Engineer, Data Architect, Machine Learning Engineer, Deep Learning Engineer, Business and Intelligence Developer might get confused with a Data Scientist. These careers are similar and a piece of what Data Scientist’s do.
The majority of data scientists have an advanced degree such as a graduate or masters degree. A minimum of a bachelor's degree is required. The most common fields of study of data scientists are mathematics, statistics, computer science and engineering. A PHD degree is not required but could give you an edge.
According to the U.S. Bureau of Labor Statistics’s Occupational Outlook Handbook, the median data scientist salary is $122,840 per year. According to Glassdoor, a data scientist's average salary is $113,353 per year. The low end of salary is $80,000 and the higher end is $161,000.
According to the U.S. Bureau of Labor Statistics, the job outlook for a data scientist from 2019 - 2029 is 15% growth. This is much faster than the average career growth which is why data scientist is considered an “In Demand” career field. This growth means adding 5,000 jobs over the course of 10 years.
“I think no student can ever go wrong with taking computer science courses.”
“If you wanna be a data scientist, take the computer science courses, take the math courses, take the stats courses. Make sure your technical foundation is solid, and beyond that, do what interests you.”
- Jay, Head of Data Science, Operations, & Quality for the Global Patents Organization, Google.
“Take courses that expose you to a wide variety of domains. That has grounding in math, statistics, spend some time doing computer programming and getting familiar with how to leverage computers to solve your problems at a big scale. And then take courses that force you to communicate complex ideas, because that's another key piece to this.”
“I think most of the people who do social sciences, behavioral science, physical science, come out with skills that are very transferable into the data science realm after they do some graduate work.”
- David, Director of Data Science for Infrastructure, Facebook.