The Top 5 Characteristics of an Elite Data Scientist (They may not be what you think)

James Ramadan
4 min readMar 10, 2022

As a former data scientist, and now someone who works closely with them, I have come to understand what separates the elite data scientists from the mediocre ones.

While mediocre data scientists can do their job based on sufficient technical skills, they usually lack an overall understanding of how their skillset relates to the business, which minimizes their ability to make a meaningful impact.

Here are the top 5 characteristics of an elite data scientist:

1) Understanding Business & Business Needs

Data scientists need to take time to understand the primary business problems before doing data science. And yes, this means stepping away from the data.

Sometimes data scientists are hired by companies without clear problems to solve and told to just “do data science” because data science is an industry trend. Data scientists in this position are at a disadvantage, and they need to proactively take time to identify & define key problems upfront.

Without business understanding, data scientists will be left to explore company data aimlessly and will produce “insights” that may or may not be valuable to the business. Although data scientists can still practice cutting edge techniques on this type of data, by not making an impact on the business, the data scientists reduce their credibility around the organization.

2) Storytelling & Communication

A data scientist needs to be able to communicate in order to succeed. There are several considerations here.

First, a data scientist should always understand the audience, to whom they are presenting. It will help to have answers to questions like 1) How much time do I have to present? 2) How much technical knowledge does this audience have and how technical should I make this presentation? 3) Why is the audience interested in the content of my presentation? 4) Why am I presenting these insights right now? 5) What behavioral changes would I expect to occur following my presentation from these insights?

Presentations should include a high level introduction or overview before jumping right into any relevant technical details that may not be understand by the audience.

Another consideration is the medium of communication. A data scientist should understand when to use a powerpoint vs. a set of data visualizations. If the data scientist is presenting a set of data visualizations, they should make sure they choose the best presentation medium available to them, and clearly explain the story behind the visualizations and their data insights.

3) Skepticism & Ability to Validate Assumptions

Data Scientists should maintain constant skepticism, both about the quality of their data and their data insights. They should seek to identify & validate any assumptions they make during their analyses.

The best way to validate assumptions is to engage key stakeholders. A data dictionary can be a great start to understanding data, but, often times, going directly to the people within an organization, who know the most about the actual data, is the best option. My recommendation is to build this step into the end-to-end data analysis or data science product process.

4) Balance of Being Big Pictured & Detail-Oriented

Being a successful data scientist requires a balance of being big-pictured and detailed-oriented.

Being detail-oriented allows data scientists to execute well in the details, but they should also ensure they have asked enough questions to capture the root cause of their data problems and have started with top priority problems.

Being big pictured helps data scientists understand concepts and prioritize important problems, but they also need to ensure they spend time reviewing the details of their code and presentation content for no critical mistakes.

5) Technical Skills in Math & Coding

Elite data scientists love their craft and take joy in learning the newest technical skills. They are great coders and their math background gives them a great foundation to create new algorithms.

Data Scientists can also unblock themselves if they can understand various ways to store data, e.g. in centralized vs distributed systems, and also how to pull data. The data can be relational, e.g SQL, or unstructured data, e.g text, so understanding how to process various data types will maximize impact.

In regards to model building, feature engineering is the most important skillset, and understanding various model evaluation techniques is also important. Linear algebra knowledge is foundational, and differential equations can help with creating new algorithms.

Finally, pure coding / software development skills also help with settings up environments, e.g. Docker, and integrating into live products, e.g. APIs.

Essentially, a data scientist’s math & technical skills determine their potential, but the other items on this list cannot be neglected. If a data scientist has all 5 characteristics from this list, they can be a rockstar.

Thanks for reading :P

--

--