Hiring Data Scientists? Here's What You Should Know!
If you're looking to hire a data scientist, here are some tips..
I often meet people, who work in data science wanting to change their current jobs as data scientists in their current companies, asking to join our team.
Whenever I ask why? The usual answer I get is “Because I don’t like doing unnecessary reports!” The answer baffles me, because being a data scientist especially in industry, reporting is part of the job. I observed a little further, and found the main problem is the lack of employer knowledge of what is data science and what does a data scientist do exactly, thus reporting becomes the job, rather than part of it.
Moreover, whenever I meet a new client, they often compare our data science services to PowerBI or Tableau or even their new cutting-edge ERP reports.
“My ERP does this and that, what can you do
The answer is so obvious that I struggle to answer it clearly sometimes. It’s like someone asking you “ Well I have a computer, why do we need software engineers or computer scientists for?”
Data Visualization, or a data analytics tools are as the label suggests, a means, not the end. It does have some analytics automation making most people, even with no background in data science, use it to publish some good reports. It's like me driving a Ferrari. I will make it move, but, if a formula one driver holds the wheel in my place, I am pretty sure the experience will be much different. Let me say this as clearly and as simple as I possibly can:
Data Scientists are NOT about dashboards or Reports!
This causes so much confusion to so many employers, ending up hiring data scientists for the wrong reasons, disengaging the data scientists and disappointing the employer of the outcome. A Data Scientist is someone, who leverages data, to find actionable insights or construct data-driven solutions. Interestingly, this is rarely done through a dashboard, though, it can be a powerful illustrator.
To make this more tangible, take this simple example, a dashboard would tell you that you’re traveling at 120Km/hour. A data scientist would tell you that:
Speed = Distance / Time and that your car's Average Fuel Consumption = 50 Liters/100Km
So, if you want to reach Alexandria, 250 Km away, you will need 2 hours and 125 liters of fuel.
This is an answer , can consequently be the basis of so many decisions, such as if we need to refuel Or plan a trip schedule accordingly. Of course, such is not the case in a real data science.
Usually data scientists deal with an immense number of variables, such as Sales, Marketing Spent, Customer demographic, product features, dates, macroeconomics data, employee’s data, and many more, utilizing a variety of statistical modeling techniques and building algorithms to answer a business question. Now, an answer can be visualized with a dashboard, but even then, a hurdle arises that is usually overlooked, which is translating the insights into action. An essential step for which many businesses fall short.
Also, most dashboards are built based on “Count of Records” frame. That is, how many sales did we achieve? How many conversions did we get? They’re built with the user knowing what they are looking for. Often, the transformative gems and insights are hidden deep in the data, and requires lots of data mining, engineering and modeling to find the answer that can really put a business in a competitive advantage.
The market nowadays is flooding with data science tools, but again these tools might prove to be a wrong investment, if not used correctly. A Data science tool can make a data scientist life easier with the automation it provides for analysis and model building. Though, it can be very counterproductive if the person using the tool, is not familiar with what is a model, how to validate it and deploy it.
What should be the scope of a data scientist?
A data scientist can describe, predict and prescribe (answering the whys and so whats) actions and outcomes. They should be able to answer tough business questions. Further, they can build data-driven optimization models, using Machine Learning and Artificial Intelligence to automate, and enhance processes. Such as energy utilization, or auto-customer segmentation and clustering them into different workflows like credit scoring for example.
As mentioned, data scientists are problem solvers, their scope should be as such. A data science project, usually should start with a question, such as, how can I grow my sales 10%? How can I lower my operational costs? How can I sell my product in market X? Where will the planet Mars trajectory be after 10 Years?
Sometimes I'd like to think, that a data scientist job within a business context is finding the first Friedmann equation. It is the equation, based on what is in the universe, its expansion rate will change over time. If you want to know where the Universe came from and where it's headed, all you need to measure is how it is expanding today and what is in it. This equation allows you to predict the rest.
Same goes for the business, finding an equation that best describes how the business works and expands. The closest example, is the concept of the Growth Equation, famous amongst growth marketers.It is basically the equation that describes how a business makes revenue. For example, the following is the Amazon Growth Equation:
This is a pretty long post, and it's cold. Here's a cute, warm pug, to add some flavor. You're almost there :)
What should be the skills of a data scientist?
This is a bit of a tricky question, as any data science project would require a fusion of skills, rarely found in one individual. The following are the dominant, and most essential skills for any data science project:
Business Acumen/ Domain Expertise, just knowing how to ask the right question is an integral part, if not essential to any data science project success.
Assorted Mathematics and Statistics, and this is a really important skill, and sometimes this is confused with just being able to code, copy and pasting algorithms. Math is the basis of every algorithm, it allows the data scientist to be abel to choose, and tune the best algorithm for every case. Not to mention, diagnose and critically think about an algorithm when it should be working and formulating the desired predictions, but it's not (This is very frequent)
Computer Science, ultimately a data scientist will use code to build their models and algorithms. For that matter, knowing how to code, understanding computing complexity and model a problem or a solution into code is crucial. Though, being able to code is Not data science. Also, most data science project nowadays requires machine learning and artificial intelligence to make more robust , accurate , and automate predictions & results.
Software Engineering and IT, eventually a model will be deployed in a production environment. Understanding the different development frameworks, integration, databases is a vital step for a successful data science project. Further, with the famous term, Big Data, which is the concept of mining, processing and integrating vast amounts of unstructured data, with many data types for fast processing and analysis, it requires great familiarity and experience of using Big Data frameworks.
Business Translators, all of the above skills prescribe the medicine. Whether the patient takes it or not, is reliant on the business translator. Now, business translators are able to quickly frame data science projects and transform the insights generated into action and ensure the incremental or transformation have occurred. This role can sometimes be filled by the domain expert. Of course, this is one of the most overlooked positions in data science teams.
As it is apparent, it’s extremely rare to find all of the above skills in just one person. They do exist, but they are very tough to hire. So how do you overcome such an obstacle? By building a data science team, rather than hiring a data scientist. It is more work, and might seem like a big investment, but it is definitely worth it.
Defining the Data Science Problems?
To define a Data Scientist scope, first of all, really come to terms of what you actually want of the data scientist? A good rule of thumb, if you want to see some dashboards, or build a glorified excel sheet that can automate certain calculations, you don’t really need a data scientist.
A data scientist is a problem solving mindset as mentioned, so, The Data Science Problem definition is the most important step in hiring a data scientist.
Here are some examples of problems that could be solved with Data Science.
A good data science problem is an optimization problem, in which one is trying to find the optimal balance between two variables.
How can I best Allocate my merchandise on all my branches to maximize sales?
What is the best price point to increase my revenues?
Also, another problem type could be what we define as a recommendation problem?
Which product feature could be upsold to which customers to increase sales?
What products could be bundled together to liquidate more cash?
There is also the scoring problems, in which you can use machine learning to score customers, or candidates for sales, or hiring.
Which customers are most likely to buy?
Which customers can be eligible for a promotion, loan or other special offers?
These are some of the problems a data scientist or a data science team can be most effective in. This is of course, is much different from just reporting dashboards or excel functions.
Certainly, after framing the problem, and building the algorithm, dashboards for monitoring and tracking performance could be commissioned accordingly, though, definitely it’s not the end, it's always the means.
Founder & CEO of Synapse Analytics on how AI can be used in Retail. Synapse is a data science and AI company based in Egypt. Synapse Analytics