I recently came across this job description for a data scientist (anonymized to protect the innocent)
Translate business requirements into machine learning product.
Architect and Build machine learning software products for our core business.
Communicate with business owners the strategy and rationale, define and execute a project plan
Lead all aspects of ML automation including model training and development, feature selection and model tuning.
Develop production ML and data pipelines. Develop production code and ship to production environments.
While it is very clear why such a person would be hugely valuable to an organization, I know of very few individuals who would meet these requirements. It brings to mind an analogy between a software developer and a software architect. A software developer is typically a much more prescribed job function, defined by a specific development skill set and years of experience implementing certain types of software stacks. An architect, on the other hand, is a jack of many trades, equally able to perform deep technical tasks, do design, understand the business, communicate effectively with everyone from business owners to low-level developers, and be part-time strategist, mentor and evangelist. In the software world, this distinction is well understood. There are many software developer jobs but far fewer software architect jobs and managers and recruiters all understand the complexity of hiring one versus the other. It is also very hard to know if someone is good at such a complex role unless you have worked with them before. In the data science world, however, similar distinctions are still being developed. The net result is both difficulties hiring a data scientist and substantial turnover within the job itself.
Working in the AI field for the last 5 years, I have had my share of experiences with successful and unsuccessful data scientist hiring. One particularly grating scenario is when a senior data scientist joins the team and after much struggle, the team and the data scientist are forced to part ways. The data scientist either cannot or does not want to write production code. The software developers are frustrated by the poor code quality, and the data scientist feels their algorithmic value adds are not appreciated. Other frustrations expressed by data scientists include being pigeonholed into data processing jobs where they cannot do the ML they enjoy, or not being able to relate to the (software and other) counterparts on their teams.
Given some of these experiences, I have adopted the following process for hiring data scientists. Hopefully, it will help others along the same journey.
Four Stage Skill Evaluation
I break the data scientist skill set into the following buckets:
Do they understand the algorithms and underlying mathematics?
Have they ever used the algorithms to solve a practical problem with real data?
Have they ever built production code in an ML environment? Have they ever had to build an algorithm or a part of a system that executes one (like Spark, TensorFlow, etc?)
Can they read and understand (and apply) improvement from a research paper on an algorithm?
The first thing to understand as a manager is which subset of these four skills you need. People with all four are unicorns, and unless you are Facebook or Google, hard to hire in bulk. For my purposes, I have found that people with 2-3 of these skills work really well. For your environment, you may need even less. The fewer the skills you actually need, the more likely you can get someone good who will fit well within your team.
What do They Want to Learn?
Given how ill-defined data science tasks still are, it is important to get a crystal clear understanding of what your candidate is looking to accomplish career-wise and make sure the job experience you are offering is a win-win. This is particularly true if you are hiring someone who scored three or more in the checklist above. Particular items I look at are:
If the job I am offering includes coding, do they want to code, or do they see it as a necessary evil? For roles with code, I generally favor people with even limited programming skills who have a desire to learn more code, over more advanced coders who clearly do not enjoy that aspect of the task.
Do they want depth or breadth? Some would like to explore advanced algorithms. Others want to see their work in real-world use. Some want to learn more about computer science, etc.
How do they envision growing in the role? Do they want to learn and work with more advanced ML? Do they want to become a team lead? Do they want to have product impact? Can you support their growth ambitions?
Whom Will They Work with (Also What do They NOT Need to Know)?
Unless you intend to hire someone with all four skill sets above (and sometimes even then), it is important to pair the data scientist with other key roles for everyone's success. Does your team have an architect that can help define and optimize the use cases? Is there a product owner the data scientist can work with to help him/her get their work into real-world use?
This is also a good way to identify what complementary skills exist in your organization that your data scientist does not need to bring to bear. For example:
Is domain knowledge a requirement for your candidate or can you complement him/her with a team member who can help there?
If your data scientist came from a statistics background and does not know software practices all that well (but wants to learn), can you train them?
This is the flip side of identifying bare minimum needs above; it can create a clear win-win, where there is growth for the candidate and a less competitive hiring field for you as a manager.
Other skills, like clear communication, are critical in this and every role. For a data scientist, however, it is very important to be as clear as possible on what you need in the role, and make sure that the path you envision matches that of your candidate and that it is a win-win for both.
The role of data scientist is bifurcating, dividing into subfields such as ML researcher, ML engineer, Decision Scientist, or Decision Intelligence Engineering. In each case, the goal is to try to isolate the specific mix of mathematics, practical use and coding skill required. As these roles become more clearly defined, it should become easier to generally match data scientists to roles. Till then, each manager needs to do the best individual match possible.
Nisha Talagala, contributor, is Co-Founder, CTO and VP, Engineering at ParalelM. Nisha has more than 15 years of expertise in software development, distributed systems, I/O solutions, persistent memory, and flash. Prior to ParallelM, Nisha was a Fellow at SanDisk and Fellow/Lead Architect at Fusion-io, where she drove innovation in non-volatile memory, in particular the industry’s first persistent memory solution. She was technology lead for server flash at Intel – where she led server platform non-volatile memory technology development, storage-memory convergence, and partnerships. Before joining Intel, Nisha was the CTO of Gear6, where she designed and built clustered computing caches for high performance I/O environments. Nisha earned her PhD at UC Berkeley with research on software clustering and distributed storage. Nisha holds 59 patents in distributed systems, networking, storage, performance and non-volatile memory and serves on multiple industry and academic conference program committees.