Scaling Data Science
Being the first Data Scientist (DS) at a startup is exciting, yet comes with a myriad of challenges from navigating data infrastructure and data engineering staffing to balancing proper modeling against ad hoc analytics to translating findings to business partners to proving one’s own worth. Equally challenging is the next step of advocating for scaling the DS team, attracting the right talent and motivating the team, while continuing to strive for rigor. How do you do it with grace, confidence and impact? This post provides an overview of common challenges and approaches to scaling a Data Science team as a startup grows, and tips on how to find confidence in yourself to step up to the challenge and be your own best advocate.
1. Congratulations! You are the first data scientist at a startup. Now what?
Data scientists love rigorous statistical modeling. However, is that what an early-stage startup really needs to rapidly grow its business? Sometimes a quick A/B test or building a dashboard could be significantly more impactful. Ultimately, the question is about the business problem we are trying to solve, not the method. At times, to establish trust and credibility a DS might need to identify and solve fundamental problems such as reliability of A/B tests and experimentation platform as a whole. Having the confidence to go big and disrupt an existing paradigm is at times the best way to build trust and credibility so that you can build that much needed (but only you know that!) statistical model.
2. Woohoo! You proved you know what you are doing! How can you be most effective?
A surprising fact for many experienced data scientists might be that data at startups is often messy. Sometimes, columns are named as colum1, colum2, etc. Sometimes, data abruptly disappears. Sometimes bots and spam activity distort findings. An important step for any early DS team is to spend a lot of time meeting with different partners in Data Infrastructure and Engineering, Sales, Finance, to understand the source of data and the practice of maintaining it. It is important to get a consensus on the source of truth, and only then to go and build the right data pipeline. Documentation and code reviews are also critical to help team members to share their knowledge and avoid repeating efforts.
Here are a few tips:
– Build a strong relationship with different partners and understand their needs
– Provide key stakeholders with a weekly priority list to ensure alignment
– Develop procedures and checklists for recurring efforts, such as A/B tests
– Always ask the five-whys before writing code
– Gauge and present results at levels that business partners require/understand
3. Great work on demonstrating impact! Now how do you scale a DS team?
Given that the data science skill set is in such high demand today, attracting a qualified data scientist is no small task. What is the secret to successfully hiring DS candidates? Once again, demonstrating the impact that scientists are able to deliver, combined with quality mentorship, is one of many effective approaches. This secret sauce is also the solution to effective retention of high-quality candidates.
Finally, building the initial DS team takes significant discipline and understanding of business partners’ needs. Having an initial DS team that is motivated by impact rather than beautiful models is critical as a startup matures. Impact can come from jumping in to build a highly desired dashboard or run/analyze a critical A/B test. Demonstrating this impact to the initial DS team is also important to keep them motivated, but allowing opportunities for self-development through modeling or learning new skills is critical for retention. Balancing the two is more an art than a science.
Once an initial DS team demonstrates its versatility and broad skill set, the case can be made for a larger team. A key consideration here is to parallel scaling of the data infrastructure and engineering teams, as well as quantitative analysts.