Michael Siers
blog-30-March-18

Why Breadth of Knowledge is so Important as a Data Scientist

Jack of all trades, master of some.

We must strike a balance between depth and breadth of knowledge in order to maximize our creative potential.
Frans Johansson The Medici Effect

Here's an oversimplified view of how data scientists can derive business value. It requires the ability to do 2 tasks - A) use data science tools and B) be able to identify places to use them. In fact, the more time I've spent in this field, the more I've realised that 80% of the business value I've derived has been through the latter. This is because as the data scientist, it's your responsibility to identify & communicate opportunities for applying data science to business problems. This post briefly describes the underlying concept that both tasks heavily rely on: Breadth of Knowledge.

Breadth of Knowledge

As mentioned in the introduction, I believe 80% of business value is derived from identifying avenues to apply data science tools to business problems. Actually, not just identification, but communication of these ideas. The identification will come naturally to a data scientist who has a diverse set of projects under his/her belt. For example, consider that you've done projects in the following applied machine learning areas:

  • Time series analysis
  • Customer segmentation using clustering
  • Market basket analysis

Now consider that you're working as a data scientist at a company. You will likely identify opportunities to apply these tasks at work. However, you may be missing out on other applied machine learning areas such as:

  • Anomaly detection
  • Knowledge discovery using decision trees
  • Customer churn prediction
  • Demand forecasting
  • Margin forecasting
  • Self-service bots
  • Document classification
  • Intelligent data auditing
  • ... and many more...

In summary, the more applied machine learning (ML) and data science (DS) that you've been exposed to, the more opportunities you are going to identify at work. As a result, you'll derive more business value. So the question becomes - How do I get exposure to these areas? Well, doing online courses and pet projects is good, but in my experience I've had spectacular success with another approach. Read case studies, blog posts, forum discussions, etc. Eventually you are going to hear about areas of applied ML or DS that sound really interesting. For example, text mining was something that I've recently been getting really interested in but I'd never worked on a full-scale text mining project. I skimmed some information about text mining such as the wiki page, some 5 minute youtube videos, a couple of blog posts. I asked about what sort of text data we had at work. After finding some, I communicated that there was area for improvement using text mining techniques. I bought and read a book (20 minutes a day at work) & next thing I knew I'm working on a text mining project. After finishing the project I fully intend to rinse & repeat the process, building up my exposure to different areas of applied ML & DS.

Final Notes

Want to derive as much DS & ML business value as possible? Get exposure to as many areas of ML and DS as you can. It's your responsibility as a data scientist to introduce DS & ML concepts to your company where value can be derived.