Essential Skills for Career Success in Data Science
What skills does it take to be successful as a Data Scientist? We get asked this question many times a day from aspiring analysts, engineers, economists, and those that are a late stage with their formal education. It seems that there are a tremendous amount of highly computer literate professionals and academics looking to identify the full compliment of pre-requisite skills needed, whilst seeking to identify a point of entry that will enable them to establish a career as an authentic data scientist.
In this article we will outline what skills you will need to pursue for a career in data science, and how companies and organisations today are shaping their analytics programmes around this ever-evolving field.
Despite the headline demand from industry and commerce seeking to attract and hire data scientists, the degree of competition for highly coveted roles with multinational brand names and leading organisations is fiercely competitive. Those aspiring data scientists that have a passion for solving problems, combined with a commitment to high-level learning will establish an early stage advantage over the competition.
Getting down to basics, there are a huge array of skills that you will need to acquire in order to self-identify and receive recognition as a bona fide ‘data scientist’; for a start you will need some core programming skills in order to manipulate and hack data, for without fundamental programming skills you cant expect to go beyond rudimentary ‘point and click’ exercises, simple visualisations, and constructing basic models. Make no mistake, high level programming skills are essential for extraction, ETL (Extract, Transform and Loading big data to a Data Warehouse), without which you cant deep dive the extreme depths of databases to extract and manipulate data for precise insight.
You will need robust database management skills that go beyond Relational Database Management Systems (RDBMS). You should be familiarising yourself with NoSQL and the four different flavours of Key-Value, Column Oriented, Document Stored, Graph Based data management that you can fit according to requirements.
Add to this you need well-developed statistics skills. In order to competently use machine learning frameworks such as TensorFlow and Apache Spark to analyse big data its essential to develop an understanding of the statistical theories behind them. This brings about the study of statistical learning, a theoretical framework for machine learning drawing from the fields of statistics and functional analysis. It is necessary to understand the ideas behind the various statistical techniques to enable you to know how and when to use them. You should familiarise yourself with Linear Regression, Classification, Resampling Methods, Subset Selection, Shrinkage, Dimension Reduction, Nonlinear Models, Tree-Based Methods, Support Vector Machines, and Unsupervised Learning.
You will also need mathematical programming simulation optimisation skills, backed by advanced maths and algebra. If this is something that you are looking to brush up on, the easiest approach is to tackle linear algebra and calculus by setting them to work on actual algorithms. To triangulate the 3 maths components required for machine learning and data science you will need to learn Linear Algebra (matrix algebra and eigenvalues), Calculus (derivatives and gradients), and Gradient Descent (design, build and implement neural networks).
Last, but by no means least comes the skill in how to place all of this together in either a problem or opportunity context. Being able to explain it, which takes communication, combined with soft skills which places a halo around the core, which is sometimes the missing link when it comes to companies securing the ‘finished article’ - a 360° Data Scientist.
At graduate level its a very exciting field to get in to, for a successful career in data science you need to plan for success, and to bring something to the field that you care about, not just looking for a job, there needs to be a problem, opportunity, something that needs inventing that you deeply care about, that you can apply the skillset to and roll it out in to the world, to hiring managers that you are interviewing with, to demonstrate that you are someone that is above and beyond someone that runs code and builds models, its important to about something that really impresses you and then combine it with the skillset to show how you will implement your ideas. It is crucial to differentiate yourself from every other student that is graduating alongside you from highly credited schools and programmes, bring something personal.
The key message is to ‘find an angle’ and start tackling a problem, which could be open source collaboration projects, look for solutions to problems that you are passionate about, for as long as this is your foundation for pursuing a career in data science, doors will open and opportunities will present themselves to you.