2 Introduction

This chapter will cover some of the what and why behind the Data Science Practice course.

2.1 Motivation

The Master of Data Science and Innovation program takes students from a wide range of backgrounds and experiences, and focuses on developing students to become leaders across the whole spectrum of data science. Until now, the core courses have been a mix of essential technical knowledge (statistics, machine learning) and data-focused professional skills (narrative, project management, innovation, decision making). With an increasing number of graduates entering the workforce it has become clear that there is a technical gap in the MDSI offering, and that students (especially those without a computing background) needed a new course to equip them with the practical skills required for “Data Scientist” roles across most industries.

This course has been developed to prepare students for their first day on the job, and aims to build on the UTS reputation for producing job-ready graduates with experience in modern, relevant technologies.

2.2 Style

This course will cover key skills and technologies which form part of many data scientist roles in Australia in 2019. The selection of which skills and technologies to include in the course has been made based on consultation with practicing data scientists and review of job advertisements for Data Scientist positions.

This course will (at times) present strong opinions about the practice of data science, and this is intentional. These opinions will be based on observations of the skills that make people successful in Data Scientist roles, however they are necessarily going to provide a one-sided view of life as a practicing Data Scientist. Students who want to get the most out of this course should make efforts to build their own networks within the data science community and seek out multiple opinions about the skills required to practice data science effectively.

For topics where effective teaching materials are already available online, the course will encourage students to use these existing materials rather than rewriting a (likely inferior) set of notes. In particular this means that a number of topics will be taught using DataCamp and students will be provided with access to the service to enable this. These external modules are compulsory, but will not form part of the assessment.

Where time and budget permits, we will seek to bring external practicing Data Scientists into the discussion for guest lectures and discussion sessions to help provide alternative opinions and reinforce commonalities.

2.3 Format

The course will be delivered over four Saturday classes of 8 hours duration, spread out across the semester. The three assignments will also be spread across the semester, but we will endeavour to provide flexibility where there is an opportunity for students to use major projects from other subjects as a case study for Data Science Practice. For example if a student was working with code as part of a group assignment in another course, and that project was a suitable way to demonstrate the required learnings for Assessment 1, we would consider flexibility in the DSP due dates to allow that student to apply their learnings to a real project rather than a synthetic one. Equivalently, this may apply to projects in the workplace, or iLab projects.

As far as practical, all course materials will be available to students without charge. These course notes are released under a Creative Commons license and will remain free and available following the course. Course readings will preference free material, and most online services used in teaching the course (Bitbucket, GitHub, DataCamp, etc) are available free using student licenses.

Where practical, the course will be delivered directly from these course notes to minimise the number of places students need to search for information. Where slide decks are required, including from external speakers, they will be provided to students as soon as possible.

2.4 Questions

These course notes will evolve over time in response to student questions. Please submit questions about the course materials as Bitbucket Issues against the course notes repo. You can also ask questions directly within the course notes using Disqus comments (they should appear below) although you will need to create an account with Disqus to do this.