Managing, curating, and assuring the quality of data is an essential part of the data scientist’s role. This module will introduce students to the central concepts and tools needed for the effective management of data. Topics covered will include database systems, data cleansing, quality assurance, data provenance. The module will also address legal and ethical issues associated with data storage, including the requirements of data protection legislation.
This course introduces analytical and computational tools for mathematical optimization. Upon successful completion of this course you should be able to:
Design and implement databases using appropriate technology.
Work effectively with existing datasets and databases.
Develop and apply a set of quality criteria for a dataset, using appropriate data cleansing tools and techniques to curate the data.
Understand and be able to apply the core principles of relevant legal frameworks to ensure that data is being held in accordance with relevant legislation.
Mondays and Tuesdays 4:00pm-4:50pm. My office hours will take place in my office: School of Computer Science, room 215.
Assignments. Students are allowed, and even encouraged, to discuss assignment briefs but not share solutions as this will constitute cheating. Likewise, using ChatGPT or similar software is prohibited by the University regulation and code of conduct and will not be tolerated.
Exam. No collaboration is allowed.
Students are expected to attend all scheduled lectures. You are encouraged to attend all the labs and participate to the in-class activities.
All module materials can be downloaded from the links below, and are organised week by week.