The course syllabus contains changes
See changesCourse syllabus adopted 2020-02-20 by Head of Programme (or corresponding).
Overview
- Swedish nameTekniker för storskalig datahantering
- CodeDAT346
- Credits7.5 Credits
- OwnerMPDSC
- Education cycleSecond-cycle
- Main field of studyComputer Science and Engineering, Software Engineering
- DepartmentCOMPUTER SCIENCE AND ENGINEERING
- GradingTH - Pass with distinction (5), Pass with credit (4), Pass (3), Fail
Course round 1
- Teaching language English
- Application code 87112
- Maximum participants100
- Block schedule
- Open for exchange studentsNo
- Only students with the course round in the programme overview.
Credit distribution
Module | Sp1 | Sp2 | Sp3 | Sp4 | Summer | Not Sp | Examination dates |
---|---|---|---|---|---|---|---|
0119 Examination 4 c Grading: TH | 4 c |
| |||||
0219 Written and oral assignments 3.5 c Grading: UG | 3.5 c |
In programmes
- MPALG - COMPUTER SCIENCE - ALGORITHMS, LANGUAGES AND LOGIC, MSC PROGR, Year 1 (elective)
- MPDSC - DATA SCIENCE AND AI, MSC PROGR, Year 1 (compulsory elective)
Examiner
- Graham Kemp
- Professor, Data Science and AI, Computer Science and Engineering
Eligibility
General entry requirements for Master's level (second cycle)Applicants enrolled in a programme at Chalmers where the course is included in the study programme are exempted from fulfilling the requirements above.
Specific entry requirements
English 6 (or by other approved means with the equivalent proficiency level)Applicants enrolled in a programme at Chalmers where the course is included in the study programme are exempted from fulfilling the requirements above.
Course specific prerequisites
At least 15 credits in programming and at least 7.5 credits in databases, e.g. TDA357 Databases.Aim
The aim of this course is to deepen the students' knowledge and skills and familiarize them with the technical and technological side of data science, including relevant data models, and software respectively hardware environments.Learning outcomes (after completion of the course the student should be able to)
On successful completion of the course the student will be able to:Knowledge and understanding
- discuss important technological aspects when designing and implementing analysis solutions for large-scale data,
- describe index structures and discuss their utility,
- describe data models and software standards for sharing data on the web.
- implement applications for transforming and analyzing large-scale data with appropriate software frameworks,
- provide access and utilize structured data over the web with appropriate datamodels and software tools.
- suggest appropriate computational infrastructures for analysis tasks and discuss their advantages and drawbacks,
- discuss mechanisms for concurrency and recovery in database systems,
- discuss the efficiency of query plans,
- discuss large-scale data processing from an ethical point of view.
Content
In particular, the course will include
- an overview of computer architectures, algorithmic approaches, and high-performance computing infrastructures with a focus on limitations for processing large-scale data,
- an introduction to relevant frameworks for cluster computing with large-scale data,
- implementation of data analysis tools on a cluster using Python and appropriate software frameworks,
- index structures, query processing and optimisation; concurrency, recovery,
- an overview of non-relational database technologies,
- semantic web and related technologies,
- an overview of ethical questions regarding large-scale data, e.g. with respect to licenses, accessibility, and anonymisation.
Organisation
Lectures, computer lab sessions, and exercise sessions.Literature
Course literature to be announced the latest 8 weeks prior to the start of the course.Examination including compulsory elements
The course is examined by an individual written exam carried out in an examination hall, as well as mandatory written assignments, some of which will be carried out individually and some of which will be carried out in groups of up to 4 students. There will be non-obligatory individual assignments which grant bonus points for the written exam. These bonus points are valid for the whole academic year.The course syllabus contains changes
- Changes to course rounds:
- 2020-11-05: Max number of participants Max number of participants changed from 30 to 100 by PA
[Course round 1] - 2020-01-13: Examinator Examinator changed from Alexander Schliep (schliep) to Graham Kemp (kemp) by Viceprefekt
[Course round 1]
- 2020-11-05: Max number of participants Max number of participants changed from 30 to 100 by PA