Course syllabus for Techniques for large-scale data

The course syllabus contains changes
See changes

Course syllabus adopted 2020-02-20 by Head of Programme (or corresponding).

Overview

  • Swedish nameTekniker för storskalig datahantering
  • CodeDAT346
  • Credits7.5 Credits
  • OwnerMPDSC
  • Education cycleSecond-cycle
  • Main field of studyComputer Science and Engineering, Software Engineering
  • DepartmentCOMPUTER SCIENCE AND ENGINEERING
  • GradingTH - Pass with distinction (5), Pass with credit (4), Pass (3), Fail

Course round 1

  • Teaching language English
  • Application code 87112
  • Maximum participants100
  • Block schedule
  • Open for exchange studentsNo
  • Only students with the course round in the programme overview.

Credit distribution

0119 Examination 4 c
Grading: TH
4 c
  • 02 Jun 2021 am J
  • 09 Okt 2020 am J
  • 20 Aug 2021 pm J
0219 Written and oral assignments 3.5 c
Grading: UG
3.5 c

In programmes

Examiner

Go to coursepage (Opens in new tab)

Eligibility

General entry requirements for Master's level (second cycle)
Applicants enrolled in a programme at Chalmers where the course is included in the study programme are exempted from fulfilling the requirements above.

Specific entry requirements

English 6 (or by other approved means with the equivalent proficiency level)
Applicants enrolled in a programme at Chalmers where the course is included in the study programme are exempted from fulfilling the requirements above.

Course specific prerequisites

At least 15 credits in programming and at least 7.5 credits in databases, e.g. TDA357 Databases.

Aim

The aim of this course is to deepen the students' knowledge and skills and familiarize them with the technical and technological side of data science, including relevant data models, and software respectively hardware environments.

Learning outcomes (after completion of the course the student should be able to)

On successful completion of the course the student will be able to:

Knowledge and understanding
  • discuss important technological aspects when designing and implementing analysis solutions for large-scale data,
  • describe index structures and discuss their utility,
  • describe data models and software standards for sharing data on the web.
Skills and abilities
  • implement applications for transforming and analyzing large-scale data with appropriate software frameworks,
  • provide access and utilize structured data over the web with appropriate datamodels and software tools.
Judgement and approach
  • suggest appropriate computational infrastructures for analysis tasks and discuss their advantages and drawbacks,
  • discuss mechanisms for concurrency and recovery in database systems,
  • discuss the efficiency of query plans,
  • discuss large-scale data processing from an ethical point of view.

Content

In particular, the course will include
  • an overview of computer architectures, algorithmic approaches, and  high-performance computing infrastructures with a focus on limitations for processing large-scale data,
  • an introduction to relevant frameworks for cluster computing with large-scale data,
  • implementation of data analysis tools on a cluster using Python and appropriate software frameworks,
  • index structures, query processing and optimisation; concurrency, recovery,
  • an overview of non-relational database technologies,
  • semantic web and related technologies,
  • an overview of ethical questions regarding large-scale data, e.g. with respect to licenses, accessibility, and anonymisation.

Organisation

Lectures, computer lab sessions, and exercise sessions.

Literature

Course literature to be announced the latest 8 weeks prior to the start of the course.

Examination including compulsory elements

The course is examined by an individual written exam carried out in an examination hall, as well as mandatory written assignments, some of which will be carried out individually and some of which will be carried out in groups of up to 4 students. There will be non-obligatory individual assignments which grant bonus points for the written exam. These bonus points are valid for the whole academic year.

The course syllabus contains changes

  • Changes to course rounds:
    • 2020-11-05: Max number of participants Max number of participants changed from 30 to 100 by PA
      [Course round 1]
    • 2020-01-13: Examinator Examinator changed from Alexander Schliep (schliep) to Graham Kemp (kemp) by Viceprefekt
      [Course round 1]