Introduction to digital resources

This course aims to provide PhD students with the tools to make more effective use of digital resources and techniques for facilitating research and research quality. The curriculum will treat tools, tips, and tricks that are industry standards both within and outside research. These topics will be taught by experts at Chalmers e-Commons, Chalmers digital research infrastructure.

Every topic is taught as a separate module and takes a full day. Every module is taught on the introductory level, where the subject is introduced and the student is helped over the first hurdle to familiarize themselves with the topic.

Course information

This course aims to provide PhD students with the tools to make more effective use of digital resources and techniques for facilitating research and research quality. The curriculum will treat tools, tips, and tricks that are industry standards both within and outside research. These topics will be taught by experts at Chalmers e-Commons, Chalmers digital research infrastructure.

Every topic is taught as a separate module and takes a full day. Every module is taught on the introductory level, where the subject is introduced and the student is helped over the first hurdle to familiarize themselves with the topic.

The total number of modules is 12, and 10 modules are required to pass the course.

If you would like to formally include the course in your graduate study program you need to discuss this with your supervisor. 

Planning

This course will take place in weekly modules over the course of the fall of 2024. Every module lasts a full day, and the first one will be taught on Friday 20th of September. The last module will happen on Friday December 6th. 

For more information, and signing up, contact Leon Boschman (leon.boschman@chalmers.se).

Modules

This course consists of 12 modules.

Unix shell (Bash)

The Unix terminal is a powerful tool used often in computational science. Most supercomputers will only allow for a terminal environment, and no graphical user interface. Having experience in using the terminal is essential for working with HPC facilities. It is also necessary to run slightly more advanced analysis scripts. 

At the end, students will know some basic commands in Bash, and how to navigate using a terminal.

Basic Python

Python is an all-round programming language widely used in the field of data analysis, scientific computing, and machine learning.  

This module aims to teach students the basics of Python and help them set up a Python environment on their own laptop.  

They will do this in a beginner-friendly Jupyter Notebook environment.  

At the end, they can write and execute a small notebook that uses functions, lists, dictionaries, etc.  

Structured data analysis

In this module students will learn how to write a re-usable data analysis in Python. They will learn the basics of pandas, numpy, and scipy to work with tabular and numerical data. Moreover, they will learn the basics of creating a reusable workflow.  

Data visualization

Students will be taught different data visualization strategies. They will learn about which visualizations to use, and how to make visualizations accessible to people with colorblindness.  

They will also learn how to make visualizations in python using industry standard plotting libraries.

High performance computing (HPC)

In this module students will learn the difference between computing on their laptop and on a high-performance computing cluster.  

They will learn about strategies to make optimal use of HPC clusters, and how they might use HPC in their own research project.

Research data management

Students will learn about the best practices of research data management. This will include an approach to making research data FAIR (findable, accessible, interoperable, reusable). They will also learn about GDPR compliance and data life cycle management.   

Version control & collaboration

Students will learn about git version control, which is a decentralized version control system widely used for source code and other plain-text files. Additionally, they will learn about version control in a collaborative setting where multiple researchers work on the same files.

Writing readable code

Here we discuss how students can ensure that the code they write is easily readable, with a clear and easy-to-follow logic. This will help in getting consistent results from data analyses.

Using Python Notebooks for communication

Notebooks are a great tool to communicate science and science results. In this module we will teach students how to use the interactive capabilities of Jupyter notebook as an effective means of communication.

Digital project management

In this module the students will learn how to effectively work on digital projects, specifically tailored towards scientific data analysis. We will discuss the use of software versioning, effective collaboration, and ensuring that the project can be taken over by colleagues.

Introduction to machine learning & AI

We discuss the basics of machine learning and AI, and what different kinds of problems can be solved using these techniques. We will also discuss how these methods could be used in their own fields.  

Ethics of AI

The use of AI, and especially generative AI, comes with a host of ethical dilemmas. We will make students aware of these dilemmas, and discuss how they apply to their own work.