Introduction to digital resources

The course provides PhD students with practical tools to effectively use digital resources and techniques to support research and high research quality. The content includes tools, tips, and methods that are industry standards in both academia and industry. The instruction is delivered by experts from Chalmers e-Commons – Chalmers' digital research infrastructure.

The course consists of twelve separate modules, each covering a specific topic over the course of a full day. The modules are taught at an introductory level, focusing on providing an overview and helping participants take the first steps into each subject area. To pass the course, participation in at least ten of the twelve modules is required.

If you wish to formally include the course as part of your doctoral studies, this must be discussed with your supervisor.

Planning

The course is delivered in weekly modules during the autumn term of 2024. Each module consists of a full day, starting on Friday, 20 September. The final module takes place on Friday, 6 December.

For more information, and signing up, contact Leon Boschman (leon.boschman@chalmers.se).

Modules

The course consists of the following 12 modules:

1. Unix Shell (Bash) 

The Unix terminal is a powerful tool used often in computational science. Most supercomputers will only allow for a terminal environment, and no graphical user interface. Having experience in using the terminal is essential for working with HPC facilities. It is also necessary to run slightly more advanced analysis scripts. 

At the end, students will know some basic commands in Bash, and how to navigate using a terminal.

2. Basic Python

Python is an all-round programming language widely used in the field of data analysis, scientific computing, and machine learning.  

This module aims to teach students the basics of Python and help them set up a Python environment on their own laptop.  

They will do this in a beginner-friendly Jupyter Notebook environment.  

At the end, they can write and execute a small notebook that uses functions, lists, dictionaries, etc.  

3. Structured data analysis

In this module students will learn how to write a re-usable data analysis in Python. They will learn the basics of pandas, numpy, and scipy to work with tabular and numerical data. Moreover, they will learn the basics of creating a reusable workflow.  

4. Data visualization

Students will be taught different data visualization strategies. They will learn about which visualizations to use, and how to make visualizations accessible to people with colorblindness.  

They will also learn how to make visualizations in python using industry standard plotting libraries.

5. High performance computing (HPC)

In this module students will learn the difference between computing on their laptop and on a high-performance computing cluster.  

They will learn about strategies to make optimal use of HPC clusters, and how they might use HPC in their own research project.

6. Research data management

Students will learn about the best practices of research data management. This will include an approach to making research data FAIR (findable, accessible, interoperable, reusable). They will also learn about GDPR compliance and data life cycle management.  

7. Version control & collaboration

Students will learn about git version control, which is a decentralized version control system widely used for source code and other plain-text files. Additionally, they will learn about version control in a collaborative setting where multiple researchers work on the same files.

8. Writing readable code

Here we discuss how students can ensure that the code they write is easily readable, with a clear and easy-to-follow logic. This will help in getting consistent results from data analyses.

9. Using Python Notebooks for communication

Notebooks are a great tool to communicate science and science results. In this module we will teach students how to use the interactive capabilities of Jupyter notebook as an effective means of communication.

10. Digital project management

In this module the students will learn how to effectively work on digital projects, specifically tailored towards scientific data analysis. We will discuss the use of software versioning, effective collaboration, and ensuring that the project can be taken over by colleagues.

11. Introduction to machine learning & AI

We discuss the basics of machine learning and AI, and what different kinds of problems can be solved using these techniques. We will also discuss how these methods could be used in their own fields.  

12. Ethics of AI

The use of AI, and especially generative AI, comes with a host of ethical dilemmas. We will make students aware of these dilemmas, and discuss how they apply to their own work.