Health Data Lab
The lab was formerly called the Biological Data Processing Systems (BDPS) Lab.
The big data era in molecular biology has created exiting potential for novel biological discoveries, but also exciting challenges for computer scientists in data management, analysis, and interpretation. Especially exciting are the possibilities in combining omics data with medical images and phenotype variables. Therefore, in the next decades there will be developed sophisticated bioinformatics and machine learning methods and framework to analyze and explore the information in the integrated data. However, the dataset sizes and complexity requires the development of novel infrastructure systems, analysis approaches, and data exploration tools targeted for such complex health datasets.
Our goal is to provide the systems, methods, and tools needed to analyze and interpret complex health datasets. Our research interests are threefold. First, build and experimentally evaluate infrastructure systems for bioinformatics and machine learning analyses. Second, apply bioinformatics, statistics, and machine learning methods for novel health data analyses. Third, build and evaluate data exploration and interpretation tools. All our research is interdisciplinary. We therefore combine experimental computer science with real problems, applications, and data obtained from our biomedical research collaborators.
We also contribute to research infrastructure development and operation, commercialization of our research, and many outreach activities.
We participate in several large and small projects.
The Norwegian Woman and Cancer (NOWAC) biobank contains time series with questionnaire data from 170 000 women and more than 60 000 blood samples. The biobank is analyzed using several omics technologies including microarrays, RNA-seq, methylation, and mass spectrometry. The data is being analyzed by the Systems Epidemiology group lead by Professor Torkjel Sandanger at the Department of Community Medicine, University of Tromsø. Our responsibility in the project is to build a backend for standardized data analysis pipelines, machine learning based data analysis, and a system for exploration and visualization of the analysis results. We are using these as building blocks to build a platform for swift exploration of the data under different epidemiological designs.
High North Population Studies
We are members of the strategic initiative High North Population Studies at UiT that combines epidemiological research and computer science to collect, analyze, and utilize the data collected in population studies at UiT. Our contributions are methods to uncover complex cross-level interactions in large heterogeneous population-study datasets, a framework for exploration of metagenomics data integrated with host genomics and phenotype data, and developing and operating infrastructure for bioinformatics analyses on sensitive data.
In the Tromsø Lung Sounds project we are building a database with more than 18.000 lung sound recordings. The recordings are done as part of Tromsøundersøkelsen 7, which is an Epidemiological study that was started in 1974. The database will be used to provide educational and analysis services for lung sounds. Our contributions are methods for automated classification and similarity search for the sounds. This project is done in collaboration with Hasse Melbye at the Department of Community Medicine, University of Tromsø. The results from this project are further developed by our Medsensio AS startup.
In the new Center for New Antibacterial Strategies (CANS) we are responsible for bioinformatics analyses and system biologi.
In the air:bit air pollution project we have developed educational projects for use in Norwegian High Schools. This work is done in collaboration with Skolelaboratoriet i realfag og teknologi. We provide build instructions, programming guides, and a portal for data analysis and live visualization. Air:bit has been used by 13 high-school classes in Northern Norway.
Kodeklubben Tromsø is an after-school club for kids and youth who want to learn about computer programming. Together with volunteers from local tech companies and students from our department, members of our lab organize the activities. Each semester we run a 10-week program with two-hour classes each week. In the spring of 2016 we have over 130 kids from ages 7 to 17 attending the club.
We are collaborating with Associate Professor Kristian Svendsen on analysis of adverse drug effect data, and with The Norwegian Historical Data Centre on transcription of Norwegian handwritten census books.
The lab currently consist of:
|Lars Ailo Bongo||Professor||Principal investigator||Homepage | GitHub | Bitbucket|
|Edvard Pedersen||Associate Professor||Homepage | GitHub|
|Einar Holsbø||Postdoc||Popultation studies in the north||Homepage | GitHub|
|Morten Grønnesby||Ph. D. student||NOWAC||Homepage | GitHub|
|Jo Inge Arnes||Ph. D. student||NOWAC||Homepage | --> GitHub|
|Rafael Nozal Cañadas||Ph. D. student||Population studies in the north||Homepage | --> GitHub|
|Tengel Skar||Master student||Adverse drug effect data||Homepage | GitHub|
|Nikita Shvetsov||Scientific staff||Population studies in the north||Bitbucket|
Former lab members are:
|Tengel Ekrem Skar||Master student, 2019, CS, UiT||Thesis: Scalable exploration of population-scale drug consumption data and Source code.|
|Mayeul Marcadella||Technical staff||ELIXIR.|
|Aleksandr Agafonov||Technical staff||ELIXIR.|
|Dr. Bjørn Fjukstad||PhD student, 2018, CS, UiT||Thesis: "Toward Reproducible Analysis and Exploration of High-Throughput Biological Datasets." and source code for Kvik and walrus.|
|Tim Alexander Teige||Master student, 2018, CS, UiT||Thesis: Auto scaling framework, simulator, and algorithms for the META-pipe backend and Source code.|
|Nina Angelvik||Master student, 2018, CS, UiT||Thesis: Data management platform for citizen science education projects. Source code: Backend and Frontend.|
|Mike Voets||Master student, 2018, CS, UiT||Thesis: Deep Learning: From Data Extraction to Large-Scale Analysis. Source code: replication study and DICOM anonymizer.|
|Inge Alexander Raknes||Technical staff||ELIXIR.|
|Rigmor Katrine Johansen||Intern, 2017||Bioethics, NOWAC|
|Johan Ravn||Master student, 2017, CS, UiT||Thesis: Detection of Wheezes and Breathing Phases using Deep Convolutional Neural Networks.|
|Dr. Giacomo Tartari||Technical staff||ELIXIR|
|Frode Opdahl||Master student, 2016, CS, UiT||Project: Virtual reality.|
|Jarl Fagerli||Master student, 2015, CS, UiT||Thesis: COMBUST I/O. Abstractions facilitating parallel execution of programs implementing common I/O patterns in a pipelined fashion as workflows in Spark. Thesis. Source code.|
|Kenneth Knudsen||Master student, 2015, CS, UIT||Thesis: Freia: Exploring Biological Pathways Using Unity3D. (Thesis, Source Code, Demo)|
|Ove Kåven||Master student, 2015, CS, UiT||Thesis: Multiparadigm Optimizing Retargetable Transdisciplinary Abstraction Language (Thesis, Source code)|
|Ida Jaklin Johansen||Technical staff||Elixir.no|
|Martin Ernstsen||Master student, 2013, CS, UiT||Thesis: Mario - A system for iterative and interactive processing of biological data (Thesis, Source code)|
|Terje André Johansen||Master student, 2011, CS, UiT||Thesis: A scalable, interactive widget library for visualizing biological data (Thesis, Paper)|
|Torje Henriksen||Master student (main-advisor Otto J. Anshus, co-advisor Phuong Hoai Ha), 2008, CS, UiT||Thesis: Efficient intra-node Communication for Chip Multiprocessors|