Biological Data Processing Systems (BDPS)

Biological Data Processing Systems Lab

The big data era in molecular biology has created exiting potential for novel biological discoveries, but also exiting challenges for computer scientists in data management, processing, and visualization. In the next decades there will be developed sophisticated bioinformatics methods and framework to analyze and explore the information in the data. However, these will require development of novel infrastructure systems targeted for bioinformatics data and methods.

Our research goal is to build and experimentally evaluate infrastructure systems that support the methods under development by our biomedical collaborators. We are designing and implementing systems for big data management and processing, machine learning, interactive analysis, and interpretation. We are primarily interested in improving the scalability and usability of bioinformatics analysis methods and frameworks.

We combine experimental computer science with real problems, applications, and data obtained from our biology collaborators.

Projects

We participate in several large and small projects.

NOWAC

The Norwegian Woman and Cancer (NOWAC) biobank contains time series with questionnaire data from 170 000 women and more than 60 000 blood samples. The biobank is analyzed using several omics technologies including microarrays, RNA-seq, methylation, and mass spectrometry. The data is being analyzed by the Systems Epidemiology group lead by Professor Torkjel Sandanger at the Department of Community Medicine, University of Tromsø. Our responsibility in the project is to build a backend for standardized data analysis pipelines, machine learning based data analysis, and a system for exploration and visualization of the many NOWAC study designs.

We are also collaborating with Professors Lill-Tove Busund and Tom Dønnem from the Translational Cancer Research Group on multi-level and multi-tissue analysis and clinical use of data from NOWAC.

ELIXIR

ELIXIR is a large-scale European project to construct and operate a sustainable infrastructure to support life science research. One of the responsibilities of the Norwegian Elixir node, Elixir.no, is marine genomics. In the Norwegian node the Center for Bioinformatics at the University of Tromsø is responsible for bioinformatics services and research in marine metagenomics. Our research is focused on building infrastructure systems for metagenomics analysis pipelines. We are focusing on building and experimentally evaluating cloud-based infrastructure systems that provide more scalable, flexible, and portable data processing.

We participate in the Marine metagenomic infrastructure as driver for research and industrial innovation use case project in the ESFRI ELIXIR-EXCELERATE infrastructure project were we develop the META-pipe data analysis service for ELIXIR users and Norwegian users, and the Marine Metagenomics Portal that provides marine reference databases. We were a partner in the ELIXIR Pilot Action Marine metagenomics pilot – towards domain specific service.

The Tromsø Study and the Population Studies in the High North Initiative

In the lung sounds project we are building a database with more than 18.000 lung sound recordings. The recordings are done as part of Tromsøundersøkelsen 7, which is an Epidemiological study that was started in 1974. The database will be used to provide educational and analysis services for lung sounds. Our contributions are methods for automated classification and similarity search for the sounds. This project is done in collaboration with Hasse Melbye at the Department of Community Medicine, University of Tromsø. The results from this project are further developed by our Medsensio AS startup.

We are also members of the new strategic initiative "Population Studies in the High North" at UiT that combines epidemiological research and computer science to collect, analyze, and utilize the data collected in population studies at UiT.

air:bit

In the air:bit air pollution project we have developed educational projects for use in Norwegian High Schools. This work is done in collaboration with Skolelaboratoriet i realfag og teknologi. We provide build instructions, programming guides, and a portal for data analysis and live visualization. Air:bit has been used by 13 high-school classes in Northern Norway.

Other

Kodeklubben Tromsø is an after-school club for kids and youth who want to learn about computer programming. Together with volunteers from local tech companies and students from our department, members of our lab organize the activities. Each semester we run a 10-week program with two-hour classes each week. In the spring of 2016 we have over 130 kids from ages 7 to 17 attending the club.

We are collaborating with Associate Professor Kristian Svendsen on analysis of adverse drug effect data, and with The Norwegian Historical Data Centre on transcription of Norwegian handwritten census books.

COST Action IC1406 - High-Performance Modelling and Simulation for Big Data Applications (cHiPSet). We are in the management committee representing Norway.

People

The lab currently consist of:
Lars Ailo Bongo Professor Principal investigator Homepage | Github | Bitbucket
Einar Holsbø Ph. D. student NOWAC Homepage | Github
Bjørn Fjukstad Ph. D. student NOWAC Homepage | Github
Morten Grønnesby Ph. D. student NOWAC Homepage | Github
Tengel Skar Master student Adverse drug effect data Homepage | Github
Aleksandr Agafonov Technical staff Center for Bioinformatics, Elixir Github
Nikita Shvetsov Technical staff Biobank Norway Bitbucket
Mayeul Marcadella Technical staff Elixir Github
Jon Ivar Kristiansen Technical staff Center for Bioinformatics, Elixir Homepage

Former lab members are: .
Tim Alexander Teige Master student, 2018, CS, UiT Thesis: Auto scaling framework, simulator, and algorithms for the META-pipe backend and Source code.
Nina Angelvik Master student, 2018, CS, UiT Thesis: . Source code: Backend and Frontend
Mike Voets Master student, 2018, CS, UiT Thesis: Deep Learning: From Data Extraction to Large-Scale Analysis. Source code: replication study and DICOM anonymizer.
Inge Alexander Raknes Technical staff ELIXIR.
Rigmor Katrine Johansen Intern Bioethics, NOWAC
Johan Ravn Master student, 2017, CS, UiT Thesis: Detection of Wheezes and Breathing Phases using Deep Convolutional Neural Networks.
Dr. Giacomo Tartari Technical staff ELIXIR
Dr. Edvard Pedersen Ph.D. student, 2016, CS, UiT Thesis: A Data Management Model For Large-Scale Bioinformatics Analysis Thesis and source code).
Frode Opdahl Master student, 2016, CS, UiT Project: Virtual reality.
Jarl Fagerli Master student, 2015, CS, UiT Thesis: COMBUST I/O. Abstractions facilitating parallel execution of programs implementing common I/O patterns in a pipelined fashion as workflows in Spark. Thesis. Source code.
Kenneth Knudsen Master student, 2015, CS, UIT Thesis: Freia: Exploring Biological Pathways Using Unity3D. (Thesis, Source Code, Demo)
Ove Kåven Master student, 2015, CS, UiT Thesis: Multiparadigm Optimizing Retargetable Transdisciplinary Abstraction Language (Thesis, Source code)
Ida Jaklin Johansen Technical staff Elixir.no
Martin Ernstsen Master student, 2013, CS, UiT Thesis: Mario - A system for iterative and interactive processing of biological data (Thesis, Source code)
Terje André Johansen Master student, 2011, CS, UiT Thesis: A scalable, interactive widget library for visualizing biological data (Thesis, Paper)
Torje Henriksen Master student (main-advisor Otto J. Anshus, co-advisor Phuong Hoai Ha), 2008, CS, UiT Thesis: Efficient intra-node Communication for Chip Multiprocessors

Collaborators

We are a lab in the High Performance Distributed Systems research group at the Department of Computer Science, University of Tromsø.

We are also part of the Center for Bioinformatics which is co-located with the Norwegian Structural Biology Centre (NORSTRUCT).

We collaborate with Professor Eiliv Lund, Torkjel Sandanger, Lill-Tove Busund, and Tom Dønnem in the NOWAC project.

We collaborate with Professor Hasse Melbye in the Lung Sounds project.

EPINOR is a national research school in population based epidemiology. We are one of associated research groups, and our PhD students can apply for admission to the school.

NORBIS is the national research school for bioinformatics, biostatistics, and systems biology.

We have long term collaboration with Professors Kai Li and Olga Troyanskaya at Princeton University. We are also collaborating with Etienne Birmelé at Université Paris Descartes.