Menu

Javascript is not activated in your browser. This website needs javascript activated to work properly.
You are here

Bioinformatics and Deliverables

Our bioinformatics services include below analysis pipeline, deliverables and data management.

RNA-Seq analysis pipeline

CTG RNA-Seq analysis pipeline is divided into five main tasks:

  1. Demultiplexing: organizing the FASTQ files based on the sample index information, and generating the statistics and reporting files. This task will be performed using the bcl2fastq2 software with default settings.
  2. Quality Control (QC): FastQC provides quality control analyses and checks on raw sequence data coming from the above Demultiplexing step of the pipeline. For each analysis it provides a graphical plot and a suggestive warning if the data has any problems.
  3. Read Mapping: alignment of reads to a specified reference genome. This task will be performed using the HISAT2 software, and the reference genome sequence from the researcher requested database or by default from the Ensemble database.
  4. Picard QC: quality control checks such as a) base distribution by cycle, b) insert size, c) quality by cycle, and d) quality by distribution on above aligned data results. This task will be performed using Picard tool.
  5. Expression counts: assembly of the alignments into full transcripts and quantification of the expression levels of each gene and transcript. This task will be performed using the StringTie software.

Deliverables

Project delivery report

Summarized information about the experimental setup, methods, and corresponding references and links (20XX-XX_Project_Delivery_Report.pdf).

Demultiplexing results

“DE_MUL_PLEX_project_number”. This folder contains subfolders with names as Sample ID, provided in the sample sheet. These subfolders contain .fastq files with the file names as Sample Name, provided in the sample sheet.

Quality Control results

“FASTQC_project_number”. This folder contains a FastQC summary report .html for each Read and a “.zip” file containing results from FastQC analysis.

Read Mapping results

“HISAT2_project_number”. This folder contains subfolders with names as Sample ID, provided in the sample sheet, which contains genome reference alignment result files.

Files:

  •   .bam
  •   .bam.bai
  •   summary.txt

Picard QC results

“PICARD_QC_project_number”. This folder contains genome reference alignment QC results.

QC metrices in text format:
  - alignment summary metrics
  - base distribution by cycle metrices
  - insert size metrices
  - quality by cycle metrices
  - quality distribution metrices

QC figures in pdf format:
  - base distribution by cycle
  - insert size histogram
  - quality by cycle
  - quality distribution

Expression counts results

“StringTie_project_number”. This folder contains subfolders with names as Sample ID, provided in the sample sheet, which contains Read counts for exons, introns, transcripts, and genes.

Files:

  • SampleID.tsv file contains gene abundances information in a tab limited format.
  • t_data.ctab file contains transcript abundances information in a tab limited format.
  • e_data.ctab file contains exon Read counts information in a tab limited format.
  • i_data.ctab file contains intron Read counts information in a tab limited format.
  • e2t.ctab file contains mapping information for exon index to transcript index.
  • i2t.ctab file contains mapping information for intron index to transcript index.
  • SampleID.gtf file contains a fully covered transcripts matching the reference annotation transcripts.

Visit this StringTie website link for more information about the files and filetypes.

Data management and analysis

CTG uses LUNARC (Center for Scientific and Technical Computing at Lund University) for data management and analysis.

Delivery medium

Your project data will be delivered through either of the following medium:

  1. On a hard disk, encrypted and protected with a password. Please visit this site https://www.veracrypt.fr/en/Home.html for more information see.
  2. Your UPPMAX account, if you do not have one please visit UPPMAX site https://www.uppmax.uu.se/support/getting-started/ for more details.

Data Storage

The project data will be stored for 6 months.

Additional Services

For additional Bioinformatics services please contact NBIS – National Bioinformatics Infrastructure Sweden, (https://nbis.se/).

  1. In a hard disk, encrypted and protected with a password. Please visit this site https://www.veracrypt.fr/en/Home.html for more information.
Page Manager:

Contact for service

E-mail: CTGservice [at] med [dot] lu [dot] se
Ingrid Wilson
  Phone: +46706855137
Markus Heidenblad
  Phone: +4646173592

Bioinformatics

Sequencing analysis viewer
Sequencing analysis viewer

Contact CTG:  

   Phone: +46706855137 
   E-mail: CTGservice [at] med [dot] lu [dot] se

Visiting Addresses:

   Medicon Village 404:B3                                BMC - Biomedical Centre
   Scheelevägen 2                                             Sölvegatan 23 B
         SE-223 81 Lund                                             SE - 221 85 Lund

         Link to map for Medicon Village                  Link to map for BMC