ideas4biologyRNA-seq Data Analysis

Our comprehensive analysis of RNA-Seq data encompasses a number of steps, using tools and methods that are commonly applied in the field as well as a number of our own analytical methods. This includes both standard procedures, such as identification of differentially expressed genes, and also more specialized tasks, like identification of long non-coding RNAs. Below listed are most commonly requested types of data analysis, with a brief summary of expected output files.

Data pre-processing

  • Quality filtering
  • Adapter trimming
  • Discarding rRNA-mapped reads or contaminants (per request)

Output files include: quality report (HTML or PDF formats), filtered reads in FASTQ format (per request)

Fig. 1. A detailed and self-explanatory quality report for each sample provided.

Expression analysis

  • Estimating gene and transcript expression values
  • Differential expression analysis
  • Enrichment analysis of KEGG pathways and Gene Ontology terms
    Output files include:
  • Tab-separated files with calculated expression values for genes and transcripts; these files might be open in a spreadsheet, such as MS Excel. The expression values are provided in FPKM/RPKM and TPM units
  • A list of differentially expressed genes, with fold change values and adjusted P-values
  • Raw expression values for each gene in each condition as a single table (per request)
  • A heatmap of differentially expressed genes in PDF format
  • MA and volcano plots - essential diagnostic plots for differential expression analysis, in PDF format

Fig. 2. MA plot and Volcano plot, essential diagnostic plots for differential expression analysis.

Ab initio transcriptome assembly (genome dependent)

  • Read mapping against a reference genome sequence
  • Quality control of mapping results and post-processing them
  • Ab initio assembly of the transcriptome (detection of novel genes and splicing isoforms, including antisense and intronic transcripts)
  • Quality filtering of the transcriptome (if required)
  • Comparison with a reference transcriptome/annotations (if available)
  • Possible further steps: Expression analysis; Annotation of transcriptomes
    Output files include:
  • Quality report from read mapping
  • A file in BAM format containing raw results of read mapping (per request)
  • The transcriptome in GFF/GTF and FASTA formats; the GFF/GTF files might be used for direct visualization of the transcriptome in genome browsers, such as IGV
  • Sashimi plots for genes of interest (per request)

Fig. 3. Read mapping results in BAM format

Fig. 4. Sashimi plot

De novo transcriptome assembly (does not require genome sequence)

    Output files include:
  • a transcriptome in FASTA format
  • quality report for the transcriptome assembly

Annotation of transcriptomes

  • Identification of protein-coding genes
  • Annotation of proteins: finding structural and functional domains; similarity search against databases of annotated proteins in other organisms
  • Assignment of KEGG pathways and Gene Ontology (GO) terms
  • Identification of tRNAs, rRNAs, snoRNAs, snRNAs
  • Identification of long non-coding RNAs (lncRNAs)
  • Identification of circRNAs
  • Possible further steps: identification of microRNAs
    Output files include:
  • A single tab-delimited file (to be open in a spreadsheet program, such as MS Excel) with most of annotation results, including assigned KEGG and GO terms and found protein domains
  • FASTA and GTF/GFF/BED files with identified open reading frames and different classes of non-coding RNAs
  • Interested in other types of output files? Feel free to ask us

Examples of other, RNA-Seq related types of analysis:

  • Finding RNA editing events
  • Finding targets for microRNAs
  • Analysis of splicing (different variants)
    Output files include:
  • A single tab-delimited file (to be open in a spreadsheet program, such as MS Excel) with most of annotation results, including assigned KEGG and GO terms and found protein domains
  • FASTA and GTF/GFF/BED files with identified open reading frames and different classes of non-coding RNAs
  • Interested in other types of output files? Feel free to ask us



ideas4biologyAbout us

We have years of experience as bioinformaticians and we also perform our own scientific projects, so that we constantly improve our analytical skills and pipelines, adjusting them to ever changing methods and standards in the field. Quite importantly, we have experience in RNA-Seq data analysis of different groups of organisms, including human, plants, and bacteria – both model and non-model organisms. Most of the provided result files are in a user-friendly format and/or might be visualized in tools, such as genome browsers and MS Excel. Last but not least, all that comes at competitive price and there is no fee for evaluation of your project. Please contact us to learn more on our offer, pricing and scheduling.



ideas4biologyCONTACT US

EMAIL
office@ideas4biology.com
POSTAL ADDRESS Poznan Science and Technology Park
Rubiez 46
Building C4, Office 82
61-612 Poznań, Poland

COMPANY ADDRESS os. Wichrowe Wzgórze 2/12
61-672 Poznań
BANK ACCOUNT PLN: 06 2490 0005 0000 4520 2193 3163

EUR: 83 2490 0005 0000 4600 6031 7315

ALIOR BANK

SWIFT code: ALBPPLPW
PHONE NUMBER
+48 698 141 228

NIP: 9721251926
REGON: 302822671

NEWSLETTER