READemption

READemption 4 is a pipeline for the computational evaluation of RNA-Seq data. It was originally developed to process dRNA-Seq reads (as introduced by Sharma et al. [1] originating from bacterial samples. Meanwhile it has been extended to process data generated in different experimental setups. An elaborate description of all current features of READemption is available under <https://reademption.readthedocs.io/en/latest/#>. In this section we describe how to use READemption for the purpose of producing wiggle data from raw reads produced using the dRNA-seq protocol.

Comparison of different conditions As already in section installation explained, TSSpredator is able to analyse genomes consisting multiple contigs or a chromosome with one ore more plasmids or several chromosomes. After preparing the input data as explained, one can create a reademption folder with the following command:

reademption create -f TestAnalysis

After the creation you will get a message to copy your references and reads into the folders TestAnalysis/input/reference sequences and TestAnalysis/input/reads/ respectively. For the multi-fastA case all genomes of the contigs, chromosomes or plasmids should be in one fastA file. After all needed files are copied, you can run the following command to start mapping:

Please check the website for more informations about the parameter which can be set. In this step all provided reads undergo a processing step and the processed reads will be mapped to the genome (in the multi-fastA case to each entry) using segemehl 5. For generating wiggle files the subcommand coverage has to be used:

reademption coverage -f TestAnalysis

Using the bam files generated by the subcommand align, one-base coverages are calculated, for each sample for the forward and reverse strand separately. Positions with zero coverage are not listed in the wiggle files. The coverage step creates three folders, raw coverages and coverages with different normalizing factors. For TSSpredator we use the wiggles files contained in the coverage-tnoar-min-normalized folder.

Comparison of different strains/species For the analysis of different strains or species reademption should be started for each strain/species separately. As the normalization step should be done over all wiggle files from all strains/species you have to make the normalization by your own. We recommend to take the wiggle files from the folder coverage-tnoar-mil-normalized, divide the coverage column of all files by 1.000.000 and multiply by the lowest number of aligned reads of all considered libraries. The lowest number can be found by comparing the file names of the wiggle files. After normalization the wiggle files can be used by TSSpredator as described before.

4

Konrad U. Förstner, Jörg Vogel, Cynthia M. Sharma. 2014, “READemption – A tool for the computational analysis of deep-sequencing-based transcriptome data.”, Aug 13, Bioinformatics.

5

Hoffmann S, Otto C, Kurtz S, Sharma CM, Khaitovich P, Vogel J, Stadler PF, Hackermueller J, “Fast mapping of short sequences with mismatches, insertions and deletions using index structures”, PLoS Comput Biol (2009) vol. 5 (9) pp. e1000502