NIAB - National Institute of Agricultural Botany

Transcriptomic data for Vicia faba

Reference transcriptome Vicia faba "Hedin"

This transcriptome represents multiple tissues derived from root tissue and whole shoot tissue of an inbred SSD line of Vicia faba "Hedin". Plants were grown in polytunnels and field conditions and tissue was sampled at flowering time - early pod set.

Whole RNA was sequenced as paired-end Illumina libraries at a target read length of 150 bases at Earlham Institute. Read libraries were normalised for coverage before assembly with the insilico_read_normalization.pl script included in the Trinity package.

Multiple assemblers (SOAP-denovo-trans, trans-ABySS, Trinity, BinPacker) and kmers (32 - 75) were tried as well as several post-assembly merging steps (Transfuse, trans-ABySS merge). Assembly quality was assessed with BUSCO and transrate.

Contigs for the final assembly were assembled with trans-ABySS at 32mer and 72mer and with SOAP-denovo-trans at 65mer and 75mer and merged with trans-ABySS-merge.

Assembled contigs were filtered for contaminants by searching them with tblastn against the NCBI nr database and with blastn against the NCBI nt database and any contigs which matched organisms other than of higher plants were removed.

Unidentified contigs were searched against the coding sequence datasets Arabidopsis_thaliana.TAIR10.cds.all.fa, Glycine_max.V1.0.cdna.abinitio.fa, Trifolium_pratense.Trpr.cdna.all.fa, downloased November 2017 from ENSEMBL genomes, and Mt4.0v2_GenesCDSSeq_20140818_1100.fasta, downloaded from the Medicago HapMap project.

Any contigs found in these were left in the dataset, the remainder was removed. Contigs were translated into protein sequence with transeq and annotation was performed with interproscan and Blast2GO.

Files