1 Reading the data

The input files are:

2 Analysis at family level

2.1 FPKM for each family of Repetitive Elements

For each family of Repetitive Elements (in case of elements with no repFamily name or repFamilies belonging to more than one repClass I use repClass) I compute FPKM values, as follows: for each sample:

  • I compute the sum of counts for all elements belonging to that repFamily
  • I divide this sum by the total number of reads for that sample and multiply by 10⁶
  • I divide this number by the total sum of lengths (in Kb) of the elements belonging to that repFamily –> FPKM
  • When specified, I subtract from each FPKM the total FPKM of all transposons belonging to the DNA repClass

2.2 Heatmaps

The heatmaps are scaled by rows.

No samples showing particularly abundant DNA contamination nor RNA TE families showing deregulation in one of the three experimental groups.

3 DE-Seq analysis of RNA transposons

I include the FPKM of DNA transposons as confounding factor in DESeq2 formula.

Before running the Differential Expression analysis, the data are pre-filtered to remove all repetitive elements with < 10 reads among all samples.

3.1 DESeq2-normalized counts of EGA TEs per condition

3.2 MA-plots

  • The threshold used for a dot to be coloured in the MA-plots is p-value adjusted < 0.1.
  • Transposable elements whose mean expression > 10 and log2FoldChange > 0.2 (or < -0.2) are labeled.

sessionInfo()
## R version 4.1.3 (2022-03-10)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 20.04.6 LTS
## 
## Matrix products: default
## BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0
## LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.9.0
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=it_IT.UTF-8        LC_COLLATE=en_US.UTF-8    
##  [5] LC_MONETARY=it_IT.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=it_IT.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=it_IT.UTF-8 LC_IDENTIFICATION=C       
## 
## attached base packages:
## [1] stats4    stats     graphics  grDevices utils     datasets  methods  
## [8] base     
## 
## other attached packages:
##  [1] EDASeq_2.28.0               ShortRead_1.52.0           
##  [3] GenomicAlignments_1.30.0    Rsamtools_2.10.0           
##  [5] Biostrings_2.62.0           XVector_0.34.0             
##  [7] BiocParallel_1.28.3         ggpubr_0.5.0               
##  [9] ggrepel_0.9.2               gridExtra_2.3              
## [11] DESeq2_1.34.0               SummarizedExperiment_1.24.0
## [13] Biobase_2.54.0              MatrixGenerics_1.6.0       
## [15] matrixStats_0.62.0          GenomicRanges_1.46.1       
## [17] GenomeInfoDb_1.30.1         IRanges_2.28.0             
## [19] S4Vectors_0.32.4            BiocGenerics_0.40.0        
## [21] pheatmap_1.0.12             data.table_1.14.6          
## [23] ggplot2_3.4.0              
## 
## loaded via a namespace (and not attached):
##   [1] colorspace_2.0-3       rjson_0.2.21           ggsignif_0.6.4        
##   [4] deldir_1.0-6           hwriter_1.3.2.1        ellipsis_0.3.2        
##   [7] rstudioapi_0.14        farver_2.1.1           bit64_4.0.5           
##  [10] AnnotationDbi_1.56.2   fansi_1.0.3            xml2_1.3.3            
##  [13] codetools_0.2-18       splines_4.1.3          R.methodsS3_1.8.2     
##  [16] cachem_1.0.6           geneplotter_1.72.0     knitr_1.40            
##  [19] jsonlite_1.8.3         broom_1.0.1            annotate_1.72.0       
##  [22] ashr_2.2-54            dbplyr_2.2.1           png_0.1-7             
##  [25] R.oo_1.25.0            compiler_4.1.3         httr_1.4.4            
##  [28] backports_1.4.1        assertthat_0.2.1       Matrix_1.5-3          
##  [31] fastmap_1.1.0          cli_3.4.1              prettyunits_1.1.1     
##  [34] htmltools_0.5.3        tools_4.1.3            gtable_0.3.1          
##  [37] glue_1.6.2             GenomeInfoDbData_1.2.7 dplyr_1.0.10          
##  [40] rappdirs_0.3.3         Rcpp_1.0.9             carData_3.0-5         
##  [43] jquerylib_0.1.4        vctrs_0.5.1            rtracklayer_1.54.0    
##  [46] xfun_0.35              stringr_1.4.1          irlba_2.3.5.1         
##  [49] lifecycle_1.0.3        restfulr_0.0.15        rstatix_0.7.1         
##  [52] XML_3.99-0.12          zlibbioc_1.40.0        scales_1.2.1          
##  [55] aroma.light_3.24.0     hms_1.1.2              parallel_4.1.3        
##  [58] RColorBrewer_1.1-3     curl_4.3.3             yaml_2.3.6            
##  [61] memoise_2.0.1          sass_0.4.2             biomaRt_2.50.3        
##  [64] SQUAREM_2021.1         latticeExtra_0.6-30    stringi_1.7.8         
##  [67] RSQLite_2.2.18         highr_0.9              genefilter_1.76.0     
##  [70] BiocIO_1.4.0           GenomicFeatures_1.46.5 filelock_1.0.2        
##  [73] truncnorm_1.0-8        rlang_1.0.6            pkgconfig_2.0.3       
##  [76] bitops_1.0-7           invgamma_1.1           evaluate_0.18         
##  [79] lattice_0.20-45        purrr_0.3.5            labeling_0.4.2        
##  [82] bit_4.0.5              tidyselect_1.2.0       magrittr_2.0.3        
##  [85] R6_2.5.1               generics_0.1.3         DelayedArray_0.20.0   
##  [88] DBI_1.1.3              pillar_1.8.1           withr_2.5.0           
##  [91] prettydoc_0.4.1        mixsqp_0.3-48          survival_3.2-13       
##  [94] KEGGREST_1.34.0        abind_1.4-5            RCurl_1.98-1.9        
##  [97] tibble_3.1.8           crayon_1.5.2           car_3.1-1             
## [100] interp_1.1-3           utf8_1.2.2             BiocFileCache_2.2.1   
## [103] rmarkdown_2.18         progress_1.2.2         jpeg_0.1-9            
## [106] locfit_1.5-9.6         grid_4.1.3             blob_1.2.3            
## [109] digest_0.6.30          xtable_1.8-4           tidyr_1.2.1           
## [112] R.utils_2.12.2         munsell_0.5.0          bslib_0.4.1