1 Reading the data

The input files are:

2 Analysis at family level

2.1 FPKM for each family of Repetitive Elements

For each family of Repetitive Elements (in case of elements with no repFamily name or repFamilies belonging to more than one repClass I use repClass) I compute FPKM values as follows: for each sample:

  • I compute the sum of counts for all elements belonging to that repFamily
  • I divide this sum by the total number of STAR input reads (/2 because with featureCounts I quantify fragments) for that sample and multiply by 10⁶
  • I divide this number by the total sum of lengths (in Kb) of the elements belonging to that repFamily –> FPKM
  • I subtract from each FPKM the total FPKM of all transposons belonging to the DNA repClass.

2.2 FPKM of specific families per condition

The values are the FPKM corrected for DNA transposons.

2.2.1 testing the difference in the mean FPKM per condition

I use t-test on the FPKM values corrected for DNA transposons.

[[1]] [1] “L1 - F pval = 0.125541125541126 - M pval = 0.536796536796537”

[[2]] [1] “ERVK - F pval = 0.662337662337662 - M pval = 0.125541125541126”

[[3]] [1] “ERVL - F pval = 0.051948051948052 - M pval = 0.0303030303030303”

[[4]] [1] “Satellite - F pval = 0.536796536796537 - M pval = 0.00432900432900433”

3 DE-Seq analysis of RNA transposons

I include the FPKM of DNA transposons as confounding factor in DESeq2 formula.

Before running the Differential Expression analysis, the data are pre-filtered to remove all repetitive elements with < 10 reads among all samples.

3.1 Differentially expressed TEs

## log2 fold change (MMSE): condition M_HEMI vs M_WT 
## Wald test p-value: condition M_HEMI vs M_WT 
## DataFrame with 4 rows and 5 columns
##             baseMean log2FoldChange     lfcSE      pvalue      padj
##            <numeric>      <numeric> <numeric>   <numeric> <numeric>
## IAPLTR2_Mm   179.523       0.316178 0.1519884 0.000104618 0.0101654
## MERVL-int    527.511       0.258159 0.0914886 0.000145426 0.0110614
## L1MdTf_II    393.994       0.286710 0.1144614 0.000151786 0.0110614
## L1MdTf_I     401.114       0.280728 0.1140062 0.000176763 0.0114503

3.2 MA-plots

  • The threshold used for a dot to be coloured in the MA-plots is p-value adjusted < 0.05.
  • Transposable elements whose mean expression > 20 and abs(log2FoldChange) > 0.2 are labeled.

3.3 FPKM of differentially expressed TEs

I also include two non-DE TEs, one expressed and one non-expressed in murine trophoblast cells (Ext.Fig3g in Weigert et al. 2023 ).

sessionInfo()
## R version 4.1.3 (2022-03-10)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 20.04.6 LTS
## 
## Matrix products: default
## BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0
## LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.9.0
## 
## locale:
##  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=it_IT.UTF-8        LC_COLLATE=en_US.UTF-8    
##  [5] LC_MONETARY=it_IT.UTF-8    LC_MESSAGES=en_US.UTF-8   
##  [7] LC_PAPER=it_IT.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=it_IT.UTF-8 LC_IDENTIFICATION=C       
## 
## attached base packages:
## [1] stats4    stats     graphics  grDevices utils     datasets  methods  
## [8] base     
## 
## other attached packages:
##  [1] ggpubr_0.5.0                ggrepel_0.9.2              
##  [3] gridExtra_2.3               DESeq2_1.34.0              
##  [5] SummarizedExperiment_1.24.0 Biobase_2.54.0             
##  [7] MatrixGenerics_1.6.0        matrixStats_0.62.0         
##  [9] GenomicRanges_1.46.1        GenomeInfoDb_1.30.1        
## [11] IRanges_2.28.0              S4Vectors_0.32.4           
## [13] BiocGenerics_0.40.0         pheatmap_1.0.12            
## [15] data.table_1.14.6           ggplot2_3.4.0              
## 
## loaded via a namespace (and not attached):
##  [1] bitops_1.0-7           bit64_4.0.5            RColorBrewer_1.1-3    
##  [4] httr_1.4.4             tools_4.1.3            backports_1.4.1       
##  [7] bslib_0.4.1            irlba_2.3.5.1          utf8_1.2.2            
## [10] R6_2.5.1               DBI_1.1.3              colorspace_2.0-3      
## [13] withr_2.5.0            tidyselect_1.2.0       bit_4.0.5             
## [16] compiler_4.1.3         cli_3.4.1              DelayedArray_0.20.0   
## [19] labeling_0.4.2         sass_0.4.2             scales_1.2.1          
## [22] SQUAREM_2021.1         genefilter_1.76.0      mixsqp_0.3-48         
## [25] stringr_1.4.1          digest_0.6.30          rmarkdown_2.18        
## [28] XVector_0.34.0         pkgconfig_2.0.3        htmltools_0.5.3       
## [31] invgamma_1.1           highr_0.9              fastmap_1.1.0         
## [34] rlang_1.0.6            rstudioapi_0.14        RSQLite_2.2.18        
## [37] prettydoc_0.4.1        farver_2.1.1           jquerylib_0.1.4       
## [40] generics_0.1.3         jsonlite_1.8.3         BiocParallel_1.28.3   
## [43] car_3.1-1              dplyr_1.0.10           RCurl_1.98-1.9        
## [46] magrittr_2.0.3         GenomeInfoDbData_1.2.7 Matrix_1.5-3          
## [49] Rcpp_1.0.9             munsell_0.5.0          fansi_1.0.3           
## [52] abind_1.4-5            lifecycle_1.0.3        stringi_1.7.8         
## [55] yaml_2.3.6             carData_3.0-5          zlibbioc_1.40.0       
## [58] grid_4.1.3             blob_1.2.3             parallel_4.1.3        
## [61] crayon_1.5.2           lattice_0.20-45        Biostrings_2.62.0     
## [64] splines_4.1.3          annotate_1.72.0        KEGGREST_1.34.0       
## [67] locfit_1.5-9.6         knitr_1.40             pillar_1.8.1          
## [70] ggsignif_0.6.4         codetools_0.2-18       geneplotter_1.72.0    
## [73] XML_3.99-0.12          glue_1.6.2             evaluate_0.18         
## [76] png_0.1-7              vctrs_0.5.1            gtable_0.3.1          
## [79] purrr_0.3.5            tidyr_1.2.1            assertthat_0.2.1      
## [82] ashr_2.2-54            cachem_1.0.6           xfun_0.35             
## [85] xtable_1.8-4           broom_1.0.1            rstatix_0.7.1         
## [88] survival_3.2-13        truncnorm_1.0-8        tibble_3.1.8          
## [91] AnnotationDbi_1.56.2   memoise_2.0.1