Publications
Applications of statistical and ML methods in molecular biology including synthetic DNA QC
Abstract
SOLQC - Recent years have seen a growing number and a broadening scope of studies using synthetic oligo libraries for a range of applications in synthetic biology. As experiments are growing by numbers and complexity, analysis tools can facilitate quality control and help statistical assessment and inference. We present a novel analysis tool, called SOLQC, which enables fast and comprehensive analysis of synthetic oligo libraries. SOLQC takes as input the results of an NGS analysis performed on the library, by the user. SOLQC then provides statistical information such as the distribution of variant representation, different error rates and their dependence on sequence or library properties. SOLQC also produces graphical descriptions of the analysis results. The results are reported in a flexible report format. We demonstrate SOLQC by analyzing several literature libraries. In the context of these analyses we discuss the potential benefits and relevance of the different components of SOLQC. Intrinsic Autoencoders - An important tools that serves life science to gain insights into the functionality of living cells is the analysis of gene expression in cells and populations. The results of gene expression profiling usually reside in a very high dimensional space, obstructing efficient and effective inference and down stream tasks such as classification. We believe that working in a 3 more adequate representation space can facilitate better results for downstream analysis tasks such as classification and clustering. As a step in this direction we investigate the intrinsic dimension of simple data. In particular, we show how autoencoders can be used to fully reconstruct data that resides in a manifold whose dimension is much smaller than that of the original ambient representation space.
Product Used
NGS
Related Publications