Genetic Sequence compression using Machine Learning and Arithmetic Encoding Decoding Techniques

6 Dec 2022  ·  Mehedi Hasan Sarkar, Adnan Ferdous Ashrafi ·

We live in a period where bio-informatics is rapidly expanding, a significant quantity of genomic data has been produced as a result of the advancement of high-throughput genome sequencing technology, raising concerns about the costs associated with data storage and transmission. The question of how to properly compress data from genomic sequences is still open. Previously many researcher proposed many compression method on this topic DNA Compression without machine learning and with machine learning approach. Extending a previous research, we propose a new architecture like modified DeepDNA and we have propose a new methodology be deploying a double base-ed strategy for compression of DNA sequences. And validated the results by experimenting on three sizes of datasets are 100, 243, 356. The experimental outcomes highlight our improved approach's superiority over existing approaches for analyzing the human mitochondrial genome data, such as DeepDNA.

PDF Abstract
No code implementations yet. Submit your code now

Tasks


Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here