CSBL::Computational & Synthetic Biology Laboratory at KU symbol

Research Topics Publications Members Softwares

We Seek Answers for Big Questions by Reading Genome, Writing Genome and Editing Genome. From Molecules To Organisms, We See Everything in the Light of Evolution. Artificial Intelligence (e.g. Deep Learning) Facilitates Our Digital Biology Approach.

Computational Genomics

NGS - Reading Genomes

We understand living things by their genomes and transcriptomes

  • Sequencing is a process of collecting the genetic code of all living things. Deciphering the code of life begins with sequencing, gene prediction and its functional annotation. We have sequenced and annotated various genomes (DNAseq) and transcriptomes (RNAseq) of living organisms ranging from bacteria to fungi to insects, and even humans.
  • Since 2013, we have been participating in the 1000 Fungal Genome Project (1kFGP) geared by the Joint Genome Institute (JGI), by DOE USA. It is a data-driven approach to access all fungi on Earth. The project has officially ended, but the sequencing efforts continue.
  • We are exploring the Fungal Genome Universe by consolidating all known fungal genome information. We mainly focus on the biology of edible, medicinal and poisonous mushrooms. We are trying to parse new knowledge of fungal biology by various NGS techniques.

Microbiome, Metagenome & Pan-genome require another level of sequence informatics

  • We sequence DNA/RNA not only from living organisms but also from abiotic environmental samples. From Antarctic soils (environmental metagenome) to fermentation starters (food microbiome) to insect guts (gut microbiome), we explore a ‘parallel sequence universe’ using various NGS techniques. We recently gathered pan-genomic data of acetogens and probiotics (e.g. lactic acid bacteria), which is essential for their biotechnological applications. Now, we transit from gathering to understanding stage by assistance of AI/data sciences.

Genome to Function

3D structure - What You See Is What You Understand

  • Shape determines function - we used X-ray crystallography as a magnifying glass to probe bio-macromolecular structures at the molecular level. However, recent advances in artificial intelligence (AI, e.g. deep learning) have dramatically changed the tools and strategies of biomolecular studies.
  • All matter in the whole universe is composed of a finite number of elements (see, the periodic table). Likewise, the innumerable proteins in the protein universe belong to a finite number of protein folds that can be further decomposed to a finite number of building blocks (folding units). The question is, ‘Are there Structural Foldons like protein structure alphabets that recombine to provide the molecular diversity of protein universe during evolution’? This question can be applied to protein structure model building and now integrated with AI for design and engineering of biomolecules/cells. In the same way, we can raise a question: do there DNA/RNA foldons also exist? (Check out later!)

Deep learning (e.g. LLM) is a game changer

  • Biostrings are similar to the natural languages having finite number of alphabets and words. Deep learning using NLMs (natural language models) such as GPT and BERT has lifted natural language processing. We use NLMs for biosequence analysis at multiple levels.

Genes and Proteins

Evolutionary Genomics

  • There are huge number of genes and proteins exists in the protein universe. We explore the protein universe to see how protein structures evolve. We mapped the protein space - Protein Structure Universe where the protein structures are born, evolve and innovate. As we explore the birth, innovation, and death of proteins, we try to derive the novel principle of protein design. We are open-minded and try to use all AI tools currently developed by others.

Enzyme Genomics

  • We study the functionality of protein domains and families in the pan-genomic space where genes/proteins are born, evolve, innovate, horizontally transferred and eventually destroyed. This is a traditional approach to understanding protein space, but we aim to design novel enzymes from this approach.

Synthetic Biology

Biology is Technology

  • People (including scientists and engineers) have found it difficult to answer the question, “What is synthetic biology?”Biology is Technology” was once a motto for synthetic biologists hacking living things. We see two perspectives of synthetic biology: science and technology. The fusion of science and technology will be true synthetic biology.

We design, build, test and learn biosystems

  • Richard Feynman said, ‘What I cannot create, I do not understand’, which is followed by ‘Know how to solve every problem that has been solved’. This is the goal of synthetic biology as a technology tinkering living things.

  • Construction by Design - We can construct syntheic metabolic pathway by design (e.g. iPNN - intelligent Pathway Network Navigator).

  • Learning by Construction - We can learn how nature builds ‘things’ by synthesis (e.g. PKSDS - PolyKetide Synthetase Design Suite)

iGEM $ DIYBio

Biohackers

  • Synthetic biology is a hacking tool for biology. Amateur and citizen scientists applying synthetic biology approach are called as ‘biohackers’. CSBL supports biohackers.

  • We gears undergraduate research programs, the Korea_U_Seoul team for iGEM. The Korea_U_Seoul team is open for any undergraduate student.

  • DIYBio: CSBL supports DIYBio Movements in Korea

  • We are also interested in the manipulation of cell surface by displaying peptides and proteins in microbes and viruses

Knowledge Discovery

Engineering Principles

  • We learn and discover nature’s design principles for engineering biology. For instance, deconstruction of Red Algal Biomass can be accelerated by a designed pathway.

  • Agar, a recalcitrant polysaccharide, has a great potential as renewable biomass. We have recently elucidated the details of bacterial agarolytic pathways. We have sequenced genomes (DNAseq) and transcriptomes (RNAseq) of several agarolytic microorganisms using next-generation sequencing (NGS) techniques. We have identified key enzymes (e.g. beta-agarases, agarooligosaccharide beta-galactosidase - ABG, neoagarobiose hydrolase - NABH, anhydrogalactose dehydrogenase -AHGD and anhydrogalactonate cycloisomerase - ACI, etc.) in the agar metabolic pathway and determined atomic structures of key enzymes. The full understanding of molecular and cellular functions of these novel agarolytic enzymes will provide the design principle of synthetic agar degradation pathways and eventually guide the construction of synthetic microorganisms converting agar into valuable chemicals.