Dividing of the standard set of amino acids into groups according to their evolutionary age
- Авторлар: Efimov V.M.1,2,3,4, Efimov K.V.5, Kovaleva V.Y.2
-
Мекемелер:
- Institute of Cytology and Genetics, Siberian Branch, Russian Academy of Sciences
- Institute of Animal Systematics and Ecology, Siberian Branch, Russian Academy of Sciences
- Novosibirsk State University
- Tomsk State University
- Higher School of Economics
- Шығарылым: Том 59, № 2 (2025)
- Беттер: 299–308
- Бөлім: БИОИНФОРМАТИКА
- URL: https://edgccjournal.org/0026-8984/article/view/682884
- DOI: https://doi.org/10.31857/S0026898425020111
- EDN: https://elibrary.ru/GFVVPE
- ID: 682884
Дәйексөз келтіру
Аннотация
It is generally accepted that the existing set of proteinogenic amino acids encoded by the standard genetic code was formed step by step in the course of evolution. Most studies name Ala, Asp, Glu, Gly, Ile, Leu, Pro, Ser, Thr and Val as early amino acids, presumably of extraterrestrial origin. However, other studies have chosen a consensus list of early amino acids in which Ile is replaced by Arg. We compared the differences between early and late amino acids for the lists with Ile and with Arg based on their physicochemical properties (AAindex database). The point-biserial correlation coefficient rpb, Student’s t-test and its reliability, p-value, were calculated between the binary lists with Ile and Arg and each AA index. Since a total of 2×553 p-values were obtained, the problem of multiple comparisons was solved using the Bonferroni correction and the Benjamini-Hochberg method. Next, we used the 2B-PLS method, which is applied to two different sets of variables related to the same objects, to find information common to both sets. The first set was the binary lists of Trifonov (Arg) and Wong (Ile), and the second set was 553 AA indexes. The maximum correlation with both the list with Ile and with Arg (1.0 and 0.8, respectively) was demonstrated by the binary AA index CHAM830108, which characterizes the ability of an amino acid to be a charge donor: late amino acids are capable of being donors, while early ones are not. Apparently, this is due to the differences in the conditions under which the standard set of amino acids evolved: prebiotic and biotic. The results of the 2B-PLS analysis also show that in the list of 10 evolutionarily early amino acids, Ile looks preferable to Arg. The allocation of the last 6 amino acids (Cys, His, Met, Phe, Trp, Tyr) obtained on the basis of the reduction of the HOMO-LUMO gap in a separate, third stage of the evolution of the set of standard amino acids is confirmed. A compact arrangement on the 2B-PLS plane of the physicochemical properties of three groups of amino acids, in which adenine, thymine and cytosine are located in the second position of the codons, respectively, as well as the maximum dispersion of amino acids with guanine in the second position of the codons, is revealed.
Толық мәтін

Авторлар туралы
V. Efimov
Institute of Cytology and Genetics, Siberian Branch, Russian Academy of Sciences; Institute of Animal Systematics and Ecology, Siberian Branch, Russian Academy of Sciences; Novosibirsk State University; Tomsk State University
Хат алмасуға жауапты Автор.
Email: vmefimov@gmail.com
Ресей, Novosibirsk; Novosibirsk; Novosibirsk; Tomsk
K. Efimov
Higher School of Economics
Email: vmefimov@gmail.com
Ресей, Moscow
V. Kovaleva
Institute of Animal Systematics and Ecology, Siberian Branch, Russian Academy of Sciences
Email: vmefimov@gmail.com
Ресей, Novosibirsk
Әдебиет тізімі
- Trifonov E.N. (2000) Consensus temporal order of amino acids and evolution of the triplet code. Gene. 261(1), 139–151.
- Trifonov E.N. (2004) The triplet code from first principles. J. Biomol. Struct. Dynamics. 22(1), 1–11.
- Sobolevsky Y., Trifonov E.N. (2005) Conserved sequences of prokaryotic proteomes and their compositional age. J. Mol. Evol. 61, 591–596.
- Jordan I.K., Kondrashov F.A., Adzhubei I.A., Wolf Y.I., Koonin E.V., Kondrashov A.S., Sunyaev S. (2005) A universal trend of amino acid gain and loss in protein evolution. Nature. 433(7026), 633–638.
- Trifonov E.N. (2009) The origin of the genetic code and of the earliest oligopeptides. Res. Microbiol. 160(7), 481–486.
- Granold M., Hajieva P., Toşa M.I., Irimie F.D., Moosmann B. (2018) Modern diversification of the amino acid repertoire driven by oxygen. Proc. Natl. Acad. Sci. USA. 115(1), 41–46.
- Demongeot J., Seligmann H. (2019) Spontaneous evolution of circular codes in theoretical minimal RNA rings. Gene. 705, 95–102.
- Seligmann H. (2020) First arrived, first served: competition between codons for codon–amino acid stereochemical interactions determined early genetic code assignments. Sci. Nature. 107(3), 20.
- Saralov A.I. (2021) Factors in protobiomonomer selection for the origin of the standard genetic code. Acta Biotheoretica. 69(4), 745–767.
- Zhao M., Ding R., Liu Y., Ji Z., Zhao Y. (2022) Determination of the amino acid recruitment order in early life by genome-wide analysis of amino acid usage bias. Biomolecules. 12(2), 171.
- Wong J.T.F. (1981) Coevolution of genetic code and amino acid biosynthesis. Trends Biochem. Sci. 6, 33–36.
- Brooks D.J., Fresco J.R., Singh M. (2004) A novel method for estimating ancestral amino acid composition and its application to proteins of the Last Universal Ancestor. Bioinformatics. 20(14), 2251–2257.
- Wong J.T.F. (2005) Coevolution theory of the genetic code at age thirty. BioEssays. 27(4), 416–425.
- Doi N., Kakukawa K., Oishi Y., Yanagawa H. (2005) High solubility of random sequence proteins consisting of five kinds of primitive amino acids. Protein Eng. Des. Sel. 18(6), 279–284.
- Trifonov E.N. (2008) Tracing life back to elements. Physics Life Rev. 5(2), 121–132.
- Higgs P.G., Pudritz R.E. (2009) A thermodynamic basis for prebiotic amino acid synthesis and the nature of the first genetic code. Astrobiology. 9(5), 483–490.
- McDonald G.D., Storrie–Lombardi M.C. (2010) Biochemical constraints in a protobiotic earth devoid of basic amino acids: the “BAA (–) world”. Astrobiology. 10(10), 989–1000.
- Longo L.M., Blaber M. (2012) Protein design at the interface of the pre-biotic and biotic worlds. Arch. Biochem. Biophys. 526(1), 16–21.
- Longo L.M., Lee J., Blaber M. (2013) Simplified protein design biased for prebiotic amino acids yields a foldable, halophilic protein. Proc. Natl. Acad. Sci. USA. 110(6), 2135–2139.
- Doig A.J. (2017) Frozen, but no accident why the 20 standard amino acids were selected. FEBS J. 284(9), 1296–1305.
- Koonin E.V., Novozhilov A.S. (2017) Origin and evolution of the universal genetic code. Annu. Rev. Genet. 51(1), 45–62.
- Fried S.D., Fujishima K., Makarov M., Cherepashuk I., Hlouchova K. (2022) Peptides before and during the nucleotide world: an origins story emphasizing cooperation between proteins and nucleic acids. J. R. Soc. Interface. 19(187), 20210641.
- Makarov M., Sanchez Rocha A.C., Krystufek R., Cherepashuk I., Dzmitruk V., Charnavets T., Hlouchova K. (2023) Early selection of the amino acid alphabet was adaptively shaped by biophysical constraints of foldability. J. Am. Chem. Soc. 145(9), 5320–5329.
- Kawashima S., Pokarowski P., Pokarowska M., Kolinski A., Katayama T., Kanehisa M. (2007) AAindex: amino acid index database, progress report 2008. Nucl. Acids Res. 36(suppl_1), D202–D205.
- Кендалл М., Стьюарт А. (1973) Статистические выводы и связи. М.: Наука, 900 с.
- Наркевич А.Н., Виноградов К.А., Гржибовский А.М. (2020) Множественные сравнения в биомедицинских исследованиях: проблема и способы решения. Экология человека. 10, 55–64.
- Wasserstein R.L., Schirm A.L., Lazar N.A. (2019) Moving to a world beyond “p< 0.05”. Am. Statistician. 73(supрl. 1), 1–19.
- Breimann S., Kamp F., Steiner H., Frishman D. (2024) AAontology: аn ontology of amino acid scales for interpretable machine learning. J. Mol. Biol. 168717.
- Rohlf F.J., Corti M. (2000) The use of partial least-squares to study covariation in shape. Systematic Biol. 49, 740–753.
- Hill T., Lewicki P. (2006) Statistics: methods and applications: a comprehensive reference for science, industry, and data mining. Tulsa, Okla., UK: StatSoft Ltd. 719 p. ISBN: 9781884233593
- Hammer Ø., Harper D.A.T., Ryan P.D. (2001) PAST: paleontological statistics software package for education and data analysis. Palaeontologia Electronica. 4, 1–9.
- Polunin D., Shtaiger I., Efimov V. (2019) JACOBI4 software for multivariate analysis of biological data. bioRxiv. 803684.
- Charton M., Charton B.I. (1983) The dependence of the Chou–Fasman parameters on amino acid side chain structure. J. Theor. Biol. 102(1), 121–134.
- Nakashima H., Nishikawa K. (1992) The amino acid composition is different between the cytoplasmic and extracellular sides in membrane proteins. FEBS Lett. 303(2–3), 141–146.
- Cedano J., Aloy P., Perez–Pons J.A., Querol E. (1997) Relation between amino acid composition and cellular location of proteins. J. Mol. Biol. 266(3), 594–600.
- Dayhoff M., Schwartz R., Orcutt B. (1978) 22 a model of evolutionary change in proteins. Atlas Protein Sequence Struct. 5, 345–352.
- Jones D.T., Taylor W.R., Thornton J.M. (1992) The rapid generation of mutation data matrices from protein sequences. Bioinformatics. 8(3), 275–282.
- Hutchens J.O. (1970) Heat capacities, absolute entropies, and entropies of formation of amino acids and related compounds. In: Handbook of Biochemistry. 2nd ed. Ed. Sober H.A. Cleveland, Ohio: Chem. Rubber Co., B60–B61.
- Qian N., Sejnowski T.J. (1988) Predicting the secondary structure of globular proteins using neural network models. J. Mol. Biol. 202(4), 865–884.
- Fukuchi S., Nishikawa K. (2001) Protein surface amino acid compositions distinctively differ between thermophilic and mesophilic bacteria. J. Mol. Biol. 309(4), 835–843.
- Kumar S., Tsai C.J., Nussinov R. (2000) Factors enhancing protein thermostability. Protein Engineering. 13(3), 179–191.
- Jukes T.H., Holmquist R., Moise H. (1975) Amino acid composition of proteins: selection against the genetic code. Science. 189(4196), 50–51.
- Kakraba S., Knisley D. (2016) A graph theoretic model of single point mutations in the cystic fibrosis transmembrane conductance regulator. J. Adv. Biotech. 6, 780–786.
- Prabhakaran M., Ponnuswamy P.K. (1982) Shape and surface features of globular proteins. Macromolecules. 15(2), 314–320.
- McMeekin T.L., Groves M.L., Hipp N.J. (1964) Refractive indices of amino acids, proteins, and related substances. In: Amino Acids and Serum Proteins. Ed. Stekol J.A. Washington: Advances in Chemistry, Am. Chem. Soc., 44, pp. 54–66.
- Cronin J.R., Pizzarello S. (1983) Amino acids in meteorites. Adv. Space Res. 3, 5–18.
- Miller S.L. (1953) A production of amino acids under possible primitive earth conditions. Science. 117, 528–529.
- Fukui K. (1982) Role of frontier orbitals in chemical reactions. Science. 218, 747–754.
- Pearson R.G. (1986) Absolute electronegativity and hardness correlated with molecular orbital theory. Proc. Nat. Acad. Sci. USA. 83, 8440–8441.
- Aihara J. (1999) Reduced HOMO-LUMO gap as an index of kinetic stability for polycyclic aromatic hydrocarbons. J. Phys. Chem. A. 103, 7487–7495.
- Selassie C.D., Verma R.P. (2003) History of quantitative structure-activity relationships. Burger’s Med. Chem. Drug Discov. 1, 1–48.
Қосымша файлдар
