C11ORF97, or Chromosome 11 Open Reading Frame 97, is a protein which in humans is encoded by the C11ORF97 gene.[5] It is hypothesized to localize to the cytoplasm, and plays a role in the ciliary basal body.[6] Based on its protein interactions, it is thought to have a role in Lemierre's Syndrome and Hepatic Coma.[7]

C11orf97
Identifiers
AliasesC11orf97, LINC01171, chromosome 11 open reading frame 97
External IDsMGI: 1916575; HomoloGene: 75283; GeneCards: C11orf97; OMA:C11orf97 - orthologs
Orthologs
SpeciesHumanMouse
Entrez
Ensembl
UniProt
RefSeq (mRNA)

NM_001190462

NM_029306

RefSeq (protein)

NP_001177391

NP_083582

Location (UCSC)Chr 11: 94.51 – 94.53 MbChr 9: 14.67 – 14.68 Mb
PubMed search[3][4]
Wikidata
View/Edit HumanView/Edit Mouse

Gene

edit

Human C11ORF97 gene is 19,663 basepairs long including all introns, spanning from position 94,512,461 to 94,532,123.[8] It is found on the long arm of chromosome 11 at 11q21, with a plus strand orientation.[8] Human C11ORF97 has only one known variant.[8]

mRNA transcript

edit
 
Annotated conceptual translation of human C11ORF97 mRNA with aligned peptide sequence, showing important conserved, repeated, and modified regions.

Expression

edit

Human C11ORF97 expression is seen in many tissues, however, it is mainly seen in the lungs and the brain.[9][10] According to Human Protein Atlas, the consensus dataset for RNA tissue specificity on C11ORF97 shows six different parts of the brain having some of the highest expression, stating that the enriched groups are brain, choroid plexus, fallopian tube, and lung.[11] According to The Human Protein Atlas, human C11ORF97 RNA tissue specificity was highest in different parts of the brain, as well as the lungs. Although it was high in testes, this is most likely not significant to this gene.[12]

Protein

edit

Features

edit

The one variant of C11ORF97 produced in humans is 126 amino acids in length, and a predicted weight of 13.9 kDa.[5][13] It has an isoelectric point of pH 9.87.[14] It has no transmembrane regions, and no domains of unknown functions. The amino acid composition from SAPS tool shows that there is enriched G and R, and highly lessened S, T, D, and F amino acids.[15]

 
i-TASSER Tertiary Structure Results for Human C11ORF97.

Subcellular localization

edit

Human C11ORF97 expected localization is in the cytoplasm, with a score of 0.5188, according to DeepLoc.[16] The following tools produced no results when searching for C11ORF97 localization: NetNES, SignalP, TatP, or Human Protein Atlas. A nuclear localization signal, as well as a nuclear export signal was found, suggesting that C11ORF97 most likely has a role in the nucleus, and is then exported to the cytoplasm.

 
Tertiary Structure for Human C11ORF97 from AlphaFold.

Structure

edit
 
Annotated Tertiary Structure of Human C11ORF97 from AlphaFold, annotated with NCBI Integrated Sequence Viewer (iCN3D).

The tertiary structure was viewed through AlphaFold, i-TASSER and annotated with NCBI's iCN3D tool.[17][18] These results are shown in the figures to the right, all of them include similar, or near identical features—two alpha helices and no beta sheets.

The C-scores for the 5 i-TASSER models, in order, are -3.59, -4.88, -5.00, -4.47, and -5.00. Thus, the first structure in this figure has the most confidence compared to the other four predicted structures.

Function

edit

Protein-protein interactions

edit

There were only a couple protein-protein interactions found for human C11ORF97, with a medium or higher confidence threshold.[19]

Name Full Name Score Identification Description
MORN2 MORN repeat-containing protein 2 0.693 Textmining Predicted to be involved in cell differentiation and spermatogenesis. Associated with Lemierre’s Syndrome and Hepatic Coma.
CRACR2A Calcium release activated channel regulator 2A 0.583 Textmining Enables GTPase activity and calcium ion binding. Involved in activation of store-operated calcium channel activity and store-operated calcium entry.

Post-translational modifications

edit
 
Illustration of C11ORF97 protein post translational modifications, including phosphorylation, glycosylation, c-mannosylation, propionylation, sumoylation, and a SUMOinteraction.
 
Unrooted Phylogenetic tree for C11ORF97 orthologs. Three letter codes can be found in the figures of multiple sequence alignments.

There are many post-translational modifications found in Human C11ORF97, many of which are conserved in orthologs. There are many phosphorylation sites, as well as a SUMOinteraction and sumoylation site, and others.[20] These types post-translational modifications have various functions, and can play a role in cell growth and proliferation. A more detailed description is seen in the illustration to the right.

Homology and evolution

edit

Human C11ORF97 protein is found in vertebrates and invertebrates. It is found in the following vertebrates: mammals, birds, reptiles, amphibians, and fish. Human C11ORF97 seems to have first appeared in invertebrates 686 million years ago. All of the comparisons are seen in Table 2 below. An unrooted phylogenetic tree is also provided, showing the predicted likelihood of how the orthologs for this gene are related.[21] Multiple sequence alignments for strict and distant orthologs are also provided as figures. The codes for the 3 letter abbreviations are the same between figures.


 
Annotated Multiple Sequence Alignments of Human C11ORF97 Protein and StrictOrthologs. The orthologs in this alignment are from mammals, birds, and reptiles. This alignment was made using ClustalW, and shading was done using BoxShade. The 3 letter codes are as follows: Cca, Caretta caretta; Tca, Terrapene Carolina triunguis; Pra, Podarcis raffonei; Cti, Crotalus tigris; Gga, Gallus gallus; Aap, Apus apus; Aro, Apteryx rowi; Tgu, Tinamus guttatus; Has, Homo sapiens; Ame, Ailuropoda melanoleuca; Mmu, Mus musculus; Mna, Miniopterus natalensis.









 
Annotated Multiple Sequence Alignment of Human C11ORF97 Protein and Distant Orthologs. The orthologs in this alignment are from amphibians, fish, and invertebrates. This alignment was made using ClustalW, and shading was done using BoxShade.5, 6 The 3 letter codes are as follows: Ler, Leucoraja erinacea; Ppe, Pristis pectinata; Xla, Xenopus laevis; Hys, Hyla sarda; Gse, Geotrypetes seraphini; Rbi, Rhinatrema bivittatum; Hsa, Homo sapiens; Hru, Haliotis rufescens; Gae, Gigantopelta aegis.
Table 2. Orthologs of Human C11ORF97. Compares orthologs from different groups. Sorted by date of divergence within groups, and then by sequence similarity.
group genus, species common name taxonomic group date of divergence (MYA) accession number sequence length (aa) sequence identity (%) sequence similarity (%)
Mammals Homo sapiens human Primates 0 NP_001177391.1 126 100 100
Mus musculus mouse Rodentia 87 NP_083582.2 121 71.4 77
Ailuropoda melanoleuca giant panda Carnivora 94 XP_019648185.2 127 84.3 88.2
Miniopterus natalensis natal long-fingered bat Chiroptera 94 XP_016062178.1 97 47.2 50
Aves Apteryx rowi okarito brown kiwi Apterygiformes 319 XP_025927683 125 42.8 54.5
Tinamus guttatus white-throated tinamou Tinamiformes 319 XP_010210901.1 120 34.2 49
Apus apus common swift Apodiformes 319 XP_051500140 121 33.8 45.8
Gallus gallus chicken Galliformes 319 XP_040517768.1 264 20.1 28.6
Reptiles Terrapene carolina triunguis three-toed box turtle Testudines 319 XP_024064440.1 118 46.8 54.7
Podarcis raffonei aeolian wall lizard Squamata 319 XP_053241613.1 121 42.7 52.4
Crotalus tigris tiger rattlesnake Squamata 319 XP_039205026.1 133 40 53.8
Caretta caretta loggerhead-turtle Testudines 319 XP_048699259.1 130 38.8 46.2
Amphibians Rhinatrema bivittatum two-lined caecilian Gymnophiona 352 XP_029458218.1 149 26.6 36.2
Geotrypetes seraphini gaboon caecilian Gymnophiona 352 XP_033805672.1 185 21.8 30.1
Hyla sarda sardinian tree frog Anura 352 XP_056417471.1 184 19 27.5
Xenopus laevis african clawed frog Anura 352 OCT93259.1 167 18 28.6
Fish Leucoraja erinacea little skate Rajiformes 462 XP_055493278.1 144 19.8 30.5
Pristis pectinata small-tooth sawfish Pristiformes 462 XP_051882720.1 77 19.5 35.9
Invertebrates Haliotis rufescens red abalone Vetigastropoda 686 XP_046353937.1 139 16.7 26.9
Gigantopelta aegis deep sea snail Neomphalina 686 XP_041350579.1 154 15.8 23

Paralogs

edit

There were no paralogs found for human C11ORF97 protein.

Clinical significance/pathology

edit

Based on its protein interactions, it is predicted to have a role in Lemierre's syndrome and hepatic coma.[7] C11ORF97 was also found to be closely related to ciliary movement, seen through multiple published papers.[22][23] C11ORF97 was also mentioned in a published paper that dealt with the respiratory illnesses, COVID-19.[24]

References

edit
  1. ^ a b c GRCh38: Ensembl release 89: ENSG00000257057Ensembl, May 2017
  2. ^ a b c GRCm38: Ensembl release 89: ENSMUSG00000031927Ensembl, May 2017
  3. ^ "Human PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.
  4. ^ "Mouse PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.
  5. ^ a b "uncharacterized protein C11orf97 [Homo sapiens]". National Center for Bioinformatics (NCBI).
  6. ^ "C11orf97". neXtProt. SIB Swiss Institute of Bioinformatics. Retrieved 2023-12-07.
  7. ^ a b "DISEASES - MORN2". diseases.jensenlab.org. Retrieved 2023-12-07.
  8. ^ a b c "C11orf97 chromosome 11 open reading frame 97 [Homo sapiens (human)] - Gene". National Center for Bioinformatics (NCBI). U.S. National Library of Medicine. Retrieved 2023-12-07.
  9. ^ Fagerberg L, Hallström BM, Oksvold P, Kampf C, Djureinovic D, Odeberg J, et al. (February 2014). "Analysis of the human tissue-specific expression by genome-wide integration of transcriptomics and antibody-based proteomics". Molecular & Cellular Proteomics. 13 (2): 397–406. doi:10.1074/mcp.m113.035600. PMC 3916642. PMID 24309898.
  10. ^ Duff MO, Olson S, Wei X, Garrett SC, Osman A, Bolisetty M, et al. (May 2015). "Genome-wide identification of zero nucleotide recursive splicing in Drosophila". Nature. 521 (7552): 376–379. Bibcode:2015Natur.521..376D. doi:10.1038/nature14475. PMC 4529404. PMID 25970244.
  11. ^ "Tissue expression of C11orf97 - Summary". The Human Protein Atlas. Retrieved 2023-12-07.
  12. ^ "Tissue expression of C11orf97 - Summary". The Human Protein Atlas. Retrieved 2023-12-16.
  13. ^ "CK097_HUMAN". UniProt. A0A1B0GVM6. Retrieved 2023-12-07.
  14. ^ "Protein Isoelectric Point". www.bioinformatics.org. Retrieved 2023-12-07.
  15. ^ "SAPS < Sequence Statistics < EMBL-EBI". www.ebi.ac.uk. Retrieved 2023-12-07.
  16. ^ "DeepLoc 2.0". DTU Health Tech - Bioinformatic Services. Retrieved 2023-12-07.
  17. ^ "I-TASSER results". zhanggroup.org. Retrieved 2023-12-07.
  18. ^ "iCn3D: Web-based 3D Structure Viewer". National Center for Bioinformatics (NCBI). U.S. National Library of Medicine. Retrieved 2023-12-07.
  19. ^ "C11orf97 protein (human)". STRING interaction network. Retrieved 2023-12-16.
  20. ^ "GPS-SUMO: Prediction of SUMOylation Sites & SUMO-interacting Motifs". sumo.biocuckoo.cn. Retrieved 2023-12-16.
  21. ^ "Phylogeny fr".
  22. ^ Stauber M, Boldt K, Wrede C, Weidemann M, Kellner M, Schuster-Gossler K, et al. (September 2017). "1700012B09Rik, a FOXJ1 effector gene active in ciliated tissues of the mouse but not essential for motile ciliogenesis". Developmental Biology. 429 (1): 186–199. doi:10.1016/j.ydbio.2017.06.027. PMID 28666954.
  23. ^ Merlino J (2022). "Role of the Primary Cilium in the Crosstalk Between Obesity and Cancer". cdr.lib.unc.edu. doi:10.17615/gqs5-kz52. Retrieved 2023-12-16.
  24. ^ Vastrad B, Vastrad C, Tengli A (December 2020). "Bioinformatics analyses of significant genes, related pathways, and candidate diagnostic biomarkers and molecular targets in SARS-CoV-2/COVID-19". Gene Reports. 21: 100956. doi:10.1016/j.genrep.2020.100956. PMC 7854084. PMID 33553808.