Chromosome 9 open reading frame 85, commonly known as C9orf85, is a protein in Homo sapiens encoded by the C9orf85 gene. The gene is located at 9q21.13.[5] When spliced, four different isoforms are formed. C9orf85 has a predicted molecular weight of 20.17 kdal.[6] Isoelectric point was found to be 9.54.[6] The function of the gene has not yet been confirmed, however it has been found to show high levels of expression in cells of high differentiation.[7]

C9orf85
Identifiers
AliasesC9orf85, chromosome 9 open reading frame 85
External IDsMGI: 1913456; HomoloGene: 11933; GeneCards: C9orf85; OMA:C9orf85 - orthologs
Orthologs
SpeciesHumanMouse
Entrez
Ensembl
UniProt
RefSeq (mRNA)

NM_182505
NM_198394

NM_025423

RefSeq (protein)

NP_872311
NP_001351982
NP_001351984
NP_001351986

NP_079699

Location (UCSC)Chr 9: 71.91 – 71.99 MbChr 19: 21.56 – 21.63 Mb
PubMed search[3][4]
Wikidata
View/Edit HumanView/Edit Mouse

Background

edit

Protein Sequence

edit

The sequence for C9orf85 isoform 1 in Homo sapiens, derived from NCBI:[5]

MSSQKGNVARSRPQKHQNTFSFKNDKFDKSVQTKKINAKLHDGVCQRCKEVLEWRVKYSKYKPLSKPKKCVKCLQKTVKDSYHIMCRPCACELEVCAKCGKKEDIVIPWSLPLLPRLECSGRILAHHNLRLPCSSDSPAS ASRVAGTTGAHHHAQLIFVFLVEMGFHYVGQAGLELLTS

Aliases

edit
  • Uncharacterized Protein C9orf85[8]
  • MGC61599[9]
  • RP11-364E17.2[10]
  • LOC138241[10]
  • OTTHUMP00000021461[10]

Isoforms

edit
Table Showcasing the Lengths of all C9orf85 Isoforms
Isoform # mRNA Length (bp) Amino Acid Length (aa)
1[11] 3821 179
2[12] 1185 157
3[13] 1316 138
4[14] 3707 69

Isoform 1 is the major form of the gene used. This isoform contains 4 exons. It's accession number is NM_001365053.2.[5]

Homology

edit

Orthologs

edit

The C9orf85 gene was found in all species type including vertebrate to bacteria. However no type of protist was found as an ortholog for the human gene except for plasmodium.

A List of 20 Orthologs for the gene C9orf85 in Homo sapiens[15]
Genus species Common Name Taxonomic Group Date of Divergence (MYA) Accession Number Length (aa) Identity Similarity
Homo sapiens Human Chordata 0 NP_001351982 179 100% 100%
Meriones unguiculatus Mongolian gerbil Rodentia 90 XP_021514638 154 74% 84%
Gallus gallus Chicken Chordata 312 XP_001233821 166 78.70% 60%
Terrapene carolina triunguis Three-toed box turtle Chordata 312 XP_024066792 171 85.45% 61%
Chelonia mydas Green sea turtle Chordata 312 XP_007065676 178 84.31% 61%
Calidris pugnax Ruff bird Chordata 312 XP_014813985 166 77.78% 60%
Microcaecilia unicolor Tiny cayenne caecilian Chordata 351.8 XP_030049723 178 76.15% 60%
Xenopus tropicalis Western clawed frog Chordata 351.8 KAE8633085 133 78.18% 61%
Electrophorus electricus Electric eel Chordata 435 XP_026886158 156 55.13% 87%
Oncorhynchus mykiss Rainbow trout Chordata 435 XP_021461156 177 60.77% 72%
Acanthaster planci Crown-of-thorns starfish Echinodermata 684 XP_022096254 197 56.76% 62%
Photinus pyralis Big dipper firefly Arthropoda 797 XP_031346726 183 58.04% 62%
Pomacea canaliculata Golden apple snail Mollusca 797 XP_025077101 208 47.33% 73%
Drosophila melanogaster Fruit Fly Arthropoda 797 NP_573209 234 49.58% 65%
Acropora millepora Coral Cnidaria 824 XP_029187517 190 57.14% 62%
Salpingoeca rosetta Choanoflagellate Choanoflagellate 1023 XP_004995700 286 41.67% 60%
Apophysomyces ossiformis Fungi Mucoromycota 1105 KAF7725139 181 40% 72%
Ricinus communis Caster oil plant Spermatophyta 1496 XP_002530997 227 34.21% 78%
Plasmodium ovale wallikeri Malarian protist Apicomplexa 1768 SBT56954 680 68.75% 17%
Bacillus cereus Bacteria Firmicutes 4290 KXI72539 83 73.61% 39%

Paralogs

edit
5 Possible Paralogs for the gene C9orf85 in Homo sapiens[15]
Paralog Accession Number Length (aa) Identity Similarity Location
CCDC198 XP_005267863 290 44.38% 89% Chromosome 14
Retbidin EAW84316 224 60% 51% Chromosome 19
hCG2038446 EAX11460 135 68.54% 49% Chromosome 2
hCG1820974 EAW94215 143 72.58% 41% Chromosome 17
O-phosphoseryl-tRNA(Sec) selenium transferase isoform X1 XP_016863766 586 70.67% 41% Chromosome 4

Rate of Molecular Evolution

edit
 
A graph depicting the rate of divergence for the human gene C9orf85 in comparison to Homo sapiens Cytochrome C and Fibrinogen Alpha Chain.

A rate of divergence can be calculated using the molecular clock hypothesis. As observed by the graph, C9orf85 lies between Cytochrome C and Fibrinogen Alpha with a slope leaning more towards Cytochrome C. Therefore, C9orf85 is possibly evolving at a slower rate than most proteins.

Conservation

edit

Multiple Sequence Alignment

edit

A multiple sequence alignment (MSA)[16] was done between the top 15 closely related orthologs to the Homo sapiens C9orf85. 20 amino acids were discovered to be conserved among all 15 sequences at the beginning of the protein sequence; within the first three exons.

In a MSA between distantly related homologs, 5 amino acids were observed to be conserved between exons two and three.

Yet, when running a multiple sequence alignment between Homo sapiens and the extremely distant Bacillus cereus, 53 amino acids are observed to be conserved primarily in the fourth exon.

Cysteine

edit
 
Multiple sequence alignment of C9orf85 showcasing the most significant & conserved cysteines.

The amino acid cysteine appears the most throughout the protein sequence as a conserved amino acids; 8 out of 20 instances. Cysteine 48, 70, 89, 96, and Tryptophan 54 are amino acids conserved in all species type – including vertebrate, invertebrate, fungi, plants, and protists – besides bacteria.

Using the Statistical Analysis of Protein Sequences tool,[6] SAPS, 5 spacings of cysteine were found. Four with the pattern of C-X-X-C—at amino acids 45, 70, 86, and 96—and the fifth spacing at amino acid 89 (CAC). The C-X-X-C pattern is known to be present in metal-binding proteins and oxidoreductases.[17] Additionally, three of the five cysteine spacings were also the top conserved amino acids throughout the most closely related orthologs; C70, C89, and C96.

Localization

edit

Gene Localization in Humans

edit

C9orf85 has been found to be expressed highly in epithelial cells.[18] of the pancreas.[19] Additionally, high levels of expression have been established in the urinary bladder and thymus of the adult human, while expression levels were significant in the intestine of a 20-week-old fetus.[5]

Subcellular Localization

edit

k-NN results predict C9orf85 to be 78.3% nuclear, 8.7% mitochondrial, 8.7% cytoplasmic, and 4.3% vacuolar.[20]

Promoter

edit

C9orf85 has 3 predicted promoters for the gene.[21] The choice promoter was GXG_18858 on the plus strand. Chosen for its large quantity of CAGE tags and its position being furthest upstream. Its start position is 71909780 and its end position is 71911841. It includes 2062 base pairs and has 13 transcripts. The last 500 base pairs of the double stranded promoter is featured below:

5' GCAGGAGGCGGGGATTGCGGAAAAGAAGAACCAATAGGAACAAAGGTTCC 3'
3' CGTCCTCCGCCCCTAACGCCTTTTCTTCTTGGTTATCCTTGTTTCCAAGG 5'
5' CCGCCCCTTTGATTTGATGGACTACACATTCGGGCCAATGGGGGAATTCT 3'
3' GGCGGGGAAACTAAACTACCTGATGTGTAAGCCCGGTTACCCCCTTAAGA 5'
5' CATTTCGAAGAAAGTGGGACTTGTTCTCCGGGTTTGAGAAAGAGGCTGCG 3'
3' GTAAAGCTTCTTTCACCCTGAACAAGAGGCCCAAACTCTTTCTCCGACGC 5'
5' CGGAGCCGGAGGGGTCGAGGCTGCGCCGCGTGGAGTGGCTTGGCTTAACA 3'
3' GCCTCGGCCTCCCCAGCTCCGACGCGGCGCACCTCACCGAACCGAATTGT 5'
5' GCAGGGAGGGCAGAGCGATGCTCTTTGACCTCCCAGAAGAGTCACGTGGG 3'
3' CGTCCCTCCCGTCTCGCTACGAGAAACTGGAGGGTCTTCTCAGTGCACCC 5'
5' CTGACCCAGAGCCGGGGCGGAAAGGCTGCGTTTGTTTCTTCCGGGTCATT 3'
3' GACTGGGTCTCGGCCCCGCCTTTCCGACGCAAACAAAGAAGGCCCAGTAA 5'
5' GACAGAAGCGTCAATTCCTGGGAGTAGTTCGTTGGTTTTCTTTCCCCTCA 3'
3' CTGTCTTCGCAGTTAAGGACCCTCATCAAGCAACCAAAAGAAAGGGGAGT 5'
5' TCCTTTTGCCTGCTCCCGGCGAGGGGTGGCTTTGATTTCGGCGATGAGCT 3'
3' AGGAAAACGGACGAGGGCCGCTCCCCACCGAAACTAAAGCCGCTACTCGA 5'
5' CCCAGAAAGGCAACGTGGCTCGTTCCAGACCTCAGAAGCACCAGAATACG 3'
3' GGGTCTTTCCGTTGCACCGAGCAAGGTCTGGAGTCTTCGTGGTCTTATGC 5'
5' TTTAGCTTCAAAAATGACAAGTTCGATAAAAGTGTGCAGACCAAGGTAGG 3'
3' AAATCGAAGTTTTTACTGTTCAAGCTATTTTCACACGTCTGGTTCCATCC 5'
A Table of 16 Possible Transcription Factors Predicted to Bind to the Promoter[22]
Transcription Factor Detailed Matrix Information Matrix Score
CLOX Transcriptional repressor CDP 0.962
KLFS Gut-enriched Krueppel-like factor 1.000
CAAT Nuclear factor Y (Y-box binding factor) 0.940
HIFF Aryl hydrocarbon receptor nuclear translocator-like, homodimer 1.000
MZF1 Myeloid zinc finger protein 0.992
STAT STAT5: signal transducer and activator of transcription 5 0.944
ETSF ETS-like gene 1 (ELK-1) 0.958
CREB Tax/CREB complex 0.834
P53F Tumor suppressor p53 (3' half site) 0.921
TCFF TCF11/LCP-F1/Nrf1 homodimers 1.000
FKHD Fork head homologous X binds DNA with a dual sequence specificity (FHXA and FHXB) 0.870
MIRF Zinc finger protein 768 0.819
BCL6 B-cell CLL/lymphoma 6, member B (BCL6B) 0.878
AP2F Transcription factor AP-2, alpha 0.931
EBOX MYC associated factor X 0.926
GCMF Glial cells missing homolog 1, chorion-specific transcription factor GCMa 0.942

Regulation

edit

Transmembrane Domain

edit

Though there is a presence of hydrophobic regions in the protein sequence,[6][23][24] there have been no confirmed transmembrane domains present[25]

Phosphorylation

edit

A protein kinase C phosphorylation site is predicted at amino acid 3-5.[26] There is also a possible CK2 phosphorylation site at amino acid 77-80[26]

SUMOylating

edit

There is one predicted SUMO site at position 23.[27] The result is significant with a p-value of 0.041.

Function

edit

Through the level of expression in various tissue samples, the C9orf85 protein is a regulated gene rather than a constitutive gene.[5]

Additionally, urinary bladder epithelial cells function by altering the immune system of an infection.[28] The thymus is a primary lymphoid organ of the immune system, composed of T cells and epithelial cells. Research has found that the thymus has an increasing role in the development of intestinal immunity[29] Both are an element of the immune system, designed to ensure proper function of the immune system.

References

edit
  1. ^ a b c GRCh38: Ensembl release 89: ENSG00000155621Ensembl, May 2017
  2. ^ a b c GRCm38: Ensembl release 89: ENSMUSG00000035171Ensembl, May 2017
  3. ^ "Human PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.
  4. ^ "Mouse PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.
  5. ^ a b c d e "C9orf85 chromosome 9 open reading frame 85 [Homo sapiens (human)] – Gene – NCBI". www.ncbi.nlm.nih.gov. Retrieved 2020-09-30.
  6. ^ a b c d EMBL-EBI. (2020). SAPS Results. Ebi.Ac.Uk. https://www.ebi.ac.uk/Tools/services/web/toolresult.ebi?jobId=saps-I20201219-191317-0344-54841082-p1m
  7. ^ Chen BZ, Yu SL, Singh S, Kao LP, Tsai ZY, Yang PC, et al. (January 2011). "Identification of microRNAs expressed highly in pancreatic islet-like cell clusters differentiated from human embryonic stem cells". Cell Biology International. 35 (1): 29–37. doi:10.1042/CBI20090081. PMID 20735361. S2CID 30538749.
  8. ^ C9orf85 - Uncharacterized protein C9orf85 - Homo sapiens (Human) - C9orf85 gene & protein. (2020). Uniprot.Org. https://www.uniprot.org/uniprot/Q96MD7
  9. ^ (2020). Genenames.Org. https://www.genenames.org/data/gene-symbol-report/#!/hgnc_id/28784
  10. ^ a b c "AceView a comprehensive annotation of human and worm genes with mRNAs or ESTsAceView". www.ncbi.nlm.nih.gov. Retrieved 2020-09-30.
  11. ^ "uncharacterized protein C9orf85 isoform 1 [Homo sapiens] – Protein – NCBI". www.ncbi.nlm.nih.gov. Retrieved 2020-09-30.
  12. ^ "uncharacterized protein C9orf85 isoform 2 [Homo sapiens] – Protein – NCBI". www.ncbi.nlm.nih.gov. Retrieved 2020-12-19.
  13. ^ "uncharacterized protein C9orf85 isoform 3 [Homo sapiens] – Protein – NCBI". www.ncbi.nlm.nih.gov. Retrieved 2020-12-19.
  14. ^ "uncharacterized protein C9orf85 isoform 4 [Homo sapiens] – Protein – NCBI". www.ncbi.nlm.nih.gov. Retrieved 2020-12-19.
  15. ^ a b "Protein BLAST: search protein databases using a protein query". blast.ncbi.nlm.nih.gov. Retrieved 2020-10-26.
  16. ^ "Clustal Omega < Multiple Sequence Alignment < EMBL-EBI". www.ebi.ac.uk. Retrieved 2020-10-26.
  17. ^ Miseta A, Csutora P (August 2000). "Relationship between the occurrence of cysteine in proteins and the complexity of organisms". Molecular Biology and Evolution. 17 (8): 1232–9. doi:10.1093/oxfordjournals.molbev.a026406. PMID 10908643.
  18. ^ GENEVESTIGATOR Team at Nebion AG. (2020). Genevisible. Genevisible.com; genevisible. https://genevisible.com/tissues/HS/UniProt/Q96MD7
  19. ^ "Gene: C9orf85 – ENSG00000155621". bgee.org. Retrieved 2020-09-30.
  20. ^ PSORT II Prediction. (2020). Psort.Hgc.Jp. https://psort.hgc.jp/form2.html
  21. ^ Genomatix: Gene2Promoter Result. (2020). Genomatix.De. https://www.genomatix.de/cgi-bin/c2p/c2p.pl?s=c5402bf929e4d6000dfc7ce8c56fa1e6;TASK=c2p;SHOW=TempSeq_kd0ZKohP.html
  22. ^ Genomatix: MatInspector Result. (2019). Genomatix.De. https://www.genomatix.de/cgi-bin/eldorado/eldorado.pl?s=c5402bf929e4d6000dfc7ce8c56fa1e6;PROM_ID=GXP_18858;GROUP=vertebrates;GROUP=others;ELDORADO_VERSION=E35R1911
  23. ^ "ProtScale". Expasy. Archived from the original on 2019-01-08.
  24. ^ TMPred results. (2020). Vital-It.Ch. https://embnet.vital-it.ch/cgi-bin/TMPRED_form_parser
  25. ^ "SOSUI/submit a protein sequence". harrier.nagahama-i-bio.ac.jp. Retrieved 2020-12-19.
  26. ^ a b "Motif Scan". myhits.sib.swiss. Retrieved 2020-12-19.
  27. ^ "GPS-SUMO: Prediction of SUMOylation Sites & SUMO-interaction Motifs". sumosp.biocuckoo.org. Archived from the original on 2018-05-06. Retrieved 2020-12-19.
  28. ^ Abraham SN, Miao Y (October 2015). "The nature of immune responses to urinary tract infections". Nature Reviews. Immunology. 15 (10): 655–63. doi:10.1038/nri3887. PMC 4926313. PMID 26388331.
  29. ^ Falk W (July 2006). "A ticket to the gut for thymic T cells". Gut. 55 (7): 910–2. doi:10.1136/gut.2005.087288. PMC 1856347. PMID 16766746.