Family with Sequence Similarity 166, member C (FAM166C), is a protein encoded by the FAM166C gene. The protein FAM166C (aliases c2orf70, LOC339778) is localized in the nucleus. It has a calculated molecular weight of 23.29 kDa. It also contains DUF2475, a protein of unknown function from amino acid 19–85.[5] The FAM166C protein is nominally expressed in the testis, stomach, and thyroid .[6]

FAM166C
Identifiers
AliasesFAM166C, chromosome 2 open reading frame 70, family with sequence similarity 166 member C, C2orf70
External IDsMGI: 1922684; HomoloGene: 49920; GeneCards: FAM166C; OMA:FAM166C - orthologs
Orthologs
SpeciesHumanMouse
Entrez
Ensembl
UniProt
RefSeq (mRNA)

NM_001105519
NM_001322426

NM_029285

RefSeq (protein)

NP_001098989
NP_001309355

NP_083561

Location (UCSC)Chr 2: 26.56 – 26.58 MbChr 5: 30.62 – 30.64 Mb
PubMed search[3][4]
Wikidata
View/Edit HumanView/Edit Mouse

Gene

edit

The FAM166C gene, also known as C2orf70, is located on the positive-sense strand of locus 2p23.3. It has 9 exons, however due to overlap only 4 are distinguishable in the human genome.[7] FAM166C spans from 26,562,565 to 26,581,166 for a total length of 18.6 kpb.[8]

Gene neighborhood

edit
 
Gene neighborhood of FAM166C.[7]

The gene neighborhood for FAM166C consists of DRC1, LOC112840921, OTOF, CIB4 and LOC122756675.[7] LOC112840921 and LOC122756675 are both predicted transcriptional regulatory regions.[9][10] DRC1 (dynein regulatory complex subunit 1) encodes a central component of the nexin-dynein complex, a regulator of ciliary diene. Mutations in this gene can lead to ciliary dyskinesia.[11] OTOF encodes the protein otoferlin which has been suggested to be involved in vesicle membrane fusion. Mutations can lead to neurosensory nonsyndromic recessive deafness, DFNB9.[12] CIB4 (Homo sapiens calcium and integrin binding family member 4) encodes the CIB4 protein which regulates integrin alphaIIb subunit activation.[13]

Transcripts

edit

FAM166C has 2 different transcript variants. The most abundant variant is FAM166C transcript variant 1, which is 718 nucleotides in length.[7]

FAM166C Transcript variants
Accession Number Transcript Length Number of Exons Protein Length Isoform
NM_001105519.3 718 4 201 1
NM_001322426.2 754 5 184 2

Protein

edit

The FAM166C protein is 201 amino acids in length with a predicted molecular weight of 23 kDA and an isoelectric point of 10.[14] It has higher than normal levels of tyrosine and proline and lower than normal levels of isoleucine.[15]

Domains and structure

edit

The FAM166C protein has one domain of unknown function called DUF2475 from amino acids 19–85.[16] FAM166C isoform 1 secondary structure appears to be primarily alpha helical in nature with only short segments predicted to be beta sheets.[17] Tertiary structure predictions shows 5 distinct alpha helices with high confidence.[18]

Isoforms

edit

FAM166C has 2 different splice isoforms. The most abundant isoform is FAM166C protein isoform 1 which is 201 amino acids in length.[7]

FAM166C protein isoforms
Name Transcript variant Peptide length Domains present
Isoform 1 1 201 aa DUF275
Isoform 2 2 184 aa DUF275

Regulation

edit

Gene level regulation

edit

Promoter

edit
 
Bird's eye view of the coding region of the FAM166C gene and the promoters in the region. Blue boxes represent promoter regions while orange boxes represent exons.

FAM166C has 3 possible promoters that produce complete protein isoforms, however Isoform 1 is only encoded by GXP_1493451. Isoform 2 is also encoded by GXP_1493451.[19]

Transcription Factor Binding Sites

edit

GXP_1493451 contains over 250 transcription factor binding sites. The most conserved and likely to bind include a forkhead box protein factor (V$FOXP2.01), a collagen krox domain factor (V$CKROX.01) and an E2F transcription factor(V$E2F3.01).[19]

Expression pattern

edit
 
HPA RNA-seq normal tissue profiling for FAM166C.[20]

FAM166C has overall low levels of expression compared to other proteins but within the tissues it is expressed in, it appears most prominently in the testes, stomach and thyroid.[7] Within the cell, FAM166C is localized to the nucleus and contains 2 nuclear localization signals.[21] Protein antibody staining is highly indicative of nuclear membrane localization specifically.[22]

 
FAM166C protein antibody targeting reveals nuclear membrane localization.[22] Green represents FAM166C antibodies, blue the nucleus, and red the microtubules.

Transcript level regulation

edit
 
FAM166C 3' UTR structure predicted structure. Highly conserved region are shown in green. Potential mi-RNA binding sites are labelled in orange and blue, and polyadenylation sites are labeled in red.

The 5' UTR of FAM166C transcript variant 1 is 29 bp in length.[23] Analysis of potential 3d structures identifies one hairpin structure, however, the 5' UTR differs heavily among orthologs indicating this is unlikely to be an important region for transcriptional regulation.

The 3' UTR is 89 bp in length and contains one polyadenylation signal at 699 bp. It is conserved among human transcript variants, but only small segments are well conserved among orthologs.[23] It contains 2 predicted mi-RNA binding sites in areas of moderate conservation at 631 bp (has-miR-3184-3p) and at 641 bp ( has-miR-4539, has-miR-12113).[24] 3D predictions identify two stem loop structures.

Protein level regulation

edit

FAM166C is predicted to have 7 phosphorylation sites, 2 acetylation sites and one O-GlcNAc site, which are well conserved among orthologs.[25][26][27]

 

The above image is a conceptual translation of FAM166C transcript variant 1/ protein isoform 1. Phosphorylation sites are highlighted in green, N-linked acetylation sites are highlighted in indigo, internal acetylation sites are highlighted in pink, O-ß-GlcNAc sites are highlighted in yellow, nuclear localization signals are highlighted in light blue and the poly A signal is highlighted in red. The start and stop of transcription are marked with colored green and red text respectively. DUF275 is marked with brackets and amino acids conserved among all known orthologs are bolded.

Homology and evolution

edit

Paralogs

edit

The human FAM166C gene has two paralogs called FAM166A and FAM166B. They are located at 9q34.3 and 9p13.3 respectively.[28][29] The function of both proteins is not currently well understood.

Orthologs

edit

FAM166C has orthologs in species as distant as insects. Mammalian orthologs are moderately similar to human FAM166C, with percent identity greater than 70%. Orthologs in reptiles, birds and amphibians range from 65% to 40%. In fish and invertebrates, identity ranges from 40% to 20%. No orthologs were found in fungi, bacteria or plants.

Genus/Species Common Name Taxonomic Order Estimated Date of Divergence (MYA) Accession number Sequence length (aa) Sequence identity (%) Sequence similarity (%)
Mammalia Homo sapiens Human Primates 0 NP_001098989.1 201 100 100
Mus musculus Mouse Rodentia 90 NP_083561.1 200 68.2 80.6
Canis Lupis familiaris Dog Carnivora 96 XP_038546893.1 201 78.6 90
Camelus ferus Wild Bactrian camel Artiodactyla 96 XP_006189760.1 201 82.1 89.1
Reptilia Podarcis muralis Common wall lizard Squamata 312 XP_028581109.1 201 66.2 80.1
Thamnophis elegans Western terrestrial garter snake Squamata 312 XP_032071055.1 201 65.7 79.6
Gopherus evgoodei Goode's thornscrub tortoise Testudines 312 XP_030410948. 201 64.7 78.1
Aves Gallus gallus Chicken Galliformes 312 XP_420014.2 128 58.5 76.4
Apteryx rowi Okarito brown kiwi Apterygiformes 312 XP_025911147.1 199 40.8 52.9
Amphibia Ranitomeya imitator Mimic poison frog Anura 351.8 CAF5191195. 200 67.2 80.6
Bufo bufo Common toad Anura 351.8 XP_040286881.1 155 51 71
Microcaecilia unicolor Tiny cayenne caecilian Gymnophiona 351.8 XP_030051403.1 173 46.8 69.3
Geotrypetes seraphini Gaboon caecilian Gymnophiona 351.8 XP_033791776.1 173 49.0 67.9
Fish Alosa sapidissima American shad Clupeiformes 435 XP_041950360.1 201 41.1 62.4
Salmo trutta Brown trout Salmoniformes 435 XP_029626333.1 192 4.21 60.9
Carcharodon carcharias Great white shark Chondrichthyes 473 XP_041043370.1 200 41.0 60.8
Invertebrata Ciona intestinalis Vase tunicate Enterogona 676 XP_002130039.1 206 43.9 62.3
Anneissia japonica Sea lily Crinoidea 684 XP_033123630.1 203 38.9 58.8
Saccoglossus kowalevskii Acorn worm Enteropneusta 684 XP_002733424.1 197 41.5 58.5
Photinus pyralis Common eastern firefly Coleoptera 797 XP_031329322.1 200 23.7 39.9

Evolution

edit

The FAM166C gene appears most distantly in insects which diverged from humans approximately 797 million years ago.[30] Orthologs of FAM166A and FAM166B also occur in insects. FAM166C evolves at a moderate rate; a 1% change in amino acid sequence required around 10 million years. Based on sequence similarity of orthologs, FAM166C evolves at a rate in the middle of cytochrome c and fibrinogen alpha.

 
Rate of evolution comparison between FAM166C, FAM166B, FAM166C, Cytochrome C, and Fibrinogen Alpha A. FAM166C appears to evolve at an average rate somewhere in the middle of Cytochrome C and Fibrinogen A.

Clinical significance

edit

Disease association

edit

Colorectal cancer

edit

Several studies have evaluated FAM166C as a potential target for colorectal cancer treatment. In one study, researchers evaluated FAM166C for drug treatment viability for G12A colorectal cancer. FAM166C was one of 11 genes that had a significantly different twofold change between KRAS G12 (mutated oncogene suppressor) colorectal cancer patients and wild type colorectal cancer patients.[31] Another study identified FAM166C as one of four potential targets for CVB-D, an autophagy cell death inducer of colorectal cancer cells, based on its over-expression in colon adenocarcinoma.[32]

Mutations (SNPs of interest)

edit

Using GWAS, a FAM166C SNP was identified as being correlated with high levels of bacterial colonization, a trait that may be associated with periodontitis.[33]

Using whole exome sequencing and the human reference genome as a comparison, a novel FAM166C SNP was identified as the only gene mutation having a polyphen score of 0.954 indicating it was likely deleterious and may be involved in one of the patient's bilateral cleft lip and palate.[34]

References

edit
  1. ^ a b c GRCh38: Ensembl release 89: ENSG00000173557Ensembl, May 2017
  2. ^ a b c GRCm38: Ensembl release 89: ENSMUSG00000029182Ensembl, May 2017
  3. ^ "Human PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.
  4. ^ "Mouse PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.
  5. ^ "protein FAM166C isoform 1 [Homo sapiens]". National Center for Biotechnology Information. U.S. National Library of Medicine. Retrieved 3 October 2021.
  6. ^ "C2orf70". The Human Protein Atlas. Knut and Alice Wallenberg Foundation. Retrieved 3 October 2021.
  7. ^ a b c d e f "FAM166C family with sequence similarity 166 member C [Homo sapiens (human)] - Gene - NCBI". www.ncbi.nlm.nih.gov.
  8. ^ "Genome Data Viewer - NCBI". www.ncbi.nlm.nih.gov. Retrieved 17 December 2021.
  9. ^ "LOC122756676 Sharpr-MPRA regulatory region 9124 [Homo sapiens (human)] - Gene - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2021-12-18.
  10. ^ "LOC112840921 Sharpr-MPRA regulatory region 888 [Homo sapiens (human)] - Gene - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2021-12-18.
  11. ^ "DRC1 dynein regulatory complex subunit 1 [Homo sapiens (human)] - Gene - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2021-12-18.
  12. ^ "OTOF otoferlin [Homo sapiens (human)] - Gene - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2021-12-18.
  13. ^ "CIB4 calcium and integrin binding family member 4 [Homo sapiens (human)] - Gene - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2021-12-18.
  14. ^ ""ExPASy - Compute pI/Mw tool"". Expasy.
  15. ^ "SAPS < Sequence Statistics < EMBL-EBI". www.ebi.ac.uk.
  16. ^ "protein FAM166C isoform 1 [Homo sapiens] - Protein - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2021-12-16.
  17. ^ "A Protein Secondary Structure Prediction Server". JPred4.
  18. ^ "AlphaFold Protein Structure Database". alphafold.ebi.ac.uk. Retrieved 2021-12-18.
  19. ^ a b "Genomatix". Archived from the original on 24 February 2001. Retrieved 17 December 2021.
  20. ^ "FAM166C family with sequence similarity 166 member C [Homo sapiens (human)] - Gene - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2021-12-16.
  21. ^ "PSORT II Prediction". psort.hgc.jp. Retrieved 16 December 2021.
  22. ^ a b "Subcellular - FAM166C - The Human Protein Atlas". www.proteinatlas.org. Retrieved 2021-12-17.
  23. ^ a b "Homo sapiens family with sequence similarity 166 member C (FAM166C), transcript variant 1, mRNA". 2021-02-16.
  24. ^ "miRDB - MicroRNA Target Prediction Database". mirdb.org. Retrieved 2021-12-17.
  25. ^ "NetPhos 3.1". DTU Health Tech.
  26. ^ "NetAcet- 1.0 DTU Health Tech". DTU Health Tech.
  27. ^ "YinOYang 1.2". DTU Health Tech.
  28. ^ "FAM166A family with sequence similarity 166 member A [Homo sapiens (human)] - Gene - NCBI". www.ncbi.nlm.nih.gov. Retrieved 25 October 2021.
  29. ^ "FAM166B family with sequence similarity 166 member B [Homo sapiens (human)] - Gene - NCBI". www.ncbi.nlm.nih.gov. Retrieved 25 October 2021.
  30. ^ Kumar S, Stecher G, Suleski M. "TimeTree: The Timescale of Life". www.timetree.org. Retrieved 25 October 2021.
  31. ^ Ohnami S, Maruyama K, Chen K, Takahashi Y, Hatakeyama K, Ohshima K, et al. (September 2021). "BMP4 and PHLDA1 are plausible drug-targetable candidate genes for KRAS G12A-, G12D-, and G12V-driven colorectal cancer". Molecular and Cellular Biochemistry. 476 (9): 3469–3482. doi:10.1007/s11010-021-04172-8. PMC 8342352. PMID 33982211.
  32. ^ Jiang F, Chen Y, Ren S, Li Z, Sun K, Xing Y, et al. (July 2020). "Cyclovirobuxine D inhibits colorectal cancer tumorigenesis via the CTHRC1‑AKT/ERK‑Snail signaling pathway". International Journal of Oncology. 57 (1): 183–196. doi:10.3892/ijo.2020.5038. PMC 7252468. PMID 32319595.
  33. ^ Divaris K, Monda KL, North KE, Olshan AF, Lange EM, Moss K, et al. (July 2012). "Genome-wide association study of periodontal pathogen colonization". Journal of Dental Research. 91 (7 Suppl): 21S–28S. doi:10.1177/0022034512447951. PMC 3383103. PMID 22699663.
  34. ^ Shah NS, Sulong S, Sulaiman WA, Halim AS (2020). "Genetic Variations Associated with Non-Syndromic Cleft Lip and Palate in Malays with Whole Exome Sequencing: Case Report and Gene Review". Malaysian Journal of Human Genetics. 1 (1): 35–44. Retrieved 17 December 2021.