CFAP97D2

edit
 
Predicted three-dimensional structure of CFAP97D2 from I-TASSER

Cilia- and flagella-associated protein 97 domain-containing 2 (CFAP97D2) also known as KIAA1430 [1]

(previous name) is a protein encoded by the CFAP97D2 gene. It plays a vital role in the nucleus[2].

Gene

edit

CFAP97D2 gene is 50,003bp long. It is located at 114173082bp to 114223032bp on human chromosome 13 and contains 8 exons[3]. It belongs to the cilia-and flagella-associated 97 (CFAP97) gene family, which has three genes: CFAP97, CFAP97D1, and CFAP97D2[4].

Transcripts

edit

There are three known transcript variants: transcript variant X1, transcript variant 1, and transcript variant 2[5].

Transcript variants of CFAP97D2
Transcript variant Accession number Length Encoded protein Protein length
Transcript variant X1 XM_017020910 5446bp isoform X1 (XP_016876399) 166aa
Transcript variant 1 NM_001395230 952bp isoform 1 (NP_001382159) 99aa
Transcript variant 2 NM_001395229 949bp isoform 2 (NP_001382158) 98aa

Transcript expression

edit

CFAP97D2 transcripts have a low abundance and are found to be expressed in a wide array of organs enriched with connective tissues, such as the brain, testes, ovary, fallopian tube, white blood cells, bone marrows[6][7].

In immune cells, CFAP97D2 is specifically enriched in naive CD8 T cells[8]. This specificity suggests its critical role in the immune system contributing to the complex network of cells.

Proteins

edit

There are three known CFAP97D2 protein isoforms: isoform X1, isoform 1, and isoform 2[9].

The longest protein isoform X1 consists of 166 amino acids[10], with a molecular weight of 19kDa and an isoelectric point of 10.4[11]. There is a relatively high content of lysine, phenylalanine, and leucine in CFAP97D2, resulting in a relatively high total charge[12].

Domains and motifs

edit

CFAP97D2 has a domain named “KIAA1430” that spans residues 27 to 111[13]. This domain is highly conserved across vertebrates, invertebrates, and fungi[14], and it is also considered to be specifically related to motile cilia[15].

An analysis through PSORT II[16] found a conserved mitochondrial processing peptidase cleavage site is found in CFAP97D2 at residue 16. And a conserved nuclear localization signal is found at residue 27. This indicates the vital role of CFAP97D2 in the nucleus. Additionally, human CFAP97D2 is predicted to have a unique leucine-zipper-pattern of 63 DNA binding motif at residue 37, which can help distinguish nuclear proteins[17].

Structure

edit

The secondary structure of CFAP97D2 is highly conserved among species, and most are composed of alpha-helices with beta-sheets and coiled connected[18][19]. The KIAA1430 domain consists of two helices and connected coils. There is a coiled coil conserved in human CFAP97D2 from residues 48 to 109[20].

There is no disulfide bond and transmembrane domains found in CFAP97D2.

Subcellular localization

edit

CFAP97D2 has a high likelihood to localize in mitochondria and nucleus, and to be soluble in the cytoplasm[21].

Regulation

edit

Gene level regulation

edit

Two promoters are found in the CFAP97D2 gene via Genomatix[22]. Promoter A (GXP_6735018) encodes for transcript X1 while promoter B (GXP_7530125) encodes for transcripts 1 and 2. Several different transcription factors within promoter A regulate the expression of the CFAP97D2 gene (function found on Genomatix promoter annotation).

Conserved regulatory transcription factors in promoter
Transcription factors Full name Function
MYT1 MYT1 C2HC zinc finger protein regulate the expression in the brain and bone marrows (immune system).
PAR PAR/bZIP family regulate the expression in the brain and bone marrows (immune system).
Paralog Hox Paralog hox genes 1-8 from the four hox clusters A, B, C, D function in adipose tissue may also lead to high expressions in the fatty tissues of the reproductive system
cAMP cAMP-responsive element binding proteins function in adipose tissue may also lead to high expressions in the fatty tissues of the reproductive system

Post-translational modifications

edit

There are sixteen residues along CFAP97D2 protein that are likely phosphorylation modification sites, and eleven internal lysines are likely acetylation modification sites[23]. These two types of sites are highly conserved in the KIAA1430 domain across species. There are also three SUMOylation consensus sites overlapping with acetylation sites. Each of these post-translational modifications is expected to have an effect on the protein. The phosphorylation sites can reduce the isoelectric point of CFAP97D2 to 9.77 if all conserved sites were modified[24]. Acetylation of internal lysines may influence the intermolecular interactions eventual degradation of CFAP97D2[25]. SUMOylation sites are residues that SUMO (small ubiquitin-like modifier) proteins can attach to CFAP97D2 and affect its nuclear-cytosolic transport and transcriptional regulation[26].

Homology and evolution

edit

CFAP97D2 is determined to be a homolog of uncharacterized protein C17orf105[27], and has orthologs widely among invertebrates, vertebrates, and fungi[28]. So far, no existing paralog is detected.

Seq # Group Genus and species Common name Taxonomic group Date of divergence from the human lineage (MYA) Accession number Sequence length Query cover Sequence identity to human protein Sequence similarity to human protein
1 Mammals Homo sapiens Human Hominin 0 XP_016876399 166 100% 100.00% 100.0%
2 Mus musculus House mouse Rodentia 90 NP_001094989.1 96 57% 62.50% 44.0%
3 Phyllostomus discolor* Pale spear-nosed bat Chiroptera 96 KAF6083800 169 58% 68.04% 58.6%
4 Orycteropus afer afer Aardvark Tubulidentata 105 XP_042637093 97 57% 70.83% 51.8%
5 Birds Egretta garzetta Little egret Pelecaniformes 312 XP_009637256 186 70% 54.24% 53.3%
6 Tauraco erythrolophus Red-crested turaco Musophagiformes 312 XP_009987599 162 70% 53.39% 53.3%
7 Antrostomus carolinensis Chuck-will's-widow Caprimulgiformes 312 XP_010175828 157 70% 46.61% 50.3%
8 Reptiles Pogona vitticeps Central bearded dragon Squamata 312 XP_020639639 116 69% 53.45% 53.3%
9 Chelonia mydas Green sea turtle Testudines 312 XP_007063899 118 70% 52.54% 55.1%
10 Amphibians Ranitomeya imitator Mimic poison frog Anura 351.8 CAF5206259 92 55% 61.96% 43.4%
11 Bufo bufo Common toad Anura 351.8 XP_040281015 111 66% 60.36% 53.9%
12 Rhinatrema bivittatum Rhinatrema Gymnophiona 351.8 XP_029460476 111 66% 56.76% 53.9%
13 Xenopus tropicalis Western clawed frog Anura 351.8 XP_012812129 118 70% 49.15% 53.3%
14 Bony Fish Lepisosteus oculatus Spotted gar Lepisosteiformes 435 XP_006639009 115 68% 50.43% 51.5%
15 Scyliorhinus torazame Cloudy catshark Carcharhiniformes 473 GCB80112 116 69% 48.28% 49.1%
16 Invetebrates Styela clava Stalked sea squirt Stolidobranchia 676 XP_039250546.1 111 66% 55.75% 50.3%
17 Apostichopus japonicus Japanese spiky sea cucumber Synallactida 684 PIK62955.1 111 66% 49.56% 49.7%
18 Stylophora pistillata Hood coral Scleractinia 824 XP_022791075.1 111 66% 51.33% 49.7%
19 Salpingoeca rosetta S. rosetta Choanoflagellate 1023 XP_004992110.1 102 60% 42.69% 47.9%
20 Fungi Chytriomyces confervae Chytrids Chytridiales 1105 TPX77255 61 36% 39.34% 24.7%
21 Batrachochytrium salamandrivorans Bsal Rhizophydiales 1105 OON08604 113 61% 31.03% 33.3%

(*Only partial sequence is shown on NCBI, BLAT is used to make up for the full sequence.)

 
Time-calibrated phylogenetic tree of CFAP97D2

CFAP97D2 is inferred to first appear in a fungus, Chytridiales (Chytriomyces confervae), at around 1105 MYA[29]. It was highly likely to split from the LOC106699411 gene in the little brown bat (Myotis lucifugus) at around 100 MYA[29].

The most distantly related species detected so far with CFAP97DF2 ortholog is a fungus called Bsal (Batrachochytrium salamandrivorans) with no isoform. The evolutionary rate of CFAP97D2 is determined to be lower compared to the Fibrinogen alpha gene and slightly faster compared to the Cytochrome c gene. This suggests that the CFAP97D2 gene may have a relatively low mutation rate in fungi and other species.

Interacting Proteins

edit

There was only one known interacting protein: E3 ubiquitin-protein ligase TRIP12 isoform X1[30]. The gene fusion that occurs between trip12 and CFAP97D2 joined them together, allowing these two genes to be transcribed and translated as a single unit[31]. The activity of E3 ubiquitin-protein ligase is strictly regulated by post-translational modifications including phosphorylation and SUMOylation, which may also affect these modification sites on CFAP97D2 protein[32].

Pathological Significance

edit

It was indicated the expression of CFAP97D2 is affected by Parkinson’s disease[33]. It shows that both protein levels and RNA levels of CFAP97D2 and other tubulin genes undergo a decline, supporting that CFAP97D2 may be highly related to tubulin in human tissue compositions.

Another protein in the CFAP97 gene family, CFAP97D1, functions in both structural components of mammalian sperm flagella and sufficient motility of sperm and affects subsequent fertilization[34]. As CFAP97D1 and CFAP97D2 may have certain interactions with each other, CFAP97D2 is initially expected to be related to sperm motility in mammals.

One literaturely significant variation is found in the 5’-untranslational region, which is a variation in CDC16 gene, cell division cycle 16 homolog in S. cerevisiae[35][36]. It is a component of the anaphase promoting complex/cyclosome (APC/C), a cell cycle regulated E3 ubiquitin ligase that controls progression through mitosis and the G1 phase of the cell cycle[37]. The variation in CDC16 was found that have a negative effect dependent on DNA methylation on gene’s expression level.

Single nucleotide polymorphism (SNPs) found in the coding region mostly led to missense. So far, no significant effect was found to connect with human phenotype or conditions.

References

edit
  1. ^ NCBI (National Center for Biotechnology Information) CFAP97D2 (Homo sapiens) isoform 2 protein [1]
  2. ^ "PSORT II Prediction". psort.hgc.jp. Retrieved 2021-12-18.
  3. ^ "CFAP97D2 CFAP97 domain containing 2 [Homo sapiens (human)] - Gene - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2021-12-18.
  4. ^ Oura, Seiya; Kazi, Samina; Savolainen, Audrey; Nozawa, Kaori; Castañeda, Julio; Yu, Zhifeng; Miyata, Haruhiko; Matzuk, Ryan M.; Hansen, Jan N.; Wachten, Dagmar; Matzuk, Martin M. (2020-08-12). "Cfap97d1 is important for flagellar axoneme maintenance and male mouse fertility". PLOS Genetics. 16 (8): e1008954. doi:10.1371/journal.pgen.1008954. ISSN 1553-7404. PMC 7444823. PMID 32785227.{{cite journal}}: CS1 maint: PMC format (link) CS1 maint: unflagged free DOI (link)
  5. ^ "CFAP97D2 CFAP97 domain containing 2 [Homo sapiens (human)] - Gene - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2021-12-18.
  6. ^ "CFAP97D2 CFAP97 domain containing 2 [Homo sapiens (human)] - Gene - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2021-12-18.
  7. ^ "Tissue expression of CFAP97D2 - Summary - The Human Protein Atlas". www.proteinatlas.org. Retrieved 2021-12-18.
  8. ^ "Immune cell - CFAP97D2 - The Human Protein Atlas". www.proteinatlas.org. Retrieved 2021-12-18.
  9. ^ "CFAP97D2 CFAP97 domain containing 2 [Homo sapiens (human)] - Gene - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2021-12-18.
  10. ^ "uncharacterized protein CFAP97D2 isoform X1 [Homo sapiens] - Protein - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2021-12-18.
  11. ^ "Compute pI/MW - SIB Swiss Institute of Bioinformatics | Expasy". www.expasy.org. Retrieved 2021-12-18.
  12. ^ "SAPS < Sequence Statistics < EMBL-EBI". www.ebi.ac.uk. Retrieved 2021-12-18.
  13. ^ "uncharacterized protein CFAP97D2 isoform X1 [Homo sapiens] - Protein - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2021-12-18.
  14. ^ "Protein BLAST: search protein databases using a protein query". blast.ncbi.nlm.nih.gov. Retrieved 2021-12-18.
  15. ^ Soulavie, Fabien; Piepenbrock, David; Thomas, Joëlle; Vieillard, Jennifer; Duteyrat, Jean-Luc; Cortier, Elisabeth; Laurençon, Anne; Göpfert, Martin C.; Durand, Bénédicte (2014-04-15). "hemingway is required for sperm flagella assembly and ciliary motility in Drosophila". Molecular Biology of the Cell. 25 (8): 1276–1286. doi:10.1091/mbc.E13-10-0616. ISSN 1059-1524. PMC 3982993. PMID 24554765.
  16. ^ "PSORT II Prediction". psort.hgc.jp. Retrieved 2021-12-18.
  17. ^ "PSORT Users' Manual". psort.hgc.jp. Retrieved 2021-12-18.
  18. ^ Kumar, Prof. T. Ashok. "CFSSP: Chou & Fasman Secondary Structure Prediction Server". www.biogem.org. Retrieved 2021-12-18.
  19. ^ "Bioinformatics Toolkit". toolkit.tuebingen.mpg.de. Retrieved 2021-12-18.
  20. ^ "MARCOIL - SIB Swiss Institute of Bioinformatics | Expasy". www.expasy.org. Retrieved 2021-12-18.
  21. ^ "Services". https://www.healthtech.dtu.dk. Retrieved 2021-12-18. {{cite web}}: External link in |website= (help)
  22. ^ "Genome Annotation and Browser". Genomatix software suite. {{cite web}}: Check date values in: |date= (help)CS1 maint: url-status (link)
  23. ^ "Services". https://www.healthtech.dtu.dk. Retrieved 2021-12-18. {{cite web}}: External link in |website= (help)
  24. ^ "CFAP97D2 (human)". www.phosphosite.org. Retrieved 2021-12-18.
  25. ^ "Histone acetylation and deacetylation", Wikipedia, 2021-11-11, retrieved 2021-12-18
  26. ^ "SUMO protein", Wikipedia, 2021-07-20, retrieved 2021-12-18
  27. ^ "CFAP97D2 CFAP97 domain containing 2 [Homo sapiens (human)] - Gene - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2021-12-18.
  28. ^ "Protein BLAST: search protein databases using a protein query". blast.ncbi.nlm.nih.gov. Retrieved 2021-12-18.
  29. ^ a b "TimeTree :: The Timescale of Life". www.timetree.org. Retrieved 2021-12-18.
  30. ^ "STRING: functional protein association networks". string-db.org. Retrieved 2021-12-18.
  31. ^ "Fusion gene", Wikipedia, 2021-08-29, retrieved 2021-12-18
  32. ^ "Ubiquitin ligase", Wikipedia, 2021-12-03, retrieved 2021-12-18
  33. ^ Kim, Jeong-Min; Lee, Kyu-Hwa; Jeon, Yeo-Jin; Oh, Jung-Hwa; Jeong, So-Young; Song, In-Sung; Kim, Jin-Man; Lee, Dong-Seok; Kim, Nam-Soon (2006-01-01). "Identification of Genes Related to Parkinson's Disease Using Expressed Sequence Tags". DNA Research. 13 (6): 275–286. doi:10.1093/dnares/dsl016. ISSN 1340-2838.
  34. ^ Oura, Seiya; Kazi, Samina; Savolainen, Audrey; Nozawa, Kaori; Castañeda, Julio; Yu, Zhifeng; Miyata, Haruhiko; Matzuk, Ryan M.; Hansen, Jan N.; Wachten, Dagmar; Matzuk, Martin M. (2020-08-12). "Cfap97d1 is important for flagellar axoneme maintenance and male mouse fertility". PLOS Genetics. 16 (8): e1008954. doi:10.1371/journal.pgen.1008954. ISSN 1553-7404. PMC 7444823. PMID 32785227.{{cite journal}}: CS1 maint: PMC format (link) CS1 maint: unflagged free DOI (link)
  35. ^ "dbSNP Short Genetic Variations". NCBI. {{cite web}}: Check date values in: |date= (help)CS1 maint: url-status (link)
  36. ^ van Eijk, Kristel R.; de Jong, Simone; Boks, Marco PM; Langeveld, Terry; Colas, Fabrice; Veldink, Jan H.; de Kovel, Carolien GF; Janson, Esther; Strengman, Eric; Langfelder, Peter; Kahn, René S. (2012-11-17). "Genetic analysis of DNA methylation and gene expression levels in whole blood of healthy human subjects". BMC Genomics. 13 (1): 636. doi:10.1186/1471-2164-13-636. ISSN 1471-2164. PMC 3583143. PMID 23157493.{{cite journal}}: CS1 maint: PMC format (link) CS1 maint: unflagged free DOI (link)
  37. ^ "CDC16", Wikipedia, 2021-11-02, retrieved 2021-12-19