Transmembrane protein 39B (TMEM39B) is a protein that in humans is encoded by the gene TMEM39B.[1] TMEM39B is a multi-pass membrane protein with eight transmembrane domains.[1] The protein localizes to the plasma membrane and vesicles.[1][2] The precise function of TMEM39B is not yet well-understood by the scientific community, but differential expression is associated with survival of B cell lymphoma, and knockdown of TMEM39B is associated with decreased autophagy in cells infected with the Sindbis virus.[3][4] Furthermore, the TMEM39B protein been found to interact with the SARS-CoV-2 ORF9C (also known as ORF14) protein.[5] TMEM39B is expressed at moderate levels in most tissues, with higher expression in the testis, placenta, white blood cells, adrenal gland, thymus, and fetal brain.[6][7]

TMEM39B
Identifiers
Aliases
External IDsGeneCards: [1]
Orthologs
SpeciesHumanMouse
Entrez
Ensembl
UniProt
RefSeq (mRNA)

n/a

n/a

RefSeq (protein)

n/a

n/a

Location (UCSC)n/an/a
PubMed searchn/an/a
Wikidata
View/Edit Human

Gene edit

The TMEM39B gene in humans is located on the plus strand at 1p35.2.[1] The gene is composed of 14 exons and covers 30.8 kb, spanning from 32,072,031 to 32,102,866.[1] It is flanked by KHDRBS1 upstream and KPNA6 downstream.[1] The TMEM39B gene region also contains the microRNA-encoding gene MIR5585.[1]

Transcript edit

There are four validated transcript variants for TMEM39B produced by different promoters and alternative splicing.[1] Transcript variant 1 is translated into the longest and most abundant protein isoform.

Transcript Variants of TMEM39B[1]
Transcript variant RefSeq Accession Length (bp) Description Number of exons
Transcript variant 1 NM_018056.4 1778 bp Encodes isoform 1 9
Transcript variant 2 NM_001319677.1 2106 bp Extended 5' UTR, encodes isoform 2 9
Transcript variant 3 NM_001319678.2 1542 bp Lacks a portion of the 5' coding region, encodes isoform 3 7
Transcript variant 4 NM_001319679.2 1539 bp Lacks a portion of the 5' coding region, encodes isoform 3 7
 
Validated transcript variants of TMEM39B, retrieved from AceView.[8]

Protein edit

Isoforms edit

There are three validated protein isoforms for TMEM39B.[1] Isoform 1 is the longest and the other two isoforms use a downstream in-frame start codon.[1]

Protein Isoforms of TMEM39B
Protein isoform Protein size Molecular weight Description
Isoform 1 492 aa 56 kDa Longest and most abundant isoform
Isoform 2 365 aa 42 kDa Shorter at N-terminus, uses downstream in-frame start codon
Isoform 3 293 aa 33 kDa Shorter at N-terminus, uses downstream in-frame start codon

General properties edit

 
Diagram of TMEM39B topology predicted by Protter.[9]

The human TMEM39B protein isoform 1 is composed of 492 amino acids and has a predicted molecular weight of 56 kDa.[1] The basal isoelectric point (pI) of the protein is 9.51.[10] Compared to the composition of the human proteome, TMEM39B has a higher percentage of serine, histidine, and leucine and a lower percentage of glutamate and aspartate, making it basic overall.[11] It contains two pairs of tandem repeats: “GSSG” from amino acids 21–28 and “PPSH” from amino acids 107–114.[11] There is a periodic motif of four leucines spaced seven residues apart from amino acids 168–195, which is not predicted to form a leucine zipper. There is an “F..Y” motif with three repeats from amino acids 183-202 and a motif of phenylalanine at every other residue from amino acids 409–416.[11] There are no notable charge clusters, charge runs, or spacings, nor are there any sorting signals.[11]

Topology edit

TMEM39B isoform 1 contains eight transmembrane regions, and the N-terminus and C-terminus are predicted to be located in the cytosol.[9]

Regulation edit

Gene-level regulation edit

Promoter edit

 
Promoters for TMEM39B.

TMEM39B has several promoter regions predicted by GenoMatix ElDorado.[12] Most promoters are overlapping in a similar region, where use of a different promoter would only cause skipping of the first exon.

Transcription factors edit

The promoter of TMEM39B transcript variant 1 contains numerous transcription factor binding sites. The transcription factors SMARCA3, TLX1, and CMYB have binding sites with high affinity near the binding site of transcription factor IIB, so they are potential regulators of gene transcription.

TMEM39B Transcription Factors.
Transcription factor Description Matrix similarity
TCF11 TCF11/LCR-F1/Nrf1 homodimers 1
TFIIB Transcription factor II B (TFIIB) recognition element 1
ETV1 Ets variant 1 0.996
ZNF300 KRAB-containing zinc finger protein 300 0.994
CMYB c-Myb, important in hematopoesis, cellular equivalent to avian myoblastosis virus oncogene v-myb 0.994
ASCL1 Achaete-scute family bHLH transcription factor 1 0.99
OSR2 Odd-skipped related 2 0.99
E2F1 E2F transcription factor 1 0.989
ZNF35 Human zinc finger protein ZNF35 0.986
GKLF Gut-enriched Krueppel-like factor 0.981
PURALPHA Purine-rich element binding protein A 0.974
SMARCA3 SWI/SNF related, matrix associated, actin dependent regulator of chromatin, subfamily a, member 3 0.973
NFAT Nuclear factor of activated T-cells 0.971
ZF5 Zinc finger / POZ domain transcription factor 0.962
INSM1 Zinc finger protein insulinoma-associated 1 (IA-1) 0.958
ZBTB14 Zinc finger and BTB domain containing 14 (ZFP-5, ZFP161) 0.914
KLF6 Core promoter-binding protein (CPBP) with 3 Krueppel-type zinc fingers (KLF6, ZF9) 0.891
GABPA GA binding protein transcription factor, alpha 0.891
TLX1 T-cell leukemia homeobox 1 0.873
ZNF704 Zinc finger protein 704 0.847

Expression pattern edit

RNA sequencing data show that TMEM39B is expressed in all tissues types, with higher levels in the testis, placenta, white blood cells, adrenal gland, thymus, and fetal brain.[6][7] Microarray data show that TMEM39B is expressed at moderate levels in most tissues, on average in the 58th percentile of genes expressed in a given tissue sample.[13] By percentile rank, TMEM39B is most highly expressed with respect to other genes in BDCA4+ dendritic cells, CD19+ B-cells, and CD14+ monocytes.[13]

Transcript-level regulation edit

miRNA binding sites edit

The 3' UTR of the TMEM39B protein contains binding sites for the miRNAs miR-1290, miR-4450, and miRNA-520d-5p.[14] Binding of these miRNAs may lead to RNA silencing.

 
5' UTR of TMEM39B transcript variant 1. Structure generated by Mfold.[15]
 
3' UTR of TMEM39B transcript variant 1. Structure generated by Mfold.[15]

mRNA-binding proteins edit

The RNA-binding proteins SFRS13A, ELAVL1, and KHDRBS3 have binding sites in the 3' UTR, and the proteins KHSRP, SFRS9 and YBX1 have binding sites in the 5' UTR.[16][17]

Secondary structure edit

The predicted secondary structure of the 5' and 3' UTR of TMEM39B contains multiple stem-loops which may play a role in stability and binding.[15]

Protein-level regulation edit

Post-translational modifications edit

 
Conceptual translation of TMEM39B transcript variant 1.

The TMEM39B protein contains numerous sites of predicted post-translational modifications, including phosphorylation, SUMOylation, acetylation, and glycosylation.[18][19][20][21][22][23][24] Sites of predicted S-palmitoylation at Cys13, Cys87, and Cys264 are conserved in orthologs. SUMOylation is predicted at Lys279 and Lys359. Several well-conserved sites of phosphorylation, glycation, and O-linked-N-acetylglucosaminylation are predicted in cytosolic regions of the protein, as annotated on the conceptual translation of TMEM39B transcript variant 1.

Sub-cellular localization edit

The TMEM39B protein has been found to localize to the vesicles using immunohistochemistry.[2]

Homology and Evolution edit

Paralogs edit

The human TMEM39B gene has a paralog called TMEM39A, also referred to by the alias SUSR2 (suppressor of SQST-1 aggregates in rpl-43 mutants), which is located at 3q13.33.[25] The TMEM39A protein contains 488 amino acids and shares 51.2% identity with TMEM39B.[26] Although the function of the paralog TMEM39A is not well-understood, variants are associated with greater risk of autoimmune disease.[27] The paralog TMEM39A has also been found to interact with Encephalomyocarditis virus (EMCV) capsid proteins as a regulator of the viral autophagy pathway.[28]

Orthologs edit

TMEM39B has orthologs in species as distant as cartilaginous fish.[26] Mammalian orthologs are highly similar to human TMEM39B, with percent identity greater than 85%. In orthologs in birds, reptiles, and amphibians, the percent identity to human TMEM39B ranges between 70% and 85%. In fish, the percent identity ranges from 40% to 75%. TMEM39B is only conserved in vertebrates, but the paralog TMEM39A has orthologs in species as distant as arthropods.[26] A selected list of orthologs from NCBI BLAST is displayed below.[26]

TMEM39B Orthologs[26]
Genus and Species Common name Taxonomic group Date of divergence (MYA) from humans[29] Accession # Sequence length (aa) Sequence identity to human protein Sequence similarity to human protein
Homo sapiens Human Mammalia 0 NP_060526.2 492 100 100
Mus musculus House mouse Mammalia 89 NP_955009.1 492 96.1 98
Ornithorhynchus anatinus Platypus Mammalia 180 XP_028937398.1 489 85.8 90.5
Gallus gallus Red junglefowl Aves 318 NP_001006313.2 489 85 91.5
Thamnophis elegans Western terrestrial garter snake Reptilia 318 XP_032083369.1 491 81.5 88.5
Xenopus tropicalis Western clawed frog Amphibia 352 NP_001005048.1 483 75.2 83.2
Oryzias latipes Japanese medaka Actinopterygii 433 XP_004082414.1 488 74.3 85.2
Danio rerio Zebrafish Actinopterygii 433 NP_956154.1 491 74.2 84.9
Erpetoichthys calabaricus Reedfish Actinopterygii 433 XP_028675900.1 489 71.8 84.6
Callorhinchus milii Australian ghostshark Chondrichthyes 465 XP_007902480.1 490 73 85.1
Amblyraja radiata Thorny skate Chondrichthyes 465 XP_032900681.1 504 70.5 83.5
Scyliorhinus torazame Cloudy catshark Chondrichthyes 465 GCB75241.1 373 55.5 65.4

Evolution edit

The TMEM39B gene appears most distantly in cartilaginous fish (chondrichthyes), which diverged from humans approximately 465 million years ago.[29] Orthologs of the paralog TMEM39A are found in arthropods, which diverged from humans approximately 763 million years ago, suggesting that TMEM39B was produced by the duplication of an ancestral form of TMEM39A .[29]

TMEM39B evolves at a relatively slow rate; a 1% change in the amino acid sequence requires approximately 13.9 million years. Based on sequence similarity of orthologs, TMEM39B evolves approximately 1.5 times faster than cytochrome c and 7 times slower than fibrinogen alpha.

Interacting proteins edit

Immune proteins edit

Using co-immunoprecipitation, affinity capture MS, and two-hybrid screens, the TMEM39B protein has been found to interact with various membrane glycoproteins .[30][31][32] Many interacting proteins have immune functions, including IL13RA1 (interleukin-13 receptor subunit alpha-1), KLRD1 (killer cell lectin-like receptor subfamily D, member 1), and SEMA7A (semaphorin-7A). SEMA7A acts as an activator of T cells and monocytes, while KLDR1 encodes an antigen presented on natural killer cells.[33][34] IL13RA1 has been proposed to mediate JAK-STAT signaling, which regulates immune cell activation.[35]

SARS-CoV-2 edit

The TMEM39B protein interacts with the SARS-CoV-2 ORF9c accessory protein, also sometimes referred to as ORF14.[5][36] ORF9C is located within the nucleocapsid (N) gene, overlapping with ORF9b.[36] Two mutations in OFC9c resulting in premature stop codons have been observed in SARS-CoV-2 isolates, suggesting that this reading frame is dispensable for viral replication.[37] The ORF9c protein has been shown to localize to vesicles when transfected into HeLa cells and is predicted to have a non-cytoplasmic domain and transmembrane domain.[38]

Variants edit

Many SNPs (single nucleotide polymorphisms) have been detected in the TMEM39B gene, of which a smaller subset cause nonsynonymous amino acid changes.[39] There are notably fewer SNPs that occur at sites of post-translational modifications, motifs, or highly conserved amino acids; changes in these amino acids may be more likely to have phenotypic effects. The table below lists selected SNPs resulting in a change at such sites.

Selected SNPs in TMEM39B[39]
SNP mRNA position Base change Amino acid position Amino acid change Description
rs1259613993 180 C > T 11 S > P “GSSG” repeat
rs1446462546 271 C > T 41 S > F O-GlcNAc, phosphorylation site
rs867417059 282 A > T 45 S > C O-GlcNAc, phosphorylation site
rs1009960963 289 C > T 47 S > F Phosphorylation site
rs377359320 503 C > A 118 N > K Highly conserved
rs748779192 555 C > T 136 R > C Highly conserved
rs778604874 558 C > T 137 R > C Highly conserved
rs1419668726 696 T > C 183 F > L [F..Y] motif
rs759591458 963 C > T 272 R > C Highly conserved
rs1180695332 1003 G > C 285 R > P Highly conserved
rs200048180 1009 A > G 287 K > R Glycation site
rs1445226108 1060 C > T 304 P > L Highly conserved
rs771743935 1206 C > A 353 H > N Highly conserved
rs376257849 1294 G > A 382 G > D Highly conserved
rs1368770455 1302 G > T 385 V > L Highly conserved
rs756106866 1336 G > A 396 G > D Highly conserved
rs868721112 1356 C > T 403 P > S Highly conserved
rs1383803294 1369 C > G 407 S > C Phosphorylation site
rs917085732 1581 T > G 478 S > A O-GlcNAc, phosphorylation site

Clinical significance edit

In a study using 164 tumor samples from patients with diffuse large B cell lymphoma, TMEM39B was one of 17 genes identified as part of a prognostic profile for 5-year progression-free survival.[3] In another study using a genome-wide siRNA screen, knockdown of TMEM39B with siRNAs decreased viral capsid/autophagosome colocalization, survival of virus-infected cells, and mitophagy in HeLa cells infected with Sindbis virus.[4] This may suggest that TMEM39B plays a role in viral autophagy like its paralog TMEM39A.

References edit

  1. ^ a b c d e f g h i j k l "TMEM39B transmembrane protein 39B [ Homo sapiens (human) ]". National Center for Biotechnology Information. U.S. National Library of Medicine. Retrieved 11 June 2020.
  2. ^ a b "Anti-TMEM39B antibody produced in rabbit HPA040191". Sigma-Aldrich. Retrieved 2020-07-30.
  3. ^ a b Kim; et al. (2014). "Gene expression profiles for the prediction of progression-free survival in diffuse large B cell lymphoma: results of a DASL assay". Annals of Hematology. 93 (3): 437–447. doi:10.1007/s00277-013-1884-0. PMID 23975159. S2CID 24280204.
  4. ^ a b Orvedahl; et al. (2011). "Image-based genome-wide siRNA screen identifies selective autophagy factors". Nature. 480 (7375). pp. 113–117, Figure 2. Bibcode:2011Natur.480..113O. doi:10.1038/nature10546. PMC 3229641. PMID 22020285.
  5. ^ a b Gordon, David E.; Jang, Gwendolyn M.; Bouhaddou, Mehdi; Xu, Jiewei; Obernier, Kirsten; White, Kris M.; O’Meara, Matthew J.; Rezelj, Veronica V.; Guo, Jeffrey Z.; Swaney, Danielle L.; Tummino, Tia A. (2020-04-30). "A SARS-CoV-2 protein interaction map reveals targets for drug repurposing". Nature. 583 (7816): 459–468. Bibcode:2020Natur.583..459G. doi:10.1038/s41586-020-2286-9. ISSN 1476-4687. PMC 7431030. PMID 32353859.
  6. ^ a b Fagerberg L, Hallström BM, Oksvold P, et al. (2014). "Analysis of the human tissue-specific expression by genome-wide integration of transcriptomics and antibody-based proteomics". Mol Cell Proteomics. 13 (2): 397–406. doi:10.1074/mcp.M113.035600. PMC 3916642. PMID 24309898.
  7. ^ a b "Illumina bodyMap2 transcriptome (ID 204271) - BioProject - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2020-07-30.
  8. ^ "AceView: Gene:TMEM39B, a comprehensive annotation of human, mouse and worm genes with mRNAs or ESTsAceView". www.ncbi.nlm.nih.gov. Retrieved 2020-07-30.
  9. ^ a b Omasits U, Ahrens CH, Müller S, Wollscheid B (15 March 2014). "Protter: interactive protein feature visualization and integration with experimental proteomic data". Bioinformatics. 30 (6): 884–6. doi:10.1093/bioinformatics/btt607. hdl:20.500.11850/82692. PMID 24162465.
  10. ^ "ExPASy - Compute pI/Mw tool". web.expasy.org. Retrieved 2020-08-02.
  11. ^ a b c d Madeira F, Park YM, Lee J, et al. (2019). "The EMBL-EBI search and sequence analysis tools APIs in 2019". Nucleic Acids Research. 47 (W1): W636–W641. doi:10.1093/nar/gkz268. PMC 6602479. PMID 30976793.
  12. ^ Genomatix: ElDorado. TMEM39B
  13. ^ a b NCBI GEO (National Center for Biotechnology Gene Expression Omnibus) GDS596. TMEM39B. Su AI, Wiltshire T, Batalov S, Lapp H et al. A gene atlas of the mouse and human protein-encoding transcriptomes. Proc Natl Acad Sci U S A 2004 Apr 20;101(16):6062-7. PMID 15075390
  14. ^ "miRDB - MicroRNA Target Prediction Database". mirdb.org. Retrieved 2020-07-30.
  15. ^ a b c "RNA Folding Form | mfold.rit.albany.edu". unafold.rna.albany.edu. Retrieved 2020-08-02.
  16. ^ "RBPDB: The database of RNA-binding specificities". rbpdb.ccbr.utoronto.ca. Retrieved 2020-08-02.
  17. ^ "miRDB - MicroRNA Target Prediction Database". mirdb.org. Retrieved 2020-08-02.
  18. ^ Xie Y, Zheng Y, Li H, Luo X, He Z, Cao S, Shi Y, Zhao Q, Xue Y, Zuo Z, Ren J (2016). "GPS-Lipid: a robust tool for the prediction of multiple lipid modification sites". Scientific Reports. 6: 28249. Bibcode:2016NatSR...628249X. doi:10.1038/srep28249. PMC 4910163. PMID 27306108.
  19. ^ Ren J, Wen L, Gao X, Jin C, Xue Y, Yao X (2008). "CSS-Palm 2.0: an updated software for palmitoylation sites prediction". Protein Engineering, Design & Selection. 21 (11): 639–644. doi:10.1093/protein/gzn039. PMC 2569006. PMID 18753194.
  20. ^ "SUMOplot Analysis Program". Abcepta.
  21. ^ Zhao Q, Xie Y, Zheng Y, Jiang S, Liu W, Mu W, Liu Z, Zhao Y, Xue Y, Ren J (2014). "GPS-SUMO: a tool for the prediction of sumoylation sites and SUMO-interaction motifs". Nucleic Acids Research. 42 (W1): W325–W330. doi:10.1093/nar/gku383. PMC 4086084. PMID 24880689.
  22. ^ Blom N, Gammeltoft S, Brunak S (1999). "Sequence and structure-based prediction of eukaryotic protein phosphorylation sites". Journal of Molecular Biology. 294 (5): 1351–1362. doi:10.1006/jmbi.1999.3310. PMID 10600390.
  23. ^ Johansen, Morten Bo; Kiemer, Lars; Brunak, Soren (2006). "Analysis and prediction of mammalian protein glycation". Glycobiology. 16 (9): 844–853. doi:10.1093/glycob/cwl009. PMID 16762979.
  24. ^ Steentoft C, Vakhrushev SY, Joshi HJ, Kong Y, Vester-Christensen MB, Schjoldager KT, Lavrsen K, Dabelsteen S, Pedersen NB, Marcos-Silva L, Gupta R, Bennett EP, Mandel U, Brunak S, Wandall HH, Levery SB, Clausen H (May 15, 2013). "Precision mapping of the human O-GalNAc glycoproteome through SimpleCell technology". EMBO J. 32 (10): 1478–88. doi:10.1038/emboj.2013.79. PMC 3655468. PMID 23584533.
  25. ^ "TMEM39A transmembrane protein 39A [Homo sapiens (human)] - Gene - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2020-07-30.
  26. ^ a b c d e "Standard Protein BLAST". National Center for Biotechnology Information.
  27. ^ Tran Q, Park J, Lee H, Hong Y, Hong S, Park S, Park J, Kim SH (2017). "TMEM39A and Human Diseases: A Brief Review". Toxicological Research. 33 (3): 205–209. doi:10.5487/TR.2017.33.3.205. PMC 5523561. PMID 28744351.
  28. ^ Li X, Ma R, Li Q, Li S, Zhang H, Xie J, Bai J, Idris A, Feng R (2019). "Transmembrane Protein 39A Promotes the Replication of Encephalomyocarditis Virus via Autophagy Pathway". Frontiers in Microbiology. 10: 2680. doi:10.3389/fmicb.2019.02680. PMC 6901969. PMID 31849860.
  29. ^ a b c Kumar S, Stecher G, Suleski M, Hedges SB (2017) TimeTree: A Resource for Timelines, Timetrees, and Divergence Times. Mol Biol Evol doi:10.1093/molbev/msx116
  30. ^ Huttlin EL, Ting L, Bruckner RJ, et al. (2015). "The BioPlex Network: A Systematic Exploration of the Human Interactome". Cell. 162 (2): 425–440. doi:10.1016/j.cell.2015.06.043. PMC 4617211. PMID 26186194.
  31. ^ Huttlin EL, Bruckner RJ, Paulo JA, et al. (2017). "Architecture of the human interactome defines protein communities and disease networks". Nature. 545 (7655): 505–509. Bibcode:2017Natur.545..505H. doi:10.1038/nature22366. PMC 5531611. PMID 28514442.
  32. ^ Wang J, Huo K, Ma L, et al. (11 October 2011). "Toward an understanding of the protein interaction network of the human liver". Mol Syst Biol. 7: 536. doi:10.1038/msb.2011.67. PMC 3261708. PMID 21988832.
  33. ^ Xie J, Wang H (2017). "Semaphorin 7A as a potential immune regulator and promising therapeutic target in rheumatoid arthritis". Arthritis Research & Therapy. 19 (1): 10. doi:10.1186/s13075-016-1217-5. PMC 5251212. PMID 28109308.
  34. ^ Lanier, L. L. (2015). "KLRK1 (killer cell lectin-like receptor subfamily K, member 1)". Atlas of Genetics and Cytogenetics in Oncology and Haematology. 19 (3): 172–175. doi:10.4267/2042/56407. hdl:2042/56407.
  35. ^ Sheikh F, Dickensheets H, Pedras-Vasconcelos J, et al. (2015). "The Interleukin-13 Receptor-α1 Chain Is Essential for Induction of the Alternative Macrophage Activation Pathway by IL-13 but Not IL-4". Journal of Innate Immunity. 7 (5): 494–505. doi:10.1159/000376579. PMC 4553078. PMID 25766112.
  36. ^ a b Shukla A, Hilgenfeld R (2015). "Acquisition of new protein domains by coronaviruses: analysis of overlapping genes coding for proteins N and 9b in SARS coronavirus". Virus Genes. 50 (1): 29–38. doi:10.1007/s11262-014-1139-8. PMC 7089080. PMID 25410051.
  37. ^ von Brunn, A., Teepe, C., Simpson, J. C., Pepperkok, R., Friedel, C. C., Zimmer, R., ... & Haas, J. (2007). Analysis of intraviral protein-protein interactions of the SARS coronavirus ORFeome. PLOS ONE, 2(5), e459.
  38. ^ Baruah, C., Devi, P., & Sharma, D. K. (2020). Sequence analysis and structure prediction of SARS-CoV-2 accessory proteins 9b and ORF14: evolutionary analysis indicates close relatedness to bat coronavirus.
  39. ^ a b "SNP linked to Gene (geneID:55116) Via Contig Annotation". www.ncbi.nlm.nih.gov. Retrieved 2020-08-02.