Research article

Prognostic score model-based signature genes for predicting the prognosis of metastatic skin cutaneous melanoma


  • Purpose 

    Cutaneous melanoma (SKCM) is the most invasive malignancy of skin cancer. Metastasis to distant lymph nodes or other system is an indicator of poor prognosis in melanoma patients. The aim of this study was to identify reliable prognostic biomarkers for SKCMs.

    Methods 

    Four RNA-sequencing datasets associated with SKCMs were downloaded from the Gene Expression Omnibus (GEO) and The Cancer Genome Atlas (TCGA) database as well as corresponding clinical information. Differentially expressed genes (DEGs) were screened between primary and metastatic samples by using MetaDE tool. Weighted gene co-expression network analysis (WGCNA) was conducted to screen functional modules. A prognostic score (PS)-based predictive model and nomogram model were constructed to identify signature genes and independent clinicopathologic factors.

    Results 

    Based on MetaDE analysis and WGCNA, a total of 456 overlapped genes were identified as hub genes related to SKCMs progression. Functional enrichment analysis revealed these genes were mainly involved in the hippo signaling pathway, signaling pathways regulating pluripotency of stem cells, pathways in cancer. In addition, eight optimal DEGs (RFPL1S, CTSV, EGLN3, etc.) were identified as signature genes by using PS model. Cox regression analysis revealed that pathologic stage T, N and recurrence were independent prognostic factors. Three clinical factors and PS status were incorporated to construct a nomogram predictive model for estimating the three years and five-year survival probability of individuals.

    Conclusions 

    The prognosis prediction model of this study may provide a promising method for decision making in clinic and prognosis predicting of SKCM patients.

    Citation: Jiaping Wang. Prognostic score model-based signature genes for predicting the prognosis of metastatic skin cutaneous melanoma[J]. Mathematical Biosciences and Engineering, 2021, 18(5): 5125-5145. doi: 10.3934/mbe.2021261

    Related Papers:

    [1] Wenjun Liu, Zhijing Chen, Zhiyu Tu . New general decay result for a fourth-order Moore-Gibson-Thompson equation with memory. Electronic Research Archive, 2020, 28(1): 433-457. doi: 10.3934/era.2020025
    [2] Siyi Luo, Yinghui Zhang . Space-time decay rate for the 3D compressible quantum magnetohydrodynamic model. Electronic Research Archive, 2025, 33(7): 4184-4204. doi: 10.3934/era.2025189
    [3] Yi Cheng, Ying Chu . A class of fourth-order hyperbolic equations with strongly damped and nonlinear logarithmic terms. Electronic Research Archive, 2021, 29(6): 3867-3887. doi: 10.3934/era.2021066
    [4] Mohammad M. Al-Gharabli, Adel M. Al-Mahdi . Existence and stability results of a plate equation with nonlinear damping and source term. Electronic Research Archive, 2022, 30(11): 4038-4065. doi: 10.3934/era.2022205
    [5] Chahrazed Messikh, Soraya Labidi, Ahmed Bchatnia, Foued Mtiri . Energy decay for a porous system with a fractional operator in the memory. Electronic Research Archive, 2025, 33(4): 2195-2215. doi: 10.3934/era.2025096
    [6] Jincheng Shi, Shuman Li, Cuntao Xiao, Yan Liu . Spatial behavior for the quasi-static heat conduction within the second gradient of type Ⅲ. Electronic Research Archive, 2024, 32(11): 6235-6257. doi: 10.3934/era.2024290
    [7] Jie Qi, Weike Wang . Global solutions to the Cauchy problem of BNSP equations in some classes of large data. Electronic Research Archive, 2024, 32(9): 5496-5541. doi: 10.3934/era.2024255
    [8] Xu Liu, Jun Zhou . Initial-boundary value problem for a fourth-order plate equation with Hardy-Hénon potential and polynomial nonlinearity. Electronic Research Archive, 2020, 28(2): 599-625. doi: 10.3934/era.2020032
    [9] Qin Ye . Space-time decay rate of high-order spatial derivative of solution for 3D compressible Euler equations with damping. Electronic Research Archive, 2023, 31(7): 3879-3894. doi: 10.3934/era.2023197
    [10] Huafei Di, Yadong Shang, Jiali Yu . Existence and uniform decay estimates for the fourth order wave equation with nonlinear boundary damping and interior source. Electronic Research Archive, 2020, 28(1): 221-261. doi: 10.3934/era.2020015
  • Purpose 

    Cutaneous melanoma (SKCM) is the most invasive malignancy of skin cancer. Metastasis to distant lymph nodes or other system is an indicator of poor prognosis in melanoma patients. The aim of this study was to identify reliable prognostic biomarkers for SKCMs.

    Methods 

    Four RNA-sequencing datasets associated with SKCMs were downloaded from the Gene Expression Omnibus (GEO) and The Cancer Genome Atlas (TCGA) database as well as corresponding clinical information. Differentially expressed genes (DEGs) were screened between primary and metastatic samples by using MetaDE tool. Weighted gene co-expression network analysis (WGCNA) was conducted to screen functional modules. A prognostic score (PS)-based predictive model and nomogram model were constructed to identify signature genes and independent clinicopathologic factors.

    Results 

    Based on MetaDE analysis and WGCNA, a total of 456 overlapped genes were identified as hub genes related to SKCMs progression. Functional enrichment analysis revealed these genes were mainly involved in the hippo signaling pathway, signaling pathways regulating pluripotency of stem cells, pathways in cancer. In addition, eight optimal DEGs (RFPL1S, CTSV, EGLN3, etc.) were identified as signature genes by using PS model. Cox regression analysis revealed that pathologic stage T, N and recurrence were independent prognostic factors. Three clinical factors and PS status were incorporated to construct a nomogram predictive model for estimating the three years and five-year survival probability of individuals.

    Conclusions 

    The prognosis prediction model of this study may provide a promising method for decision making in clinic and prognosis predicting of SKCM patients.



    1. Introduction

    Approximately one-third of all proteins are integral membrane proteins (MPs) [1,2,3,4], and they comprise more than half of all drug targets due to their prevalence in a wide variety of biological functions [5,6,7,8]. However, of the more than 106,000 proteins with experimentally determined three-dimensional (3D) structures in the Protein Data Bank (PDB) [2,9,10,11,12,13,14], only about 2,300 are MPs [6,7,8,9,10,11,12,13,14,15,16,17,18,19,20]. Further, according to Stephen White’s database of MPs of known structure (http://blanco.biomol.uci.edu/mpstruc/), only approximately 520 unique MP structures have been determined. The disparity between the importance of MPs and the available 3D structures reflects the technical difficulties associated with MP structure determination by X-ray crystallography and nuclear magnetic resonance (NMR) spectroscopy. To study MPs in their biologically relevant native conformation(s), a membrane mimic must be present during the experiment. While exciting progress in the field of crystallography are observed, such as the use of femto-second crystallography [21,22], robotics [23,24], and antibodies [1,3,4], MP crystallization remains a bottleneck. Limiting factors for solution NMR spectroscopy include line-broadening due to slow tumbling times of large MPs embedded in membrane mimics. Cryo-probes, increasingly powerful NMR magnets, selective labeling, and the development of solid-state NMR techniques [5] are continuously pushing the MP NMR field forward, but challenges remain [9,10,13,14].

    1.1. EPR spectroscopy can serve as an alternative means of membrane protein structural characterization

    Site-directed spin labeling electron paramagnetic resonance (SDSL-EPR) spectroscopy may serve as another means of MP structure determination because it has a number of advantages compared to more traditional methods. For example, proteins can be studied in native-like environments, such as in lipid bicelles or vesicles, and no crystallization is required. Because of the sensitivity of SDSL-EPR relatively small amounts of protein suffice, which is important in the case of MPs that are often difficult to express and purify. As unpaired electrons are only present at the two labeling sites, results are straight-forward to interpret as an error-prone resonance assignment process is omitted, unlike in NMR spectroscopy [8,15,16,17,18,19,20].

    However, EPR is not without its disadvantages. Like NMR spectroscopy, structure determination is indirect in that the spectroscopic data are first converted to structural restraints [25,26,27,28]. Also, for distance measurements, SDSL requires the removal of all endogenous reactive cysteines in the protein and the mutation of the residues of interest into cysteines. As a result, in contrast to NMR spectroscopy, only one inter-residue distance can be measured per experiment. This results in low throughput and sparse datasets. In addition, the spin label itself introduces uncertainty, as the distance between the paramagnetic spin labels, which are at the tips of long and flexible side-chains, is measured. This distance then needs to be converted into a structural restraint based on MP backbone coordinates.

    There are two principal ways to accomplish this conversion. Explicit approaches such as MMM [29], mtsslWizard [30], PRONOX [31,32], and RosettaEPR [33] model the spin label explicitly. Such programs can predict EPR distances with an accuracy of ~3 Å. Unfortunately, such methods are too slow to be integrated into de novo folding simulations such as RosettaTMH. As a rapid but less precise alternative we also implemented an implicit Knowledge-Based Potential (KBP) into RosettaEPR that computes a likelihood distribution for the Cβ distance based on the observed distance of the unpaired electron in the SDSL-DEER experiment [25,26]. This potential is used in calculations labeled RosettaTMH+EPR in the present manuscript.

    1.2. Novel de novo membrane protein structure prediction tools are needed

    In order to aid in MP structure determination, several computational methods have been developed. These methods can be divided into two categories: template-based comparative modeling, and de novo folding. Template-based methods, such has Modeller [34,35,36,37], Rosetta [38,39], SWISS-MODEL [40], and I-TASSER [41,42,43], are commonly used when the structure of a homologous protein exists. Template-based modeling methods are so named because they require a structural template onto which a target sequence can be threaded. For the sequence in question, a template structure, whether it is a sequence homolog or a structure exhibiting the same expected fold, must first be identified. Next, often after performing one or more sequence alignments, the target sequence is threaded onto the 3D coordinates of the template structure, thus replacing the sequence of the template with that of the target [44].

    Even though significant advances are being made in the structure determination of additional GPCR or LeuT-fold structures, progress is slow when it comes to the determination of new MP folds. Thus, it is often difficult to identify a suitable template structure for MP comparative modeling because there are a limited number unique MP structures available in the PDB. Additionally, even though templates having a similar fold may exist, it is possible that the sequence homology between the target and the template is too low to be confidently detected. For example, of the more than 20 experimentally determined structures of G-protein coupled receptors (GPCRs), the majority are of class A, or class 1 (http://gpcr.scripps.edu/index.html) [45], even though there are five or six classes of GPCRs [46]. Similarly, while there are some structures of transporters, such as LeuT [47], vSGLT [48], BetP [49], and GadC [50], MPs having the LeuT fold perform a large variety of functions, show significant divergence in sequence, and can belong to a number of different protein superfamilies [51]. While comparative modeling based on an evolutionarily distant template can be useful for hypothesis generation, especially when combined with experimental methods in an iterative fashion [52], de novo structure prediction of MPs is needed in the absence of a structural template. Additionally, de novo folding methods allow for an unbiased exploration of the conformational space and can be used to complement comparative modeling in case of low similarity between template and target.

    Compared to template-based MP modeling methods, there are only a handful of tools for de novo folding of MPs. RosettaMembrane was introduced in 2006 [53] and was later expanded to include full-atom scoring potentials [54]. Its structure prediction capabilities were limited to MPs of fewer than 150 amino acids. The addition of limited helix-helix contact restraints derived from sequence conservation allowed for accurate modeling of larger MPs - in four of twelve test cases models superimposable below a RMSD of 4 Å were observed [55]. One limitation was that this method could only account for one restraint at a time. Furthermore, the utility of RosettaMembrane in its current state is limited. For technical reasons that originate in the RosettaMembrane code base, it is not possible to de novo fold MPs with multiple restraints, such as those obtained from NMR and EPR.

    Other methods to predict MP structure, such as FILM3, exhibit mild success for predicting large MPs, but they rely on correlated mutational information to score MP models. Of 71 MP sequences, FILM3 was able to correctly predict 100% of inter-helix contacts for 17 proteins. Upon comparison with two-dimensional slices of the experimental structures, 9 predicted structures had the correct fold [56,57]. EVfold_membrane is also a promising method for MP structure determination but also relies on information from evolutionary covariation [58]. BCL::MP-Fold, on the other hand, is independent of contacts predicted from correlated mutations. It reduces the conformational search space by assembling secondary structure elements (SSEs) combined with knowledge-based potentials (KBPs) to assess model quality [59]. The disadvantage of BCL-generated models is the lack of inter-helix loop regions. Additionally, it under-predicts secondary structural features often present in MPs, such as helical kinks because models are comprised of idealized α-helices.

    1.3. RosettaTMH allows for folding of membrane proteins, both with and without experimental restraints

    We have developed RosettaTMH to address the size limitation of other reported MP de novo folding methods. RosettaTMH assembles MP folds via rigid body perturbations of transmembrane helices (TMHs). However, unlike BCL::MP-Fold, 3- and 9-amino acid fragment insertions, as used in the traditional Rosetta de novo folding algorithm [53,55,60], are used to more thoroughly sample helical orientations and introduce bends and kinks. Throughout the de novo folding process, RosettaMembrane’s MP-specific scoring functions are used [53,54]. RosettaTMH can be combined with multiple experimental restraints, such as inter-residue distance information from EPR, which is an advantage compared to previously published RosettaMembrane folding protocols. This additional feature allows for improved sampling of native-like folds that are in agreement with empirical information.

    RosettaTMH was benchmarked on 34 MPs of known structure. It was compared to the original RosettaMembrane folding algorithm, “MembraneAbinitio” [53,55] and the traditional fragment assembly-only method used for folding soluble proteins in Rosetta, “ExtendedChain” [61] but using the RosettaMembrane scoring function. In order to assess the performance of combining RosettaTMH with experimentally obtained structural data, EPR distance restraints were simulated for all MPs in the benchmark set. The purpose of the benchmark was to determine if these restraints increase the sampling of native-like MP folds. The simulated distance restraints were generated using the BioChemical Library (BCL, www.meilerlab.org) and the restraint-picking algorithm introduced by Kazmier, et al. [62]. We show that, by implementing the ability to fold MPs with structural restraints, native-like folds can be obtained for 30 MPs in the benchmark set. For the purpose of this manuscript we define a native-like fold as having a RMSD100SSE value smaller than 8 Å (read below).

    2. Materials and Methods

    2.1. The RosettaTMH de novo folding algorithm

    The RosettaTMH MP folding algorithm differs significantly from both the Rosetta folding algorithm for soluble proteins, “ExtendedChain” [61], as well as the published RosettaMembrane folding protocols [53,55]. The primary difference is that RosettaTMH allows for potentially enhanced sampling of MP folds by treating TMHs as rigid bodies. Each TMH can be rotated, translated, or transformed as an independent entity. In order to implement this new algorithm in the overall Rosetta folding framework, the model’s fold tree was modified. The fold tree of a protein model is a directed acyclic graph representing the connectivity of the model in internal coordinate space. This connectivity is distinct from chemical connectivity and enables Rosetta to rapidly move large sections of the protein independently [63,64]. In the case of a helical MP, a radial, or star-shaped, fold tree is used; therefore, the center of mass (CoM) of each TMH is connected to a central node (Figure 1).

    Figure 1. Generation of membrane protein fold tree in RosettaTMH and initial placement of transmembrane helices This schematic outlines how RosettaTMH generates a radial fold tree for a 5-TMH MP. In preparation for generating the fold tree (A), the primary sequence of the protein is read in and used to create an idealized α-TMH. RosettaTMH utilizes user-defined TMH definitions to divide the idealized TMH and insert each individual TMH into the implicit membrane. It then calculates each TMH’s center of mass (CoM). (B) The CoMs connect the TMHs to a central root residue (open circle) in internal coordinate space. (C) A hexagonal grid is computed, such that the vertices are aligned along the membrane center plane and are 15 Å away from one another and from the origin. Then, for each grid point, a TMH is chosen randomly, and the TMH is transformed to that grid point such that its CoM is aligned with the origin. The hexagonal grid can be expanded as needed, depending on the number of TMHs in the protein.

    Before de novo folding begins, each TMH is inserted into the implicit RosettaMembrane environment [53]. The CoM of each TMH is set at the membrane center, and the helices are aligned to the membrane normal such that each TMH is antiparallel to its sequential neighbors. The helices are arranged in a hexagonal grid and are initially separated from each other by 15 Å. The starting fold of the model is randomized; that is, the arrangement of helices in the hexagonal grid is different for each starting model (Figure 1).

    2.2. Stages of de novo folding with RosettaTMH

    The pre-processing and de novo folding stages of RosettaTMH are summarized in Figure 2.

    Figure 2. Outline of stages for RosettaTMH de novo folding This flowchart summarizes the process of de novo folding with RosettaTMH, including the pre-processing that takes place prior to Monte Carlo Metropolis (MCM) sampling.

    Folding begins after the initialization of the model. The first stage of de novo folding consists entirely of rigid body transformations [53] performed in a Monte Carlo Metropolis (MCM) fashion [65,66]. For each MCM move, the TMH is allowed to either rotate by up to 0.1° about any axis or translate up to 0.5 Å in any direction from its current position. The conformation resulting from each transformation is scored according to the RosettaMembrane centroid-based scoring function. Stage 1 of folding consists of 2,000 MCM moves, and the RG and RosettaMembrane-specific “density” term are turned on [53]. These scoring terms aid in improving the compactness of the model. After the first stage, the model undergoes 9- and 3-amino acid fragment insertions using a protocol analogous to the one used for soluble proteins [53,61]. Briefly, in Stage 2,2,000 MCM cycles are performed, during which 9mer fragments are inserted onto the helical protein backbone. The density scoring term is turned off, and residue pairing, membrane environment, and membrane-specific penalties are added [53,54]. The density term is re-introduced in Stage 3, which consists of 10 inner cycles; during these inner cycles, the scoring function can be alternated if desired. However, for MPs, the scoring function is the same for each of two inner cycle sub-stages. Each sub-stage consists of 2,000 MCM cycles for inserting 9mer fragments, resulting in a total of 20,000 fragment insertions. Finally, the density term is up-weighted in Stage 4, and 4,000 MCM cycles of 3mer fragment insertions are performed. Note, that we omitted construction of loops between TMHs as we wanted to test the TMH folding protocol. An inter-helix distance score ensures that TMHs are close enough to allow for construction of loops (red below).

    2.3. Setup of RosettaTMH parameterization and benchmarking datasets

    The 34-protein benchmarking set exhibits a wide range of sizes and topological complexity. The number of EPR distance restraints simulated was computed as

    @\# restraints = 0.2 \times \# a{a_{TMH}}@
    Where #restraints refers to the number of simulated EPR restraints generated, and #aaTMH refers to the number of amino acids in TMHs defined in the experimental structures. This number of restraints was chosen because it is on the order of the maximal number of distance restraints that have been obtained for several MPs [67,68,69,70]. The input files used (i.e., fragments, secondary structure prediction, span, lipophilicity, and native PDB files) were the same or based on those employed for benchmarking of BCL::MP-Fold [59] and are provided in the Protocol Capture that accompanies this publication.

    2.4. Simulation of EPR distance restraints using the BioChemical Library

    Ten sets of EPR distance restraints were generated for each protein for the 34-MP benchmark. This was done to avoid bias resulting from using any single restraint set. The restraint selection algorithm developed by Kazmier, et al. [62] was employed. The algorithm optimizes the information content of the restraint set by maximizing the sequence separation between spin labeling sites. At the same time, the algorithm finds restraint sets that link all pairs of SSEs in the protein and excludes positions that are likely buried and unlikely to be labeled without disruption of the tertiary structure. In order to convert the resulting restraint sets to EPR-like distance restraints for testing during de novo folding, the Euclidian distances between the specified residues were determined from the MP experimental structures. Next, a spin label uncertainty was added to each distance, based on the cone model-based spin label statistics generated for the RosettaEPR KBP [26]. These statistics were generated by placing a pseudo-spin label in the form of a right-angle cone (based on methanethiosulfonate, or MTS) on exposed residue pairs in a database of over 3,500 proteins. The frequency of observed values for the calculated difference between spin label distance and Cβ distance (dSL-d) were collected in a histogram, which was shown to match relatively well to experimentally determined dSL-d values for T4-lysozyme and αA-crystallin. This histogram of spin label statistics quantifies the expected uncertainty associated with EPR distances measured on proteins spin labeled with MTS.

    2.5. Optimization of EPR distance restraint scoring term weighting

    The EPR distances for the residue pairs were simulated as described in the previous section. Preliminary benchmarking indicated that the EPR score used for the folding of T4-lysozyme [26] was insufficient to improve MP model quality of large MPs, such as rhodopsin. Instead, it was determined that a two-component scoring term was needed.

    The modified EPR restraint potential for folding MPs consists of an energetic bonus derived from the aforementioned cone model statistics. Indeed, this energetic bonus is the same KBP used in the de novo folding of T4-lysozyme [26]. However, in addition to the KBP energetic bonus, the EPR restraint score contains an energetic penalty characterized by the equation:

    @f\left( x \right) = \left\{ {(xlb)2forx<lb0forlbxub(xub)2forub<xub+rswitchxubrswitch+rswitch2forx>ub+rswitch
    } \right.@
    where x is the currently measured distance within the model, lb is the restraint lower bound, ub is the restraint upper bound, and rswitch is set to 0.5. This quadratic penalty is similar to that used for NOE-derived distance restraints in NMR structure calculations. The EPR scoring potential is designed such that the quadratic penalty is enforced if, during folding, the simulated model’s dSL-d value for a given residue pair is greater than −12.0 Å and less than 12.0 Å.

    The weight of each EPR scoring term component was optimized separately. One thousand models of each of 9 proteins indicated in Table 1 were folded using RosettaTMH for each EPR restraint weighting scheme. Combinations of the weights for both components of the scoring term were systematically tested in a grid search. For each protein and each of 49 weighting schemes, the percentage of models having RMSD100SSE < 8 Å was computed, and the average of these values across the 9 proteins are reported in Table S1. The RMSD100SSE is defined as:

    @RMS{D_{100}}SSE = RMSD\_SSE/\left( {1 + ln\sqrt {N/100} } \right)@
    where RMSD is root mean square distance, SSE is secondary structural element, and N is the number of residues. In addition, the enrichment was computed based on the models obtained from each weighting scheme (Table S2). Enrichment was computed as:
    @enrichment = \frac{{TP}}{{TP + FP}} \times \frac{{P + N}}{P}@
    where @\left( {P + N} \right)/P@ the ratio is set to 10, limiting the maximum possible enrichment to 10.0. The models were sorted according to Rosetta score. Models that fell within the top 10% by score were counted as “positive, ” (P), and models whose scores fell into the bottom 90% by score were counted as “negative” (N). The positives were then sorted by RMSD100SSE relative to the native structures, and those models that fell within the top 10% by RMSD100SSE were labeled “true positives” (TPs). All other low-scoring models were considered “false positives” (FPs).

    Table 1. Proteins used for benchmarking.
    PDBChainDomain# Res# TMHContact Order# Restraints
    3SYO76–197122214.412
    2BG9A211–3019136.916
    1J4N4–119116315.217
    2KSF396–502107411.913
    1PY6 (1PY7)a77–199123413.320
    2PNOA2–131130413.622
    2BL212–156145420.725
    2K731–164164415.519
    2ZW3A2–217216425.724
    1IWG336–498163517.426
    1RHZA23–188166519.821
    2YVXA284–471188520.626
    1OCCC71–261191524.129
    4A2N1–192192522.424
    1KPL31–233203523.431
    2BS2C21–237217517.529
    3P5N10–188179617.922
    2IC891–272182617.923
    1PV61–190189628.333
    2NR94–195192617.624
    1OKCb2–293292625.834
    3B60bA10–328319625.752
    2KSY1–223223720.137
    1PY6b5–231227725.236
    3KCU29–280252729.733
    1FX8b6–259254728.638
    1U19b33–310278725.041
    3KJ6A35–346311739.531
    3HD6b6–4484031243.659
    3GIAb3–4354331262.564
    3O0RbB10–4584491230.669
    3HFXb12–5044931268.063
    2XUTA13–5004881442.871
    2XQ2A9–5735651571.879
    a Referred to as 1PY7 in this publication
    b These proteins were used in RosettaTMH parameter optimization
     | Show Table
    DownLoad: CSV

    During EPR restraint weight optimization, the Rosetta radius of gyration (RG) scoring term was used with a weight of 4.25. Each restraint was scored independently, and the sum of individual restraint scores constitutes the total raw restraint score. The total restraint score was multiplied by a normalization factor that is equal to:

    @weigh{t_{cst}} = \frac{{log\left( {\# cst} \right)}}{{\# cst}} \times \# aa@
    Where weightcst is the weight by which the entire restraint score is multiplied before it is added to the total Rosetta score, or energy, #cst is equal to the number of simulated EPR restraints used, and #aa is the number of residues in the protein. Because the total restraint score is the sum of individual restraint scores, the weighted restraint score can be represented by:
    @cst\_scor{e_{weighted}} = average\left( {cst\_scor{e_{weighted}}} \right) \times log\left( {\# cst} \right) \times \# aa@

    2.6. Benchmarking of RosettaTMH in the absence and presence of simulated EPR restraints

    The generation of input files for this benchmark, except for the simulated restraints, is described in the work by Weiner, et al. on BCL::MP-Fold [59]. Briefly, the primary sequence of each protein listed in Table 1 was used to generate 3- and 9-amino acid fragment files required for de novo folding in Rosetta. The Rosetta spanfiles containing the TMH definitions were obtained by using predictions from OCTOPUS [71]. Rosetta lipophilicity files were also generated for each protein using the LIPS algorithm [72]. Five thousand models were folded from the primary sequence, using TMH information and the RosettaMembrane centroid-based scoring function [53]. When multiple EPR restraint sets were used, the number of total models generated per restraint set was equal to the total number of models generated divided by the number of different restraint sets (i.e., 10 sets of 500 models for each protein).

    2.7. Computational details

    All computations were performed using the Vanderbilt University Advanced Computing Cluster for Research and Education (ACCRE) on a combination of AMD Opteron and Intel Nehalem processor nodes or the Vanderbilt University Center for Structural Biology computing cluster on a variety of x86 computing processors.

    2.8. Availability

    The RosettaTMH source code is available in the Rosetta master branch, which is available to developers in the RosettaCommons via https://github.com/RosettaCommons. Rosetta revision numbers d592380 and d7b5a70 were used for RosettaTMH parameter optimization and benchmarking, respectively. The software licenses and the complete protocol capture for this work is available from the RosettaCommons (www.rosettacommons.org), as well as in the Supplemental Information. Information for obtaining software licenses for the BioChemical Library is available at www.meilerlab.org/bclcommons.

    3. Results

    3.1. Rosetta de novo folding benchmarked on 34 helical membrane proteins

    Thirty-four α-helical MPs and MP subunits of known structure were chosen to test the RosettaTMH folding algorithm (Table 1). Nine of these proteins (underlined) were used for the initial testing and parameter optimization of the RosettaTMH protocol. These nine proteins were chosen because we wanted to design a method that would be primarily used for folding larger, complex MPs. While many MPs are oligomers, the present benchmarks includes only monomeric MPs or protomeric subunits of oligomeric MPs. Folding as well as EPR labeling strategies need to be adapted when moving to oligomeric MPs which is a focus of ongoing research but beyond the scope of this initial manuscript.

    3.2. The optimal EPR restraint potential weighs both knowledge-based potential and quadratic penalty equally

    De novo folding of soluble proteins with EPR restraints in Rosetta had been optimized previously [25,26]. However, it was found that, for MPs, a quadratic penalty was needed in addition to the EPR KBP energetic bonus to sufficiently improve conformational sampling of native-like folds where a ‘native-like’ fold is defined as having a RMSD100SSE value smaller than 8 Å. The EPR KBP and the quadratic penalty were weighted equally. The enrichment for folding was 2.93 (Table S2). RMSD100SSE and enrichment are defined in Materials and Methods.

    As expected, the enrichment for de novo folding with EPR restraints was generally lower than folding with no restraints. This is because the number of false positives, or low-scoring, high-RMSD models, was higher when folding with simulated restraints. This is perhaps due to the higher promiscuity of the EPR restraints, which are broader than distance restraints resulting from NMR nuclear Overhauser effects (NOEs). Therefore, models that fulfill the simulated restraints and are lower-scoring have not always native-like fold. This phenomenon is generally observed across all 34 benchmarked MPs as well (Table S3).

    3.3. Addition of EPR restraints significantly improves sampling for RosettaTMH and ExtendedChain

    In order to assess the overall sampling capability of each folding protocol, we computed the percentage of models having an RMSD100SSE < 8 Å (FractionRMSD<8 Å, Table 2), which serves as a cutoff for determining if models have the correct fold. We also report the best RMSD100SSE (RMSD1st RMSD) obtained for each method and the mean RMSD100SSE of the five lowest-scoring models (RMSDtop5 score). As was observed with T4-lysozyme [25,26], the addition of EPR restraints increases the likelihood of obtaining the correct MP fold for both RosettaTMH and ExtendedChain. When looking at FractionRMSD<8 Å, followed by the RMSD10% RMSD and RMSDtop5 score, RosettaTMH performs better than ExtendedChain for 11 of 34 proteins, including the 6 largest proteins. Further, when compared to other Rosetta MP folding methods, RosettaTMH+EPR obtains the highest percentage of correctly folded models for 3 of the 13 medium-sized proteins, 2 of the 8 large proteins, and 3 of the 6 very large proteins. Interestingly, ExtendedChain+EPR performs best for de novo folding of medium-sized proteins (Table 2).

    Table 2. Overall performance of de novo folding membrane proteins with Rosetta.
    RMSDtop5 scoreRMSD1st RMSDFractionRMSD<8 Å
    average RMSD100SSE to experimental structure of the top five models by scorelowest RMSD100SSE to experimental structurepercentage of models with a RMSD100SSE better than 8 Å
    PDBMembrane Ab InitioExtended ChainExtended Chain + EPRRosetta TMHRosetta TMH + EPRMembrane Ab InitioExtended ChainExtended Chain + EPRRosetta TMHRosetta TMH + EPRMembrane Ab InitioExtended ChainExtended Chain + EPRRosetta TMHRosetta TMH + EPR
    3SYO11.011.812.312.612.47.17.28.56.48.000000
    2BG96.39.79.112.58.44.04.24.65.74.302330225
    1J4N9.77.88.012.911.35.04.74.46.77.413142801
    2KSF8.09.110.09.410.65.25.75.95.05.821101519
    1PY74.96.14.211.17.72.22.42.55.64.5666758123
    2PNO7.68.77.012.38.43.03.23.37.55.5292147011
    2BL26.15.54.711.57.02.42.52.85.54.0715568039
    2K739.710.210.212.710.95.94.86.87.96.721102
    2ZW313.112.913.016.013.28.79.910.510.19.800000
    1IWG7.39.17.312.98.45.05.54.48.15.916448013
    1RHZ10.110.78.811.611.56.16.66.18.57.7111001
    2YVX9.29.07.914.79.05.96.05.18.26.7512006
    1OCC11.310.410.413.18.46.77.07.27.95.4101017
    4A2N8.49.29.411.210.55.46.26.37.87.241602
    1KPL13.214.411.815.112.710.010.77.911.19.500100
    2BS210.09.710.013.510.16.15.35.88.86.911603
    3P5N9.510.49.512.611.75.36.06.99.59.021200
    2IC89.19.49.111.510.25.35.06.18.77.141501
    1PV610.410.18.111.58.15.76.45.18.05.44118015
    2NR910.210.99.812.211.35.86.96.38.67.420301
    1OKC12.312.513.112.611.99.010.88.99.28.700000
    3B60f9.89.76.313.79.76.36.14.19.26.02138010
    2KSY8.98.46.812.88.53.94.24.57.75.1301528019
    1PY6f8.08.58.212.27.93.95.15.27.85.227915018
    3KCU10.610.39.911.910.86.36.96.49.77.910200
    1FX8f10.711.310.512.410.47.88.37.78.48.000100
    1U19f12.714.611.812.48.79.211.48.48.76.100007
    3KJ614.615.215.515.615.311.512.112.012.412.700000
    3HD6f10.616.012.013.210.97.512.49.110.38.500000
    3GIAf14.125.513.914.212.011.817.29.211.69.500000
    3O0Rf10.124.011.213.010.06.418.67.68.96.520104
    3HFXf13.529.913.513.111.610.022.210.210.68.700000
    2XUT14.325.115.116.215.012.420.212.614.612.900000
    2XQ215.835.916.016.615.713.625.512.914.813.700000
    mean10.313.010.113.010.66.88.76.98.87.51071307
    stddev.2.67.02.913.010.62.85.82.72.22.4181519010
     | Show Table
    DownLoad: CSV

    3.4. De novo folding with RosettaTMH for large proteins

    A representative subset of 7 proteins was chosen from the 34-protein benchmark set for further RMSD100SSE analysis. For each protein and for each folding method, the RMSD100SSE values were sorted from lowest to highest, and the top 5% of models by RMSD100SSE were selected. Next, RMSD100SSE vs. RMSD100SSE plots comparing RosettaTMH and RosettaTMH+EPR with the other Rosetta MP folding methods were generated. This analysis clarifies a few key conclusions concerning RosettaTMH. First, when no EPR restraints are used, MembraneAbinitio and ExtendedChain outperform RosettaTMH for small- or medium-sized MPs. Second, RosettaTMH+EPR samples more lower-RMSD conformations for larger MPs when compared to RosettaTMH and ExtendedChain (distance restraints cannot be used with MembraneAbinitio). Finally, RosettaTMH performance is comparable to MembraneAbinitio and ExtendedChain for 3HFX and 1FX8, respectively, while RosettaTMH+EPR and ExtendedChain+EPR are comparable when folding 1FX8 (Figure 3).

    Figure 3. Sampling performance for de novo folding with RosettaTMH compared to other folding methods. For each panel, the RMSD100SSE of the top 5% of models by RMSD100SSE were selected for 7 proteins. A) MembraneAvinitio vs. RosettaTMH, B) Folding from an extended chain vs. RosettaTMH, C) Folding from an extended chain with EPR restraints vs. RosettaTMH with EPR restraints, D) RosettaTMH with EPR restraints vs. RosettaTMH (without EPR restraints).

    3.5. Addition of EPR restraints primarily responsible for improvement seen in RosettaTMH folding

    The MembraneAbinitio folding algorithm was first benchmarked on a dataset of relatively small proteins and performed best with small helical bundles [53]. However, it was found that MPs having more complex folds posed a much more difficult challenge, one which folding with experimental restraints and/or a new sampling algorithm could possibly address. Further, the addition of the RosettaEPR distance restraint potential improves sampling of native-like folds significantly. This appears to be primarily due to the influence of the EPR restraints, as folding with ExtendedChain+EPR also increases sampling efficiency to include the correct fold. In order to test this hypothesis, rhodopsin (PDB ID: 1U19 [73]) was selected for an in-depth analysis of the relationship between the number of EPR restraints and overall conformational sampling ability.

    Rhodopsin was folded using all of the Rosetta methods listed in Table 2. However, for RosettaTMH+EPR and ExtendedChain+EPR, multiple sets of models were generated based on whether 0, 10, 20, 40, 80, or 160 simulated EPR distance restraints were used. Unlike in the 34-protein benchmark, only one EPR restraint set for each scenario was generated, and 1,000 models were folded for each case. Box-and-whisker plots of the resulting RMSD100SSE distributions are displayed in Figure 4. When no restraints are used, MembraneAbinitio and folding from an extended chain perform similarly, while RosettaTMH generally appears to generate lower-RMSD models. When using 10 restraints, RosettaTMH+EPR and ExtendedChain+EPR exhibit similar median RMSD100SSE values, but RosettaTMH+EPR samples a wider range of conformations. However, when 20 or more restraints are used, RosettaTMH+EPR is consistently better in sampling the correct fold. As expected, the number of outliers correlates inversely with the number of restraints.

    Figure 4. Sampling performance of RosettaTMH with various EPR restraint set sizes for folding rhodopsin Box-and-whisker plot indicating the breadth of model accuracy obtained for folding rhodopsin with RosettaTMH with 10, 20, 40, 80 and 160 simulated EPR distance restraints. The thick line indicates the median RMSD100SSE obtained, while the boxes indicate the interquartile range. The highest and lowest RMSD100SSE values, excluding outliers, are indicated by the “whiskers, ” and outliers are shown as open circles.

    3.6. Detailed analysis of individual de novo folding stages indicate rigid body sampling not necessary

    In addition to studying the overall performance of RosettaTMH with and without EPR restraints, the ability of the protocol to sample MP folds during each stage of folding (see Figure 2) was also analyzed. As with the above experiment, rhodopsin was chosen as an example protein, and only one set of 41 optimally weighted restraints was used. For each folding method, 1,000 individual trajectories were run, and the conformations before folding began and after each stage of folding were output. Then, similar to in Figure 4, the RMSD100SSE distributions for each folding stage were plotted. The first step in the Rosetta MembraneAbinitio folding algorithm (Stage 0), samples a single, high scoring, extended-chain conformation. For MembraneAbinitio and ExtendedChain, model quality significantly improves from initiation to Stage 1 and then from Stage 1 to Stage 2. In contrast, RosettaTMH-generated model accuracy decreases during Stage 1 of folding. That is, the rigid body sampling causes the quality of rhodopsin models to decrease. The RMSD100SSE values do not improve significantly for Stages 2-4 when no restraints are used. When EPR restraints are used, the models’ accuracy improves from Stage 1 to Stage 2 but does not change significantly after Stage 2. This was also observed for ExtendedChain+EPR.

    3.7. RosettaTMH-generated models exhibit large inter-helical distances

    RosettaTMH assembles MP folds by breaking up the proteins into individual TMHs and allowing the helices to move as independent rigid bodies. The resulting arrangements could result in distances between subsequent SSEs that cannot be connected by a loop. In order to determine if the SSEs can be connected by loops, the Euclidian distance between subsequent SSEs was measured for all 34 native MPs, as well as for the 5,000 models for all 34 benchmark MPs folded with and without EPR restraints. For each protein, the percentage of models in which all loops could be theoretically or realistically closed was determined. The maximum Euclidian distance theoretically possible (“Maximum Possible”) was computed as:

    @Maximum Possible = \left( {3.8*\left( {LoopLength - 1} \right)} \right)@
    where LoopLength refers to the number of residues in the inter-helix loop. These percentages were then grouped into four categories based on protein size (see Table 1) and whether or not EPR restraints were used during de novo folding. The resulting percentages of models having closeable loops are summarized in box-and-whisker plots in Figure S1.

    Generally speaking, RosettaTMH, either with and without EPR restraints, fails to reflect the dependence of Euclidean distance from loop length accurately. Indeed, every protein had at least five models in which all inter-helix distances can theoretically be spanned by a loop, but only a small percentage - if any - exhibit native-like inter-helix distances. Unsurprisingly, the addition of EPR restraints did improve the possibility of generating models with closeable loops in the majority of cases. However, for the very large MPs, no models exhibited native-like loop distances, and of the large MPs, only 0.04% of 1FX8 models had this characteristic, even when restraints were used (S4 Table).

    3.8. Guidelines for choosing a Rosetta membrane protein modeling protocol

    Based on the analysis presented in this work, we have developed a proximate guide for choosing which Rosetta MP modeling protocol to use based on the protein of interest (Figure S2). First, de novo folding should ideally be used only when there is no protein of ≥ 30% sequence homology for which there is a structure available. Next, Rosetta is currently only capable of de novo folding primarily helical MPs. If the protein of interest has fewer than 200 residues, we recommend folding with MembraneAbinitio (no restraints) or from an extended chain (with restraints). However, if the protein is large and relatively complex, MembraneAbinitio is recommended. Finally, RosettaTMH would be suitable for folding large MPs if experimental restraints, such as those from EPR or NMR, are available.

    4. Discussion

    We are introducing a new de novo folding algorithm for MPs. This initial implementation is already on par with other methods for folding large MPs within Rosetta. It has some advantages: One advantage is that experimental restraints can be incorporated, a feature that earlier folding protocols lacked. We illustrate this feature using simulated EPR distance restraints. The second advantage is the very short run time and an observed tendency to work better for large MPs.

    4.1. EPR restraints significantly assist in obtaining models with the correct fold

    The results in Table 2 and Figure 3-5 indicate that, for large and very large MPs, the conformational search space of MP structures must be limited in order to obtain de novo-folded models with native-like folds. The MembraneAbinitio protocol attempts to accomplish this by folding MPs “from the inside out.” That is, a TMH in the middle of the protein sequence is inserted into the implicit membrane environment first. Next, either TMHs N- or C-terminal to the initially inserted TMH are folded into the membrane via fragment-based assembly, beginning with the TMH adjacent to the starting TMH. Then, the TMHs on the other side (in terms of sequence) are folded in the same manner [53].

    Figure 5. Sampling performance of various Rosetta methods during each stage of de novo folding using rhodopsin as an example. Box-and-whisker plot indicating the breadth of model accuracy obtained during each stage of folding with MembraneAbinitio, folding from an extended chain with and without EPR restraints, and folding with RosettaTMH with and without restraints. The thick line indicates the median RMSD100SSE obtained, while the boxes indicate the interquartile range. The highest and lowest RMSD100SSE values, excluding outliers, are indicated by the “whiskers, ” and outliers are shown as open circles.

    While MembraneAbinitio was able to generate models with RMSD100SSE < 8 Å for over 26 of the 34 proteins tested. For the remaining 8 cases in which no correctly folded models were obtained, the addition of EPR restraints did enable other folding protocols to do so (Table 2). Indeed, for majority of benchmarked proteins, the MembraneAbinitio protocol performs better than either RosettaTMH or folding from an extended chain when EPR restraints are not used. However, when EPR restraints were used, the additional restraints often resulted in more models having the correct fold. This is important because MembraneAbinitio, unlike RosettaTMH, cannot take EPR restraints into account.

    Therefore, for MPs of more than 4 TMHs and 145 residues, it is advantageous to include structural restraints, such as those available from NMR, EPR, etc. If one does employ such restraints, the traditional folding method, ExtendedChain, appears to be better suited for medium-sized MPs. On the other hand, when looking at MsbA (3B60), rhodopsin (1U19), and nitric oxide reductase (3O0R), RosettaTMH shows promise for de novo folding larger MPs, such as GPCRs, channels, and transporters (Table 2).

    4.2. Optimization of RosettaTMH folding protocol may lead to further improvement

    Even though Rosetta is now capable of folding MPs that have the correct fold and is sometimes able to recover intra-helical features, these models are not yet accurate enough to be used as the input to full-atom refinement using the RosettaMembrane all-atom scoring functions [54]. Typically, models of approximately 2.0 Å RMSD100SSE relative to the native structure are required in order to successfully obtain atomic detail information [74].

    Based on the information in Figure 5, one next step in protocol optimization would be to forego the rigid body sampling in Stage 1 of RosettaTMH folding. It is expected that the initial set of rigid body transformations result in less viable MP conformations (e.g., TMH out of the membrane, lying too orthogonal to the membrane normal, or too far apart in 3D space). The fragment insertions in Stages 2-4 are then not able to recover the correct fold. This is supported by the lowest-RMSD models displayed in Figure 6 and the data in Table S4, which show that there is a general lack of inter-helical packing and native-like placement that is not remedied by fragment insertions. Not surprisingly, the addition of EPR restraints assists in improving packing and in the recovery of helical features (Figure 6).

    Figure 6. Most accurate model resulting from RosettaTMH folding for six proteins. The most accurate models obtained from folding with RosettaTMH without EPR restraints (left model) and with EPR restraints (right model) are colored in rainbow. The native structures are colored in gray. The RMSD100SSE of the model compared to native is reported in angstroms.

    4.3. Implementation of loop closure filter and knowledge-based potential for de novo folding with RosettaTMH could improve inter-helix packing

    In order to create a radial fold tree for each model, the original simple fold tree must be “cut” to maintain the data structure’s acyclic nature. For folding with RosettaTMH, these cutpoints are chosen within the MP loops (Figure 1). However, now that the TMHs can move independently from one another, another external force must be applied to keep the TMHs in relatively close proximity, as they will otherwise drift apart and not exhibit native-like packing (Figure S2 and 6, Table S4). One approach is to integrate a loophash filter, which would ensure that TMHs that would normally be connected by a loop remain close enough in Cartesian space such that the inter-helical loop can be successfully rebuilt at a later stage.

    The loophash filter is based on work published by Tyka, Jung, and Baker [75]. In the protocol introduced by the authors, the loophash algorithm allows for extremely fast rebuilding of protein segments by rapidly determining if a loop of a given sequence length can span the distance defined by two endpoints. A hash lookup table is generated for a loop of a given sequence length, and the hashes in the table refer to specific protein segments found in a database of non-homologous proteins of known structure. In addition to the loophash, or loop closure, filter, the implementation of a loop distance KBP, such as that used by BCL::Fold [76,77] could also be beneficial. While the loop closure filter would assist in ruling out models where TMHs could not theoretically be connected, and the loop distance KBP would provide an energetic incentive to place TMHs in more native-like conformations.

    4.4. Increased sampling may be needed in order to better observe RosettaTMH’s performance

    The RosettaTMH folding protocol appears to be a more rapid means of folding MPs than MembraneAbinitio and fragment-based assembly alone (Figure S3). This is probably a result of the lack of fragment insertions, and thus recalculation of torsion angles, during the first stage of folding. However, this decreased amount of fragment insertion may be the cause of the generation of lower-quality models. In any case, the significant speedup in model production allows for the generation of many more models. This increased sampling speed may be beneficial for obtaining higher quality models of large MPs when using a more optimized RosettaTMH protocol.

    5. Conclusion

    RosettaTMH is a new de novo folding protocol that assembles MP folds from the rigid body movements of TMHs, followed by peptide fragment insertions. This approach, along with the significantly decreased time required to fold models, allows for increased sampling of conformational space, which is important for the structure prediction of more complex proteins, such as GPCRs, transporters, and channels. RosettaTMH, unlike MembraneAbinitio, allows for the folding of MPs with experimental restraints. Further, while the new folding protocol alone improves sampling, the addition of experimental restraints may be necessary to obtain native-like folds, which is especially important for determination of MPs for which there is no structural template.

    Author Contributions

    J.M. and S.H.D. conceived the RosettaTMH modeling protocol for use within the Rosetta. S.H.D. was primarily responsible for the implementation and testing of it. S.H.D. created the first version of this manuscript including tables and figures. S.L.D. was involved in the implementation of RosettaTMH within the Rosetta framework, as well as merging the RosettaTMH source code with the publicly released version of Rosetta. A.L.F. performed thorough testing of the RosettaTMH Protocol Capture and provided feedback to S.H.D. J.M. supervised the project and finalized the manuscript.

    Acknowledgements

    The authors would like to thank Drs. Frank DiMaio, Steven Lewis, and other members of the RosettaCommons for their assistance in the development of RosettaTMH. Axel Fischer was also very helpful in providing protocols on simulating EPR distance restraints in the BCL, and Dr. Brian Weiner provided many of the Rosetta-ready input files for the benchmark set.

    Supporting Information

    Table S1: Percentage of correctly folded models obtained for folding 1,000 models of 9 membrane proteins with RosettaTMH using a variety of restraint score weighting schemes

    Table S2: Enrichment obtained for folding 1,000 models of 9 membrane proteins with RosettaTMH using a variety of restraint score weighting schemes

    Table S3: Enrichment obtained for folding 5,000 models of 34 membrane proteins with and without simulated EPR distance restraints

    Table S4: RosettaTMH ability to generate models with loops that can or are likely to be closeable

    Figure S1: Percentage of models with closeable loops generated by RosettaTMH

    Figure S2: General guidelines for membrane protein modeling in Rosetta

    Figure S3: Average time required for de novo folding

    Protocol: Protocol Capture for the work presented



    [1] D. Burns, J. George, D. Aucoin, J. Bower, N. Bower, The pathogenesis and clinical management of cutaneous melanoma: an evidence-based review, J. Med. Imaging Radiat. Sci., 50 (2019), 460-469.
    [2] R. L. Siegel, K. D. Miller, A. Jemal, Cancer statistics, CA. Cancer J. Clin., 70 (2020), 7-30.
    [3] T. Crosby, R. Fish, B. Coles, M. Mason, Systemic treatments for metastatic cutaneous melanoma, Cochrane Database Syst. Rev., 2 (2018), CD001215.
    [4] L. C. van Kempen, M. Redpath, C. Robert, A. Spatz, Molecular pathology of cutaneous melanoma, Melanoma Manag. , 1 (2014), 151-164. doi: 10.2217/mmt.14.23
    [5] C. Lugassy, S. Zadran, L. A. Bentolila, M. Wadehra, R. Prakash, S. T. Carmichael, et al., Angiotropism, pericytic mimicry and extravascular migratory metastasis in melanoma: an alternative to intravascular cancer dissemination, Cancer Microenviron. , 7 (2014), 139-152. doi: 10.1007/s12307-014-0156-4
    [6] S. L. V. Es, M. Colman, J. F. Thompson, S. W. McCarthy, R. A. Scolyer, Angiotropism is an independent predictor of local recurrence and in-transit metastasis in primary cutaneous melanoma, Am. J. Surg. Pathol. , 32 (2008), 1396-1403. doi: 10.1097/PAS.0b013e3181753a8e
    [7] L. Mervic, Time course and pattern of metastasis of cutaneous melanoma differ between men and women, PLoS One., 7 (2012), e32955.
    [8] N. R. Adler, A. Haydon, C. A. McLean, J. W. Kelly, V. J. Mar, Metastatic pathways in patients with cutaneous melanoma, Pigment Cell Melanoma Res. , 30 (2017), 13-27. doi: 10.1111/pcmr.12544
    [9] I. J. Fiddler, Melanoma metastasis, Cancer Control, 2 (1995), 398-404.
    [10] C. Haqq, M. Nosrati, D. Sudilovsky, J. Crothers, D. Khodabakhsh, B. L. Pulliam, et al., The gene expression signatures of melanoma progression, Proc. Natl. Acad. Sci. U. S. A. , 102 (2005), 6092-6097. doi: 10.1073/pnas.0501564102
    [11] S. Mandruzzato, A. Callegaro, G. Turcatel, S. Francescato, M. C. Montesco, V. Chiarion-Sileni, et al., A gene expression signature associated with survival in metastatic melanoma, J. Transl. Med. , 4 (2006), 1479-5876.
    [12] B. Huang, W. Han, Z. F. Sheng, G. L. Shen, Identification of immune-related biomarkers associated with tumorigenesis and prognosis in cutaneous melanoma patients, Cancer Cell Int. , 20 (2020), 020-01271. doi: 10.1186/s12935-020-1101-x
    [13] M. Liao, F. Zeng, Y. Li, Q. Gao, M. Yin, G. Deng, et al., A novel predictive model incorporating immune-related gene signatures for overall survival in melanoma patients, Sci. Rep. , 10 (2020), 12462. doi: 10.1038/s41598-020-69330-2
    [14] O. Kabbarah, C. Nogueira, B. Feng, R. M. Nazarian, M. Bosenberg, M. Wu, et al., Integrative genome comparison of primary and metastatic melanomas, PLoS One, 5 (2010), 0010770. doi: 10.1371/journal.pone.0010770
    [15] A. I. Riker, S. A. Enkemann, O. Fodstad, S. Liu, S. Ren, C. Morris, et al., The gene expression profiles of primary and metastatic melanoma yields a transition point of tumor progression and metastasis, BMC Med. Genomics, 1 (2008), 1755-8794.
    [16] H. Cirenajwis, H. Ekedahl, M. Lauss, K. Harbst, A. Carneiro, Molecular stratification of metastatic melanoma using gene expression profiling : Prediction of survival outcome and benefit from molecular targeted therapy, Oncotarget, 6 (2015), 12297-12309. doi: 10.18632/oncotarget.3655
    [17] R. Cabrita, M. Lauss, A. Sanna, M. Donia, G. Jönsson, Tertiary lymphoid structures improve immunotherapy and survival in melanoma, Nature, 577 (2020), 561-565. doi: 10.1038/s41586-019-1914-8
    [18] V. Nicolaidou, C. Papaneophytou, C. Koufaris, Detection and characterisation of novel alternative splicing variants of the mitochondrial folate enzyme MTHFD2, Mol. Biol. Rep., 47 (2020), 1-8.
    [19] C. Qi, L. Hong, Z. Cheng, Q. Yin, Identification of metastasis-associated genes in colorectal cancer using metaDE and survival analysis, Oncol. Lett. , 11 (2015), 568-574.
    [20] X. Wang, D. D. Kang, K. Shen, C. Song, S. Lu, L. C. Chang, et al., An R package suite for microarray meta-analysis in quality control, differentially expressed gene analysis and pathway enrichment detection, Bioinformatics, 28 (2012), 2534-2536. doi: 10.1093/bioinformatics/bts485
    [21] X. Zhai, Q. Xue, Q. Liu, Y. Guo, Z. Chen, Colon cancer recurrenceassociated genes revealed by WGCNA coexpression network analysis, Mol. Med. Rep. , 16 (2017), 6499-6505. doi: 10.3892/mmr.2017.7412
    [22] P. Langfelder and S. Horvath, WGCNA: an R package for weighted correlation network analysis, BMC Bioinf. , 9 (2008), 1471-2105.
    [23] J. Cao, S. Zhang, A Bayesian extension of the hypergeometric test for functional enrichment analysis, Biometrics. , 70 (2014), 84-94. doi: 10.1111/biom.12122
    [24] P. Shannon, A. Markiel, O. Ozier, N. S. Baliga, J. T. Wang, D. Ramage, et al., Cytoscape: a software environment for integrated models of biomolecular interaction networks, Genome Res. , 13 (2003), 2498-2504. doi: 10.1101/gr.1239303
    [25] D. W. Huang, B. T. Sherman, R. A. Lempicki, Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources, Nat. Protoc. , 4 (2009), 44-57. doi: 10.1038/nprot.2008.211
    [26] P. Wang, Y. Wang, B. Hang, X. Zou, J. H. Mao, A novel gene expression-based prognostic scoring system to predict survival in gastric cancer, Oncotarget, 7 (2016), 55343-55351. doi: 10.18632/oncotarget.10533
    [27] R. Tibshirani, The lasso method for variable selection in the Cox model, Stat. Med. , 16 (1997), 385-395. doi: 10.1002/(SICI)1097-0258(19970228)16:4<385::AID-SIM380>3.0.CO;2-3
    [28] J. J. Goeman, L1 penalized estimation in the Cox proportional hazards model, Biom. J. , 52 (2010), 70-84.
    [29] K. H. Eng, E. Schiller, K. Morrel, On representing the prognostic value of continuous gene expression biomarkers with the restricted mean survival curve, Oncotarget, 6 (2015), 36308-36318. doi: 10.18632/oncotarget.6121
    [30] W. Liang, L. Zhang, G. Jiang, Q. Wang, J. He, Development and validation of a nomogram for predicting survival in patients with resected non-small-cell lung cancer, J. Clin. Oncol. , 33 (2015), 861-869. doi: 10.1200/JCO.2014.56.6661
    [31] C. Zhang, F. Wang, F. Guo, C. Ye, B. Yang, A 13-gene risk score system and a nomogram survival model for predicting the prognosis of clear cell renal cell carcinoma, Urol. Oncol. , 38 (2020), 74. e1-74. e11.
    [32] A. Subramanian, P. Tamayo, V. K. Mootha, S. Mukherjee, B. L. Ebert, M. A. Gillette, et al., Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles, Proc. Natl. Acad. Sci. U. S. A. , 102 (2005), 15545-15550. doi: 10.1073/pnas.0506580102
    [33] X. Zhang, L. Yang, P. Szeto, G. K. Abali, Y. Zhang, A. Kulkarni, et al., The Hippo pathway oncoprotein YAP promotes melanoma cell invasion and spontaneous metastasis, Oncogene, 39 (2020), 5267-5281. doi: 10.1038/s41388-020-1362-9
    [34] Z. Kozovska, V. Gabrisova and L. Kucerova, Malignant melanoma: diagnosis, treatment and cancer stem cells, Neoplasma, 63 (2016), 510-517.
    [35] H. Moon, L. R. Donahue, E. Choi, P. O. Scumpia, W. E. Lowry, J. K. Grenier, et al., Melanocyte Stem Cell Activation and Translocation Initiate Cutaneous Melanoma in Response to UV Exposure, Cell Stem. Cell, 21 (2017), 665-678. doi: 10.1016/j.stem.2017.09.001
    [36] E. Seroussi, D. Kedra, H. Q. Pan, M. Peyrard, C. Schwartz, P. Scambler, et al., Duplications on human chromosome 22 reveal a novel Ret Finger Protein-like gene family with sense and endogenous antisense transcripts, Genome Res. , 9 (1999), 803-814. doi: 10.1101/gr.9.9.803
    [37] J. Bonnefont, T. Laforge, O. Plastre, B. Beck, S. Sorce, C. Dehay, et al., Primate-specific RFPL1 gene controls cell-cycle progression through cyclin B1/Cdc2 degradation, Cell Death Differ. , 18 (2011), 293-303. doi: 10.1038/cdd.2010.102
    [38] X. Zhang, S. Sun, J. K. Pu, A. C. Tsang, D. Lee, V. O. Man, et al., Long non-coding RNA expression profiles predict clinical phenotypes in glioma, Neurobiol. Dis. , 48 (2012), 1-8. doi: 10.1016/j.nbd.2012.06.004
    [39] M. Toss, I. Miligy, K. Gorringe, K. Mittal, R. Aneja, I. Ellis, et al., Prognostic significance of cathepsin V (CTSV/CTSL2) in breast ductal carcinoma in situ, J. Clin. Pathol. , 73 (2020), 76-82. doi: 10.1136/jclinpath-2019-205939
    [40] C. -L. Lin, T. -W. Hung, T. -H. Ying, C. -J. Lin, Y. -H. Hsieh, C. -M. Chen, Praeruptorin B mitigates the metastatic ability of human renal carcinoma cells through targeting CTSC and CTSV expression, Int. J. Mol. Sci. , 21 (2020), 2919. doi: 10.3390/ijms21082919
    [41] Q. L. Liu, Q. L. Liang, Z. Y. Li, Y. Zhou, W. T. Ou, Z. G. Huang, Function and expression of prolyl hydroxylase 3 in cancers, Arch Med. Sci. , 9 (2013), 589-593.
    [42] N. Pescador, Y. Cuevas, S. Naranjo, M. Alcaide, D. Villar, M. O. Landázuri, et al., Identification of a functional hypoxia-responsive element that regulates the expression of the egl nine homologue 3 (egln3/phd3) gene, Biochem. J. , 390 (2005), 189-197. doi: 10.1042/BJ20042121
    [43] J. Rodriguez, A. Herrero, S. Li, N. Rauch, A. Quintanilla, K. Wynne, et al., PHD3 regulates p53 protein stability by hydroxylating proline 359, Cell Rep. , 24 (2018), 1316-1329. doi: 10.1016/j.celrep.2018.06.108
    [44] J. M. Roda, Y. Wang, L. A. Sumner, G. S. Phillips, C. B. Marsh, T. D. Eubank, Stabilization of HIF-2α induces sVEGFR-1 production from tumor-associated macrophages and decreases tumor growth in a murine melanoma model, J. Immunol. , 189 (2012), 3168-3177. doi: 10.4049/jimmunol.1103817
    [45] A. Reustle, M. Di Marco, C. Meyerhoff, A. Nelde, J. S. Walz, S. Winter, et al., Integrative -omics and HLA-ligandomics analysis to identify novel drug targets for ccRCC immunotherapy, Genome Med. , 12 (2020), 32-32. doi: 10.1186/s13073-020-00731-8
    [46] Y. Wang, X. Li, W. Liu, B. Li, D. Chen, F. Hu, et al., MicroRNA-1205, encoded on chromosome 8q24, targets EGLN3 to induce cell growth and contributes to risk of castration-resistant prostate cancer, Oncogene, 38 (2019), 4820-4834. doi: 10.1038/s41388-019-0760-3
    [47] S. Li, J. Rodriguez, W. Li, P. Bullova, S. M. Fell, O. Surova, et al., EglN3 hydroxylase stabilizes BIM-EL linking VHL type 2C mutations to pheochromocytoma pathogenesis and chemotherapy resistance, Proc. Natl. Acad. Sci. U. S. A. , 116 (2019), 16997-17006. doi: 10.1073/pnas.1900748116
    [48] T. W. Bebee, J. W. Park, K. I. Sheridan, C. C. Warzecha, B. W. Cieply, A. M. Rohacek, et al., The splicing regulators Esrp1 and Esrp2 direct an epithelial splicing program essential for mammalian development, Elife, 15 (2015), 08954.
    [49] K. Horiguchi, K. Sakamoto, D. Koinuma, K. Semba, A. Inoue, S. Inoue, et al., TGF-β drives epithelial-mesenchymal transition through δEF1-mediated downregulation of ESRP, Oncogene, 31 (2012), 3190-3201. doi: 10.1038/onc.2011.493
    [50] J. Ueda, Y. Matsuda, K. Yamahatsu, E. Uchida, Z. Naito, M. Korc, et al., Epithelial splicing regulatory protein 1 is a favorable prognostic factor in pancreatic cancer that attenuates pancreatic metastases, Oncogene, 33 (2014), 4485-4495. doi: 10.1038/onc.2013.392
    [51] B. Wang, Y. Li, C. Kou, J. Sun, X. Xu, Mining database for the clinical significance and prognostic value of ESRP1 in cutaneous malignant melanoma, Biomed. Res. Int. , 5 (2020), 4985014.
    [52] A. Sawant, J. A. Hensel, D. Chanda, B. A. Harris, G. P. Siegal, A. Maheshwari, et al., Depletion of plasmacytoid dendritic cells inhibits tumor growth and prevents bone metastasis of breast cancer cells, J. Immunol. , 189 (2012), 4258-4265. doi: 10.4049/jimmunol.1101855
    [53] A. E. Boyce, J. A. McGrath, T. Techanukul, D. F. Murrell, C. W. Chow, L. McGregor, et al., Ectodermal dysplasia-skin fragility syndrome due to a new homozygous internal deletion mutation in the PKP1 gene, Australas. J. Dermatol. , 53 (2012), 61-65. doi: 10.1111/j.1440-0960.2011.00846.x
    [54] I. Hofmann, Plakophilins and their roles in diseased states, Cell Tissue Res. , 379 (2020), 5-12. doi: 10.1007/s00441-019-03153-0
    [55] P. Lee, S. Jiang, Y. Li, J. Yue, X. Gou, S. Y. Chen, et al., Phosphorylation of Pkp1 by RIPK4 regulates epidermal differentiation and skin tumorigenesis, Embo. J. , 36 (2017), 1963-1980. doi: 10.15252/embj.201695679
    [56] Y. Bao, Y. Guo, Y. Yang, X. Wei, S. Zhang, Y. Zhang, et al., PRSS8 suppresses colorectal carcinogenesis and metastasis, Oncogene, 38 (2019), 497-517. doi: 10.1038/s41388-018-0453-3
    [57] Y. Bao, Q. Wang, Y. Guo, Z. Chen, K. Li, Y. Yang, et al., PRSS8 methylation and its significance in esophageal squamous cell carcinoma, Oncotarget, 7 (2016), 28540-28555. doi: 10.18632/oncotarget.8677
    [58] A. Tamir, A. Gangadharan, S. Balwani, T. Tanaka, U. Patel, A. Hassan, et al., The serine protease prostasin (PRSS8) is a potential biomarker for early detection of ovarian cancer, J. Ovarian Res. , 9 (2016), 016-0228. doi: 10.1186/s13048-016-0226-y
    [59] A. Maurichi, R. Miceli, H. Eriksson, J. Newton-Bishop, J. Nsengimana, M. Chan, et al., Factors affecting sentinel node metastasis in thin (T1) cutaneous melanomas: development and external validation of a predictive nomogram, J. Clin. Oncol. , 38 (2020), 1591-1601. doi: 10.1200/JCO.19.01902
    [60] B. Hu, Q. Wei, C. Zhou, M. Ju, L. Wang, L. Chen, et al., Analysis of immune subtypes based on immunogenomic profiling identifies prognostic signature for cutaneous melanoma, Int. Immunopharmacol. , 6 (2020), 107162.
  • This article has been cited by:

    1. Zhiyu Tu, Wenjun Liu, Well‐posedness and exponential decay for the Moore–Gibson–Thompson equation with time‐dependent memory kernel, 2023, 0170-4214, 10.1002/mma.9133
    2. Marina Murillo‐Arcila, Well‐posedness for the fourth‐order Moore–Gibson–Thompson equation in the class of Banach‐space‐valued Hölder‐continuous functions, 2023, 46, 0170-4214, 1928, 10.1002/mma.8618
    3. Carlos Lizama, Marina Murillo-Arcila, Well-posedness for a fourth-order equation of Moore–Gibson–Thompson type, 2021, 14173875, 1, 10.14232/ejqtde.2021.1.81
    4. Danhua Wang, Wenjun Liu, Well-posedness and decay property of regularity-loss type for the Cauchy problem of the standard linear solid model with Gurtin–Pipkin thermal law, 2021, 123, 18758576, 181, 10.3233/ASY-201631
    5. Yang Liu, Wenke Li, A class of fourth-order nonlinear parabolic equations modeling the epitaxial growth of thin films, 2021, 14, 1937-1632, 4367, 10.3934/dcdss.2021112
    6. Danhua Wang, Wenjun Liu, Kewang Chen, Well-posedness and decay property for the Cauchy problem of the standard linear solid model with thermoelasticity of type III, 2023, 74, 0044-2275, 10.1007/s00033-023-01964-4
    7. Doaa Atta, Ahmed E. Abouelregal, Hamid M. Sedighi, Rasmiyah A. Alharb, Thermodiffusion interactions in a homogeneous spherical shell based on the modified Moore–Gibson–Thompson theory with two time delays, 2023, 1385-2000, 10.1007/s11043-023-09598-9
    8. İbrahim TEKİN, Identification of the time-dependent lowest term in a fourth order in time partial differential equation, 2023, 72, 1303-5991, 500, 10.31801/cfsuasmas.1127250
    9. Husam Alfadil, Ahmed E. Abouelregal, Marin Marin, Erasmo Carrera, Goufo-Caputo fractional viscoelastic photothermal model of an unbounded semiconductor material with a cylindrical cavity, 2023, 1537-6494, 1, 10.1080/15376494.2023.2278181
    10. Danhua Wang, Wenjun Liu, Global Existence and Decay Property for the Cauchy Problem of the Nonlinear MGT Plate Equation, 2024, 89, 0095-4616, 10.1007/s00245-024-10126-5
    11. Flank D. M. Bezerra, Lucas A. Santos, Maria J. M. Silva, Carlos R. Takaessu, A Note on the Spectral Analysis of Some Fourth-Order Differential Equations with a Semigroup Approach, 2023, 78, 1422-6383, 10.1007/s00025-023-01999-z
    12. Carlos Lizama, Marina Murillo-Arcila, On the existence of chaos for the fourth-order Moore–Gibson–Thompson equation, 2023, 176, 09600779, 114123, 10.1016/j.chaos.2023.114123
    13. Wen-jun Liu, Zhi-yu Tu, Equivalence between the internal observability and exponential decay for the Moore-Gibson-Thompson equation, 2024, 39, 1005-1031, 89, 10.1007/s11766-024-4133-5
    14. Ahmed E. Abouelregal, Marin Marin, Holm Altenbach, Thermally stressed thermoelectric microbeam supported by Winkler foundation via the modified Moore–Gibson–Thompson thermoelasticity theory, 2023, 103, 0044-2267, 10.1002/zamm.202300079
  • Reader Comments
  • © 2021 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)
通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索

Metrics

Article views(4381) PDF downloads(174) Cited by(3)

Article outline

Other Articles By Authors

/

DownLoad:  Full-Size Img  PowerPoint
Return
Return

Catalog