In order to have the highest efficiency in real-life photovoltaic power generation systems, how to model, optimize and control photovoltaic systems has become a challenge. The photovoltaic power generation systems are dominated by photovoltaic models, and its performance depends on its unknown parameters. However, the modeling equation of the photovoltaic model is nonlinear, leading to the difficulty in parameter extraction. To extract the parameters of the photovoltaic model more accurately and efficiently, a chaotic self-adaptive JAYA algorithm, called AHJAYA, was proposed, where various improvement strategies are introduced. First, self-adaptive coefficients are introduced to change the priority of information from the best search agent and the worst search agent. Second, by combining the linear population reduction strategy with the chaotic opposition-based learning strategy, the convergence speed of the algorithm is improved as well as avoid falling into local optimum. To verify the performance of the AHJAYA, four photovoltaic models are selected. The experimental results prove that the proposed AHJAYA has superior performance and strong competitiveness.
Citation: Juan Zhao, Yujun Zhang, Shuijia Li, Yufei Wang, Yuxin Yan, Zhengming Gao. A chaotic self-adaptive JAYA algorithm for parameter extraction of photovoltaic models[J]. Mathematical Biosciences and Engineering, 2022, 19(6): 5638-5670. doi: 10.3934/mbe.2022264
[1] | Yuxiang Zou, Jialong Qi, Hui Tang . Regulatory role of FOXQ1 gene and its target genes in colorectal cancer. AIMS Medical Science, 2024, 11(3): 232-247. doi: 10.3934/medsci.2024018 |
[2] | Salma M. AlDallal . Quick glance at Fanconi anemia and BRCA2/FANCD1. AIMS Medical Science, 2019, 6(4): 326-336. doi: 10.3934/medsci.2019.4.326 |
[3] | Panagiotis Halvatsiotis, Argyris Siatelis, Panagiotis Koulouvaris, Anthimia Batrinou, Despina Vougiouklaki, Eleni Routsi, Michail Papapanou, Maria Trapali, Dimitra Houhoula . Comparison of Q223R leptin receptor polymorphism to the leptin gene expression in Greek young volunteers. AIMS Medical Science, 2021, 8(4): 301-310. doi: 10.3934/medsci.2021025 |
[4] | Luis Miguel Juárez-Salcedo, Luis Manuel González, Samir Dalia . Immunotherapy for diffuse large B-cell lymphoma: current use of immune checkpoint inhibitors therapy. AIMS Medical Science, 2023, 10(3): 259-272. doi: 10.3934/medsci.2023020 |
[5] | Nádia C. M. Okuyama, Fernando Cezar dos Santos, Kleber Paiva Trugilo, Karen Brajão de Oliveira . Involvement of CXCL12 Pathway in HPV-related Diseases. AIMS Medical Science, 2016, 3(4): 417-440. doi: 10.3934/medsci.2016.4.417 |
[6] | Anne A. Adeyanju, Wonderful B. Adebagbo, Olorunfemi R. Molehin, Omolola R. Oyenihi . Exploring the multi-drug resistance (MDR) inhibition property of Sildenafil: phosphodiesterase 5 as a therapeutic target and a potential player in reversing MDR for a successful breast cancer treatment. AIMS Medical Science, 2025, 12(2): 145-170. doi: 10.3934/medsci.2025010 |
[7] | Sobhan Helbi, Zahra Engardeh, Sahar Nickbin Poshtamsary, Zaynab Aminzadeh, Nahid Jivad . Down-regulation of IRF3 expression in Relapse-Remitting MS patients. AIMS Medical Science, 2019, 6(2): 140-147. doi: 10.3934/medsci.2019.2.140 |
[8] | Margarida Pujol-López, Luis Ortega-Paz, Manel Garabito, Salvatore Brugaletta, Manel Sabaté, Ana Paula Dantas . miRNA Update: A Review Focus on Clinical Implications of miRNA in Vascular Remodeling. AIMS Medical Science, 2017, 4(1): 99-112. doi: 10.3934/medsci.2017.1.99 |
[9] | Carman K.M. Ip, Jun Yin, Patrick K.S. Ng, Shiaw-Yih Lin, Gordon B. Mills . Genomic-Glycosylation Aberrations in Tumor Initiation, Progression and Management. AIMS Medical Science, 2016, 3(4): 386-416. doi: 10.3934/medsci.2016.4.386 |
[10] | Vivek Radhakrishnan, Mark S. Swanson, Uttam K. Sinha . Monoclonal Antibodies as Treatment Modalities in Head and Neck Cancers. AIMS Medical Science, 2015, 2(4): 347-359. doi: 10.3934/medsci.2015.4.347 |
In order to have the highest efficiency in real-life photovoltaic power generation systems, how to model, optimize and control photovoltaic systems has become a challenge. The photovoltaic power generation systems are dominated by photovoltaic models, and its performance depends on its unknown parameters. However, the modeling equation of the photovoltaic model is nonlinear, leading to the difficulty in parameter extraction. To extract the parameters of the photovoltaic model more accurately and efficiently, a chaotic self-adaptive JAYA algorithm, called AHJAYA, was proposed, where various improvement strategies are introduced. First, self-adaptive coefficients are introduced to change the priority of information from the best search agent and the worst search agent. Second, by combining the linear population reduction strategy with the chaotic opposition-based learning strategy, the convergence speed of the algorithm is improved as well as avoid falling into local optimum. To verify the performance of the AHJAYA, four photovoltaic models are selected. The experimental results prove that the proposed AHJAYA has superior performance and strong competitiveness.
Traditionally, gene expression analysis includes reverse transcription of mRNA into cDNA and probing of gene transcripts of interest by specific primers designed for target PCR amplification (gold standard), followed by quantitative, semi-quantitative (e.g. qRT-PCR), or electrophoresis (e.g. Southern blotting) detection methods. Based on efforts provided by the Human Genome Project [1,2] and studies on expressed sequence tags (ESTs) in mammalian genomes, cDNA hybridization array chips have originally been designed to investigate deregulated mRNA expression of distinct and well-characterized gene transcripts in various diseases. Modern mRNA-microarray platforms apply one or two-color fluorescence labeling (i.e. Cyanine3 / Cy3 for green and Cy5 for red dye fluorescence) for one or two samples to be loaded on the chip, respectively, and allow the detection of more than 47 000 transcripts. In contrast to two-color arrays (e.g. HuA1 by Agilent Technologies, Santa Clara, CA, USA), one-color arrays, are most commonly used today (e.g. HG-U133 Plus 2.0 by Affymetrix, Inc., Santa Clara, CA, USA, or BeadArray HT-12v4, Illumina, Inc., San Diego, CA, USA) and represent the focus of this review.
The past few years have seen the advent of transcriptome sequencing (RNA-seq) based on the next-generation sequencing (NGS) technology using high-throughput platforms, such as the GA IIx or HiSeq2000 sequencer from Illumina. RNA-seq does not require the prior design of specific probes, rendering it a highly versatile approach for gene expression profiling (GEP). Accordingly, a number of publications on the genomic landscape of various neoplasms have applied RNA-seq to investigate gene-specific aspects such as differential splicing and exon usage [3], hidden viral transcripts [4], and cancer-specific fusion transcripts [5]. However, published reports using RNA-seq in cancer often lack statistical power for comprehensive gene expression analyses due to a limited sample size. In contrast, mRNA-based microarrays have remained the initial method of choice for high-throughput analyses of gene expression in many laboratories. Reasons for this include the associated lowerper-sample costs as well as the availability of already published microarray-derived GEP data in public databases. Many of these data sets were processed by established and widely used methods, thereby improving their comparability and the suitability for data-mining approaches.
Within this review, we present an overview of critical steps in the analysis of microarray-based GEP data (see overview in Figure 1) and the corresponding library and code information (summarized in Table 1 and 2). We will discuss step-by-step software and database query solutions that may be useful for data analysis, to avoid analytic pitfalls, and to provide an increased capability for clinical and biological interpretation of data. To illustrate the proposed analytic steps, we present analyses on exemplary data of previously published and own GEP data, all obtained in patients with B-and T-cell leukemias or lymphomas.
There are various possibilities to apply basic steps of quality control (QC) prior to or during preprocessing of GEP raw data. In order to avoid false estimates of background intensities and false inputs for normalization, removal of potential problematic samples and probes before data preprocessing is essential towards a correct interpretation of data. Problematic samples often present as outliers in density distributions or in an unsupervised cluster analysis on global gene expression values (after data preprocessing). The latter, e.g. in form of dendrograms (Code 1) or principal component analyses (PCA; Code 2), is created by using the R [7] library arrayQualityMetrics from Bioconductor [8] with its informative HTML report per array.
Numerous methods and libraries for R are available for more specific quality assessments for each of the three major microarray platforms. Affymetrix arrays can be analyzed using the affyQCReport and simpleaffy libraries (see Table 1 for all library references), which normalize expression values using housekeeping genes (e.g. calculating the actin3/actin5 ratio), while the affyPLM library allows calculation of important quality measures such as the normalized unscaled standard error (NUSE) and relative log expression (RLE) as well as their plotting across samples (Code 3). The quality of data obtained with Illumina chips can be assessed by statistical standard measurements (mean and standard deviation) or outlier detection using the lumiQ function within the lumi library (Code 4). Possible slide inhomogeneities (i.e. scratches) or contamination on two-color arrays may be detected with the imageplot function of the limma library. This package also allows the calculation of the RNA Integrity Number (RIN) as a measure of mRNA degradation with a subsequent option to remove samples below a given threshold.
A first step in the standard analysis protocol of cDNA microarrays usually is the conversion of hybridization image spots obtained by array scanners into raw gene expression values. For Affymetrix chips this is normally done either by using the freeware Affymetrix Power Tools or the R library affy. For Illumina’s BeadChips the proprietary GenomeStudio software or manual decryption via the R library beadArray may be used. For two-color arrays, scanner output files, e.g. in TIF format, can easily be read with the read.maimages function from the limma R library.
In a second step, background correction is conducted by subtracting technical noise from biological variation. This is accomplished by using e.g. RMA [9] for Affymetrix arrays or the bgAdjust function from the lumi R library for Illumina arrays, which employs a similar algorithm as GenomeStudio (Code 5). In order to account for outliers and to remove systematic variation, normalization of expression values is required. The most common procedures include quantile-normalization, which preserves the rank, but may eliminate small differences in expression values, and LOESS (locally weighted scatterplot smoothing)-normalization, which does the opposite. Robust splice normalization (RSN) aims to combine the advantages of both methods through a monotonic splice fit to one reference sample, while simple scaling normalization (SSN) forces samples to have the same scale and background. Both approaches are included in the lumi R library for Illumina arrays. For two-color arrays it may be essential to further account for dye biases in the normalization [10] and to normalize within the array itself (between both color-labeled samples) and between all two-color arrays of the cohort, e.g. by use of the limma R library. Variance-stabilizing normalization (VSN) constitutes another method for combining background correction and normalization [11], while preserving biological variation. It is implemented in the vsn (Code 6) library, applicable to arrays of all major platforms. Within the normalization process raw intensities are usually transformed, either into a log2 scale or glog in case of VSN, in order to smoothen extreme values.
Frequent impediments for GEP data analysis are missing array annotations or outdated annotation files provided by the manufacturers (e.g. frequently old GenBank predictions are included). Data-mining tools such as biomaRt [12] can be used to acquire up-to-date probe information (Code 7). They may also be helpful in assigning probes to transcripts, thereby enabling filtering for redundancies of probes, which map primarily to transcripts that are prone to nonsense-mediated mRNA decay (NMD) or to unprocessed pseudogenes. Deconvolution of genes with known transcript variants of differential function into probed isoforms may also be important for extrapolations on biological relevance. An example is the apoptosis regulator myeloid cell leukemia sequence 1 (MCL1), of which the longer isoform (MCL1-001) has been reported to enhance survival by inhibiting apoptosis, while its shorter isoform (MCL1-002) acts as a pro-apoptotic molecule [13].
Raw data preprocessing and QC is followed by the actual statistical analysis, usually in the form of probe-by-probe hypothesis tests for differential expression including: (1) two-group mean comparisons using a Student’s t-test (parametric, i.e. presuming a known statistical distribution), (2) empirical Bayes / moderated t-tests (for low sample size; e.g. n < 10; parametric), (3) Mann-Whitney-U tests (for samples with low variability; non-parametric) (Code 8), (4) multiple-group tests by means of an analysis of variance (ANOVA; parametric) (Code 9), or (5), a Jonckheere test (trend test; non-parametric). However, statistical testing of all genes / transcripts detected by an array requires correction for multiple testing, in order to avoid a substantial number of false-positive findings [14,15]. For example, using a significance level of 0.05 for each of 10, 000 tests would result in approximately 0.05 * 10, 000=500 significant rejections by chance, even if all null hypotheses of no differential expression were true. To this end, we can either control the family-wise error rate (FWER) to curtail the number of statistically significant results, e.g. by use of the (conservative) Bonferroni correction, in which the significance level for each probe-specific test equals the FWER (e.g. 0.05) divided by the total number of tested probes, or by some permutation / resampling approach. Furthermore, we can aim for controlling the false-discovery rate (FDR), i.e. the proportion of falsely rejected null hypotheses, e.g. using the Benjamini-Hochberg’s procedure, q-values, or other approaches. It should be noted, however, that control of the FDR, while very helpful in limiting the number of erroneously followed-up probes, does not imply a notion of statistical significance. The procedures by Bonferroni and by Benjamini-Hochberg are implemented in the multtest library [16], while the qvalue library provides an implementation for the rank-preserving q-value calculation (Code 10).
Nominally differentially expressed probes (e.g. with a single-test level of p < 0.05) can also be filtered by multiple-testing correction, for example by applying a q-value / FDR cutoff (common cut-off, e.g. 0.1) to ensure a low proportion of false-positives in the set of probes to be subsequently followed up. To reduce time in the analysis, it may also be useful to exclude genes / probes that are not expected to be differentially expressed either due to biologically low variability in the investigated samples, or due to technically low detectability on the array. This can be achieved either by non-specific filtering of expression values restricted to a given range (e.g. the shortest interval containing half of the data by standard deviation (sd) or interquartile range) or by setting an empirical cut-off to the coefficient of variation (sd/mean), e.g. the top 10 percent or a fixed value of 0.6. Note, however, that this may increase the rate of false-negative findings (Code 11).
When comparing GEP data obtained in the same laboratory, but with two or more different batches of arrays, the results will deviate from one another beyond the expected biological and array-specific technical variation. Batch correction addresses this issue. Two approaches commonly considered to be performing best [17] are mean-centering and a Bayesian framework named ComBat [18] (Figure 2a-c; Code 12).
A particular problem for cancer transcriptomics / genomics is the contamination of cancer tissues by normal cells (irrespective whether to consider them as actual milieu components) and vice versa. Even in lymphomas and lymphoid leukemias, such problems are encountered in lymph-node samples or in the seemingly‘pure’blood samples, as these are also of mostly multicellular composition. Tools like ESTIMATE [19] can weigh specific markers (e.g. indicating an immune or stromal cell origin) within gene expression profiles in the form of gene set enrichment analyses and thus evaluate the degree of purity. Unfortunately, due to intrinsic aberrations of‘immune cell’genes within tumor cells of leukemias / lymphomas, the immune gene set used within ESTIMATE is not reliable for the enrichment analysis within these malignancies (Figure 2d; Code 13). An alternative approach especially for leukemias / lymphomas might be CellMix [20] which uses gene sets from specific immune cell subsets, e.g. CD4+ and CD8+ T-lymphocytes, CD14+ monocytes, CD19+ B-lymphocytes, CD56+ natural killer cells, and CD66b+ granulocytes.
Two public databases are commonly used for the comparison of own microarray data with independent data sets, for example in a meta-analysis, namely the GEO (gene expression omnibus) database [21] (http://www.ncbi.nlm.nih.gov/geo) and the ArrayExpress database [22] (https://www.ebi.ac.uk/arrayexpress), with GEO featuring a larger number of integrated samples. Both platforms use distinct annotation / meta-data file systems. In GEO, samples are either described in MIAME Notation in Markup Language (MINiML; pronounced 'minimal') or SOFT formatted family files. In ArrayExpress, sample and data relationships (SDR) are described in the SDRF format, while protocol information is stored in the Investigation Description Format (IDF). Both databases offer processed numerical gene expression values (in the form of matrices) stored in regular text format (txt), or raw data in CEL or idat (for Affymetrix or Illumina chips) files. GEO and ArrayExpress also provide respective R libraries to automate queries and processing of differential expression analyses, namely GEOquery and ArrayExpress.
Analysis results for data sets within ArrayExpress are further integrated in the 'Gene expression atlas' of the EMBL / EBI (http://www.ebi.ac.uk/gxa). The latter provides information about gene and protein expression in animal and plant samples for different cell types, tissues, developmental stages, diseases, and other conditions from 1572 studies as of August 2015 [23]. The human data sets are currently exported into an RDF version accessible via a SPARQL Endpoint (http://www.ebi.ac.uk/rdf/services/atlas/sparql; accessed 02/21/2016).
Implemented queries include:
·“Query 1: Get experiments where the sample description contains diabetes”
·“Query 2: Get differentially expressed genes where factor is asthma”
·“Query 3: Show expression for ENSG00000129991 (TNNI3)”
·“Query 4: Show expression for ENSG00000129991 (TNNI3) with its GO annotations from Uniprot (Federated query to http://sparql.uniprot.org/sparql)”
·“Query 5: For the genes differentially expressed in asthma, get the gene products associated to a Reactome pathway”
·“Query 6: Get all mappings for a given probe e.g. A-AFFY-1/661_at”
Query 2 and 5 can be further modified in order to compare gene dysregulation in other types of diseases, e.g. in lymphoid leukemias, such as chronic lymphocytic leukemia (CLL; Table 3). User’s familiarity with the underlying ontologies (controlled vocabulary; [24]) is, however, necessary to construct queries.
For conceptualizing a pharmacologic compound (e.g. inhibitor) acting against a specific gene product or for designing specific gene-knockouts within a model organism, it may be particularly important to know in what conditions and disease subtypes expression of a distinct gene is up-or down-regulated and to which degree (basal or extreme). Integrative analyses of expression changes within a multitude of samples of the same entity, or model organism, or any other comparable biological system as well as across initially separately analyzed (and published) series (cohorts) are often called gene expression meta-analyses. In the following we describe multiple ways to conduct a meta-analysis of GEP data with their limitations and advantages.
The first approach includes construction and sending of specific queries to the EMBL / EBI RDF platform. Querying can further be semi-automated using the SPARQL R library, which allows the investigation of different data sets in a specific condition, e.g. comparisons of CLL vs. normal B-cells, or between distinct groups of tumor samples stratified by a characteristic of interest, e.g. immunoglobulin heavy chain (IGHV) gene mutated vs. unmutated CLL. Results are usually tabularized and fold-changes visualized within a heatmap (Figure 3a; Suppl. Table 1; Code 14).
Since not all 'ArrayExpress' data sets are yet integrated into the EMBL / EBI RDF platform and the GEO database contains additional data sets, the manual download, processing, and integration of such additional data is often necessary.
Therefore, a second, more hands-on approach to meta-analyses is a search by keyword, e.g. 'chronic lymphoid', within GEO and / or ArrayExpress (or any other public database). Once the data set has been picked, it is background-corrected and the annotated replicates can be combined with their original samples by calculating their mean. Afterwards all samples within the data set are normalized (e.g. quantile-normalized).
Probe sets of a gene which map to retained / dysfunctional transcripts (or which map to more retained / dysfunctional transcripts than other probe sets of the same gene) should be removed to obtain meaningful expression values (Suppl. Table 2). For example, BCL2L11 on Affymetrix HG-U133 Plus 2.0 chips has two probes, one hybridizes two protein-coding and six NMD (nonsense-mediated decay) transcripts, the other one hybridizes two protein-coding and eight NMD transcripts. Thus, ambiguous expression values of this gene have to be evaluated with caution. The residual unambiguous probe sets assigned to a gene are then further summarized by calculation of average expression values per gene.
For further evaluation of the GEP meta-analysis, three different techniques for integration can be used to observe gene expression patterns and entity clustering:
1) The first method quantile-normalizes a matrix of average gene expressions across entities from different experiments and finally gives a visual approximation. If there is also a tumor suppressor gene (very low expression) and an oncogene (very high expression) in the gene set to be evaluated, one can expect an expression range similar to the whole transcriptome. It should be noted that in previous Affymetrix sets, such as HG-U133A, some genes (e.g. BMF and BOK) are not covered by specific probes on the array and, therefore, need to be imputed by the median of the respective data set. This guarantees that in the heatmap (or PCA) these genes are not visualized as up-or down-regulated; they in fact can be manually labeled (blackened). Expression values from all data sets are merged into one matrix and again quantile-normalized to account for variability in platform specifications and noise. A more suitable approach than normalizing on each gene set separately might be to normalize on the whole combined transcriptome (intersection of all probed genes). However, this would disregard genes not covered by all platforms used. The resulting heatmap (generated by function heatmap.2, library gplots; Figure 3b) shows the expression of selected genes and transcripts in their respective data set and can be additionally subdivided by the different entities (median across samples of an entity).
2) Batch effects cannot be entirely excluded by using method 1) as may be observed by a bias in clustering of samples from the same experiment. Therefore, we recommend a novel method called inSilicoMerge [25], which combines data sets and removes their batch effect with a choice of various methods, such as the empirical Bayes method ComBat (Figure 3c).
Unfortunately, data sets from different platforms can only be combined gene-wise, meaning that e.g. MCL1 would not be deconvolutable into its isoforms MCL-001 / MCL1-long and MCL-002 / MCL1-short.
3) For an advanced evaluation, one can further perform differential expression analysis for data sets with different control samples (of varying quality, number, and specificity) available for comparison, such as’normal‘non-malignant cells or bulk tissue specimens. Fold-changes with a p-value < 0.1 (trend) or < 0.05 (significant) are extracted to compare normal-matched gene expression between different experiments and probe targets representing different gene transcripts or protein isoforms. The results are again visualized by a heatmap, either in the order obtained by hierarchical clustering (using Euclidean distance) or in order of rows sorted by gene name.
As exemplified by illustration of expression levels of Death-Associated Protein Kinase (DAPK) gene family members in subsets of CLL and normal B-cells (Figure 3d), this method allows different disease vs.’normal‘comparisons and facilitates the evaluation of which genes are exclusively down-or up-regulated and which show no clear pattern or which are specific to small subgroups. In the meta-analysis itself every differential expression analysis is further evaluated by statistical testing. Default setting is the Student’s t-test, except for low variation or non-normal distributions, for which the non-parametric Wilcoxon rank sum test is recommended.
In the abundance of genes obtained as significantly dysregulated, the role or function of a specific gene is often unknown and it is therefore encouraged to group them functionally by software tools often coined as’pathway analysis‘or’enrichment‘tools. One of the most user-friendly, however, costly tools is QIAGEN’s Ingenuity® Pathway Analysis (IPA®, QIAGEN Redwood City, www.qiagen.com/ingenuity/). Users can upload their differential expression results in the format of Excel tables into the Java GUI (graphical user interface). Annotation in the form of chip design or symbol identifiers (such as Gene Symbol, Ensembl ID or GenBank ID) can be selected for a given column as well as statistical parameters in separate columns, such as p-values, fold-changes, q-values / FDRs or simply expression values (fluorescence in microarrays or FPKM (fragments per kilobase of exon per million reads mapped) for RNA-seq). The list can be further restricted to a given range (e.g. p-value < 0.05). The selected genes are subsequently assembled into manually curated biological or toxicological / pharmacological pathways provided with an E-value (chance of a random hit). One advantage of IPA compared to other tools is the easy visualization of results by intuitive geometric forms, i.e. nodes / genes are drawn as distinct geometric symbols and edges / protein modifications in distinct line types. Similar graphs can be drawn with igraph in R, but are restricted to users that are more experienced in bioinformatics.
Other user-friendly and open-source alternatives include DAVID [26], gene set over-representation analysis (GSOA) by ConsensusPathDB [27] (Suppl. Figure 2), and gene set enrichment analysis (GSEA; Figure 4a) by the Broad Institute [28]. All three tools can be operated from web GUIs, while the first two options also offer an R implementation or in the case of GSEA, also a JAVA desktop application.
For more advanced users and those seeking to work with protein identifiers (complementary to above mentioned tools) STRINGdb10 [29] is a potential alternative. Within the R library PPI (protein-protein interaction) graphs (nodes colored according to fold-change and also reachable via web link) and enrichments (including p-values and number of observed and expected interactions) are calculated (Figure 4b; Code 15). Therein, inputs are the corresponding proteins of the most significantly dysregulated probes in different gene expression comparisons. Edges between proteins are colored according to evidence level, e.g. co-expression, literature mining, or experimental assays such as yeast2hybrid (y2h). The same R library can also be used for KEGG and GO (gene ontology) enrichment analyses (Code 16). RNA-to-protein inference can however only be approximate due to different half-lifes and decay rates as well as due to variable post-transcriptional and post-translational modifications.
Besides parameters of more established nature (routinely tested), e.g. in CLL those from clinical chemistry, such as β2microglobulin [30] or from immunophenotyping, such as ZAP70 [31], the expression of a single gene or a gene set detected by microarray-based GEP can also serve as a marker, or a scored combination of them, that predict clinical outcomes. Such prognostic estimations are predominantly measured in subgroup differences of time-to-event metrics like overall survival (OS; from date of diagnosis or less correctly from first day of treatment or study randomization to last follow-up (FU) or death) or progression-free survival (PFS; from first day of treatment or randomization to disease progression or death). Other measurements include time-to-treatment (TTT; from diagnosis or randomization to first day of treatment), time-to-next-treatment (TTNT; end of first to beginning of next treatment), time-to-treatment-failure (TTF; time from diagnosis or randomization to treatment dismissal), or event-free survival (EFS; time from diagnosis or randomization to disease progression, death or treatment dismissal). These parameters are either right-censored (date of death or progression after study window, thus unknown) or left-censored (study entry is unknown) to deal with missing time points or events (death or progression). Here we focus on right-censored data.
An univariate analysis compares time-to-event parameters for two subgroups divided by a gene expression or other marker status (see [32] for an introduction). For multivariate analysis, multiple genes or markers are considered for a competing subset comparison (see [33] for an introduction). For the former there are standard methods implemented within the R library survival with functions survdiff to test the differences of survival times with the log-rank test [34] and survfit to plot the survival times with the Kaplan-Meier estimator [35] (Code 17). A multivariate analysis allows ranking of the most significant markers contributing to an adverse prognosis. It is usually conducted with the Cox Proportional Hazards [36] (CoxPH) model.
As evidence provided by different data sources and methods strengthens a given hypothesis, it is important to validate identified markers of prognosis in an independent patient cohort. However, this is often difficult due to a limited availability of reasonably-sized data sets for comparison. Possible causes may be a low disease incidence (e.g. notorious for mature T-cell lymphomas) or general difficulties in obtaining primary tumor samples (e.g. due to the need of invasive procedures to be consented by the patient). Another factor imposing limitations on sample size is the uniformity of received treatments, which must apply to a given patient cohort in order to reliably predict related outcomes. For GEP studies in such scenarios, we propose an alternative algorithm for the identification of prognostic gene expression signatures, which we demonstrate by the example of GEP data generated from peripheral blood tumor samples of patients with T-cell prolymphocytic leukemia (T-PLL) and CLL. We obtained gene expression profiles from 49 T-PLL samples with available OS status and from 58 chemoimmunotherapy-treated CLL patients with available PFS data, both from Illumina HumanHT-12 v4.0 Expression BeadChips.
In a first training set of 10 T-PLL, 5 patients with longest OS (time from diagnosis to death of disease, > 800 days) were compared to those with shortest OS (< 300 days, n=5) using the‘Significance analysis of microarrays’(SAM) analysis in survival mode via the R library samr [37]. We only considered expression profiles from patients in whom corresponding samples had been obtained within 6 months from diagnosis (ensuring similarities between specimen and clinical data) and who had presented with similar lymphocyte doubling times as an indicator of disease kinetics at the time of sample. From an initial most informative index-set of 5 differentially expressed probes (RAB25, KIAA1211L-probe1, KIAA1211L-probe2, GIMAP6, FXYD2; FDR < 0.1), linear regression [38] and removal of one outlier by setting OS < 200 days, resulted in a 2nd training set of nine cases. Another subsequent SAM (survival mode) resulted in a 2-gene / 3-probe set as the most robust combined predictor of OS. These probe sets were used to calculate an expression index via an additive model fit using Tukey's median polish procedure [39] (medpolish function within the standard stats library) on a test set of 40 uniformly treated T-PLL (the nine training cases excluded) fulfilling the criteria of available array data and OS information. Kaplan-Meier curves (log-rank tests for differences) were created based on stratified per patient-values of this“2-gene / 3-probe prognostic expression index”(RAB25 and the two KIAA1211LL transcripts either merged or separated; Figure 5a). Ranking the cases solely based on these expression indices, the five T-PLL cases with the lowest values indeed showed significantly superior OS over those five cases with highest or 35 cases with higher (Figure 5b; Suppl. Figure 3a) expression index values (index fold-change (fc)=−2.37; Figure 5b; index fc=−1.62; Suppl. Figure 3a). A similar approach was used to identify signature genes associated with PFS in chemoimmunotherapy-treated CLL (Figure 5c; Suppl. Figure 3b; Code 18) resulting in a predictive 4-gene / 7-probe index (including GPD1L, TNFSF12, JHDM1D, TBCD, AARS2, MTG1, and TNIP). In both cohorts, the detected differential expression of signature genes and their association with clinical outcome requires further validation, e.g. by qRT-PCR, in independent samples before considering them further as valid markers.
When dealing with large data sets (e.g. a gene expression matrix) that incorporate different clinical or molecular information (‘features‘), and if a group status (‘class‘) of clinical or biological interest (e.g. treatment responder vs. non-responder) is known, the application of discrimination (or supervised learning) methods can be considered. Such methods aim to train classifiers (logistic, linear, or non-linear) that are able to predict the status of future samples based on certain features (e.g. treatment response). In general, it is important to validate classification rules obtained from training data in an independent test set, preferably obtained from another set of patients from a different laboratory / trial group, in order to avoid a biased data interpretation. When there is no independent set available, an internal cross-validation can be performed. Therein, the available patient samples are repeatedly separated into a training set and a test set, while subsequently observing the average classification performance by the number of false positives and false negatives obtained through the classifier.
A popular supervised learning approach are support vector machines [40] (SVM; R libraries gmum.r or e1071). They try to separate classes by projecting features and their interactions into high-dimensional space and subsequently by searching for either linear (Figure 6a-b) or non-linear (Figure 6c; Suppl. Figure 4) separating hyperplanes in the original feature space (Code 19).
Decision trees (R libraries rpart, tree or party; Code 20) can also divide samples according to a class variable into further most informative binary portions of gene expression signatures (Figure 7a-b) or of other molecular features (i.e. mutational or cytogenetic strata in CLL) (Figure 7c-f; Suppl. Figure 5); measured by ANOVA for numerical or by entropy for categorical values. When looking for a cut-off for adverse prognosis, they can be further used in the form of regression trees [41]. Different parameters can be controlled in this approach, such as the maximum size of a tree or the number of portions / bins. It is recommended to keep these relatively low in the training set to avoid“overfitting”and thus enable re-evaluation in the test set. Random forests [42] (as an assembly of permutated decision trees) can be used to determine the chance of observing random tree branching (library randomForest) (Code 21). Both algorithms are also included in the rattle library, which offers a user-friendly GUI with interactive plots and a selection menu for class variable and co-variates as well as algorithm and parameter choices. For a more detailed review on current machine learning algorithms in GEP, we refer to [43].
In this review we discuss procedures to optimize GEP analyses. We highlight the importance of advanced preprocessing, such as batch correction and admixture modeling, but also appraise the versatility and sophistication of analysis and classification algorithms. Many of the presented methods, originally established for microarray data analysis, can also be applied to RNA-seq data (on the basis of read counts instead of fluorescence values). In addition to GEP, it is always desirable to aim for additional genetic information, including (somatic) copy-number alterations, structural variation, and genotyping of nucleotide variants for a most comprehensive genetic workup of the investigated cancer specimen. Epigenomic data, e.g. from methylome and ChIP-seq experiments may be added as a second layer. Besides setting up an own data repository in MySQL or RDF for managing internal data, one may also investigate the cBioPortal for Cancer Genomics [44]. TCGA (https://tcga-data.nci.nih.gov/tcga), ICGC (https://dcc.icgc.org), and other large curated data sets provide user-friendly search engines with multiple visualization options. Another helpful tool for combining gene expression data with available genomic knowledge in a network-based analysis is Expander [45]. Overall, this review and the attached source codes may provide guidance to both molecular biologists and bioinformaticians / biostatisticians to properly conduct GEP analyses from microarrays and to go beyond the application of standard analytic tools to optimally interpret the clinical and biological relevance of the obtained results.
M.H. (HE3553/3-1) and C.D.H. (SCHW1711/1-1) are funded by the German Research Foundation (DFG) as part of the collaborative research group on“Exploiting the DNA damage response in CLL”(KFO286). M.H. (HE3553/4-1), has been supported by the DFG as part of the collaborative research group on mature T-cell lymphomas“CONTROL-T”(FOR1961). Further support: German Cancer Aid (108029), CECAD, José Carreras Leukemia Foundation (R12/08) (all to M.H.); CLL Global Research Foundation (to M.H. and C.D.H.); Köln Fortune Program and Fritz Thyssen foundation (10.15.2.034MN) (both to M.H. and A.S.).
We gratefully acknowledge all contributing centers and investigators enrolling patients into the trials and registry of the German CLL Study Group (GCLLSG) and at the UT M.D. Anderson Cancer Center (MDACC), Houston/TX, USA; the GCLLSG and UT MDACC staff and the patients with their families for their invaluable contributions.
Data analysis: G.C.; survival analyses: G.C., A.S., M.H.; experiments and conduction of GEP: A.S., C.D.H.; clinical data: M.H., C.D.H.; manuscript preparation: G.C., C.D.H., M.N., M.H.
There were no competing interests interfering with the unbiased conduction of this study.
Human tumor samples were obtained from patients under IRB-approved protocols following written informed consent according to the Declaration of Helsinki. Collection and use have been approved for research purposes by the ethics committee of the University Hospital of Cologne (#11-319) and UT M.D. Anderson Cancer Research Center. The cohorts were selected based on uniform front-line treatment as part of the TPLL1 [46] (NCT00278213) and TPLL2 (NCT01186640, unpublished) prospective clinical trials as well as FCR300 [47] or included in the nation-wide T-PLL and CLL registries of the German CLL Study Group (GCLLSG, IRB# 12-146).
[1] |
S. Li, W. Gong, X. Yan, C. Hu, D. Bai, L. Wang, Parameter estimation of photovoltaic models with memetic adaptive differential evolution, Sol. Energy, 190 (2019), 465–474. https://doi.org/10.1016/j.solener.2019.08.022 doi: 10.1016/j.solener.2019.08.022
![]() |
[2] | Z. Liao, Q. Gu, S. Li, Z. Hu, B. Ning, An improved differential evolution to extract photovoltaic cell parameters, IEEE Access, 8 (2020), 177838–177850. http://doi.org/10.1109/ACCESS.2020.3024975 |
[3] |
S. Li, Q. Gu, W. Gong, B. Ning, An enhanced adaptive differential evolution algorithm for parameter extraction of photovoltaic models, Energy Convers. Manage., 205 (2020), 112443. https://doi.org/10.1016/j.enconman.2019.112443 doi: 10.1016/j.enconman.2019.112443
![]() |
[4] |
Z. Liao, Z. Chen, S. Li, Parameters extraction of photovoltaic models using triple-phase teaching-learning-based optimization, IEEE Access, 8 (2020), 69937–69952. https://doi.org/10.1109/ACCESS.2020.2984728 doi: 10.1109/ACCESS.2020.2984728
![]() |
[5] |
H. M. Ridha, H. Hizam, C. Gomes, A. A. Heidari, H. Chen, M. Ahmadipour, et al., Parameters extraction of three diode photovoltaic models using boosted LSHADE algorithm and Newton Raphson method, Energy, 224 (2021), 120136. https://doi.org/10.1016/j.energy.2021.120136 doi: 10.1016/j.energy.2021.120136
![]() |
[6] |
S. Li, W. Gong, Q. Gu, A comprehensive survey on meta-heuristic algorithms for parameter extraction of photovoltaic models, Renewable Sustainable Energy Rev., 141 (2021), 110828. https://doi.org/10.1016/j.rser.2021.110828 doi: 10.1016/j.rser.2021.110828
![]() |
[7] |
M. Abd Elaziz, D. Oliva, Parameter estimation of solar cells diode models by an improved opposition-based whale optimization algorithm, Energy Convers. Manage., 171 (2018), 1843–1859. https://doi.org/10.1016/j.enconman.2018.05.062 doi: 10.1016/j.enconman.2018.05.062
![]() |
[8] |
J. Liang, K. Qiao, M. Yuan, K. Yu, B. Qu, S. Ge, et al., Evolutionary multi-task optimization for parameters extraction of photovoltaic models, Energy Convers. Manage., 207 (2020), 112509. https://doi.org/10.1016/j.enconman.2020.112509 doi: 10.1016/j.enconman.2020.112509
![]() |
[9] |
A. Askarzadeh, A. Rezazadeh, Parameter identification for solar cell models using harmony search-based algorithms, Sol. Energy, 86 (2012), 3241–3249. https://doi.org/10.1016/j.solener.2012.08.018 doi: 10.1016/j.solener.2012.08.018
![]() |
[10] |
T. Kang, J. Yao, M. Jin, S. Yang, T. Duong, A novel improved cuckoo search algorithm for parameter estimation of photovoltaic (PV) models, Energies, 11 (2018), 1–31. https://doi.org/10.3390/en11051060 doi: 10.3390/en11051060
![]() |
[11] |
M. R. AlRashidi, M. F. AlHajri, K. M. El-Naggar, A. K. Al-Othman, A new estimation approach for determining the Ⅰ–Ⅴ characteristics of solar cells, Sol. Energy, 85 (2011), 1543–1550. https://doi.org/10.1016/j.solener.2011.04.013 doi: 10.1016/j.solener.2011.04.013
![]() |
[12] |
A. Askarzadeh, A. Rezazadeh, Artificial bee swarm optimization algorithm for parameters identification of solar cell models, Appl. Energy, 102 (2013), 943–949. https://doi.org/10.1016/j.apenergy.2012.09.052 doi: 10.1016/j.apenergy.2012.09.052
![]() |
[13] |
R. Ben Messaoud, Extraction of uncertain parameters of single-diode model of a photovoltaic panel using simulated annealing optimization, Energy Rep., 6 (2020), 350–357. https://doi.org/10.1016/j.egyr.2020.01.016 doi: 10.1016/j.egyr.2020.01.016
![]() |
[14] |
S. Li, W. Gong, L. Wang, X. Yan, C. Hu, A hybrid adaptive teaching–learning-based optimization and differential evolution for parameter identification of photovoltaic models, Energy Convers. Manage., 225 (2020), 113474. https://doi.org/10.1016/j.enconman.2020.113474 doi: 10.1016/j.enconman.2020.113474
![]() |
[15] |
K. G. K. Harish, Modeling of solar cell under different conditions by Ant Lion Optimizer with LambertW function, Appl. Soft Comput., 71 (2018), 141–151. https://doi.org/10.1016/j.asoc.2018.06.025 doi: 10.1016/j.asoc.2018.06.025
![]() |
[16] |
H. M. Ridha, H. Hizam, S. Mirjalili, M. L. Othman, M. E. Ya'acob, L. Abualigah, A novel theoretical and practical methodology for extracting the parameters of the single and double diode photovoltaic models, IEEE Access, 10 (2022), 11110–11137. https://doi.org/10.1109/ACCESS.2022.3142779 doi: 10.1109/ACCESS.2022.3142779
![]() |
[17] |
A. A. Al-Shamma'a, H. O. Omotoso, F. A. Alturki, H. M. H. Farh, A. Alkuhayli, K. Alsharabi, et al., Parameter estimation of photovoltaic cell/modules using bonobo optimizer, Energies, 15 (2022), 140. https://doi.org/10.3390/en15010140 doi: 10.3390/en15010140
![]() |
[18] |
W. Zhou, P. Wang, A. A. Heidari, X. Zhao, H. Turabieh, M. Mafarja, et al., Metaphor-free dynamic spherical evolution for parameter estimation of photovoltaic modules, Energy Rep., 7 (2021), 5175–5202. https://doi.org/10.1016/j.egyr.2021.07.041 doi: 10.1016/j.egyr.2021.07.041
![]() |
[19] |
A. Farah, A. Belazi, F. Benabdallah, A. Almalaq, M. Chtourou, M. A. Abido, Parameter extraction of photovoltaic models using a comprehensive learning Rao-1 algorithm, Energy Convers. Manage., 252 (2022), 115057. https://doi.org/10.1016/j.enconman.2021.115057 doi: 10.1016/j.enconman.2021.115057
![]() |
[20] |
J. Luo, J. Zhou, X. Jiang, A modification of the imperialist competitive algorithm with hybrid methods for constrained optimization problems, IEEE Access, 9 (2021), 161745–161760. https://doi.org/10.1109/ACCESS.2021.3133579 doi: 10.1109/ACCESS.2021.3133579
![]() |
[21] |
M. A. E. Sattar, A. Al Sumaiti, H. Ali, A. A. Z. Diab, Marine predators algorithm for parameters estimation of photovoltaic modules considering various weather conditions, Neural Comput. Appl., 33 (2021), 11799–11819. https://doi.org/10.1007/s00521-021-05822-0 doi: 10.1007/s00521-021-05822-0
![]() |
[22] |
S. Jiao, G. Chong, C. Huang, H. Hu, M. Wang, A. A. Heidari, et al., Orthogonally adapted Harris hawks optimization for parameter estimation of photovoltaic models, Energy, 203 (2020), 117804. https://doi.org/10.1016/j.energy.2020.117804 doi: 10.1016/j.energy.2020.117804
![]() |
[23] |
Y. Yu, K. Wang, T. Zhang, Y. Wang, C. Peng, S. Gao, A population diversity-controlled differential evolution for parameter estimation of solar photovoltaic models, Sustainable Energy Technol. Assess., 51 (2022), 101938. https://doi.org/10.1016/j.seta.2021.101938 doi: 10.1016/j.seta.2021.101938
![]() |
[24] |
S. Gao, K. Wang, S. Tao, T. Jin, H. Dai, J. Cheng, A state-of-the-art differential evolution algorithm for parameter estimation of solar photovoltaic models, Energy Convers. Manage., 230 (2021), 113784. https://doi.org/10.1016/j.enconman.2020.113784 doi: 10.1016/j.enconman.2020.113784
![]() |
[25] | R. V. Rao, Jaya: A simple and new optimization algorithm for solving constrained and unconstrained optimization problems, Int. J. Ind. Eng. Comput., 7 (2016), 19–34. http://dx.doi.org/10.5267/j.ijiec.2015.8.004 |
[26] |
Y. Zhang, Z. Jin, Comprehensive learning Jaya algorithm for engineering design optimization problems, J. Intell. Manuf., 2021 (2021). https://doi.org/10.1007/s10845-020-01723-6 doi: 10.1007/s10845-020-01723-6
![]() |
[27] |
Y. Zhang, A. Chi, S. Mirjalili, Enhanced Jaya algorithm: A simple but efficient optimization method for constrained engineering design problems, Knowl. Based Syst., 233 (2021), 107555. https://doi.org/10.1016/j.knosys.2021.107555 doi: 10.1016/j.knosys.2021.107555
![]() |
[28] |
M. Afifi, H. Rezk, M. Ibrahim, M. El-Nemr, Multi-objective optimization of switched reluctance machine design using jaya algorithm (MO-Jaya), Mathematics, 9 (2021), 1107. https://doi.org/10.3390/math9101107 doi: 10.3390/math9101107
![]() |
[29] |
S. Basak, B. Bhattacharyya, B. Dey, Combined economic emission dispatch on dynamic systems using hybrid CSA-JAYA Algorithm, Int. J. Syst. Assur. Eng. Manage., 2022 (2022). https://doi.org/10.1007/s13198-022-01635-z doi: 10.1007/s13198-022-01635-z
![]() |
[30] |
D. Saadaoui, M. Elyaqouti, K. Assalaou, D. B. hmamou, S. Lidaighbi, Multiple learning JAYA algorithm for parameters identifying of photovoltaic models, Mater. Today Proc., 52 (2022), 108–123. https://doi.org/10.1016/j.matpr.2021.11.106 doi: 10.1016/j.matpr.2021.11.106
![]() |
[31] |
M. F. Tefek, M. Arslan, Highway accident number estimation in Turkey with Jaya algorithm, Neural Comput. Appl., 34 (2022), 5367–5381. https://doi.org/10.1007/s00521-022-06952-9 doi: 10.1007/s00521-022-06952-9
![]() |
[32] |
J. Gholami, M. R. Kamankesh, S. Mohammadi, E. Hosseinkhani, S. Abdi, Powerful enhanced Jaya algorithm for efficiently optimizing numerical and engineering problems, Soft Comput., 2022 (2022). https://doi.org/10.1007/s00500-022-06909-z doi: 10.1007/s00500-022-06909-z
![]() |
[33] |
X. Jian, Y. Cao, A chaotic second order oscillation JAYA Algorithm for parameter extraction of photovoltaic models, Photonics, 9 (2022). https://doi.org/10.3390/photonics9030131 doi: 10.3390/photonics9030131
![]() |
[34] |
S. Belagoune, N. Bali, K. Atif, H. Labdelaoui, A discrete chaotic Jaya algorithm for optimal preventive maintenance scheduling of power systems generators, Appl. Soft Comput., 119 (2022), 108608. https://doi.org/10.1016/j.asoc.2022.108608 doi: 10.1016/j.asoc.2022.108608
![]() |
[35] |
A. Aleti, I. Moser, A systematic literature review of adaptive parameter control methods for evolutionary algorithms, Assoc. Comput. Mach., 49 (2017), 1–35. https://doi.org/10.1145/2996355 doi: 10.1145/2996355
![]() |
[36] |
Z. Lei, S. Gao, S. Gupta, J. Cheng, G. Yang, An aggregative learning gravitational search algorithm with self-adaptive gravitational constants, Exp. Syst. Appl., 152 (2020), 113396. https://doi.org/10.1016/j.eswa.2020.113396 doi: 10.1016/j.eswa.2020.113396
![]() |
[37] | R. Tanabe, A. S. Fukunaga, Improving the search performance of SHADE using linear population size reduction, in 2014 IEEE Congress on Evolutionary Computation (CEC), (2014), 1658–1665. https://doi.org/10.1109/CEC.2014.6900380 |
[38] | H. Yang, S. Gao, R. L. Wang, Y. Todo, A ladder spherical evolution search algorithm, IEICE Trans. Inf. Syst., 104 (2021), 461–464. http://doi.org/10.1587/transinf.2020EDL8102 |
[39] |
X. Yu, X. Wu, W. Luo, Parameter identification of photovoltaic models by hybrid adaptive JAYA Algorithm, Mathematics, 10 (2022), 183. https://doi.org/10.3390/math10020183 doi: 10.3390/math10020183
![]() |
[40] |
Y. J. Zhang, Y. X. Yan, J. Zhao, Z. M. Gao, AOAAO: The hybrid algorithm of arithmetic optimization algorithm with aquila optimizer, IEEE Access, 10 (2022), 10907–10933. https://doi.org/10.1109/ACCESS.2022.3144431 doi: 10.1109/ACCESS.2022.3144431
![]() |
[41] |
J. Zhao, Z.-M. Gao, The chaotic slime mould algorithm with chebyshev map, in 2nd International Conference on Artificial Intelligence and Computer Science, 1631 (2020), 012071. https://doi.org/10.1088/1742-6596/1631/1/012071 doi: 10.1088/1742-6596/1631/1/012071
![]() |
[42] |
K. Yu, B. Qu, C. Yue, S. Ge, X. Chen, J. Liang, A performance-guided JAYA algorithm for parameters identification of photovoltaic cell and module, Appl. Energy, 237 (2019), 241–257. https://doi.org/10.1016/j.apenergy.2019.01.008 doi: 10.1016/j.apenergy.2019.01.008
![]() |
[43] |
Z. Yan, S. Li, W. Gong, An adaptive differential evolution with decomposition for photovoltaic parameter extraction, Math. Biosci. Eng., 18 (2021), 7363–7388. https://doi.org/10.1016/j.apenergy.2019.01.008 doi: 10.1016/j.apenergy.2019.01.008
![]() |
[44] |
G. Xiong, J. Zhang, X. Yuan, D. Shi, Y. He, G. Yao, Parameter extraction of solar photovoltaic models by means of a hybrid differential evolution with whale optimization algorithm, Sol. Energy, 176 (2018), 742–761. https://doi.org/10.1016/j.solener.2018.10.050 doi: 10.1016/j.solener.2018.10.050
![]() |
[45] |
S. Li, W. Gong, X. Yan, C. Hu, D. Bai, L. Wang, et al., Parameter extraction of photovoltaic models using an improved teaching-learning-based optimization, Energy Convers. Manage., 186 (2019), 293–305. https://doi.org/10.1016/j.enconman.2019.02.048 doi: 10.1016/j.enconman.2019.02.048
![]() |
[46] |
X. Chen, K. Yu, Hybridizing cuckoo search algorithm with biogeography-based optimization for estimating photovoltaic model parameters, Sol. Energy, 180 (2019), 192–206. https://doi.org/10.1016/j.solener.2019.01.025 doi: 10.1016/j.solener.2019.01.025
![]() |
[47] |
L. M. P. Deotti, J. L. R. Pereira, I. C. Silva Júnior, Parameter extraction of photovoltaic models using an enhanced Lévy flight bat algorithm, Energy Convers. Manage., 221 (2020), 113114. https://doi.org/10.1016/j.enconman.2020.113114 doi: 10.1016/j.enconman.2020.113114
![]() |
[48] |
G. Xiong, J. Zhang, D. Shi, L. Zhu, X. Yuan, Parameter extraction of solar photovoltaic models with an either-or teaching learning based algorithm, Energy Convers. Manage., 224 (2020), 113395. https://doi.org/10.1016/j.enconman.2020.113395 doi: 10.1016/j.enconman.2020.113395
![]() |
[49] |
X. Yang, W. Gong, Opposition-based JAYA with population reduction for parameter estimation of photovoltaic solar cells and modules, Appl. Soft Comput., 104 (2021), 107218. https://doi.org/10.1016/j.asoc.2021.107218 doi: 10.1016/j.asoc.2021.107218
![]() |
[50] |
K. M. Sallam, M. A. Hossain, R. K. Chakrabortty, M. J. Ryan, An improved gaining-sharing knowledge algorithm for parameter extraction of photovoltaic models, Energy Convers. Manage., 237 (2021), 114030. https://doi.org/10.1016/j.enconman.2021.114030 doi: 10.1016/j.enconman.2021.114030
![]() |
[51] |
Z. Hu, W. Gong, S. Li, Reinforcement learning-based differential evolution for parameters extraction of photovoltaic models, Energy Rep., 7 (2021), 916–928. https://doi.org/10.1016/j.egyr.2021.01.096 doi: 10.1016/j.egyr.2021.01.096
![]() |
[52] |
K. Yu, J. J. Liang, B. Y. Qu, Z. Cheng, H. Wang, Multiple learning backtracking search algorithm for estimating parameters of photovoltaic models, Appl. Energy, 226 (2018), 408–422. https://doi.org/10.1016/j.apenergy.2018.06.010 doi: 10.1016/j.apenergy.2018.06.010
![]() |
[53] |
N. Pourmousa, S. M. Ebrahimi, M. Malekzadeh, M. Alizadeh, Parameter estimation of photovoltaic cells using improved Lozi map based chaotic optimization algorithm, Sol. Energy, 180 (2019), 180–191. https://doi.org/10.1016/j.solener.2019.01.026 doi: 10.1016/j.solener.2019.01.026
![]() |
[54] |
W. Long, T. Wu, M. Xu, M. Tang, S. Cai, Parameters identification of photovoltaic models by using an enhanced adaptive butterfly optimization algorithm, Energy, 229 (2021), 120750. https://doi.org/10.1016/j.energy.2021.120750 doi: 10.1016/j.energy.2021.120750
![]() |
[55] |
Y. Liu, A. A. Heidari, X. Ye, C. Chi, X. Zhao, C. Ma, et al., Evolutionary shuffled frog leaping with memory pool for parameter optimization, Energy Rep., 7 (2021), 584–606. https://doi.org/10.1016/j.egyr.2021.01.001 doi: 10.1016/j.egyr.2021.01.001
![]() |
[56] |
X. Chen, K. Yu, W. Du, W. Zhao, G. Liu, Parameters identification of solar cell models using generalized oppositional teaching learning based optimization, Energy, 99 (2016), 170–180. https://doi.org/10.1016/j.energy.2016.01.052 doi: 10.1016/j.energy.2016.01.052
![]() |
[57] |
K. Yu, X. Chen, X. Wang, Z. Wang, Parameters identification of photovoltaic models using self-adaptive teaching-learning-based optimization, Energy Convers. Manage., 145 (2017), 233–246. https://doi.org/10.1016/j.enconman.2017.04.054 doi: 10.1016/j.enconman.2017.04.054
![]() |
[58] |
K. Yu, J. J. Liang, B. Y. Qu, X. Chen, H. Wang, Parameters identification of photovoltaic models using an improved JAYA optimization algorithm, Energy Convers. Manage., 150 (2017), 742–753. https://doi.org/10.1016/j.enconman.2017.08.063 doi: 10.1016/j.enconman.2017.08.063
![]() |
[59] |
X. Chen, B. Xu, C. Mei, Y. Ding, K. Li, Teaching–learning–based artificial bee colony for solar photovoltaic parameter estimation, Appl. Energy, 212 (2018), 1578–1588. https://doi.org/10.1016/j.apenergy.2017.12.115 doi: 10.1016/j.apenergy.2017.12.115
![]() |
[60] |
S. M. Ebrahimi, E. Salahshour, M. Malekzadeh, F. Gordillo, Parameters identification of PV solar cells and modules using flexible particle swarm optimization algorithm, Energy, 179 (2019), 358–372. https://doi.org/10.1016/j.energy.2019.04.218 doi: 10.1016/j.energy.2019.04.218
![]() |
[61] |
Y. Zhang, C. Huang, Z. Jin, Backtracking search algorithm with reusing differential vectors for parameter identification of photovoltaic models, Energy Convers. Manage., 223 (2020), 113266. https://doi.org/10.1016/j.enconman.2020.113266 doi: 10.1016/j.enconman.2020.113266
![]() |
[62] |
Y. Zhang, M. Ma, Z. Jin, Comprehensive learning Jaya algorithm for parameter extraction of photovoltaic models, Energy, 211 (2020), 118644. https://doi.org/10.1016/j.energy.2020.118644 doi: 10.1016/j.energy.2020.118644
![]() |
[63] |
Y. Zhang, M. Ma, Z. Jin, Backtracking search algorithm with competitive learning for identification of unknown parameters of photovoltaic systems, Expert Syst. Appl., 160 (2020), 113750. https://doi.org/10.1016/j.eswa.2020.113750 doi: 10.1016/j.eswa.2020.113750
![]() |
[64] |
D. Oliva, M. Abd El Aziz, A. Ella Hassanien, Parameter estimation of photovoltaic cells using an improved chaotic whale optimization algorithm, Appl. Energy, 200 (2017), 141–154. https://doi.org/10.1016/j.apenergy.2017.05.029 doi: 10.1016/j.apenergy.2017.05.029
![]() |
[65] |
P. Lin, S. Cheng, W. Yeh, Z. Chen, L. Wu, Parameters extraction of solar cell models using a modified simplified swarm optimization algorithm, Sol. Energy, 144 (2017), 594–603. https://doi.org/10.1016/j.solener.2017.01.064 doi: 10.1016/j.solener.2017.01.064
![]() |
[66] |
G. Xiong, J. Zhang, D. Shi, Y. He, Parameter extraction of solar photovoltaic models using an improved whale optimization algorithm, Energy Convers. Manage., 174 (2018), 388–405. https://doi.org/10.1016/j.enconman.2018.08.053 doi: 10.1016/j.enconman.2018.08.053
![]() |
[67] |
A. M. Beigi, A. Maroosi, Parameter identification for solar cells and module using a Hybrid Firefly and Pattern Search Algorithms, Sol. Energy, 171 (2018), 435–446. https://doi.org/10.1016/j.solener.2018.06.092 doi: 10.1016/j.solener.2018.06.092
![]() |
[68] |
J. Liang, S. Ge, B. Qu, K. Yu, F. Liu, H. Yang, et al., Classified perturbation mutation based particle swarm optimization algorithm for parameters extraction of photovoltaic models, Energy Convers. Manage., 203 (2020), 112138. https://doi.org/10.1016/j.enconman.2019.112138 doi: 10.1016/j.enconman.2019.112138
![]() |
[69] |
X. Lin, Y. Wu, Parameters identification of photovoltaic models using niche-based particle swarm optimization in parallel computing architecture, Energy, 196 (2020), 117054. https://doi.org/10.1016/j.energy.2020.117054 doi: 10.1016/j.energy.2020.117054
![]() |
1. | A. Schrader, G. Crispatzu, S. Oberbeck, P. Mayer, S. Pützer, J. von Jan, E. Vasyutina, K. Warner, N. Weit, N. Pflug, T. Braun, E. I. Andersson, B. Yadav, A. Riabinska, B. Maurer, M. S. Ventura Ferreira, F. Beier, J. Altmüller, M. Lanasa, C. D. Herling, T. Haferlach, S. Stilgenbauer, G. Hopfinger, M. Peifer, T. H. Brümmendorf, P. Nürnberg, K. S. J. Elenitoba-Johnson, S. Zha, M. Hallek, R. Moriggl, H. C. Reinhardt, M.-H. Stern, S. Mustjoki, S. Newrzela, P. Frommolt, M. Herling, Actionable perturbations of damage responses by TCL1/ATM and epigenetic lesions form the basis of T-PLL, 2018, 9, 2041-1723, 10.1038/s41467-017-02688-6 |