Comparative analysis of copper and zinc based agrichemical biocide products: materials characteristics, phytotoxicity and in vitro antimicrobial efficacy

Parthiban Rajasekaran; Harikishan Kannan; Smruti Das; Mikaeel Young; Swadeshmukul Santra; Parthiban Rajasekaran; Harikishan Kannan; Smruti Das; Mikaeel Young; Swadeshmukul Santra

doi:10.3934/environsci.2016.3.439

AIMS Environmental Science

2016, Volume 3, Issue 3: 439-455. doi: 10.3934/environsci.2016.3.439

Previous Article Next Article

Research article

Comparative analysis of copper and zinc based agrichemical biocide products: materials characteristics, phytotoxicity and in vitro antimicrobial efficacy

1.
NanoScience Technology Center, University of Central Florida, 12424 Research Parkway, Suite 400, Orlando, FL 32826, United States
2.
Department of Chemistry, University of Central Florida, 12424 Research Parkway, Suite 400, Orlando, FL 32826, United States
3.
Department of Material Science and Engineering, University of Central Florida, 12424 Research Parkway, Suite 400, Orlando, FL 32826, United States
4.
Burnett School of Biomedical Sciences, University of Central Florida, 12424 Research Parkway, Suite 400, Orlando, FL 32826, United States.

Received: 18 May 2016 Accepted: 20 July 2016 Published: 26 July 2016

In the past few decades, copper based biocides have been extensively used in food crop protection including citrus, small fruits and in all garden vegetable production facilities. Continuous and rampant use of copper based biocides over decades has led to accumulation of this metal in the soil and the surrounding ecosystem. Toxic levels of copper and its derivatives in both the soil and in the run off pose serious environmental and public health concerns. Alternatives to copper are in great need for the agriculture industry to produce food crops with minimal environmental risks. A combination of copper and zinc metal containing biocide such as Nordox 30/30 or an improved version of zinc-only containing biocide would be a good alternative to copper-only products if the efficacy can be maintained. As of yet there is no published literature on the comparative study of the materials characteristics and phyto-compatibility properties of copper and zinc-based commercial products that would allow us to evaluate the advantages and disadvantages of both versions of pesticides. In this report, we compared copper hydroxide and zinc oxide based commercially available biocides along with suitable control materials to assess their efficacy as biocides. We present a detailed material characterization of the biocides including morphological studies involving electron microscopy, molecular structure studies involving X-ray diffraction, phytotoxicity studies in model plant (tomato) and antimicrobial studies involving surrogate plant pathogens (Xanthomonas alfalfae subsp. citrumelonis, Pseudomonas syringae pv. syringae and Clavibacter michiganensis subsp. michiganensis). Zinc based compounds were found to possess comparable to superior antimicrobial properties while exhibiting significantly lower phytotoxicity when compared to copper based products thus suggesting their potential as an alternative.

Keywords:

Citation: Parthiban Rajasekaran, Harikishan Kannan, Smruti Das, Mikaeel Young, Swadeshmukul Santra. Comparative analysis of copper and zinc based agrichemical biocide products: materials characteristics, phytotoxicity and in vitro antimicrobial efficacy[J]. AIMS Environmental Science, 2016, 3(3): 439-455. doi: 10.3934/environsci.2016.3.439

Related Papers:

[1]	Tiziano Penati, Veronica Danesi, Simone Paleari . Low dimensional completely resonant tori in Hamiltonian Lattices and a Theorem of Poincaré. Mathematics in Engineering, 2021, 3(4): 1-20. doi: 10.3934/mine.2021029
[2]	Marco Sansottera, Veronica Danesi . Kolmogorov variation: KAM with knobs (à la Kolmogorov). Mathematics in Engineering, 2023, 5(5): 1-19. doi: 10.3934/mine.2023089
[3]	Chiara Caracciolo . Normal form for lower dimensional elliptic tori in Hamiltonian systems. Mathematics in Engineering, 2022, 4(6): 1-40. doi: 10.3934/mine.2022051
[4]	Hitoshi Ishii . The vanishing discount problem for monotone systems of Hamilton-Jacobi equations: a counterexample to the full convergence. Mathematics in Engineering, 2023, 5(4): 1-10. doi: 10.3934/mine.2023072
[5]	G. Gaeta, G. Pucacco . Near-resonances and detuning in classical and quantum mechanics. Mathematics in Engineering, 2023, 5(1): 1-44. doi: 10.3934/mine.2023005
[6]	Simone Paleari, Tiziano Penati . Hamiltonian lattice dynamics. Mathematics in Engineering, 2019, 1(4): 881-887. doi: 10.3934/mine.2019.4.881
[7]	Federico Bernini, Simone Secchi . Existence of solutions for a perturbed problem with logarithmic potential in $\mathbb{R}^2$ . Mathematics in Engineering, 2020, 2(3): 438-458. doi: 10.3934/mine.2020020
[8]	Dario Bambusi, Beatrice Langella . A $C^\infty$ Nekhoroshev theorem. Mathematics in Engineering, 2021, 3(2): 1-17. doi: 10.3934/mine.2021019
[9]	Jorge E. Macías-Díaz, Anastasios Bountis, Helen Christodoulidi . Energy transmission in Hamiltonian systems of globally interacting particles with Klein-Gordon on-site potentials. Mathematics in Engineering, 2019, 1(2): 343-358. doi: 10.3934/mine.2019.2.343
[10]	Panayotis G. Kevrekidis . Instabilities via negative Krein signature in a weakly non-Hamiltonian DNLS model. Mathematics in Engineering, 2019, 1(2): 378-390. doi: 10.3934/mine.2019.2.378

Abstract

1. Introduction

Nowadays, we receive a variety of information every day. How to make better use of information that changes so quickly is something we deserve to think about. The development of computer science technology has promoted the popularity of knowledge graphs (KGs)^[1,2,3], which is a semantic network^[4] that reveals and describes the relationships between entities in the real world. In this paper, a knowledge graph means triples. Entities are the most basic elements in a KG, and different entities may have different relationships. Let $\mathbf E$ represent a set of entities, $\mathbf R$ represent a set of relationships between entities. A knowledge graph can be expressed as a collection of a number of triples:

$\begin{equation} K G = \{ ( h , r , t ) | h , t \in \mathbf E , r \in\mathbf R \} \end{equation}$

(1.1)

where $h$ represents the head entity, $t$ represents the tail entity, and $r$ denotes the relationship which is used to connect the head entity and the tail entity, i.e., $< Obama, Place\_of\_Birth, Honolulu >$ . In essence, KGs are multi-relational graphs composed of entities (nodes) and relationships(edges). Each edge consists of a head entity, a relationship, and a tail entity. The graph structures and large volume often make KGs useful for regular users to access valuable information. There are many widely used knowledge bases on the Internet, such as Freebase^[5], Wikidata^[6] and DBpedia ^[7]. When applying knowledge graphs to a natural language task, the correctness and coverage determine its contributions to the task. Common natural language processing (NLP)^[8] tasks, such as Question-answering systems and information retrieval tasks, often require a KGs system as support. However, tasks based on KGs are often affected by incompleteness. Incompleteness in this paper means that a triple misses an entity or a relationship. Therefore, it is necessary to study Knowledge Graph Completion (KGC) ^[9,10,11] methods to complete missing information with the purpose of improving the quality of KGs and the performance of real-world applications. KGC methods mainly use structured knowledge to infer factual information which is already covered in a KG. Table 1 shows that even the ultra-large-scale knowledge graphs still lack a lot of important information.

Table 1. Statistics of missing type and quantity.

Dataset	Missing type	Quantity
Freebase	a place of birth	71%
Freebase	nationality	75%
Freebase	parents	94%
DBpedia	a place of birth	60%
DBpedia	known for	58%

| Show Table

DownLoad: CSV

In general, KGC tasks can be divided into two subtasks: Link prediction ^[12] and Entity prediction ^[13]. Link prediction aims to automatically complete the missing relationships in the knowledge graphs. Entity prediction aims to automatically complete the missing entities in the knowledge graphs. This article focuses on entity prediction tasks. The benefit of knowledge graphs embedding(KGE)^[14] methods in a variety of practical applications stimulates us to explore its promising use in solving the KGC problems. Extensive research ^[15] has been done on KGC to fill in missing entities or relationships by KGE methods. The KGE methods embed the entities and relationships of the knowledge graphs into the continuous vector space and then complete some downstream KGs applications such as KGC. Most existing KGE technologies perform entity prediction tasks based on observed facts. An entity prediction task based on the KGE method first represents the entity and the relationship as the form of the vectors, i.e., the real value vectors, and then defines the scoring function to measure the plausibility of triples. The ultimate goal is to maximize the total plausibility of triples. The embedding vectors of entities and relationships are used for the interaction of components of given triples. Taking the model ProjE ^[16] as an example, before using the scoring function, the model simply uses diagonal matrices to combine entities and relationships. The refactoring formula is as follows:

$\begin{equation} \mathbf{e} \oplus \mathbf{r} = \mathbf{D}_{e} \mathbf{e}+\mathbf{D}_{r}\mathbf{r} \mathbf{} \end{equation}$

(1.2)

where $\mathbf{D}_{e}$ and $\mathbf{D}_{r}$ are the specific matrix combination operators. This combination method ignores the interaction between entities and relationships in the same embedding space, resulting in insufficient interaction. To bridge this gap, we explore how to handle the problem of insufficient interaction and use a simple and efficient neural network to perform KGC tasks.

Inspired by the above observations, this paper proposes a new entity prediction model called FRS(Feature Refactoring Scoring) based on the idea of shared parameters and knowledge graphs embedding. The main characteristics of the proposed approach are:

1. Instead of requiring a prerequisite or a pretraining process, FRS is a self-sufficient model over the length-1 relationship and doesn't occupy expensive multi-hop paths through the knowledge graphs.

2. FRS innovatively introduces feature engineering methods in the knowledge graphs completion models. In the feature processing layer, the alignment between entities and relationships in the same feature space is realized by using the idea of shared parameters neural network. The experiment proves that the feature processing layer alleviates the problem of insufficient interaction and provides a new idea for the KGC task.

3. Through extensive experiments of FRS, we find that the effect of embedding size and negative candidate sampling probability on experimental results is in reverse. It contributes to the entity predictive model's fine-tuning work.

4. Unlike the entity prediction model with complex networks, FRS can be regarded as a simple three-layer neural network entity prediction and outperforms state-of-the-art KGC methods.

2. Related works

This section introduces some of the basic concepts, definitions, and a few abbreviations in this article. In addition, we review different types of entity prediction algorithms and their score functions.

2.1. Background

$\mathbf{Notations}:$ Throughout this paper, we use uppercase bold letters to represent matrices(e.g., $\mathbf{M}$ , $\mathbf{W}$ ) and lowercase bold letters to represent vectors(e.g., $\mathbf{h}$ , $\mathbf{r}$ , $\mathbf{t}$ ). $\|\mathbf{x}\|_{1 / 2}$ denotes either the $\ell_{1}$ norm or $\ell_{2}$ norm. Let tanh(x) the hyperbolic tangent function, and sigmoid(x) the sigmoid function. The score function is used to measure the triples plausibility.

Definition 1. (Knowledge Graphs Embedding): a method of knowledge representation learning which embeds entities and relationships of knowledge graphs into continuous vector spaces.

Definition 2. (Entity Prediction): given two $< r, t >$ or $< h, r >$ as input, we consider the entity prediction as a ranking scoring problem where the top-k candidates in the scoring list are prediction results. The output list should follow the rules in this task, when outputting any two tail entities $e_i$ , $e_j$ , if $< h, r, e_i >$ exists in KGs, $< h, r, e_j >$ does not exist in KGs, then $e_j$ should be placed after $e_i$ .

We summarize the mentioned abbreviations and concepts in this paper in Table 2.

Table 2. The important symbols and their definitions.

Abbreviations or Concepts	Definitions or Explanations
E	a set of entities
R	a set of relationships
$< h, r, t >$	a triple, i.e., $< head entity, relationship, tail entity >$
KG	a knowledge graph
KGE	knowledge graphs embedding
KGC	knowledge graphs completion
FRS	Feature Refactoring Scoring
KRL	knowledge representation learning

| Show Table

DownLoad: CSV

Entity prediction algorithms can be categorized into distance models and similarity matching models. The main difference is that different models use different scoring functions. The distance model uses a distance-based scoring function, while the similarity matching model uses a similarity-based semantic matching scoring function.

2.2. Distance models

The distance model usually uses a distance-based scoring function to measure the probability of triples. Unstructured model (UM) ^[17] and structured embedding (SE) ^[18] are early stage models. Their structure is simple and can achieve good prediction results.

UM^[17] treats triples with only a single relationship and all relationships set to zero vectors $\mathbf { r } = \mathbf { 0 }$ . so the scoring function is

$\begin{equation} f_{r}(h, t) = -\| \mathbf { h } - \mathbf { t } \| _ { 2 } ^ { 2 } \end{equation}$

(2.1)

The model is simple and easy to expand, however, this method does not distinguish well between different relationships in KGs. In order to solve this problem, SE ^[18] learns relationships specific matrices for entities. The score function can be defined as:

$\begin{equation} f_{r}(h, t) = -\left\|\mathbf{M}_{r, 1} \mathbf{h}-\mathbf{M}_{r, 2} \mathbf{t}\right\|_{1} \end{equation}$

(2.2)

where specific matrices $\mathbf{M}_{r, 1}$ , $\mathbf{M}_{r, 2} \in \mathbb{R}^{d \times d}$ . The SE transforms vectors $\mathbf h$ and $\mathbf t$ of entities through the corresponding specific matrices of the relationships $\mathbf r$ and then measures their similarity in the transformation space, which reflects the semantic correlation of the head entities to tail entities in the relationship space. The SE model has a significant drawback: it uses two different matrices for the head and tail entities to project, and the coordination is poor. It is often impossible to accurately describe the semantic relationship between two entities and the relationship. Although the UM model and the SE model have a simple model structure, they have achieved good entity prediction results in the early stage, but they are not effective in the scenarios of the relationship directly related.

With the development of distributed representation, researchers find that word vectors sourcing from the word2vec^[19] algorithm can capture invisible semantic information between words and words, as shown in the following:

$\begin{equation} \boldsymbol{v}_{\text {Tokyo}}-\boldsymbol{v}_{\text {Japan}} \approx \boldsymbol{v}_{\text {Berlin}}-\boldsymbol{v}_{\text {Germany}} \end{equation}$

(2.3)

Since the triple data has obvious relationship information, relationship $r$ can be interpreted as the translation of $h$ to $t$ . This is the main idea of the most representative distance model TransE^[20]. TransE is inspired by translation invariance in the word vector space, where the relationship between words usually corresponds to translations in potential feature spaces. TransE wants that $h+r \approx t$ when measuring the plausibility of a triple. The score function can be defined as:

$\begin{equation} f_{r}(h, t) = -\|\mathbf{h}+\mathbf{r}-\mathbf{t}\|_{{1} / {2}} \end{equation}$

(2.4)

where $f_{r}(h, t)$ can be two options: $L_{1}-norm$ or $L_{2}-norm$ . Compared with the previous models, TransE has fewer parameters and low computational complexity, and can directly establish complex semantic relations between entities and relationships. However, due to the simplicity of TransE, it cannot handle complex relationships modelling. TransH ^[21] alleviates the problem of TransE in building complex relationships. TransH projects $\mathbf h$ and $\mathbf t$ by a specific relationship matrix. The score function can be defined as:

$\begin{equation} f_{r}(h, t) = \left\|\left(\mathbf{h}-\mathbf{w}_{r}^{\top} \mathbf{h w}_{r}\right)+\mathbf{d}_{r}-\left(\mathbf{t}-\mathbf{w}_{r}^{\top} \mathbf{t w}_{r}\right)\right\|_{2}^{2} \end{equation}$

(2.5)

where $\mathbf{w}_{\boldsymbol{r}}$ is the relationship specific hyperplane, $\mathbf{d}_{\boldsymbol{r}}$ is the relationship translation vector. These proposed models performed well for the one-to-many relationships. The TransH model makes entities have different representations of different relationships, and it simply assumes that entities and relationships can be represented in a single semantic space. However, this simple assumption leads to inaccurate modeling of entities and relationships by TransH.

2.3. Similarity matching models

The similarity matching model usually uses a similarity-based semantic matching scoring function. They measure the plausibility of triples by matching the underlying semantic information of the entity and the relationships embodied in its vector space representation.

RESCAL ^[22] is based on three-dimensional tensor factorization. The scores function can be defined as follows:

$\begin{equation} s(h, r, t) = \mathbf{h} \mathbf{M}_{\mathbf{r}} \mathbf{t}^{\top} \end{equation}$

(2.6)

where $\mathbf{M}_{\mathbf{r}}$ is a $k \times k$ relationship matrix. RESCAL learns the embedding of entities and relationships by minimizing the tensor reconstruction error and then completes the KGs using the scores of the reconstructed tensor. Neural Tensor Network(NTN) ^[23] replaces the linear transformation layer in traditional neural networks with bilinear tensors and links the head entities and tail entities vectors in different dimensions. The score function can be defined as:

$\begin{equation} f_{r}(h, t) = \mathbf{u}_{r}^{\top} \tanh \left(\mathbf{h}^{\top} \mathbf{M}_{r} \mathbf{t}+\mathbf{M}_{r, 1} \mathbf{h}+\mathbf{M}_{r, 2} \mathbf{t}+\mathbf{b}_{r}\right) \end{equation}$

(2.7)

where $\mathbf{M}_{r} \in \mathbb{R}^{d \times d \times k}$ denotes a tensor, $\mathbf{M}_{r, 1}, \mathbf{M}_{r, 2} \in \mathbb{R}^{k \times d}$ are weight matrices, and $\mathbf{u}_{r}$ is the relationship vector. NTN considers second-order correlation by introducing tensors to extend the single-layer neural network model. However, due to the parameter and the number of complexity of this model, it is difficult to deal with large-scale KGs. ProjE ^[16] is designed to fill the missing information in KGs by a shared variable neural network. They reported that ProjE has a small parameter size and performs well on standard datasets. ProjE defines its function as:

$\begin{equation} h(e, r) = \operatorname{sigmoid}\left(\mathbf{W}^{c} \tanh (\mathbf{e} \oplus \mathbf{r})+b_{p}\right) \end{equation}$

(2.8)

where $\mathbf{W}^{c}$ is the candidate matrix.

$\begin{equation} \mathbf{e} \oplus \mathbf{r} = \mathbf{D}_{e} \mathbf{e}+\mathbf{D}_{r}\mathbf{r} \mathbf{}+\mathbf{b}_{c} \end{equation}$

(2.9)

where $\mathbf{D}_{e}$ and $\mathbf{D}_{r}$ are diagonal matrices. However, the enough feature processing for the triple data is overlooked before using specific matrix combination operators. SENN ^[24] integrates the prediction tasks of head entities, relationships and tail entities into a neural network-based framework. The score function of $head\_pred(r, t)$ is defined as:

$\begin{equation} \begin{aligned} s(r, t) & = \mathbf{v}_{\mathbf{h}} \mathbf{A}_{\mathbf{E}}^{\top} \\ & = f\left(f\left(\cdots f\left([\mathbf{r} ; \mathbf{t}] \mathbf{W}_{\mathbf{h}, \mathbf{1}}+\mathbf{b}_{\mathbf{h}, \mathbf{1}}\right) \cdots\right) \mathbf{W}_{\mathbf{h}, \mathbf{n}}+\mathbf{b}_{\mathbf{h}, \mathbf{n}}\right) \mathbf{A}_{\mathbf{E}}^{\top} \end{aligned} \end{equation}$

(2.10)

where $f$ is an activate function, $n$ represents the number of neural network layers, $\mathbf W$ is the weight matrices, and $\mathbf b$ is the bias item. The result of relation prediction and entity prediction is improved. However, based on a given triple, SENN needs to calculate the head entity prediction label vector and the tail entity prediction label vector separately, which is more complicated than the traditional model that only uses a single prediction label vector to complete the entity prediction task.

3. Problem statement

The similarity matching models tend to use simple matrix operators to combine entities and relationships, which is effective, and existing models such as ProjE ^[16] and RESCAL ^[22] can be proved. However, the importance of feature processing for the triple data is overlooked. Feature processing is mainly used to enhance the interaction between triples, i.e., entities and entities, relationships and relationships, entities and relationships. Since the similarity matching model is based on the potential semantic similarity, the interaction between the triple data is the key of the entity prediction model. The interaction of this paper relies on feature engineering. There are usually two methods for feature engineering: one is manual processing, which requires more human intervention, and requires a deep understanding of the tasks involved to build better features. This method is feasible, but for more complex tasks, it takes a lot of manpower and resources to construct the appropriate features. Another method is knowledge representation learning ^[25,26,27], which automatically learns new representations from the data directly through machine learning algorithms, and is able to learn the appropriate features based on specific tasks. In this paper, a variant of feedforward neural networks is used as a feature engineering method to alleviate the problem of insufficient interaction.

3.1. F network

We start with the basic feedforward network model, which has only one neuron. A feedforward neural network is a basic network in which all nodes are organized into successive layers, each node receiving input from nodes in the earlier layers. Given an input ( $x_{i}$ ), the output can be defined as:

$\begin{equation} y = f\left(\sum\limits_{i = 1}^{n} W_{i} x_{i}+b\right) \end{equation}$

(3.1)

Where the parameter $W_{i}$ is used to fit the data, $b$ is a bias term, and f is the activation function. In a feedforward network, the chain structure is the interaction between layers, and the number of layers represents the depth of the network. This complex mapping can be seen as the interaction of feature information. Thus, we introduce a variant of the feedforward network, which is based on the shared parameters and residual ^[28], specifically designed for feature representation of entities and relationships. The structure of the F network is shown in Figure 1.

Figure 1. The structure of F network. Where

$n$ is the number of layers,

$x_{i}$ is the input of F network, and

$F{^{(n)} (x_{i})}$ is the output of F network.

DownLoad: Full-Size Img PowerPoint

Given the input ( $x_{i}$ ), the output of the $n^ { \mathrm { th } }$ layer $F{^{(n)} (x_{i})}$ can be defined as:

$\begin{equation} F{^{(n)} (x_{i})} = \sigma(F{(x_{i})}^{(n-1)}W)+x_{i} \end{equation}$

(3.2)

where $n$ is the number of layers, $\sigma$ denotes that there is a mapping relation between the input data and the shared parameter $W$ , and $x_{i}$ is a residual item. According to the structure of Figure 3 and Equation (3.2), we can get:

$\begin{equation} F{^{(1)} (x_{i})} = f(x_{i}W)+x_{i} \end{equation}$

(3.3)

$\begin{equation} F{^{(2)} (x_{i})} = f(f(x_{i}W)W)+x_{i} \end{equation}$

(3.4)

Figure 3. The effect of probability for negative candidate sampling. P refers to the ProjE model, and F refers to the FRS model.

DownLoad: Full-Size Img PowerPoint

where $f$ represents activation functions. The main identity of the F network is that it uses the idea of shared parameters and residuals to perform feature processing on the input data.

4. Methodology

This section mainly introduces the details of the FRS model proposed in this paper and briefly gives the loss function and the negative sample sampling method.

4.1. Architecture

A typical entity prediction model usually consists of two steps:

step 1) Representation of entities and relationships.

step 2) Definition of similarity scoring function.

The first step is to embed entities and relationships into successive low-dimensional vector spaces. In the second step, the scoring function is to measure the plausibility of the triples. Based on the characteristics of the typical entity prediction model, this paper proposes a new neural network model where the entities and relationships are modeled by a three-layer structure with a feature processing layer, a refactoring layer, and a candidate prediction layer. The feature processing layer and the refactoring layer are used for the representation of entities and relationships, and then the candidate prediction layer is used for the measure of the similarity function. It is in line with the construction steps of a typical entity prediction model. The explanation of FRS is as follows: given two embeddings as input, we consider the entity prediction as a ranking scoring problem where the top-k candidates in the scoring list are prediction results. To get this score list, we rank every possible candidate on a refactoring operator defined by two input embeddings through the specific feature engineering method. takes the tail entity prediction task as an example. The input data is $< Leonardo da Vinci, Nationality, ? >$ , and the candidate entities are $< Italy >$ and $< U.S.A >$ . The yellow nodes and brown nodes are row vectors, from the entity embedding matrix, and the green nodes are row vectors, from the relationships embedding matrix. It is worth noting that the three-layer network in this paper adopts the idea of shared parameters. Shared parameters are used in this paper to reduce training parameters and alleviate insufficient interaction.

Figure 2. FRS architecture for entity prediction. Take the tail entity prediction task as an example. The input is

$< Leonardo da Vinci, Nationality, ? >$ . The two tail candidate entities are

$< Italy >$ and

$< U.S.A >$ . The FRS model can be seen as a three-layer neural network structure with a feature processing layer, a refactoring layer, and a candidate prediction layer. The model finally outputs a list of scores of candidate entities. The yellow node represents the head entity, the green node represents the relationship, and the brown node represents the tail entity.

DownLoad: Full-Size Img PowerPoint

4.1.1. Feature processing layer

Recent models have shown that specific matrix combination operators can refactor entities and relationships such as HoIE ^[29] and ProjE ^[16]. However, the importance of feature processing for the triple data is overlooked before using specific matrix combination operators. Insufficient interaction may limit the performance of the KGC model, so it is necessary to perform feature processing on the input data. To solve the insufficient interaction problem, the feature processing layer is proposed based on the F network. Intuitively, multi-layer F networks can more easily learn abstract and general feature information. As the training parameters increase layer by layer, we choose the two-layer F network as the feature processing layer. It can be defined as follows:

$\begin{equation} F( \mathbf e ) = f[f ( \mathbf e \mathbf W )\mathbf W] + \mathbf e \end{equation}$

(4.1)

$\begin{equation} F ( \mathbf r) = f[f( \mathbf r\mathbf W )\mathbf W] + \mathbf r \end{equation}$

(4.2)

where $f$ represents an activation function we use in training process, $\mathbf W$ is a $k \times k$ square matrix which represents feature processing weight. In this layer, entities and relationships share the same weight matrix $\mathbf W$ , while adding the residual items $\mathbf e$ and $\mathbf r$ .

From the perspective of the interaction, entities and relationships are aligned by the shared parameter $\mathbf W$ . This alignment can be seen as some shared attributes that entities and relationships have after passing through the same embedding space. The shared parameter $\mathbf W$ can be regarded as an embedding space, and the entities and the relationships complete the interaction through the same embedding space. It can be summarized as the feature processing layer uses shared parameters to complete implicit interactions between entities and relationships.

4.1.2. Refactoring layer

The middle layer is the refactoring layer. Similar to most KGE models, this layer uses some specific matrix combination operators to combine entities and relationships. It can be defined as:

$\begin{equation} R(\mathbf e, \mathbf r) = [F(\mathbf e)+F(\mathbf r)]\mathbf V \end{equation}$

(4.3)

where $\mathbf V$ is a $k \times k$ refactoring weights matrix. The refactoring layer explicitly performs the interaction between the entities and the relationships by the shared parameter. Explicit interaction refers to the use of shared parameter $\mathbf V$ to refactor entities and relationships while completing the interaction.

4.1.3. Candidate prediction layer

The third layer is the candidate prediction layer, which is the output layer. The candidate prediction layer is based on the fact that the feature representation can capture the semantic similarity of triple data in the knowledge graphs. Therefore, by operating on the results of the output of the first two layers of the FRS, an embedding representation similar to the predicted result is obtained, and then, the similarity calculation is performed with the real candidate entities. We can define the candidate prediction process as:

$\begin{equation} \mathrm{S}(\mathbf e, \mathbf r) = f[ f(R(\mathbf e, \mathbf r))\mathbf C+b] \end{equation}$

(4.4)

where $\mathbf C$ is a ${ s \times k }$ candidate entities matrix, $s$ denotes the number of candidate entities, $k$ denotes embedding size, $f$ represents activation functions, and $b$ is the candidate prediction bias. Since $s$ comes from the entity set $\mathbf E$ , and the FRS model structure is shared parameters, no additional variables are needed. This layer obtains the final prediction results with a scoring list.

4.2. Sampling and loss function

In the algorithm of this paper, although the set of candidate entities does not increase the number of parameters because of entity variables sharing, if the model uses all the sets of entities to train each time, it will lead to a huge amount of computation. Therefore, the candidate sampling method should be used to reduce the size of the candidate entity set $C$ . We use the rules of word2vec^[19] to sample the candidate set negatively. It can be described as that the set of candidate entities used for training consists of a set of entities in all positive cases and a set of entities in a part of negative cases. We can simply use the binomial distribution $B (1, P y)$ to indicate whether an entity in a negative instance is selected. $Py$ indicates the probability that the negative case is selected, and $1-Py$ indicates the probability that the negative case is not selected. To learn the representation of entities and relationships, we need a loss function to maximize the plausibility of triples. For a triple and a binary label vector, we can obtain the candidate prediction result from positive candidates and negative candidates. For all the entities, we apply a binary label vector to them which means entities in $\mathbf { E } _ { - }$ get 0 score and entities in $\mathbf { E } _ { + }$ get 1 score. To fulfill the goal of maximizing the connection between the binary label vector and candidate prediction results, we define the loss function in a similar way:

$\begin{equation} \begin{aligned} \mathcal { L } ( { \mathbf e }, { \mathbf r } , { \mathbf y } ) & = - \sum _ { i \in \{ i | y _ { i } = 1 \} } \log \left( S(\mathbf e, \mathbf r )_ { i } \right) - \sum _ { s } \mathbb { E } _ { j \sim P _ { n } } \log ( 1 - S(\mathbf e, \mathbf r) _ { j } ) \end{aligned} \end{equation}$

(4.5)

In the binary label vector, $y\in \mathbf C$ , $y _ { i } = 1$ indicates a positive lable; $s$ denotes the number of negative samples originated from $\mathbb { E } _ { j \sim P _ { n } }$ . According to all the settings, the ranking score of the $i ^ { \mathrm { th } }$ candidate entity is:

$\begin{equation} \mathrm{S}(\mathbf e, \mathbf r)_ { i } = \operatorname { sigmoid } [\tanh(R(\mathbf e, \mathbf r))\mathbf C_{[\mathbf i, :]}+b] \end{equation}$

(4.6)

$\begin{equation} R(\mathbf e, \mathbf r) = [F(\mathbf e)+F(\mathbf r)]\mathbf V \end{equation}$

(4.7)

$\begin{equation} F( \mathbf e ) = sigmoid[sigmoid ( \mathbf e \mathbf W )\mathbf W] + \mathbf e \end{equation}$

(4.8)

$\begin{equation} F ( \mathbf r) = sigmoid[sigmoid( \mathbf r\mathbf W )\mathbf W] + \mathbf r \end{equation}$

(4.9)

5. Experiments

5.1. Evaluation protocol

We use the following two evaluation metrics as the basis for the assessment: the average ranking of all correct entities (Mean Rank) and the correct entities that appear within the top-k elements (Hits@k). These two evaluation metrics are called Raw because the target prediction results are not considered in the evaluation process. If we take the target prediction results into consideration, these two evaluation metrics become Filtered. For example, when the input triple is $< Italy, Contained\_ by, ? >$ and the target entity prediction result is $Toscana$ . In the entity prediction task, we will get a top-2 list: $Florence$ and $Toscana$ . The Raw Mean Rank and Hits@1 would be 2 and 0 respectively. The Filtered Mean Rank and the Filtered Hits@1 would both be 1. It only because the setting of Filtered ignores the other candidate entities although these are correct entities.

5.2. Experimental settings

For experiments with FRS, the omitted parameter settings of Adam are as follows: $\beta _ { 1 }$ = 0.9, $\beta _ { 2 }$ = 0.999 and 𝝐 = $1 e ^ { - 8 }$ . The number of epoch for all experiments in this paper is 100, and the initialization of all parameters is derived from a normalized distribution $U \left[- \frac { 6 } { \sqrt { k } }, \frac { 6 } { \sqrt { k } } \right]$ . We used the hyperparameter settings are as follows: dropout probability $p _ { d }$ = 0.5, negative sampling probability $p _ { n }$ = 0.5, embedding size $k$ = 200, minibatch $b$ = 200. We manually set the learning rate change operator by using the idea of learning rate decay in the process of model learning. The learning rate initial value is 0.01.

5.3. Results

This subsection will focus on the following three issues: First, what is the main similarity and difference between the FRS and other KGC models? Second, how effective FRS is compared to other KGC models under traditional experimental settings and the same benchmark? Third, what is the contribution and limitation of the FRS model proposed in this paper? In the table, MR and FMR refer to Mean Rank and Filtered Mean Rank respectively. The capital letter F indicates the Filtered results. The two tables contain different models because the experimental results in this paper are all selected from the original papers. The original paper may only experiment with one dataset, either FB15K or WN18, and the results in the table strictly follow the results of the original paper. Table 3 shows the statistics of FB15K and WN18. Table 4 shows the evaluation results on FB15K. Table 5 shows the evaluation results on WN18.

Table 3. Statistics of the experimental datasets.

Dataset	Entity	Relationship	Training	Valid	Test
FB15K	14951	1345	483142	50000	59071
WN18	40943	18	141442	5000	5000

| Show Table

DownLoad: CSV

Table 4. Entity prediction results of different models on FB15K.

Model	MR	FMR	Hits@10(%)	FHits@10(%)
Unstructed ^[17]	1074	979	4.5	6.3
SE ^[18]	273	162	39.8	28.8
TransE ^[20]	243	125	34.9	47.1
TransH ^[21]	212	87	45.7	64.4
TransR ^[30]	198	77	48.2	68.7
TEKE_H ^[31]	212	108	51.2	73.0
KG2E ^[32]	174	59	48.9	74.0
TransD ^[33]	194	91	53.4	77.3
lppTransD ^[34]	195	78	53.0	78.7
SSP ^[35]	163	82	57.2	79.0
TranSparse ^[36]	187	82	53.5	79.5
TransG ^[37]	203	98	52.8	79.8
TranSparse-DT ^[38]	188	79	53.9	80.2
PTransE-RNN ^[39]	242	92	50.6	82.2
PTransE-ADD ^[39]	207	58	51.4	84.6
ProjE_pointwise ^[16]	174	104	56.5	86.6
FRS(our)	110.8	41.9	58.8	89.1

| Show Table

DownLoad: CSV

Table 5. Entity prediction results of different models on WN18.

Model	MR	FMR	Hits@10(%)	FHits@10(%)
Unstructed ^[17]	315	304	35.3	38.2
SE ^[18]	1011	985	68.5	80.5
TransE ^[20]	263	251	75.4	89.2
TransH ^[21]	401	303	73.0	86.7
TransR ^[30]	238	225	79.8	92.0
KG2E ^[32]	342	331	80.2	92.8
TEKE_H ^[31]	127	114	80.3	92.9
TransD ^[33]	224	212	79.6	92.2
SSP ^[35]	168	156	81.2	93.2
TranSparse ^[36]	223	211	80.1	93.2
TransG ^[37]	483	470	81.4	93.3
lppTransD ^[34]	283	270	80.5	94.3
TranSparse-DT ^[38]	234	221	81.4	94.3
FRS(our)	112.9	103.8	85.4	97.2

| Show Table

DownLoad: CSV

5.4. Discussion

We discuss the performance in detail to show more insights about FRS. The results in the table are sorted in ascending order according to FHits@10. Except for the FRS model in the table 4 and table 5, the rest of the models are derived from the original published results. Since there is no special segmentation for head entity prediction and tail entity prediction in the literature, the model results in this paper are also a set of results with a better selection. All KGC models including FRS use low-dimensional embedding vectors to represent entities and relationships in the knowledge graphs. The model used in this paper differs from the other models in the table in that FRS innovatively introduces feature engineering into the knowledge graphs completion tasks and proposes a subtle feature processing method called F network. As can be seen from Table 4 and Table 5, Hits@10 is on the rise while the Mean Rank is on the decline. Since the Mean Rank is always greater than or equal to 1 and the Hits@10 score is always between 0.0 and 1.0, a lower Mean Rank and a higher Hits@10 score indicate better entity predictive performance. The model's performance measured by these metrics is even more obvious on the WN18 dataset. This may be due to the fact that there are fewer relationship types of fact triples in the WN18 dataset, affecting the learning ability of the model. Although some models can only succeed in partial evaluation protocols, FRS achieves the best performance in the four evaluation protocols of FB15K and WN18 respectively. This verifies the idea that sufficient interaction should be implemented in KGC models. The FRS model alleviates the problem of insufficient interaction in KGC tasks and has achieved the best prediction results without introducing redundant parameters and variables because of the usage of shared parameters.

Hits@K measures if correct entities appear within the top-k elements. The higher the Hits@K, the performance of the entity prediction will be better. To better demonstrate the performance of the FRS model in terms of Hits@K, we report the experimental results compared to the representative baseline methods for fine-grained evaluation indicators in Table 6. As can be seen from Table 6, FRS consistently outperforms all baselines in terms of three indicators, indicating that the algorithm can further improve performance due to its effectiveness and superiority of FRS. In the comparison of Hits@1, Hits@3, and Hits@10, they were 3.8%, 2.7% and 2.1% higher, respectively, than the best results of baseline methods.

Table 6. Experimental results of entity prediction in terms of Hits@{1, 3, 10} on FB15K.

Model	Hits@1	Hits@3	Hits@10
TransR ^[30]	21.8	40.4	58.2
RESCAL ^[22]	23.5	40.9	58.7
TransE ^[20]	29.7	57.8	74.9
HoIE ^[29]	40.2	61.3	73.9
CompIEx ^[40]	59.9	75.9	84.0
ProjE ^[16]	72.1	81.0	86.6
SENN ^[24]	65.9	79.2	87.0
FRS(our)	75.9	83.7	89.1

| Show Table

DownLoad: CSV

6. Further study

For our model, there are two pivotal hyperparameters which have an influence on the evaluation results: the embedding size $k$ and the probability of negative candidate sampling $p _ {n}$ . In this section, we focus on the impact of these two hyperparameters on the performance of the model according to the control variable method. We choose the KGE model named ProjE ^[16] for the analysis of hyperparameter effects. Because FRS simply uses a specific matrix operation to combine entities and relationships before introducing feature engineering, this is consistent with the idea in ProjE. The FB15K dataset contains more categories of relationships, it is more helpful for objective analysis of the model and the impact of parameters on the model, so the experiments in this section are designed for FB15K. Figure 3 provides the effect of embedding size on FB15K. Figure 4 provides the effect of the negative candidate sampling probability on FB15K.

Figure 4. The effect of probability for negative candidate sampling. P refers to the ProjE model, and F refers to the FRS model.

DownLoad: Full-Size Img PowerPoint

6.1. The effect of embedding size

It can be seen from that both MR and FMR show a downward trend with the increase of the embedding size $k$ , and the FRS decreases more than the ProjE model. It can be seen from that Hits@10 and FHits@10 both show an upward trend with the increase of the embedded size $k$ , and the FRS rises more than the ProjE model. Therefore, we can conclude that as the embedding size increases, the entity prediction performance of the KGE models will increase, but the Mean Rank is more sensitive to this influence factor than Hits@K. Under the same embedding size, the prediction performance of FRS model is better than ProjE. These two models have achieved a sharp change in embedding size 100,200, and 300, and the performance of prediction doesn't improve significantly when the embedding size is more than $400$ . It shows that the appropriate embedding size makes the model more expressive; However, when the embedding size of the model increases to a certain threshold, the network will learn some unimportant features or even noise, which will cause negative effects and require more computing resources.

6.2. The effect of probability for negative candidate sampling

As can be seen from , both MR and FMR are on the rise as probability $p _ {n}$ increases. As can be seen from , Hits@10 and FHits@10 both show a decreasing trend as probability $p _ {n}$ increases. When the probability $p _ {n}$ is increasing, both FRS and ProjE are negatively affected, but FRS is less damaged than ProjE. Under the influence of negative influence, the prediction effect of FRS model is still better than ProjE, which indicates that the model is more robust. From the experimental results in Figure 3 and Figure 4, we can conclude that the effect of the embedding size and the negative sampling probability on the entity prediction results is in reverse. Meanwhile, the influence of the embedding size on the model is greater than the negative sampling probability. This also shows that we should restore the semantic information in the embedding space as much as possible, that is, the sufficient interaction for the entities and relationships. In the experiment of further study, the prediction effect of the FRS model was always better than ProjE, although sometimes the influence of parameters on the experimental results was unfavorable.

7. Conclusion

In this paper, a knowledge graphs embedding model, based on the shared parameters, called FRS is proposed for entity prediction tasks. The FRS model innovatively introduces the feature engineering method into the entity prediction tasks, which alleviates the problem of insufficient interaction in the KGE models. In particular, the F network proposed in this paper realizes the alignment of entities and relationships in the same feature space. Experiments have shown that the FRS with the feature processing layer has absolutely good prediction performance compared with the traditional KGE model, and the prediction performance is better under the influence of the positive or negative influence of the hyperparameter. The FRS model does not require pre-training and is a self-sufficient model of length 1, meanwhile, it obtains the best entity prediction results with a simple three-layer network. Although the FRS model uses the idea of shared parameters, there are still a large number of parameters in the training process. The future work is how to alleviate the problem of insufficient interaction while reducing the number of training parameters. Moreover, we will consider the temporal logic relations between components of the triples, such as the application of RNN or LSTM.

Acknowledgments

This work was supported in part by the National Natural Science Foundation of China under Grant 61373120. The work of this paper was supported by the Electronic Service Technology and Engineering Lab, Northwestern Polytechnical University, through a Ph.D. Scholarship.

Conflict of interest

All authors declare no conflicts of interest in this paper.

References

[1]	Wen A, Balogh B, Momol MT, et al. (2009) Management of bacterial spot of tomato with phosphorous acid salts. Crop prot 28: 859-863.
[2]	Graham J, Gruber B, Bock C (2014) Research progress for integrated canker management. Citrus Industry 95: 20-24.
[3]	Arias-Estévez M, López-Periago E, Martínez-Carballo E, et al. (2008) The mobility and degradation of pesticides in soils and the pollution of groundwater resources. Agr Ecosyst Environ 123: 247-260.
[4]	Hoang TC, Rogevich EC, Rand GM, et al. (2008) Copper desorption in flooded agricultural soils and toxicity to the Florida apple snail (Pomacea paludosa): Implications in Everglades restoration. Environ Poll 154: 338-347.
[5]	Behlau F, Belasque J, Graham J, et al. (2010) Effect of frequency of copper applications on control of citrus canker and the yield of young bearing sweet orange trees. Crop Prot 29: 300-305. doi: 10.1016/j.cropro.2009.12.010
[6]	Schuler LJ, Hoang TC, Rand GM (2008) Aquatic risk assessment of copper in freshwater and saltwater ecosystems of South Florida. Ecotoxicol 17: 642-659. doi: 10.1007/s10646-008-0236-7
[7]	Behlau F, Canteros BI, Jones JB, et al. (2012) Copper resistance genes from different xanthomonads and citrus epiphytic bacteria confer resistance to Xanthomonas citri subsp. citri. Eur j plant pathol 133: 949-963. doi: 10.1007/s10658-012-9966-8
[8]	Behlau F, Canteros BI, Minsavage GV, et al. (2011) Molecular characterization of copper resistance genes from Xanthomonas citri subsp. citri and Xanthomonas alfalfae subsp. citrumelonis. Appl environ microb 77: 4089-4096.
[9]	Ed Etxeberria PG, Priyanka Bhattacharya, Parvesh Sharma, et al. (2016) Determining the size exclusion for nanoparticles in citrus leaves. Hortic Sci 51: 1-6.
[10]	Hendricks KE, Donahoo RS, Roberts PD, et al. (2013) Effect of copper on growth characteristics and disease control of the recently introduced Guignardia citricarpa on Citrus in Florida. Am J Plant Sci 4: 282.
[11]	Jorgensen JH, Turnidge JD (2015) Susceptibility Test Methods: Dilution and Disk Diffusion Methods*.
[12]	Young M, Santra S (2014) Copper (Cu)–Silica Nanocomposite Containing Valence-Engineered Cu: A New Strategy for Improving the Antimicrobial Efficacy of Cu Biocides. J agr food chem 62: 6043-6052. doi: 10.1021/jf502350w
[13]	Hans M, Erbe A, Mathews S, et al. (2013) Role of copper oxides in contact killing of bacteria. Langmuir 29: 16160-16166. doi: 10.1021/la404091z
[14]	Balogh B, Jones J, Momol M, et al. (2003) Improved efficacy of newly formulated bacteriophages for management of bacterial spot on tomato. Plant Dis 87: 949-954. doi: 10.1094/PDIS.2003.87.8.949
[15]	Baron M, Arellano JB, Gorge JL (1995) Copper and photosysten II-A controversial relationship. Physiol Plant 94: 174-180. doi: 10.1111/j.1399-3054.1995.tb00799.x
[16]	Kuepper H, Goetz B, Mijovilovich A, et al. (2009) Complexation and Toxicity of Copper in Higher Plants. I. Characterization of Copper Accumulation, Speciation, and Toxicity in Crassula helmsii as a New Copper Accumulator. Plant Physiol 151: 702-714.
[17]	Gupta N, Ram H, Kumar B (2016) Mechanism of Zinc absorption in plants: uptake, transport, translocation and accumulation. Rev Environ Sci Bio-Technol 15: 89-109. doi: 10.1007/s11157-016-9390-1
[18]	Haslett BS, Reid RJ, Rengel Z (2001) Zinc mobility in wheat: Uptake and distribution of zinc applied to leaves or roots. Ann Bot 87: 379-386. doi: 10.1006/anbo.2000.1349
[19]	Aslani F, Bagheri S, Julkapli NM, et al. (2014) Effects of Engineered Nanomaterials on Plants Growth: An Overview. Scientific World J 2014: 641759.
[20]	Chichiricco G, Poma A (2015) Penetration and Toxicity of Nanomaterials in Higher Plants. Nanomater 5: 851-873. doi: 10.3390/nano5020851

This article has been cited by:

Marco Sansottera, Veronica Danesi, Kolmogorov variation: KAM with knobs (à la Kolmogorov), 2023, 5, 2640-3501, 1, 10.3934/mine.2023089

Reader Comments

Your name:*

Email:*
© 2016 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)

通讯作者: 陈斌, bchen63@163.com

1.
沈阳化工大学材料科学与工程学院沈阳 110142

AIMS Environmental Science

1.2 2.7

Metrics

Article views(10558) PDF downloads(1470) Cited by(19)

Preview PDF

Download XML

Export Citation

Article outline

Show full outline

Figures and Tables

Figures(9) / Tables(1)

AIMS Environmental Science

Comparative analysis of copper and zinc based agrichemical biocide products: materials characteristics, phytotoxicity and in vitro antimicrobial efficacy

Related Papers:

Abstract

1. Introduction

2. Related works

2.1. Background

2.2. Distance models

2.3. Similarity matching models

3. Problem statement

3.1. F network

4. Methodology

4.1. Architecture

4.1.1. Feature processing layer

4.1.2. Refactoring layer

4.1.3. Candidate prediction layer

4.2. Sampling and loss function

5. Experiments

5.1. Evaluation protocol

5.2. Experimental settings

5.3. Results

5.4. Discussion

6. Further study

6.1. The effect of embedding size

6.2. The effect of probability for negative candidate sampling

7. Conclusion

Acknowledgments

Conflict of interest

References

This article has been cited by:

Reader Comments

通讯作者: 陈斌, bchen63@163.com

Metrics

Figures and Tables

Other Articles By Authors

Catalog

AIMS Environmental Science

Comparative analysis of copper and zinc based agrichemical biocide products: materials characteristics, phytotoxicity and in vitro antimicrobial efficacy

Related Papers:

Abstract

1. Introduction

2. Related works

2.1. Background

2.2. Distance models

2.3. Similarity matching models

3. Problem statement

3.1. F network

4. Methodology

4.1. Architecture

4.1.1. Feature processing layer

4.1.2. Refactoring layer

4.1.3. Candidate prediction layer

4.2. Sampling and loss function

5. Experiments

5.1. Evaluation protocol

5.2. Experimental settings

5.3. Results

5.4. Discussion

6. Further study

6.1. The effect of embedding size

6.2. The effect of probability for negative candidate sampling

7. Conclusion

Acknowledgments

Conflict of interest

References

This article has been cited by:

Reader Comments

通讯作者: 陈斌, bchen63@163.com

Metrics

Figures and Tables

Other Articles By Authors

Related pages

Tools

Export File

Citation

Format

Content

Catalog