Loading [MathJax]/jax/output/SVG/jax.js
Research article Special Issues

The analytical analysis of nonlinear fractional-order dynamical models

  • The present research paper is related to the analytical solution of fractional-order nonlinear Swift-Hohenberg equations using an efficient technique. The presented model is related to the temperature and thermal convection of fluid dynamics which can also be used to explain the formation process in liquid surfaces bounded along a horizontally well-conducting boundary. In this work Laplace Adomian decomposition method is implemented because it require small volume of calculations. Unlike the variational iteration method and Homotopy pertubation method, the suggested technique required no variational parameter and having simple calculation of fractional derivative respectively. Numerical examples verify the validity of the suggested method. It is confirmed that the present method's solutions are in close contact with the solutions of other existing methods. It is also investigated through graphs and tables that the suggested method's solutions are almost identical with different analytical methods.

    Citation: Jiabin Xu, Hassan Khan, Rasool Shah, A.A. Alderremy, Shaban Aly, Dumitru Baleanu. The analytical analysis of nonlinear fractional-order dynamical models[J]. AIMS Mathematics, 2021, 6(6): 6201-6219. doi: 10.3934/math.2021364

    Related Papers:

    [1] Vincent Brondani . In vitro analysis of site specific nuclease selectivity by NGS. AIMS Bioengineering, 2021, 8(4): 235-242. doi: 10.3934/bioeng.2021020
    [2] Ozgun Firat Duzenli, Sezer Okay . Promoter engineering for the recombinant protein production in prokaryotic systems. AIMS Bioengineering, 2020, 7(2): 62-81. doi: 10.3934/bioeng.2020007
    [3] Masaharu Somiya, Yusuke Yoshioka, Takahiro Ochiya . Drug delivery application of extracellular vesicles; insight into production, drug loading, targeting, and pharmacokinetics. AIMS Bioengineering, 2017, 4(1): 73-92. doi: 10.3934/bioeng.2017.1.73
    [4] George H McArthur IV, Pooja P Nanjannavar, Emily H Miller, Stephen S Fong . Integrative metabolic engineering. AIMS Bioengineering, 2015, 2(3): 93-103. doi: 10.3934/bioeng.2015.3.93
    [5] Pawel Jajesniak, Tuck Seng Wong . From genetic circuits to industrial-scale biomanufacturing: bacterial promoters as a cornerstone of biotechnology. AIMS Bioengineering, 2015, 2(3): 277-296. doi: 10.3934/bioeng.2015.3.277
    [6] Kristen K. Comfort . The rise of nanotoxicology: A successful collaboration between engineering and biology. AIMS Bioengineering, 2016, 3(3): 230-244. doi: 10.3934/bioeng.2016.3.230
    [7] Michiro Muraki . Development of expression systems for the production of recombinant human Fas ligand extracellular domain derivatives using Pichia pastoris and preparation of the conjugates by site-specific chemical modifications: A review. AIMS Bioengineering, 2018, 5(1): 39-62. doi: 10.3934/bioeng.2018.1.39
    [8] Meher Langote, Saniya Saratkar, Praveen Kumar, Prateek Verma, Chetan Puri, Swapnil Gundewar, Palash Gourshettiwar . Human–computer interaction in healthcare: Comprehensive review. AIMS Bioengineering, 2024, 11(3): 343-390. doi: 10.3934/bioeng.2024018
    [9] Liwei Chen, Jaslyn Lee, Wei Ning Chen . The use of metabolic engineering to produce fatty acid-derived biofuel and chemicals in Saccharomyces cerevisiae: a review. AIMS Bioengineering, 2016, 3(4): 468-492. doi: 10.3934/bioeng.2016.4.468
    [10] Vincent DEPLAIGNE, Gael Y. ROCHEFORT . Bone tissue engineering at a glance. AIMS Bioengineering, 2022, 9(1): 22-25. doi: 10.3934/bioeng.2022002
  • The present research paper is related to the analytical solution of fractional-order nonlinear Swift-Hohenberg equations using an efficient technique. The presented model is related to the temperature and thermal convection of fluid dynamics which can also be used to explain the formation process in liquid surfaces bounded along a horizontally well-conducting boundary. In this work Laplace Adomian decomposition method is implemented because it require small volume of calculations. Unlike the variational iteration method and Homotopy pertubation method, the suggested technique required no variational parameter and having simple calculation of fractional derivative respectively. Numerical examples verify the validity of the suggested method. It is confirmed that the present method's solutions are in close contact with the solutions of other existing methods. It is also investigated through graphs and tables that the suggested method's solutions are almost identical with different analytical methods.



    X-ray security screening is a preeminent protection for public safety in various venues such as airports, railway stations, undergrounds, post offices, and customs. To this end, X-ray scanning machines are deployed to detect and expose prohibited items concealed within the baggage, luggage, or cargo. Human operators play pivotal roles in threat screening[1]. The expertise and proficiency of inspectors are indispensable for the accurate detection of threats; nevertheless, other factors, such as emotional fluctuation and physical fatigue, tend to divert their concentration. The necessity for a real-time, dependable, and automatic method for security screening is becoming increasingly pressing owing to the challenging conditions encountered during the checking process. Specifically, the random arrangement and extensive overlapping of content in luggage leads to highly occluded images, which significantly complicates the task of identifying prohibited items amidst cluttered backgrounds for inspectors [2].

    Despite the growing interest in computer-assisted techniques to enhance the alertness and detection capabilities of screeners, research in this area has been understudied owing to the scarcity of extensive datasets and sophisticated deep-learning algorithms. Previous studies have primarily focused on conventional image analysis [3,4,5] and machine learning. Most recently, deep learning methods, especially convolutional neural networks (CNN), have demonstrated superior performance over conventional machine learning methods for threat object classification [6,7,8], detection [9,10] and segmentation [11,12]. Nevertheless, most methods merely achieve high accuracy and recovery rates on specific datasets.

    The principle underlying X-ray imaging is the penetration of an X-ray beam through a scanned object and its subsequent detection using a photoelectric sensor. The intensity of the X-ray signal is inversely proportional to the density of the object material, thereby enabling determination of the internal structure of the object. The formula for attenuation is Ix=I0eμx, where I0 is the initial density, Ix is the attenuated density at x cm, and μ is the linear attenuation coefficient [13]. The formulation implies that Ix is correlated with the object thickness x and the nature of its material. The influence of the object thickness is mitigated through the utilization of dual- or multi-energy imaging technology, which enables the determination of the object's density and properties, in particular, the effective atomic number Zeff. Owing to the sensitivity of human perception to color, the density and effective atom number are converted to pseudo-color images or other color modes using lookup tables. The process of generating X-ray inspection images is depicted in Figure 1. In pseudo-color mode, different materials exhibit different colors during X-ray imaging. Conventionally, blue, orange, and green refer to the inorganic, organic, and mixed materials, respectively.

    Figure 1.  Scenario of the X-ray image generation and the images can be displayed in four color modes: (a) pseudo, (b) gray, (c) hot-jet, and (d) reversed.

    Data generation is an effective strategy for improving the performance of deep-learning methods, particularly in the context of prohibited object detection during X-ray security inspections. This is because the process of acquiring and annotating a large number of X-ray images in a real-world setting is both expensive and time-consuming. Nevertheless, in recent years, researchers have made tremendous efforts to create large-scale X-ray inspection datasets. Several commonly cited benchmarks include GDXray, UCL TIP, SIXray, and OPIXray [14,15,16,17]. With flourishing contemporary generative adversarial network (GAN) technology, its outstanding image-generation capacity has received wide attention. Various attempts have been made to synthesize X-ray images of prohibited items using GANs. For instance, Yang et al. [18] proposed a generative model based on improvements to WGAN-GP to generate ten classes of prohibited items using real images. Zhu et al. [19] attempted to synthesize images using SAGAN and CycleGAN to enrich the diversity of the threat image database, although the quality of the generated images could be improved. Liu et al. [20] developed a comprehensive framework based on GAN to synthesize X-ray inspection images for data enhancement.

    Inspired by the concept of generative radiance fields (GRAF), we propose a stylized generative radiance fields (SGRF) that enables controllable synthesis of 3D-aware images. The framework of the model is shown in Figure 2, which will be described in detail in other sections. This model can generate images of prohibited items based on label prompts and serves as an implicit representation database, producing a large number of images with rich diversity in class, color mode, pose, etc. More specifically, we make the following contributions.

    Figure 2.  The framework of the stylized generative radiance fields (SGRF).

    ● We implemented a stylized generative radiance field to learn the implicit representation of multiple objects with different color styles using a single multilayer perceptron (MLP).

    ● We attained explicit control over the generation of 3D-aware images by entangling and disentangling class and style labels into/from random latent codes.

    ● We utilized the model to significantly augment the X-ray inspection datasets and effectively increase the generalization ability of state-of-the-art detection algorithms in security screening.

    The rest of this paper is organized as follows. Section 2 presents a literature review of the 2D and 3D image synthesis methods. Section 3 describes stylized generative radiance field methodology. Section 4 describes the experimental setup and details. Section 5 presents the experimental results for the proposed model. Section 6 concludes the paper and provides recommendations for future research.

    GAN: In 2014, Goodfellow et al. [21] introduced GANs, which are deep generative models inspired by the game theory. During the training process, generator G and discriminator D compete to reach a Nash equilibrium state. The principle of G is to generate as much fake data as possible to fit the potential distribution of the real data, whereas the principle of D is to correctly distinguish real data from fake data. The input of G is a random noise vector z (usually a uniform or normal distribution). The noise was mapped to a new data space via G to obtain a fake sample G(z), which is a multidimensional vector. The discriminator D is a binary classifier that takes both the real sample from the dataset and the fake sample generated by G as the input, and the output of D represents the probability that the sample is real rather than fake. The architecture of GAN is illustrated in Figure 3.

    Figure 3.  The GAN comprises a generator (G) and a discriminator (D), where z is the input noise, X is the real image, and G(z) is the fake image.

    Following this principle, various GANs were developed for image generation and translation [22,23]. Initially, GAN adopted a fully connected MLP as the generator and discriminator. Taking advantage of a CNN, Radford et al. [24] proposed a deep convolutional generative adversarial network (DCGAN) that achieved superior performance in image generation. Owing to the use of random latent vectors as inputs, unrestricted variables may result in collapse of the training process. To address this issue, conditional GANs, such as CGAN, ACGAN, and InfoGAN [25,26,27] have incorporated conditional variables (including labels, text, or other relevant data) into both the generator and discriminator of the model. These modifications result in a more robust training process and the ability to generate images based on specific conditions. Moreover, considerable effort has been devoted to optimizing the objective function for stabilizing Gan's training. For example, Arjovsky et al. [28] proposed a Wasserstein generative adversarial network (WGAN). They first showed theoretically that the Earth-Mover (EM) distance produces better gradient behaviors than other distance metrics. Gulrajani et al. [29] presented a gradient penalty named WGAN-GP to enforce the Lipschitz constraint, which performs better than the original WGAN. Petzka et al. [30] introduced a new penalty term known as the WGAN-LP to enforce the Lipschitz constraint.

    Image translation, or converting images from one domain to another, is another primary task of GANs. Isola et al. [31] utilized a pix-to-pix GAN to perform image adaptation between pairs of images. Subsequently, the HD pix-to-pix GAN [32] enhanced the quality and resolution of the generated images up to 2048 × 1024 pixels. Owing to the difficulties in obtaining paired data in real-world scenarios, CycleGAN, DiscoGAN, and DualGAN [33,34,35] employed a term of cyclic consistency to train the model using unpaired data. Choi et al. [36] proposed StarGAN, which accomplished image translation across multiple domains by using a single model.

    To date, contemporary GAN technology has achieved outstanding performance in synthesizing high-resolution photo-realistic 2D images; however, GAN cannot synthesize novel views of objects, and the generated images unable to maintain 3D consistency.

    NERF: Neural radiance fields [37] are powerful for learning 3D scene implicit representations, where the scene is represented as a continuous field and stored in a neural network. The neural radiation field can render high-fidelity images from any perspective by training on a set of posed images. The implicit rendering process is illustrated in Figure 4. The inputs are the position o, direction d(θ,ϕ), and the corresponding coordinates (x,y,z) of the emitted light from a certain perspective, which are fed into the neural radiation field to obtain the volume density σ and color (r,g,b) and obtain the final image through volume rendering. To obtain clear images, positional encoding optimizes the networks by mapping a 5D input to a high-dimensional space to represent the high-resolution geometry and appearance. Additionally, a coarse-to-fine strategy was adopted for hierarchical volume sampling to increase rendering efficiency.

    Figure 4.  Implicit rendering process of NERFs, where, x(x,y,z) refers to the 3D location of the point, o, d(θ,ϕ) refers to the origin, direction of the ray, and c(r,g,b)/σ refers to the color/density of the rendering point, respectively.

    NERF has achieved impressive success in view synthesis, image processing, controllable editing, digital human body, and multimodal applications. The limitations of NERF are its slow training and rendering, restriction to static scenes, lack of generalization, and the requirement of a large number of perspectives. To address these issues, Garbin et al. [38] proposed FastNeRF that can render high-fidelity and realistic images at 200 Hz on high-end consumer GPUs. Li et al. [39] presented the neural scene flow fields to expand it to dynamic scenes by learning implicit representation of scenes from monocular videos. Because the NERF requires retraining for a new scene and cannot be extended to unseen scenes, academic research on this topic comprises pixelNeRF and IBRNet [40,41]. Moreover, NERF stores one scene in one fully connected multilayer network (MLP), which restricts the representation of multiple scenes in a single model.

    GRAF: Taking advantage of GAN and NERF, GRAF [42] designed a generative model of NERF by integrating a neural field into a GAN generator. The model can generate 3D-aware images from the latent code (noise) and can be trained using a set of unposed images. In addition to changing the perspective, GRAF also allows the modification of the shape and appearance of the generated objects. Although GRAF has demonstrated remarkable capabilities in generating high-resolution images with 3D consistency, its performance in more complex real-world settings is limited, owing to its inability to effectively handle multi-object scenes. GIRAFFE [43] aims to represent scenes as synthetic neural feature fields that can control the camera's pose, position, and angle of objects placed in the scene as well as the shape and appearance of objects. GIRAFFE is a combination of multiple MLP networks, each of which representing a scene or an object. Our goal was to utilize a single MLP to store various objects or scenes. Inspired by previous studies, we designed stylized generative radiance fields to accomplish this objective. Using this model, we exercised explicit control over the generation of objects based on the class and style labels.

    GAN is a generative model primarily utilized for image generation and translation tasks, but the synthesized images are unable to maintain multi-view consistency. By contrast, NERFs enable the generation of 3D-aware images owing to the inherent nature of the radiance field. GRAF is a combination of GAN and NERF, and has been proven successful for novel view synthesis from a random latent code, which also allows modification of the shape and appearance of the generated object based on latent code. However, the shapes and appearances of the objects are randomly generated without explicit control. Moreover, the GRAF prototype employs an MLP to learn the implicit representation of one object or scene, leading to high memory consumption in multiple-scene cases. Therefore, we argue for representing multiple scenes using a single MLP and generating objects with explicit control through the SGRF, which has evolved from GRAF. In the following section, we briefly review the NERF and GRAF models, which form the basis of our model.

    Neural radiance fields [37] provide the foundation for various NERF-derived models. Its contributions include implicit representation of 3D geometry and differentiable volume rendering.

    Implicit representation: Radiance fields provide implicit representations of 3D geometry. The radiance field is a continuous function with the input of the 3D location x = (x,y,z), viewing direction d = (θ,ϕ), output of the emitted color c = (r,g,b), and volume density values σ. The mapping of the function from input to output is implemented using an MLP network FΘ:(x,d)(c,σ) and its weights are optimized to map each input 5D coordinate to its corresponding volume density and directional emitted color. The network FΘ directly operated on 5D input coordinates performs poorly at representing high-frequency variations in color and geometry because deep networks are biased toward learning low-frequency functions. Positional encoding was used to map a 3D location xR3 and viewing direction d S2 into a higher-dimensional space to enable the MLP to more easily approximate a higher-frequency function. Formally, the positional encoding function is defined in Eq (3.1) [37].

    γ(p)=[sin(20πp),cos(20πp),(sin(21πp),cos(21πp),,(sin(2L1πp),cos(2L1πp)] (3.1)

    This function γ(·) is applied separately to each of the three coordinate values in x and the three components of the Cartesian viewing-direction unit vector d. In the experiments, L = 10 for γ(x) and L = 4 for γ(d). In Eq (3.2) [37], an MLP network fΘ(·) is applied to map the resulting features to a color value cR3 and volume density σR+.

    FΘ:RLx×RLdR3×R+γ(x),γ(d)(c,σ) (3.2)

    Volume rending: To render a 2D image from the radiance field fΘ(·), the volume density σ(x) can be interpreted as the differential probability of a ray terminating at an infinitesimal particle at the location x. The expected color C(r) of the camera ray r(t) = o + td with near and far bounds tn and tf is calculated using Eq (3.3) [37].

    C(r)=tftnT(t)σ(r(t))c(r(t),d))dt,whereT(t)=exp(ttnσ(r(s))ds). (3.3)

    Rendering a view from a continuous NERF requires estimating integral C(r) for a camera ray traced through each pixel of the camera. This continuous integral of C(r) is numerically estimated using deterministic quadrature, which limits the resolution of the representation, because the MLP would only be queried at a fixed discrete set of locations. Instead, a stratified sampling approach was used to partition [tn, tf] into N evenly spaced bins and draw one sample uniformly at random from each bin. This approximation approach estimates the integral value from a discrete set of samples. In Eq (3.4) [37], let (cir,σir)Ni=1 denote the color and volume density values of N random samples along camera ray r. The rendering function π(·) maps these values onto the color value cr. The color value, cr, was calculated using Eq (3.5) [37].

    π:(R3×R+)NR3,{(cir,σir)}cr, (3.4)
    cr=Ni=1Tirαircir,Tir=i1j=1(1αjr),αir=1exp(σirδir), (3.5)

    where Tir and αir denote the transmittance and alpha value of sample point i along ray r and δir=xi+1rxir2 is the distance between neighboring sample points. Network FΘ is trained with a set of posed images by minimizing the reconstruction loss between observations and predictions. In addition, a hierarchical volume sampling strategy was applied to improve the rending efficiency.

    The GRAF [42] comprises a radiance field-based generator and a multi-scale patch discriminator. It differs from NERF in that it attempts to learn a model using unposed images rather than posed images.

    The generator takes the camera intrinsic matrix K, camera pose ξ, 2D sampling pattern ν, and shape/appearance codes zs/za as inputs, and predicts the image patches P. The camera poses ξ=[R|t] sampled from a pose distribution pξ and ν(u, s) determines the center uR2 and scale sR+ of the virtual K×K patch drawn from a uniform distribution. The shape and appearance variables zs and za are obtained from the shape and appearance distributions ps and pa, respectively.

    Ray sampling: K×K patch P(u,s) is determined by a set of 2D image coordinates that describe the location of every pixel of the patch in the image domain Ω. In Eq (3.6) [42], the corresponding 3D rays are uniquely determined by P(u,s), the camera pose ξ, and the intrinsic matrix K.

    P(u,s)={(sx+u,sy+v)|x,y{K2,,K21}} (3.6)

    3D point sampling: As with the NERF, stratified sampling was used to sample N points {xir}Ni=1 along each ray r for the numerical integration of the radiance field.

    Conditional radiance fields: The radiance field was implemented using a fully connected neural network with parameter θ. In addition to a regular radiance field, it is conditional on two latent codes: shape code zs and appearance code za, which determine the object's shape and appearance. As shown in Figure 5, the shape encoding h is derived from the positional encoding of γ(x) and shape code zs and then transformed to the volume density σ by density head σθ. The volume density is computed independently without using viewpoint d and appearance code za to disentangle the shape from the appearance during the inference. To predict the color c, the concatenating vector of the shape encoding h, positional encoding of γ(d), and appearance code za are passed to the color head cθ.

    Figure 5.  A conditional radiance fields gθ, with shape encoding hθ, color head cθ, and density head σθ. Here, zs/za refers to shape/appearance code.

    Volume rendering: Given the color and volume density (cir,σir)Ni=1 of N points along a ray, the color crR3 of the pixel corresponding to the ray is obtained using the volume rendering operator π. The predicted patch P is generated by combining the results of all sampling rays.

    The discriminator was implemented as a convolutional neural network with leaky ReLU activation. To accelerate training and inference, the discriminator compares the synthesized patch P with a patch P extracted from a real image I drawn from the data distribution pD. To extract a K×K patch from a real image, the real patch P is sampled by querying a real image at 2D image coordinates P(u, s) using a bilinear interpolation operation, which is referred to as Γ(I,ν). It was proven that the discriminator with shared weights was sufficient for all patches even though they were sampled at random locations with different scales. As the scale determines the receptive field of the patch, starting with patches of larger receptive fields to capture the global context, patches with smaller receptive fields are progressively sampled to refine the local details.

    During adversarial training, the goal of generator G(θ) is to minimize the objective function, and that of discriminator D(ϕ) is to maximize the objective function. The non-saturating objective function V(θ,ϕ) with R1-regularization is defined in Eq (3.7) [42].

    V(θ,ϕ)=Ezs,za,ξ,ν[f(Dϕ(Gθ(zs,za,ξ,ν)))]+EI,ν[f(Dϕ(Γ(I,ν)))λDϕ(Γ(I,ν))2], (3.7)

    where f(t)=log(1+exp(t)), I denotes an image from the data distribution pD, pν denotes the distribution over random patches, and λ controls the strength of regularization. Spectral and instance normalization were used for the discriminator. RMSprop was used as an optimizer with a batch size of 8 and learning rates of 0.0005 and 0.0001 for the generator and discriminator, respectively.

    GRAF can synthesize novel views of an object from latent code, although the generation of the shape and appearance of an object is random and lacks explicit control. To address this issue, we propose the use of SGRF that enable the implicit representation of multiple 3D objects using a single MLP, providing precise control over 3D-aware image synthesis according text prompts (labels).

    In the generator, the label codes of the class and style are entangled with a latent vector as the input of the conditional radiance field. This encourages the generator to use label-embedding latent vectors to synthesize objects with explicit control in terms of class and style. The schema of label embedding is illustrated in Figure 6.

    Figure 6.  The schema for embedding label codes into latent vectors by multiplication (X).

    Input: The random latent code zN(0 1) is split into shape code zs and appearance code za. The class and style labels are embedded into the embedding layers, and the label-embedded codes zs and za are produced by multiplying the shape code zs and appearance code za with the label codes.

    Output: In contrast to GRAF, which employs a single value of σ and c, the conditional radiance field g(θ) in our model produces a volume density array [σ(i)]N1i=0 and color array [c(j)]M1j=0, both indexed by numerical labels. In practice, the number of classes was set to N = 5 and the number of styles at M = 4.

    Conditional radiance field: The network structure of conditional radiance fields g(θ) is illustrated in Figure 7. The mechanism of prediction for volume density σ and color c is the similar to that of GRAF. However, the input and output of g(θ) are manipulated for entangling and disentangling label codes accordingly. In practice, a total of 1024 rays are sampled for one image with parameters P(u,s), camera pose ξ and intrinsic K, then 64 points are sampled along each ray r. Here, γ(x), and γ(d) are the positional encoding of the 3D coordinates x and ray direction d of the point. zs and za are the latent codes associated with class and style, respectively.

    Figure 7.  The structure of conditional radiance fields g(θ) with inputs of zs,za, γ(x) and γ(d), and outputs of volume density array [σ(i)]4i=0 and color array [c(j)]3j=0.

    In addition to using a regular discriminator D(ϕ) to compare the synthesized patch P with a real patch P, another auxiliary classifier Dvgg based on VGG16 is added to distinguish the class and style of an object or scene. VGG16 [44] is a classical deep CNN that is typically used for multi-classification tasks owing to its superior generalization ability. Initially, the discriminator Dvgg is trained using the annotated images I, which are downsampled from real image I.

    In the training phase, the loss function guides network optimization. The discriminator D(ϕ) is trained using a real patch P and a synthesized patch P with labels. The discriminator Dvgg was trained using a rescaled real image I. The loss function of G(θ) is expressed by Eq (3.8).

    L(G(θ))=Ladv(D(ϕ)|Pi,j)+λ1Lcls(Dvgg|Pi,j)+λ2Lsty(Dvgg|Pi,j), (3.8)

    where Ladv is the adversarial loss of P and P, Lcls and Lsty denote the classification and style loss of the generator, respectively, and λ1 and λ2 are weights for Lcls and Lsty. In this experiment, MSE and RMSprop were used as the adversarial loss function and optimizer, respectively, with parameters λ1 = 2.0 and λ2 = 3.0 and a batch size of 8. During the inference phase, class and style labels are embedded into latent codes zs and za, respectively, for controllable synthesis of 3D-aware images.

    Owing to the difficulty in obtaining X-ray data in a real scene, we trained the model on a synthetic dataset. In a 3D editing software, such as Blender, an object was set at the origin and the camera was on the surface of the upper hemisphere facing toward the origin of the coordinate system. By manipulating the camera's pose, we can capture natural images of an object from a variable perspective. Subsequently, the captured images were binarized and converted into a semantic map as an input to an HD pix-to-pix GAN. The four types of semantic maps correspond to the four display styles individually. The image translator is a pretrained HD pix-to-pix GAN that is responsible for the style transfer of the captured images. The dataset comprises five classes of images of prohibited items, such as guns, forks, pliers, knives, and scissors, and each class is displayed in four color modes: grey, pseudo, hot jet, and reversed. The image synthesis pipeline is illustrated in Figure 8.

    Figure 8.  The pipeline of the synthesis of training dataset from 3D models.

    We trained the model on the synthesized dataset prepared in Section 4.1 and compared its results with those of the NERF, GRAF, and GIRAFFE models, respectively. NERF learns 3D geometry from posed images and synthesizes novel views using differentiable volumetric rendering. GRAF generates high-quality 3D-aware images from latent codes of shape and appearance without requiring posed images. GIRAFFE is a combination of multiple MLP networks that represent multiple objects in a given scene.

    Frechet inception distance (FID) scores have been extensively adopted to evaluate the quality and diversity of generated images. We calculate the FID scores using Eq (4.1) [45].

    FID=d2((mr,Cr),(mg,Cg))=||mrmg||22+Tr(Cr+Cg2(CrCg)1/2), (4.1)

    where pair (mr, Cr) corresponds to real images, pair (mg, Cg) corresponds to generated images, and m and C are the mean and covariance, respectively. We also introduce kernel inception distance (KID) [46], which does not require a normal distribution hypothesis such as FID and is an unbiased estimate. In addition, average precision (AP) and mean average precision (mAP) are the most popular metrics used to evaluate object detection performance.

    The model was trained on a synthetic dataset (Section 4.1). We employed an RMSprop optimizer with learning rates of 0.0001 and 0.0005 for the discriminator and the generator, respectively. During the inference phase, the inputs of the MLP are label-embedded latent codes zs and za and view direction d(θ,ϕ). The average inference time was 0.7723 s for each image with a size of 128. In Figures 9 and 10, some X-ray images were predicted based on the class and style labels after 50,000 iterations. In Figure 11, novel views of prohibited items are synthesized by altering the pose of the virtual camera. Here, θ,ϕ are the pitch and yaw angles of the camera in a spherical coordinate system.

    Figure 9.  Samples of the synthesized X-ray images of prohibited items with class labels (i = 0, 1, 2, 3, 4), style labels (j = 0), and view direction d(θ,ϕ).
    Figure 10.  Samples of the synthesized X-ray images of prohibited items with class label (i = 0), style labels (j = 0, 1, 2, 3), and view direction d(θ,ϕ).
    Figure 11.  Novel view synthesis of prohibited items by manipulating camera pose. Here, θ,ϕ are the pitch and yaw angles of the camera, respectively.

    The FID score indicates the differences between the real and generated images; the lower the FID score, the better the generative model's performance. As shown in Figure 12, the 'guns' and 'pliers' presented superior quality and diversity compared to other category because they have more complex details in the internal structures than other categories.

    Figure 12.  FID scores of prohibited items grouped by category and color mode.

    Our model was derived from the GRAF model by modifying the inputs and outputs of the MLP in the prototype. Therefore, we performed ablation studies by comparing the results of our model with those of its counterparts. In Table 1, we quantitatively compared the FID and KID scores of GRAF with our model under the pseudo-color mode, indicating no significant variance between the two models.

    Table 1.  FID and KID Comparison for each model in the pseudo-color mode.
    Class Gun Fork Knife Pliers Scissor
    GRAF (FID) 43.55 45.25 80.14 46.92 55.35
    /without (KID) 0.038 0.029 0.084 0.042 0.048
    Ours (FID) 47.74 51.81 78.43 48.27 54.52
    /with (KID) 0.042 0.032 0.079 0.039 0.050

     | Show Table
    DownLoad: CSV

    By fusing the prohibited item image with a benign background image, we synthesized X-ray inspection images using pixel-by-pixel alpha-blending algorithms [20]. A schematic of the synthesis of X-ray inspection images is shown in Figure 13. Some samples of the synthesized images are shown in Figure 14. The prohibited objects were located within red bounding boxes.

    Figure 13.  Schema of synthesis of X-ray inspection images based on the pix-by-pix alpha-blending algorithm.
    Figure 14.  Samples of the synthesized X-ray inspection images in the pseudo-color mode. Each image comprises (Ⅰ) one prohibited item (Ⅱ), two prohibited items, or (Ⅲ) two or more overlapping prohibited items.

    In the experiment, we first prepared two datasets: Dataset A, which included 2000 images selected from the real dataset, and Dataset B, which included 1000 images from the real dataset and 1000 synthesized images. Each dataset comprises five classes of prohibited items: guns, forks, knives, pliers, and scissors. The two datasets were then split into training and validation sets in a ratio of 4:1. For validation, we employed YOLOv8 [47] as the object detection paradigm. The model was trained on Datasets A and B separately. The training curves demonstrated that the model trained on Dataset B outperformed the model trained on Dataset A, achieving improvements of approximately 4.4% in mAP0.5 and 11.9% in mAP0.5:0.95.

    During the inference phase, Model A was the pretrained model on Dataset A and Model B was the pretrained model on Dataset B. Both models were tested on the same test set, which comprised 500 real images from an available dataset. By evaluating the mAP values of both models in Table 2, Model B achieves superior performance over Model A, demonstrating that augmented Dataset B effectively improves the detection accuracy and generalization ability of the deep learning algorithms. We expect to further enhance the detection performance by adding additional synthesized X-ray inspection images to the training set.

    Table 2.  The evaluation metrics of pretrained Models A and B on the test set.
    Model Class P R mAP (0.5) mAP (0.5:0.95)
    Gun 0.79 0.81 0.85 0.56
    Fork 0.77 0.63 0.71 0.34
    Model Knife 0.62 0.52 0.54 0.30
    (A) Pliers 0.72 0.68 0.73 0.47
    Scissor 0.71 0.46 0.60 0.42
    Mean 0.72 0.62 0.68 0.42
    Gun 0.93 0.91 0.92 0.67
    Fork 0.78 0.55 0.61 0.37
    Model Knife 0.65 0.43 0.56 0.38
    (B) Pliers 0.82 0.65 0.83 0.51
    Scissor 0.74 0.51 0.62 0.44
    Mean 0.78 0.61 0.71 0.47

     | Show Table
    DownLoad: CSV

    In this study, we propose a novel stylized generative radiance field for the controllable synthesis of 3D-aware images. By manipulating the input and output of the conditional radiance field (MLP) and incorporating a new discriminator (VGG16), we enabled the entanglement and disentanglement of class and style labels into and from random latent vectors, thereby achieving explicit control over the generation of prohibited items. Moreover, our model can generate nonexistent images in the training set using transfer learning. The main advantages of this model are that it can learn multiple objects using a single MLP and synthesize novel views according to class and style labels. The experimental results reveal that our image generation model significantly increases the quantity and diversity of the images of prohibited items, and the augmented dataset effectively promotes the accuracy and generalization of object detection algorithms. Furthermore, our proposed model has the potential to be extended to other fields such as medial image generation. In the future, we plan to add more subclasses to a class, such as different types of guns in a gun class, enriching the diversity in the 3D geometry of prohibited items within one category.

    The authors declare they have not used Artificial Intelligence (AI) tools in the creation of this article.

    This work has been supported by Henan Province Program for Science and Technology Development under Grant No. 162102210009.

    The authors declare there are no conflicts of interest.



    [1] K. B. Oldham, J. Spanier, The Fractional Calculus, vol. 111 of Mathematics in science and engineering, 1974.
    [2] A. A. A. Kilbas, H. M. Srivastava, J. J. Trujillo, Theory and applications of fractional differential equations, Elsevier Science Limited, 204 (2006).
    [3] K. S. Miller, B. Ross, An introduction to the fractional calculus and fractional differential equations, Wiley, 1993.
    [4] M. Caputo, Linear models of dissipation whose Q is almost frequency independent-II, Geophys. J. Int., 13 (1967), 529-539. doi: 10.1111/j.1365-246X.1967.tb02303.x
    [5] F. Yin, J. Song, X. Cao, A general iteration formula of VIM for fractional heat-and wave-like equations, J. Appl. Math., 2013.
    [6] A. S. Arife, S. K. Vanani, F. Soleymani, The Laplace homotopy analysis method for solving a general fractional diffusion equation arising in nano-hydrodynamics, J. Comput. Theor. Nanosci., 10 (2013), 33-36. doi: 10.1166/jctn.2013.2653
    [7] K. Oldham, J. Spanier, The fractional calculus theory and applications of differentiation and integration to arbitrary order, Elsevier, 1974.
    [8] R. Shah, H. Khan, S. Mustafa, P. Kumam, M. Arif, Analytical solutions of fractional-order diffusion equations by natural transform decomposition method, Entropy, 21 (2019), 557. doi: 10.3390/e21060557
    [9] N. H. Sweilam, A. A. Elaziz El-Sayed, S. Boulaaras, Fractional-order advection-dispersion problem solution via the spectral collocation method and the non-standard finite difference technique, Chaos, Solitons Fractals, 144 (2021), 110736. doi: 10.1016/j.chaos.2021.110736
    [10] A. J. Munoz-Vzquez, J. D. Snchez-Torres, M. Defoort, S. Boulaaras, Predefined-time convergence in fractional-order systems, Chaos, Solitons Fractals, 143 (2021), 110571. doi: 10.1016/j.chaos.2020.110571
    [11] R. Guefaifia, S. M. Boulaaras, B. Cherif, T. Radwan, Infinite Existence Solutions of Fractional Systems with Lipschitz Nonlinearity, J. Function Spaces, 2020 (2020).
    [12] O. A. Arqub, Fitted reproducing kernel Hilbert space method for the solutions of some certain classes of time-fractional partial differential equations subject to initial and Neumann boundary conditions, Comput. Math. Appl., 73 (2017), 1243-1261. doi: 10.1016/j.camwa.2016.11.032
    [13] O. A. Arqub, Numerical solutions for the Robin time-fractional partial differential equations of heat and fluid flows based on the reproducing kernel algorithm, Int. J. Numer. Methods Heat Fluid Flow, 2018.
    [14] O. A. Arqub, Z. Odibat, M. Al-Smadi, Numerical solutions of time-fractional partial integrodifferential equations of Robin functions types in Hilbert space with error bounds and error estimates, Nonlinear Dynam., 94 (2018), 1819-1834. doi: 10.1007/s11071-018-4459-8
    [15] N. Mehmood, N. Ahmad, Existence results for fractional order boundary value problem with nonlocal non-separated type multi-point integral boundary conditions, AIMS Math., 5 (2019), 385-398.
    [16] H. K. Nashine, R. W. Ibrahim, Solvability of a fractional Cauchy problem based on modified fixed point results of non-compactness measures, AIMS Math., 4 (2019), 847-859. doi: 10.3934/math.2019.3.847
    [17] S. Bushnaq, S. Ali, K. Shah, M. Arif, Approximate solutions to nonlinear fractional order partial differential equations arising in ion-acoustic waves, AIMS Math., 4 (2019), 721-739. doi: 10.3934/math.2019.3.721
    [18] Y. Chu, M. MA. Khater, Y. S. Hamed, Diverse novel analytical and semi-analytical wave solutions of the generalized (2+1)-dimensional shallow water waves model, AIP Advances, 11 (2021), 015223. doi: 10.1063/5.0036261
    [19] M. Khater, U. Ali, M. A. Khan, A. A. Mousa, R. A. M. Attia, A new numerical approach for solving 1D Fractional diffusion-wave equation, J. Function Spaces, 2021 (2021).
    [20] M. M. A. Khater, R. A. M. Attia, C. Park, D. Lu, On the numerical investigation of the interaction in plasma between (high & low) frequency of (Langmuir & ion-acoustic) waves, Results Physics, 18 (2020), 103317. doi: 10.1016/j.rinp.2020.103317
    [21] G. Adomian, Solving frontier problems of physics: The decomposition method, With a preface by Yves Cherruault, Fundamental Theories of Physics, Kluwer Academic Publishers Group, Dordrecht, 1 (1994).
    [22] S. A. Khuri, A Laplace decomposition algorithm applied to a class of nonlinear differential equations, J. Appl. Math., 1 (2001), 141-155. doi: 10.1155/S1110757X01000183
    [23] A. Ali, K. Shah, R. A. Khan, Numerical treatment for traveling wave solutions of fractional Whitham-Broer-Kaup equations, Alex. Eng. J., 57 (2018), 1991-1998. doi: 10.1016/j.aej.2017.04.012
    [24] R. Shah, H. Khan, M. Arif, P. Kumam, Application of LaplaceAdomian decomposition method for the analytical solution of third-order dispersive fractional partial differential equations, Entropy, 21 (2019), 335. doi: 10.3390/e21040335
    [25] H. Khan, R. Shah, P. Kumam, D. Baleanu, M. Arif, An efficient analytical technique, for the solution of fractional-order telegraph equations, Mathematics, 7 (2019), 426. doi: 10.3390/math7050426
    [26] A. Ali, L. Humaira, K. Shah, Analytical solution of general fishers equation by using Laplace Adomian decomposition method, J. Pure Appl. Math., 2 (2018), 1-4.
    [27] S. Mahmood, R. Shah, M. Arif, Laplace adomian decomposition method for multi dimensional time fractional model of Navier-Stokes equation, Symmetry, 11 (2019), 149. doi: 10.3390/sym11020149
    [28] F. Haq, K. Shah, Q. M. Al-Mdallal, F. Jarad, Application of a hybrid method for systems of fractional order partial differential equations arising in the model of the one-dimensional Keller-Segel equation, Eur. Phys. J. Plus, 134 (2019), 1-11. doi: 10.1140/epjp/i2019-12286-x
    [29] F. T. Akyildiz, D. A. Siginer, K. Vajravelu, R. A. V. Gorder, Analytical and numerical results for the Swift Hohenberg equation, Appl. Math. Comput., 216 (2010), 221-226. doi: 10.1016/j.amc.2010.01.041
    [30] J. Lega, J. V. Moloney, A. C. Newell, Swift-Hohenberg equation for lasers, Phys. Rev. Lett., 73 (1994), 2978. doi: 10.1103/PhysRevLett.73.2978
    [31] H. Sakaguchi, H. R. Brand, Localized patterns for the quintic complex Swift-Hohenberg equation, Physica D: Nonlinear Phenomena, 117 (1998), 95-105. doi: 10.1016/S0167-2789(97)00310-2
    [32] P. C. Hohenberg, J. B. Swift, Effects of additive noise at the onset of Rayleigh-Bnard convection, Phys. Rev. A, 46 (1992), 4773. doi: 10.1103/PhysRevA.46.4773
    [33] K. Vishal, S. Das, S. H. Ong, P. Ghosh, On the solutions of fractional Swift Hohenberg equation with dispersion, Appl. Math. Comput., 219 (2013), 5792-5801. doi: 10.1016/j.amc.2012.12.032
    [34] K. Vishal, S. Kumar, S. Das, Application of homotopy analysis method for fractional Swift Hohenberg equation revisited, Appl. Math. Model., 36 (2012), 3630-3637. doi: 10.1016/j.apm.2011.10.001
    [35] M. Merdan, A numeric analytic method for time-fractional Swift Hohenberg (SH) equation with modified Riemann Liouville derivative, Appl. Math. Model., 37 (2013), 4224-4231. doi: 10.1016/j.apm.2012.09.003
    [36] N. A. Khan, N. U. Khan, M. Ayaz, A. Mahmood, Analytical methods for solving the time-fractional Swift Hohenberg (SH) equation, Comput. Math. Appl. 61 (2011), 2182-2185.
    [37] S. McCalla, B. Sandstede, Snaking of radial solutions of the multi-dimensional Swift Hohenberg equation: A numerical study, Physica D: Nonlinear Phenomena, 239 (2010), 1581-1592. doi: 10.1016/j.physd.2010.04.004
    [38] N. A. Kudryashov, D. I. Sinelshchikov, Exact solutions of the Swift Hohenberg equation with dispersion, Commun. Nonlinear Sci., 17 (2012), 26-34. doi: 10.1016/j.cnsns.2011.04.008
    [39] W. Li, Y. Pang, An iterative method for time-fractional Swift-Hohenberg equation, Adv. Math. Phys., 2018 (2018).
    [40] P. Bakhtiari, S. Abbasbandy, R. A. V. Gorder, Reproducing kernel method for the numerical solution of the 1D Swift Hohenberg equation, Appl. Math. Comput., 339 (2018), 132-143. doi: 10.1016/j.amc.2018.07.006
    [41] A. M. Wazwaz, A reliable modification of Adomian decomposition method, Appl. Math. Comput., 102 (1999), 77-86. doi: 10.1016/S0096-3003(98)10024-3
  • This article has been cited by:

    1. Kang Lan Tee, Tuck Seng Wong, 2017, Chapter 8, 978-3-319-50411-7, 201, 10.1007/978-3-319-50413-1_8
    2. Amir Zaki Abdullah Zubir, Simon A. Whawell, Tuck Seng Wong, Syed Ali Khurram, The chemokine lymphotactin and its recombinant variants in oral cancer cell regulation, 2020, 26, 1354-523X, 1668, 10.1111/odi.13500
    3. David Gonzalez-Perez, James Ratcliffe, Shu Khan Tan, Mary Chen May Wong, Yi Pei Yee, Natsai Nyabadza, Jian-He Xu, Tuck Seng Wong, Kang Lan Tee, Random and combinatorial mutagenesis for improved total production of secretory target protein in Escherichia coli, 2021, 11, 2045-2322, 10.1038/s41598-021-84859-6
    4. Klaudia Chmelova, Eva Sebestova, Veronika Liskova, Andy Beier, David Bednar, Zbynek Prokop, Radka Chaloupkova, Jiri Damborsky, Hideaki Nojiri, A Haloalkane Dehalogenase from Saccharomonospora viridis Strain DSM 43017, a Compost Bacterium with Unusual Catalytic Residues, Unique (S)-Enantiopreference, and High Thermostability, 2020, 86, 0099-2240, 10.1128/AEM.02820-19
    5. Abdulrahman H. A. Alessa, Kang Lan Tee, David Gonzalez-Perez, Hossam E. M. Omar Ali, Caroline A. Evans, Alex Trevaskis, Jian-He Xu, Tuck Seng Wong, Accelerated directed evolution of dye-decolorizing peroxidase using a bacterial extracellular protein secretion system (BENNY), 2019, 6, 2197-4365, 10.1186/s40643-019-0255-7
    6. Wilasinee Samniang, Kriengsak Panuwatwanich, Somnuk Tangtermsirikul, Seksan Papong, Development and implementation of BIM-integrated LCA method for carbon emission assessment of concrete construction: case of overpass project in Thailand, 2025, 2046-6099, 10.1108/SASBE-09-2024-0371
  • Reader Comments
  • © 2021 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)
通讯作者: 陈斌, bchen63@163.com
  • 1. 

    沈阳化工大学材料科学与工程学院 沈阳 110142

  1. 本站搜索
  2. 百度学术搜索
  3. 万方数据库搜索
  4. CNKI搜索

Metrics

Article views(3211) PDF downloads(197) Cited by(13)

Figures and Tables

Figures(6)  /  Tables(2)

/

DownLoad:  Full-Size Img  PowerPoint
Return
Return

Catalog