In recent years, with the development of science and technology, powerful computing devices have been constantly developing. As an important foundation, deep learning (DL) technology has achieved many successes in multiple fields. In addition, the success of deep learning also relies on the support of large-scale datasets, which can provide models with a variety of images. The rich information in these images can help the model learn more about various categories of images, thereby improving the classification performance and generalization ability of the model. However, in real application scenarios, it may be difficult for most tasks to collect a large number of images or enough images for model training, which also restricts the performance of the trained model to a certain extent. Therefore, how to use limited samples to train the model with high performance becomes key. In order to improve this problem, the few-shot learning (FSL) strategy is proposed, which aims to obtain a model with strong performance through a small amount of data. Therefore, FSL can play its advantages in some real scene tasks where a large number of training data cannot be obtained. In this review, we will mainly introduce the FSL methods for image classification based on DL, which are mainly divided into four categories: methods based on data enhancement, metric learning, meta-learning and adding other tasks. First, we introduce some classic and advanced FSL methods in the order of categories. Second, we introduce some datasets that are often used to test the performance of FSL methods and the performance of some classical and advanced FSL methods on two common datasets. Finally, we discuss the current challenges and future prospects in this field.
Citation: Wu Zeng, Zheng-ying Xiao. Few-shot learning based on deep learning: A survey[J]. Mathematical Biosciences and Engineering, 2024, 21(1): 679-711. doi: 10.3934/mbe.2024029
[1] | Hang Meng, Shihao Huang, Yifeng Jiang . The role of oxygen vacancies on resistive switching properties of oxide materials. AIMS Materials Science, 2020, 7(5): 665-683. doi: 10.3934/matersci.2020.5.665 |
[2] | Song-Ju Kim, Tohru Tsuruoka, Tsuyoshi Hasegawa, Masashi Aono, Kazuya Terabe, Masakazu Aono . Decision maker based on atomic switches. AIMS Materials Science, 2016, 3(1): 245-259. doi: 10.3934/matersci.2016.1.245 |
[3] | Shuhan Jing, Adnan Younis, Dewei Chu, Sean Li . Resistive Switching Characteristics in Electrochemically Synthesized ZnO Films. AIMS Materials Science, 2015, 2(2): 28-36. doi: 10.3934/matersci.2015.2.28 |
[4] | Julian Konrad, Dirk Zahn . Assessing the mechanical properties of molecular materials from atomic simulation. AIMS Materials Science, 2021, 8(6): 867-880. doi: 10.3934/matersci.2021053 |
[5] | Finn Zahari, Mirko Hansen, Thomas Mussenbrock, Martin Ziegler, Hermann Kohlstedt . Pattern recognition with TiOx-based memristive devices. AIMS Materials Science, 2015, 2(3): 203-216. doi: 10.3934/matersci.2015.3.203 |
[6] | Laura J. Weiser, Erik E. Santiso . Molecular modeling studies of peptoid polymers. AIMS Materials Science, 2017, 4(5): 1029-1051. doi: 10.3934/matersci.2017.5.1029 |
[7] | Jean-Louis Bretonnet . Basics of the density functional theory. AIMS Materials Science, 2017, 4(6): 1372-1405. doi: 10.3934/matersci.2017.6.1372 |
[8] | Grujicic Mica, Ramaswami S., S. Snipes J., Yavari R. . Multi-Scale Computation-Based Design of Nano-Segregated Polyurea for Maximum Shockwave-Mitigation Performance. AIMS Materials Science, 2014, 1(1): 15-27. doi: 10.3934/matersci.2014.1.15 |
[9] | Chih-Chieh Wang, Yu-Fan Wang, Szu-Yu Ke, Yanbin Xiu, Gene-Hsiang Lee, Bo-Hao Chen, Yu-Chun Chuang . Synthesis, structural characterization and thermal stability of a 2D layered Cd(II) coordination polymer constructed from squarate (C4O42−) and 2,2’-bis(2-pyridyl)ethylene (2,2’-bpe) ligands. AIMS Materials Science, 2018, 5(1): 145-155. doi: 10.3934/matersci.2018.1.145 |
[10] | Paola Antoniotti, Paola Benzi, Chiara Demaria, Lorenza Operti, Roberto Rabezzana . Optical spectroscopic characterization of amorphous germanium carbide materials obtained by X-Ray Chemical Vapor Deposition. AIMS Materials Science, 2015, 2(2): 106-121. doi: 10.3934/matersci.2015.2.106 |
In recent years, with the development of science and technology, powerful computing devices have been constantly developing. As an important foundation, deep learning (DL) technology has achieved many successes in multiple fields. In addition, the success of deep learning also relies on the support of large-scale datasets, which can provide models with a variety of images. The rich information in these images can help the model learn more about various categories of images, thereby improving the classification performance and generalization ability of the model. However, in real application scenarios, it may be difficult for most tasks to collect a large number of images or enough images for model training, which also restricts the performance of the trained model to a certain extent. Therefore, how to use limited samples to train the model with high performance becomes key. In order to improve this problem, the few-shot learning (FSL) strategy is proposed, which aims to obtain a model with strong performance through a small amount of data. Therefore, FSL can play its advantages in some real scene tasks where a large number of training data cannot be obtained. In this review, we will mainly introduce the FSL methods for image classification based on DL, which are mainly divided into four categories: methods based on data enhancement, metric learning, meta-learning and adding other tasks. First, we introduce some classic and advanced FSL methods in the order of categories. Second, we introduce some datasets that are often used to test the performance of FSL methods and the performance of some classical and advanced FSL methods on two common datasets. Finally, we discuss the current challenges and future prospects in this field.
Emerging nanoscale resistive switches provide possible solutions for creating future computing architectures that are faster, less expensive and more energy-efficient by exploiting their intrinsic switching characteristics [1,2]. Recent progress in the fabrication of atomic switches [3], as one class of emerging nanodevices, allows the random assembly of these devices into larger networks [4,5,6]. In [7] it was argued that the complexity of these networks resembles that of biological brains, in which the complex morphology and interactions between heterogeneous network elements are responsible for powerful and energy-efficient information processing [8]. Contrary to designed computation, where each device has a specified role, computation in random resistive switch networks does not rely on specific devices, but is encoded in the collective nonlinear dynamic switching behavior as a result of an applied input signal.
Atomic switches, as well as memristive devices [9,10], are history-dependent resistive switches. Application of a bias voltage can change the conductance of the device. The nonlinear relationship between an applied input and a resulting conductance change in these devices perform nonlinear transformations of the input, similar to the dynamics found in biological synapses [11,12,13,14].
Harnessing the intrinsic nonlinear characteristics of these emerging nanodevices assembled in random structures has been shown for the memristor [15,16,17] as well as for the atomic switch [4,5,6,7]. Simulated random memristor networks have been employed to implement reservoir computing (RC) [18,19]. RC is a computational approach initially inspired by cortical microcircuits, in which computation takes place by translating the intrinsic dynamics of an excited medium, called a reservoir, into a desired output. By observing the nonlinear responses of a random resistive switch network to different input signals, it was possible to perform simple pattern classification [15,17] as well as computationally more demanding tasks by using multiple independent random assemblies [16]. In contrast to the simulation-based results, work on atomic switch networks (ASN) has demonstrated physical random assemblies and discussed fabrication parameters that determine the network morphology [7]. Based on these fundamental device and network characteristics that resemble the complexity of biological brains, it was argued that these networks are viable candidates for the physical realizations of brain-inspired information processing, such as reservoir computing.
Here we present a modeling and simulation framework that enables a detailed analysis of resistive switch network morphologies as determined by nanowire lengths, distributions, and density. The computational capabilities of different network morphologies are analyzed with respect to the compressibility of the measured network signals. We put the computational capabilities into perspective by comparing to corresponding energy-consumption data. This comparison outlines a trade-off between computation and energy consumption. A modular approach is presented that provides a more energy-efficient architecture while achieving higher computational capabilities. Our results demonstrate what constitutes computationally useful networks with respect to network morphology, density, and signal amplitudes. Based on the used modeling parameters, future fabrication of random resistive switch networks can be guided to achieve a desired trade-off between computational capacity and energy consumption.
Atomic switch networks have so far been the only physical implementation of random resistive switch networks, and were demonstrated in the context of reservoir computing. In this section we will establish the functional similarities between atomic switches, as one type of resistive switch, and other device types, and argue that the methods and results presented here are valid for a broader range of resistive switches. Detailed comparisons of different implementations of resistive switches can be found in [1,20].
Resistive switches with a metal-insulator-metal (MIM) structure, where the insulator is typically a metal-oxide such as TiO2 or WOx [1,10], establish changes in the device conductance by redistributing oxygen vacancies within the metal-oxide. In the absence of an input bias, the oxygen vacancies remain at their positions, which leads to the history-dependent conductance of the device. Volatile behavior was demonstrated for a WOx device [21]. Here spontaneous diffusion of the oxygen vacancies causes a dissolution of conducting channels in the WOx thin-film and a return to a low-conductance ground state of the device.
The atomic switch as a gap-type device based on crossing Ag2S and Pt wires achieves conductance changes by changing the concentration of Ag+ cations which allows growing metal protrusions of Ag atoms that eventually form a bridge between the two wires. The width of that bridge determines the conductance of the device [20]. In the absence of an input bias, the thermodynamically unstable atomic bridge eventually dissolves and the atomic switch returns to a low conductive equilibrium state [13].
An important functional similarity between MIM and gap-type devices is the nonlinear relation between the applied input and the resulting device state and current. Strukov and Williams presented an exponential ionic drift model that describes how an applied electric field changes the effective activation barrier and velocity for ionic-drift within the metal-oxide [22]. Similarly, Tamura et al. observed the exponential dependence of applied bias and switching time and argued this to be caused by a required minimum activation energy necessary to form a metal bridge within the atomic switch [23].
Throughout this paper we will refer to the modeled devices by their general category of resistive switches. This more general discussion will allow our presented concepts to be translated to a broader range of devices that adhere to the discussed qualitative similarities relevant for nonlinear computing.
In this section we will describe the device model used in our simulations. Our focus is on capturing the fundamental switching characteristics of the discussed devices, not on precisely reproducing empirical data obtained from a specific device. As outlined in the preceding section, resistive switches are characterized by history-dependent nonlinear conductance changes and state decay.
We adopt a memristor model as presented for WOx devices in [1,24]. In the original model the conductance as well as the state change are defined as follows:
@G = \left[{\left( {1 - w} \right) \in \left[{1 - \exp \left( { - \theta V} \right)} \right] + w\gamma \sinh \left( {\delta V} \right)} \right]\frac{1}{V}@ | (1) |
@\frac{{dw}}{{dt}} = \lambda \sinh \left( {\eta V} \right) - \frac{w}{\tau }@ | (2) |
Here the internal device state is modeled by w, V is the applied input bias, @ \in @, @\theta@, @\gamma@, @\delta@, @\lambda@, @\eta@, @\tau@ are the model parameters calculated from the experimental data (detailed experimental data is property of authors of [1,24] and has not been released to the public). The nonlinear switching, as explained by the exponential ionic drift model [22], is modelled by the sinh term. This model captures well the nonlinear switching as a function of the applied input and the current state.
However, such a first-order model, using only one variable to describe the device conductance, does not capture effects such as Ag+ cation concentration changes. Such a change affects a device’s response to future bias signals, but does not reflect into the actual device conductance. In [20] this difference was described as the memristor using a single variable to model the size of an ion-doped area, while the atomic switch uses two variables, one to model the height of an Ag protrusion, and another to model the width of the atomic bridge that emerges after the Ag protrusion has reached a sufficient height. Recently a second-order memristor model was presented that follows a similar modeling approach to the atomic switch [12]. Here the second variable that functions as an enabler for a subsequent change in conductance is the internal device temperature. Application of an applied bias signal increases the internal device temperature due to Joule heating, which in turn affects drift and diffusion processes described earlier.
To account for effects such as Joule heating or Ag protrusions, we use equation 2 to model an internal state w' that does not directly reflect in the device conductance. Furthermore, we extend the model to implement the different state decays based on the device state. As shown in [21,13], the rate of dissolution of the atomic bridge or the diffusion of ions to an equilibrium state is state-dependent and enables short- and long-term memory within a single device. We adopt equation 2 as well as describe state variable w, which models the device conductance as:
@\frac{{dw'}}{{dt}} = \lambda \sinh \left( {\eta V} \right) - \frac{{w'}}{\tau }\left( {1 - w'} \right)@ | (3) |
@w = f\left( {w'} \right)@ | (4) |
For our simulations we employ a binary switching function f that thresholds w' to create two distinct conductances. Binary behavior of atomic switches was shown in [13] and is also found in some memristive devices [25]. Different levels of sub-surface Ag+ concentrations are required to establish or dissolve an atomic bridge (length and width of Ag protrusion) [26]. This implies different threshold values for the internal state variable w' to perform device switching. We model this by applying a hysteresis function to w'.
As random assemblies of resistive switch networks will exhibit variations in nanowire diameters, and hence in the resulting device sizes (e.g., gap size of atomic switches), variation is an integral system characteristic [7]. In [5] device parameter ranges for ASN were shown. These parameter ranges were not simply described by a Gaussian distribution around a mean value, but could span several orders of magnitude. For our experiments we apply a simplified model to create device variation where we draw device parameters uniformly from respective parameter ranges.
While the application of memristive networks to reservoir computing was simulation-based [15,16,17], physical assemblies of atomic switch networks were reported in [4,5,6,7,27]. A simple modeling approach with focus on localized conductance changes of such networks was shown in the supplementary documentation of [5].
Here we expand the modeling of such networks with the aim to investigate the relation between network morphologies, computational capabilities, and energy consumption. Network morphology is defined by the average length of nanowires. Their density can be controlled by the underlying copper seed posts. In [5] the effects of copper seed posts’ shape, size, and pitch on the network’s nanowire length and density were described. Smaller seed posts lead to long wires while larger seed posts lead to more fractal local structures. The pitch of the seed posts is a control parameter for the network density.
The connection of the random assembly of nanowires with the size of the seed posts implies that the distributions of the nanowire lengths can be described in terms of some probability density function. Large seed posts would cause more localized connections while small seed posts would result in longrange connections. Hence, we use a probability density function (PDF) that can capture different distributions. We use a beta-distribution for our purposes. A beta-distribution B produces values in the interval of [0,1] and is controlled by the parameters α and β that define the mean value and the skewness around the mean.
@P(x,α,β)=1B(α,β)xα−1(1−x)β−1(PDF)μ=αα+β(Mean value)S=2(β−α)√α+β+1(α+β+2)√αβ(Skewness) @
|
(5) |
For the modeling of the underlying seed post grid we define two post types. Interface seed posts allow the application and reading of network voltages. These posts are established at a more coarsegrained pitch. The second grid is a supporting grid and more fine-grained than the interface grid. This supporting grid improves the formation of fractal structures for localized wire growth [7] and the establishment of more complex structures between the interface nodes; without a supporting grid establishment of multiple devices between two interface nodes will not be favourable, which limits the morphological diversity. The density of a network is defined by the indegree @\xi@ of a node. The parameter @\xi@ defines the average number of devices connected to each grid node. The number of resulting devices, and hence the density, is then given by the total number nodes in the supporting grid multiplied by @\xi@. The fabrication of networks with different densities was demonstrated in [5]. In Fig.1b we show an example network model with 16 interface nodes, a supporting grid at a third of the pitch of the interface grid, and a mix of short and long-range connections. For all simulations we have used 16 interface nodes arranged in a 4 × 4 grid as shown in Fig.1b. If not mentioned otherwise we have used a supporting grid at half the pitch of the interface grid. Throughout this paper we will refer to this network as of size 161, meaning that it has 16 interface nodes and 1 node between two interface nodes (half the pitch).
For a more intuitive understanding of how the defined network parameters inter-relate, we give five examples and describe the resulting network morphology (Table 1). In Fig.2 and 3, we added the network IDs to the presented results to illustrate the relation between network morphology and information processing capacities.
ID | @\alpha@ | @\beta@ | @\xi@ | Connection length | Density |
N1 | 1 | 1 | 2 | uniform | sparse |
N2 | 1 | 10 | 8 | short | dense |
N3 | 10 | 1 | 4 | long | semi-sparse |
N4 | 10 | 10 | 6 | medium (small @\sigma@) | semi-sparse |
N5 | 5 | 5 | 2 | medium (medium @\sigma@) | sparse |
We simulate the resistive switch networks by treating them as temporarily stationary resistive networks that can be solved efficiently using the modified nodal analysis (MNA) algorithm [28]. After calculating one time step using the MNA, we update the memristive devices based on the node voltages present in the network to account for the dynamic state changes of memristors.
As was outlined by Demis et al. [7], the structural similarities of ASN and biological brains (i.e., fractal branching similar to dendritic trees) suggests that such complex random assemblies provide hardware platforms for efficient brain-inspired computing. A characteristic feature of cognitive architectures is the nonlinear transformation of an input signal into a high-dimensional representation more suitable for information processing [29,30]. In particular, for N > M, the input signal u @\in@ RM is transformed to x @\in@ RN by the dynamics of the cortical microcircuits driven by sensory inputs. In the case of random resistive switch networks this means that the measured signals at the interface nodes are ideally nonlinearly related to the input signal and then provide a platform for brain-inspired computing.
A suitable transformation of the input requires rich dynamics that can preserve relevant distinctions between different input signals in the high-dimensional space [19]. This property is attributed to the dynamics of a system in the critical regime where the distinctions between states do not diverge or converge [31,32,33]. To provide such a rich dynamics, the activity of the nodes should show the least amount of redundancy. In other words, the dynamics of the nodes should be as uncorrelated as possible. The question is how one could measure that and how this measure would be related to the parameters of the system.
Here we introduce a simple measure that can describe the dynamics of random network signals in a way that is meaningful in the context of information processing. We study the compressibility of the network dynamics as a proxy for its richness. Specifically, we use principal component analysis (PCA) to transform the system dynamics into a principal component space [34].
The distribution of variations in the principal component space indicates the amount of redundancy between the different dimensions of the original system. The variation in each principal component is given by the corresponding normalized eigenvalue of the covariance matrix of the system. To calculate these, we record the network dynamics on all interface nodes as a result of an applied network input. All network signals are expressed in the network state matrix X. The covariance matrix of the network dynamics is then given by C = XT X, and the eigenvalues are obtained by diagonalizing, C = UΛU-1. The diagonal elements of Λ, {Λ1, Λ2, ..., ΛN}, are the eigenvalues of the corresponding dimension i and can be normalized as @{\lambda _i} = \frac{{{\Lambda _i}}}{{\sum\nolimits_{i = 1}^N {{\Lambda _i}} }}@. Since @{\lambda _i}@ are normalized as a probability measure, we can describe their evenness using a single number @H = - \sum\nolimits_{i = 1}^N {{\lambda _i}} {\log _2}\left( {{\lambda _i}} \right)@. This is an entropy measure and describes how evenly the @{\lambda _i}@ are distributed. In one extreme case, where the system nodes are all maximally correlated, only one eigenvalue will be 1 and the rest will be zero, and the resulting entropy will be H = 0. In the other extreme case, where the nodes are maximally uncorrelated, the eigenvalues will be identical and equal to @\frac{1}{N}@ , and the resulting entropy will be H = log2 N. In the latter case, every node in the network represents something unique about the properties of the input signal that cannot be described by a combination of the rest of the nodes.
In the following we present entropy measurements as a function of the network morphology. As outlined in section 2.3, the morphology is modeled based on a beta distribution with parameters @\alpha@ and @\beta@ as well as the indegree @\xi@ that defines the device density. We apply a 5Hz sine wave to the upper left interface node of a 161 network and connect the lower right node to ground (0V). Fig.2 shows the network entropies as a function of @\alpha@ , @\beta@, and @\xi@. Across all densities, the highest entropies, and hence the least linearly-dependent network states are achieved with a majority of the nanowires being equal or longer than half the normalized maximum euclidean distance (lower left triangles in Fig.2 where @\alpha@ ≥ @\beta@). Long-range connections spatially distribute the input signal across the network without much voltage drop. Hence, different areas of the network experience a sufficient bias voltage to exhibit switching dynamics useful to information processing. Increasing network densities, which in other words describes the number of devices per area, creates more signal paths and due to device parallelism also higher conductive connections. This also leads to better distribution of the input signal and to an expansion of the morphologies for which larger entropies can be achieved.
Besides the network morphology, the amplitude v of the input signal also greatly affects the network dynamics due to the exponential dependence of applied bias and either activation energy to form an atomic bridge (for atomic switches) or the velocity of ionic drift (for memristive devices). In Fig.3 we show the entropy as a function of @\alpha@, @\beta@, and v for networks with an indegree @\xi@ = 6. It can be seen that the average entropy increases as we increase v. This is related to larger voltages enabling more devices to exhibit switching activity and hence affect the network dynamics. The similarity in the plots as compared to Fig.2 implies that increased input voltages also allow better spatial distribution of the input and more areas of the network to exhibit switching activity. Note that these plots present qualitative results on the dependence of v and entropy, but the absolute values of v are device-dependent and can change for devices with different threshold behavior.
We also studied network sizes 160 and 162. The 160 networks have shown very similar results to the 161 networks presented here. The morphological similarity between the 160 and the more globally connected 161 networks is that they do not form many local fractal structures. Contrary, the 162 networks are characterized by more complex morphologies between the interface nodes. While this might closely resemble complex structures in biological brains, in the context of passive electrical networks, these fine-grained fractal structures with many devices in between interface nodes lead to alleviated individual switching dynamics. While this could be circumvented in simulations by increasing the input amplitude v, in practice this is not a viable approach for reasons of safe operation and energy consumption.
In the previous section we have shown that best computational capabilities are achieved for dense, globally connected networks (@\alpha@ ≥ @\beta@) and larger signal amplitudes v. The viability of these results has to be evaluated with respect to the energy that would be consumed by such networks. Based on the application of a 5 Hz sine wave with amplitude v, we calculate the total energy consumption of a network over time T as @E\left( t \right) = \int_{i = 0}^T {{V_i}} \left( t \right){I_i}\left( t \right)dt@, with V(t) being the time-dependent signal amplitude and I(t) the current drawn by the network. As the energy consumption is very application-specific, we consider our findings as qualitative measures that highlight the general relations between the network parameters and the consumed energy. Absolute energy numbers presented here are not of relevance, only the information on how drastically energy consumption changes with network parameters. Fig.4a shows the resulting energy consumption for different @\alpha@ and @\beta@, averaged over different @\xi@ and v. The distributions of the energy data across the @\alpha@ and @\beta@ plane resemble the entropy distributions seen in Fig.2 and 3, which confirms that high entropies come at the cost of high energy consumption (relative to the consumption at low entropies) caused by high conducting paths in denser networks. This finding is further supported by plotting energy vs. entropy (Fig.4b). Here we plot the averaged energy against the averaged entropy for corresponding setups. We can see how entropy grows with energy. However, as the energy grows exponentially, increasing entropy can lead to over-proportional energy consumption.
As the computational performance of a single random network is limited by exponentially increasing energy consumption or strong linear dependence at low energies, we will outline an approach to increase entropy with linear growth of energy. In [27] a hierarchical approach of ASN was presented that combined multiple small-world networks on a single chip. Similarly, we have shown a hierarchical approach that embedded independent networks (similar sizes as presented here) in a reservoir computing architecture and showed that a memory and computationally demanding application could be solved [16].
The concept relies on extracting only a subset of signals from each independent network. This allows harnessing the different processing caused by the different random structures of the independent networks. Furthermore, in [16] we have extracted a differential signal from each network. By retrieving a network output as the difference of two network nodes, differences in the spatial and temporal dynamics within networks can be better captured than using signals measured with respect to a common ground. In Fig.5 we show two examples of 16 network signals. Fig.5b shows signals obtained from a single random network (qualitatively comparable to signals presented in [7]). Readouts from a single network show some nonlinearities; however, they are mostly characterized by strong linear dependence across the signals (low entropy of 0:37). In contrast, with 16 independent networks exposed to the same input, we can significantly increase the richness of the measured signals. In this example the entropy is 1:79. The total energy consumption of the hierarchical independent networks scales linearly with the number of networks. Considering the gain in entropy, this approach poses a much more viable approach for brain-inspired information processing than relying on single networks with high density and high signal amplitudes. A comparison of how energy and entropy relate in the two presented setups is shown in Fig.6. The plotted data represents collected data from single networks as well as for the 16 networks approach with @\xi@ = 4 and increasing v. The circled areas approximately mark the energy and entropy data obtained with a 2V input signal. The average energy difference is 16 which corresponds to the number of networks used. However, compared with single networks that exhibit similar energy consumption (around v = 4::6) significantly higher entropies can be achieved.
We have presented a detailed analysis of how the morphology of random resistive switch networks affects their computational capacity and energy consumption. It will be an application-specific choice of how much entropy is required by the individual networks and what costs one is willing to pay in terms of energy. The hierarchical approach and the better entropy per energy ratio is possible as we utilized different independent networks. This implies that cognitive computing is not merely the product of sufficient excitement of network elements but is also rooted in the heterogeneous morphologies across networks, as demonstrated for neural populations in cortical microcircuits [35]. This ability to harness randomness is a strong argument for the future fabrication of nanoscale resistive switch networks. We found that networks with strong fractal structures (i.e., 162 networks) performed worse. However, in biological brains such fractal structures (dendrites) are an integral part of the information processing. Biological dendrites are active components with voltage- and calcium-gated ion channels that regulate synaptic responses [36], while the fractal structures in resistive switch networks are purely passive elements. Further investigation of the functional differences might provide better fabrication techniques and methods to utilize fractal resistive switch networks.
The nonlinear dependence of input bias and state transitions in resistive switches allows random assemblies of such devices to exhibit complex behavior useful to computation. The design parameters used to control the simulated network morphologies relate to design parameters for the physical fabrication of the networks. We have shown how network morphologies correlate with computational capacities and energy consumption. A hierarchical approach was presented that allows us to increase computational capacities more energy-efficiently as compared to increasing information processing in single networks. These findings provide insights in the abilities and limitations of random nanoscale resistive switch networks and should serve as a design guide for investigation and fabrication of future nanoscale computing architectures.
This work was supported by the National Science Foundation under awards # 1028378, # 1028238, and # 1318833, and by DARPA under award # HR0011-13-2-0015. The views expressed are those of the author(s) and do not reflect the official policy or position of the Department of Defense or the U.S. Government. Approved for Public Release, Distribution Unlimited.
The authors declare no conflict of interest in this paper.
[1] |
H. E. Kim, A. Cosa-Linan, N. Santhanam, M. Jannesari, M. E. Maros, T. Ganslandt, Transfer learning for medical image classification: A literature review, BMC Med. Imaging, 22 (2022), 69. https://doi.org/10.1186/s12880-022-00793-7 doi: 10.1186/s12880-022-00793-7
![]() |
[2] |
Z. X. Zou, K. Y. Chen, Z. W. Shi, Y. H. Guo, J. P. Ye, Object detection in 20 years: A survey, Proc. IEEE, 111 (2023), 257–276. https://doi.org/10.1109/JPROC.2023.3238524 doi: 10.1109/JPROC.2023.3238524
![]() |
[3] | H. Q. Zhao, W. B. Zhou, D. D. Chen, T. Y. Wei, N. H. Yu, Multi-attentional deepfake detection, in 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), IEEE 8 (2021), 2185–2194. https://doi.org/10.1109/CVPR46437.2021.00222 |
[4] | I. Goodfellow, P. A. Jean, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, et al., Generative adversarial nets, in Advances in Neural Information Processing Systems, 27 (2014), 1–9. |
[5] |
B. Pandey, D. K. Pandey, B. P. Mishra, W. Rhmann, A comprehensive survey of deep learning in the field of medical imaging and medical natural language processing: Challenges and research directions, J. King Saud Univ. Comput. Inf. Sci., 34 (2022), 5083–5099. https://doi.org/10.1016/j.jksuci.2021.01.007 doi: 10.1016/j.jksuci.2021.01.007
![]() |
[6] | P. Li, X. H. Xu, Recurrent compressed convolutional networks for short video event detection, in IEEE Access, 8 (2020), 114162–114171. https://doi.org/10.1109/ACCESS.2020.3003939 |
[7] |
P. Li, Q. H. Ye, L. M. Zhang, L.Yuan, X. H. Xu, L. Shao, Exploring global diverse attention via pairwise temporal relation for video summarization, Pattern Recogn., 111 (2021), 107677. https://doi.org/10.1016/j.patcog.2020.107677 doi: 10.1016/j.patcog.2020.107677
![]() |
[8] |
P. Li, P. Zhang, T. Wang, H. X. Xiao, Time–frequency recurrent transformer with diversity constraint for dense video captioning, Inform. Process. Manag., 60 (2023), 103204. https://doi.org/10.1016/j.ipm.2022.103204 doi: 10.1016/j.ipm.2022.103204
![]() |
[9] |
P. Li, J. C. Cao, L. Yuan, Q. H. Ye, X. H. Xu, Truncated attention-aware proposal networks with multi-scale dilation for temporal action detection, Pattern Recogn., 142 (2023), 109684. https://doi.org/10.1016/j.patcog.2023.109684 doi: 10.1016/j.patcog.2023.109684
![]() |
[10] |
P. Li, Y. Zhang a, L. Yuan, H. X. Xiao, B. B. Lin, X. H. Xu, Efficient long-short temporal attention network for unsupervised video object segmentation, Pattern Recogn., 146 (2024), 110078. https://doi.org/10.1016/j.patcog.2023.110078 doi: 10.1016/j.patcog.2023.110078
![]() |
[11] |
K. Feng, J. C. Ji, Y. C. Zhang, Q. Ni, Z. Liu, M. Beer, Digital twin-driven intelligent assessment of gear surface degradation, Mechan. Syst. Signal Process., 186 (2023), 109896. https://doi.org/10.1016/j.ymssp.2022.109896 doi: 10.1016/j.ymssp.2022.109896
![]() |
[12] |
Y. D. Xu, K. Feng, X. A. Yan, R. Q. Yan, Q. Ni, B. B. Sun, et al., CFCNN: A novel convolutional fusion framework for collaborative fault identification of rotating machinery, Inform. Fusion, 95 (2023), 1–16. https://doi.org/10.1016/j.inffus.2023.02.012 doi: 10.1016/j.inffus.2023.02.012
![]() |
[13] | K. Feng, Y. D. Xu, Y. L. Wang, S. Li, Q. B. Jiang, B. B. Sun, et al., Digital twin enabled domain adversarial graph networks for bearing fault diagnosis, in IEEE Transactions on Industrial Cyber-Physical Systems, 1 (2023), 113–122. https://doi.org/10.1109/TICPS.2023.3298879 |
[14] |
O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, et al., ImageNet large scale visual recognition challenge, Int J Comput Vis, 115 (2015), 211–252. https://doi.org/10.1007/s11263-015-0816-y doi: 10.1007/s11263-015-0816-y
![]() |
[15] | K. M. He, X. Y. Zhang, S. Q. Ren, J. Sun, Deep residual learning for image recognition, in 2016 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, (2016), 770–778. https://doi.org/10.1109/CVPR.2016.90 |
[16] | A. G. Howard, M. L. Zhu, B. Chen, D. Kalenichenko, W. J. Wang, T. Weyand, et al., MobileNets: Efficient convolutional neural networks for mobile vision applications, preprint, arXiv: 1704.04861. |
[17] | X. Y. Zhang, X. Y. Zhou, M. X. Lin, J. Sun, ShuffleNet: An extremely efficient convolutional neural network for mobile devices, in 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, (2018), 6848–6856. https://doi.org/10.1109/CVPR.2018.00716 |
[18] | G. Huan, Z. Liu, L. V. D. Maaten, K. Q. Weinberger, Densely connected convolutional networks, in 2017 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, (2017), 2261–2269. https://doi.org/10.1109/CVPR.2017.243 |
[19] | W. H. Yu, M. Luo, P. Zhou, C. Y. Si, Y. C. Zhou, X. C. Wang, et al., MetaFormer is actually what you need for vision, in 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, (2022), 10809–10819. https://doi.org/10.1109/CVPR52688.2022.01055 |
[20] | Y. P. Chen, X. Y. Dai, D. D. Chen, M. C. Liu, X. Dong, L. Yuan, et al., Mobile-former: Bridging mobilenet and transforme, in 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, (2022), 5270–5279. https://doi.org/10.1109/CVPR52688.2022.00520 |
[21] |
Y. T. Vuong, Q. M. Bui, H. Nguyen, T. Nguyen, V. Tran, X. Phan, et al., SM-BERT-CR: A deep learning approach for case law retrieval with supporting model, Artif. Intell. Law, 31 (2023), 601–628. https://doi.org/10.1007/s10506-022-09319-6 doi: 10.1007/s10506-022-09319-6
![]() |
[22] | J. Deng, W. Dong, R. Socher, L. J. Li, K. Li, F. F. Li, ImageNet: A large-scale hierarchical image database, in 2009 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, (2009), 248–255. https://doi.org/10.1109/CVPR.2009.5206848 |
[23] | T. Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, et al., Microsoft COCO: Common objects in context, in 2014 European conference computer vision (ECCV), (2014), 740–755. https://doi.org/10.1007/978-3-319-10602-1_48 |
[24] |
J. C. Yang, X. L. Guo, Y. Li, F. Marinello, S. Ercisli, Z. Zhang, A survey of few-shot learning in smart agriculture: developments, applications and challenges, Plant Methods., 18 (2022), 28. https://doi.org/10.1186/s13007-022-00866-2 doi: 10.1186/s13007-022-00866-2
![]() |
[25] |
J. D. Chen, J. X. Chen, D.F. Zhang, Y. D. Sun, Y. A. Nanehkaran, Using deep transfer learning for image-based plant disease identification, Comput. Electron. Agri., 173 (2020), 105393. https://doi.org/10.1016/j.compag.2020.105393 doi: 10.1016/j.compag.2020.105393
![]() |
[26] |
S. Q. Jiang, W. Q. Min, Y. Q. Lyu, L. H. Liu, Few-shot food recognition via multi-view representation learning, ACM Transact. Multi. Comput. Commun. Appl., 16 (2020), 1–20. https://doi.org/10.1145/3391624 doi: 10.1145/3391624
![]() |
[27] |
J. Yang, X. M. Wang, Z. P. Luo, Few-shot remaining useful life prediction based on meta-learning with deep sparse kernel network, Inform. Sci., 653 (2024), 119795. https://doi.org/10.1016/j.ins.2023.119795 doi: 10.1016/j.ins.2023.119795
![]() |
[28] |
Y. Q. Wang, Q. M. Yao, J. T. Kwok, L. M. Ni, Generalizing from a few examples: A survey on few-shot learning, ACM Comput. Surveys, 53 (2020), 1–34. https://doi.org/10.1145/3386252 doi: 10.1145/3386252
![]() |
[29] | J. Lu, P. H. Gong, J. P. Ye, C. H. Zhang, Learning from very few samples: A survey, preprint, arXiv: 2009.02653. |
[30] |
X. X. Li, X. C. Yang, Z. Y. Ma, J. H. Xue, Deep metric learning for few-shot image classification: A Review of recent developments, Pattern Recogn., 138 (2023), 109381. https://doi.org/10.1016/j.patcog.2023.109381 doi: 10.1016/j.patcog.2023.109381
![]() |
[31] | A. Dabouei, S. Soleymani, F. Taherkhani, N. M. Nasrabadi, SuperMix: Supervising the mixing data augmentation, in 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, (2021), 13789–13798. https://doi.org/10.1109/CVPR46437.2021.01358 |
[32] | M. Hong, J. Choi, G. Kim, StyleMix: Separating content and style for enhanced data augmentation, in 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, (2021), 14857–14865. https://doi.org/10.1109/CVPR46437.2021.01462 |
[33] |
N. E. Khalifa, M. Loey, S. Mirjalili, A comprehensive survey of recent trends in deep learning for digital images augmentation, Artif. Intell. Rev., 55 (2022), 2351–2377. https://doi.org/10.1007/s10462-021-10066-4 doi: 10.1007/s10462-021-10066-4
![]() |
[34] | E. D. Ubuk, B. Zoph, D. Mané, V. Vasudevan, Q. V. Le, AutoAugment: learning augmentation strategies from data, in 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, (2021), 113–123. https://doi.org/10.1109/CVPR.2019.00020 |
[35] | T. DeVries, G. W. Taylor, Improved regularization of convolutional neural networks with cutout, preprint, arXiv: 1708.04552. |
[36] | J. Y. Zhu, T. Park, P. Isola, A. A. Efros, Unpaired image-to-image translation using cycle-consistent adversarial networks, in 2017 IEEE International Conference on Computer Vision (ICCV), IEEE, (2017), 2242–2251. https://doi.org/10.1109/ICCV.2017.244 |
[37] | T. Karras, T. Aila, S. Laine, J. Lehtinen, Progressive growing of GANs for improved quality, stability and variation, preprint, arXiv: 1710.10196. |
[38] | Z. T. Chen, Y. W. Fu, Y. X. Wang, L. Ma, W. Liu, M. Hebert, Image deformation meta-networks for one-Shot learning, in 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, (2019), 8672–8681. https://doi.org/10.1109/CVPR.2019.00888 |
[39] | S. Yun, D. Han, S. Chun, S. J. Oh, S. Chun, J. Choe, Y. Yoo, CutMix: Regularization strategy to train strong classifiers with localizable features, in 2019 IEEE/CVF International Conference on Computer Vision (ICCV), IEEE, (2019), 6022–6031. https://doi.org/10.1109/ICCV.2019.00612 |
[40] | S. Khodadadeh, L. Boloni, M. Shah, Unsupervised meta-learning for few-shot image classification, in 2019 Advances in Neural Information Processing Systems (NIPS), (2019). |
[41] | A. Antoniou, A. Storkey, Assume, augment and learn: Unsupervised few-shot meta-learning via random labels and data augmentation, preprint, arXiv: 1902.09884. |
[42] | T. X. Qin, W. B. Li, Y. H. Shi, Y. Gao, Diversity helps: Unsupervised few-shot learning via distribution shift-based data augmentation, preprint, arXiv: 2004.05805. |
[43] |
H. Xu, J. X. Wang, H. Li, D. Q. Ouyang, J. Shao, Unsupervised meta-learning for few-shot learning, Pattern Recogn., 116 (2021), 107951. https://doi.org/10.1016/j.patcog.2021.107951 doi: 10.1016/j.patcog.2021.107951
![]() |
[44] | M. Tao, H. Tang, F. Wu, X. Y. Jing, B. K. Bao, C. S. Xu, DF-GAN: A simple and effective baseline for text-to-image synthesis, in 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, (2022), 16494–16504. https://doi.org/10.1109/CVPR52688.2022.01602 |
[45] | W. T. Liao, K. Hu, M. Y. Yang, B. Rosenhahn, Text to image generation with semantic-spatial aware GAN, in 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, (2022), 18166–18175. https://doi.org/10.1109/CVPR52688.2022.01765 |
[46] | X. T. Wu, H. B. Zhao, L. L. Zheng, S. H. Ding, X. Li, Adma-GAN: Attribute-driven memory augmented GANs for text-to-image generation, in Proceedings of the 30th ACM International Conference on Multimedia, ACM, (2022), 1593–1602. https://doi.org/10.1145/3503161.3547821 |
[47] | A. Mehrotra, A. Dukkipati, Generative adversarial residual pairwise networks for one shot learning, preprint, arXiv: 1703.08033. |
[48] | Y. X. Wang, R. Girshick, M. Hebert, B. Hariharan, Low-shot learning from imaginary data, in 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, (2018), 7278–7286. https://doi.org/10.1109/CVPR.2018.00760 |
[49] | R. X. Zhang, T. Che, Z. Ghahramani, Y. Bengio, Y. Q. Song, MetaGAN: An adversarial approach to few-Shot learning, in 2018 Advances in Neural Information Processing Systems (NIPS), (2018). |
[50] | E. Schwartz, L. Karlinsky, J. Shtok, S. Harary, M. Marder, A. Kumar, et al., Delta-encoder: an effective sample synthesis method for few-shot object recognition, in 2018 Advances in Neural Information Processing Systems (NIPS), (2018). |
[51] | Y. Q. Xian, S. Sharma, B. Schiele, Z. Akata, F-VAEGAN-D2: A Feature generating framework for any-shot learning, in 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, (2019), 10267–102765. https://doi.org/10.1109/CVPR.2019.01052 |
[52] | K. Li, Y. L. Zhang, K. P. Li, Y. Fu, Adversarial feature hallucination networks for few-shot learning, in 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, (2020), 13467–13476. https://doi.org/10.1109/CVPR42600.2020.01348 |
[53] | F. Pahde, P. Jähnichen, T. Klein, M. Nabi, Cross-modal hallucination for few-shot fine-grained recognition, preprint, arXiv: 1806.05147. |
[54] | M. Dixit, R. Kwitt, M. Niethammer, N. Vasconcelos, AGA: Attribute-guided augmentation, in 2017 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, (2017), 3328–3336. https://doi.org/10.1109/CVPR.2017.355 |
[55] | B. Liu, X. D. Wang, M. Dixit, R. Kwitt, N. Vasconcelos, Feature space transfer for data augmentation, in 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, (2018), 9090–9098. https://doi.org/10.1109/CVPR.2018.00947 |
[56] | Z. T. Chen, Y. W. Fu, Y. D. Zhang, Y. G. Jiang, X. Y. Xue, L. Sigal, Multi-level semantic feature augmentation in few-shot learning, preprint, arXiv: 1804.05298. |
[57] | H. G. Zhang, J. Zhang, P. Koniusz, Few-shot learning via saliency-guided hallucination of samples, in 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, (2019), 2765–2774. https://doi.org/10.1109/CVPR.2019.00288 |
[58] | G. Koch, R. Zemel, R. Salakhutdinov, Siamese neural networks for one-shot image recognition, in 2015 International Conference on Machine Leaning (ICML), (2015). |
[59] | O. Vinyals, C. Blundell, T. Lillicrap, K. Kavukcuoglu, D. Wierstra, Matching networks for one shot learning, in 2019 Advances in Neural Information Processing Systems (NIPS), (2019). |
[60] | J. Snell, K. Swersky, R. Zemel, Prototypical networks for few-shot learning, in 2017 Advances in Neural Information Processing Systems (NIPS), (2017). |
[61] | F. Sung, Y. X. Yang, Li, Zhang, T. Xiang, P. H.S. Torr, T. M. Hospedales, Learning to compare: Relation network for few-shot learning, in 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, (2018), 1199–1208. https://doi.org/10.1109/CVPR.2018.00131 |
[62] | W. B. Li, L. Wang, J. L. Xu, J. Huo, Y. Gao, J. B. Luo, Revisiting local descriptor based image-to-class measure for few-shot learning, in 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, (2019), 7253–7260. https://doi.org/10.1109/CVPR.2019.00743 |
[63] | Y. B. Liu, J. H. Lee, M. Park, S. Kim, E. Yang, S. J. Hwang, et al., Learning to propagate labels: Transductive propagation network for few-shot learning, preprint, arXiv: 1805.10002. |
[64] | C. Simon, P. Koniusz, R. Nock, M. Harandi, Adaptive Subspaces for Few-Shot Learning, in 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, (2020), 4135–4144. https://doi.org/10.1109/CVPR42600.2020.00419 |
[65] | K. Allen, E. Shelhamer, H. Shin, J. Tenenbaum, Infinite mixture prototypes for few-shot learning, in 2019 International Conference on Machine Leaning (ICML), (2019), 232–241. |
[66] | C. Xing, N. Rostamzadeh, B. Oreshkin, P. O. O. Pinheiro, Adaptive cross-modal few-shot learning, in 2019 Advances in Neural Information Processing Systems (NIPS), (2019). |
[67] |
X. M. Li, L. Q. Yu, C. W. Fu, M. Fang, P.-A. Heng, Revisiting metric learning for few-shot image classification, Neurocomputing, 406 (2020), 49–58. https://doi.org/10.1016/j.neucom.2020.04.040 doi: 10.1016/j.neucom.2020.04.040
![]() |
[68] | S. P. Yan, S. Y. Zhang, X. M. He, A dual attention network with semantic embedding for few-shot learning, in 2019 Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), (2019), 9079–9086. https://doi.org/10.1609/aaai.v33i01.33019079 |
[69] |
P. Li, G. P. Zhao, X. H. Xu, Coarse-to-fine few-shot classification with deep metric learning, Inform.n Sci., 610 (2022), 592–604. https://doi.org/10.1016/j.ins.2022.08.048 doi: 10.1016/j.ins.2022.08.048
![]() |
[70] | T. Y. Gao, X. Han, Z. Y. Liu, M. S. Sun, Hybrid attention-based prototypical networks for noisy few-shot relation classification, in 2019 Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), (2019), 6407–6414. https://doi.org/10.1609/aaai.v33i01.33016407 |
[71] | B. Oreshkin, P. R. López, A. Lacoste, Tadam: Task dependent adaptive metric for improved few-shot learning, in 2018 Advances in Neural Information Processing Systems (NIPS), (2018) |
[72] | H. Y. Li, D. Eigen, S. Dodge, M. Zeiler, X. G. Wang, Finding task-relevant features for few-shot learning by category traversal, in 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, (2019), 1–10. https://doi.org/10.1109/CVPR.2019.00009 |
[73] | F. Y. Yang, R. P. Wang, X. L. Chen, SEGA: Semantic guided attention on visual prototype for few-shot learning, in 2022 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), IEEE, (2022), 1586–1596. https://doi.org/10.1109/WACV51458.2022.00165 |
[74] | R. B. Hou, H. Chang, B. P. Ma, S. G. Shan, X. L. Chen, Cross attention network for few-shot classification, in 2019 Advances in Neural Information Processing Systems (NIPS), (2019). |
[75] | A. Santoro, S. Bartunov, M. Botvinick, D. Wierstra, T. Lillicrap, One-shot with memory-augmented neural networks, preprint, arXiv: 1605.06065. |
[76] | C. Finn, P. Abbeel, S. Levine, Model-agnostic meta-learning for fast adaptation of deep networks, in 2017 International Conference on Machine Leaning (ICML), (2017), 1126–1135. |
[77] | A. Nichol, J. Achiam, J. Schulman, On first-order meta-learning algorithms, preprint, arXiv: 1803.02999. |
[78] | A. Antoniou, H. Edwards, A. Storkey, How to train your MAML, preprint, arXiv: 1810.09502. |
[79] | S. Ravi, H. Larochelle, Optimization as a model for few-shot learning, in 2017 International Conference on Learning Representations (ICLR), (2017) |
[80] | S. Gidaris, N. Komodakis, Dynamic few-shot visual learning without forgetting, in 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, (2018), 4367–4375. https://doi.org/10.1109/CVPR.2018.00459 |
[81] | Q. R. Sun, Y. Y. Liu, T. S. Chua, B. Schiele, Meta-transfer learning for few-shot learning, in 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, (2019), 403–412. https://doi.org/10.1109/CVPR.2019.00049 |
[82] | H. J. Ye, H. X. Hu, D. C. Zhan, F. Sha, Few-shot learning via embedding adaptation with set-to-set functions, in 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, (2020), 8805–8814. https://doi.org/10.1109/CVPR42600.2020.00883 |
[83] | K. Lee, S. Maji, A. Ravichandran, S. Soatto, Meta-learning with differentiable convex optimization, in 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, (2019), 10649–10657. https://doi.org/10.1109/CVPR.2019.01091 |
[84] | C. Zhang, H. H. Ding, G. S. Lin, R. B. Li, C. H. Wang, C. H. Shen, Meta navigator: Search for a Good Adaptation Policy for Few-shot Learning, in 2021 IEEE/CVF International Conference on Computer Vision (ICCV), IEEE, (2021), 9415–9424. https://doi.org/10.1109/ICCV48922.2021.00930 |
[85] | A. Aimen, S. Sidheekh, N. C. Krishnan, Task attended meta-learning for few-shot learning, preprint, arXiv: 2106.10642. |
[86] |
R. Krishnan, P. Rajpurkar, E. J. Topol, Self-supervised learning in medicine and healthcare, Nature Biomedical Engineering., 6 (2022), 1346–1352. https://doi.org/10.1038/s41551-022-00914-1 doi: 10.1038/s41551-022-00914-1
![]() |
[87] | S. Gidaris, P. Singh, N. Komodakis, Unsupervised representation learning by predicting image rotations, preprint, arXiv: 1803.07728. |
[88] | W. X. Wang, J. Li, H. Ji, Self-supervised deep image restoration via adaptive stochastic gradient langevin dynamics, in 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, (2022), 1979–1988. https://doi.org/10.1109/CVPR52688.2022.00203 |
[89] | H. Q. Wang, X. Guo, Z. H. Deng, Y. Lu, Rethinking minimal sufficient representation in contrastive learning, in 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, (2022), 16020-16029. https://doi.org/10.1109/CVPR52688.2022.01557 |
[90] | M. L. Zhang, J. H. Zhang, Z. W. Lu, T. Xiang, M. Y. Ding, S. F. Huang, IEPT: Instance-Level and Episode-Level Pretext Tasks for Few-Shot Learning, in 2021 International Conference on Learning Representations (ICLR), (2021) |
[91] | X. Luo, Y. X. Chen, L. J. Wen, L. L. Pan, Z. L. Xu, Boosting few-shot classification with view-learnable contrastive learning, in 2021 IEEE International Conference on Multimedia and Expo (ICME), IEEE, (2021), 1–6. https://doi.org/10.1109/ICME51207.2021.9428444 |
[92] |
T. Lee, S. Yoo, Augmenting few-shot learning with supervised contrastive learning, IEEE Access., 9 (2021), 61466-61474. https://doi.org/10.1109/ACCESS.2021.3074525 doi: 10.1109/ACCESS.2021.3074525
![]() |
[93] | Z. Y. Yang, J. H. Wang, Y. Y. Zhu, Few-shot classification with contrastive learning, in 2022 European conference computer vision (ECCV), (2022), 293–309. https://doi.org/10.1007/978-3-031-20044-1_17 |
[94] | Y. N. Lu, L. J. Wen, J. Z. Liu, Self-supervision can be a good few-shot learner, in 2022 European conference computer vision (ECCV), (2022), 740–758. https://doi.org/10.1007/978-3-031-19800-7_43 |
[95] | S. Fort, Gaussian prototypical networks for few-shot learning on omniglot, preprint, arXiv: 1708.02735. |
[96] | L. Bertinetto, J. F. Henriques, P. H.S. Torr, A. Vedaldi, Meta-learning with differentiable closed-form solvers, preprint, arXiv: 1805.08136. |
[97] | C. Wah, S. Branson, P. Welinder, P. Perona, S. Belongie, The caltech-ucsd birds-200-2011 dataset: Technical report CNS-TR-2011-001, (2011), 1–8. |
[98] | A. Khosla, N. Jayadevaprakash, B. P. Yao, F. F. Li, Novel dataset for fine-grained image categorization: stanford dogs, CVPR Workshop on Fine-Grained Visual Categorization., 2 (2021). |
[99] | M. Y. Ren, E. Triantafillou, S. Ravi, J. Snell, K. Swersky, J. B. Tenenbaum, et al., Meta-learning for semi-supervised few-shot classification, preprint, arXiv: 1803.00676. |
[100] | G. Liu, L. L. Zhao, W. Li, D. S. Guo, X. Z. Fang, Class-wise Metric Scaling for Improved Few-Shot Classification, in 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, (2021), 586–595. https://doi.org/10.1109/WACV48630.2021.00063 |
1. | Alireza Goudarzi, Christof Teuscher, 2016, Reservoir Computing, 9781450340618, 1, 10.1145/2967446.2967448 | |
2. | Diana Prychynenko, Matthias Sitte, Kai Litzius, Benjamin Krüger, George Bourianoff, Mathias Kläui, Jairo Sinova, Karin Everschor-Sitte, Magnetic Skyrmion as a Nonlinear Resistive Element: A Potential Building Block for Reservoir Computing, 2018, 9, 2331-7019, 10.1103/PhysRevApplied.9.014034 | |
3. | Gouhei Tanaka, Toshiyuki Yamane, Jean Benoit Héroux, Ryosho Nakane, Naoki Kanazawa, Seiji Takeda, Hidetoshi Numata, Daiju Nakano, Akira Hirose, Recent advances in physical reservoir computing: A review, 2019, 115, 08936080, 100, 10.1016/j.neunet.2019.03.005 | |
4. | Gouhei Tanaka, Ryosho Nakane, Toshiyuki Yamane, Seiji Takeda, Daiju Nakano, Shigeru Nakagawa, Akira Hirose, 2017, Chapter 48, 978-3-319-70092-2, 457, 10.1007/978-3-319-70093-9_48 | |
5. | George Bourianoff, Daniele Pinna, Matthias Sitte, Karin Everschor-Sitte, Potential implementation of reservoir computing models based on magnetic skyrmions, 2018, 8, 2158-3226, 055602, 10.1063/1.5006918 | |
6. | Dat Tran, Christof Teuscher, Computational Capacity of Complex Memcapacitive Networks, 2021, 17, 1550-4832, 1, 10.1145/3445795 | |
7. | Gouhei Tanaka, Ryosho Nakane, Simulation platform for pattern recognition based on reservoir computing with memristor networks, 2022, 12, 2045-2322, 10.1038/s41598-022-13687-z | |
8. | Sam Lilak, Walt Woods, Kelsey Scharnhorst, Christopher Dunham, Christof Teuscher, Adam Z. Stieg, James K. Gimzewski, Spoken Digit Classification by In-Materio Reservoir Computing With Neuromorphic Atomic Switch Networks, 2021, 3, 2673-3013, 10.3389/fnano.2021.675792 | |
9. | Heng Zhang, Danilo Vasconcellos Vargas, A Survey on Reservoir Computing and its Interdisciplinary Applications Beyond Traditional Machine Learning, 2023, 11, 2169-3536, 81033, 10.1109/ACCESS.2023.3299296 |
ID | @\alpha@ | @\beta@ | @\xi@ | Connection length | Density |
N1 | 1 | 1 | 2 | uniform | sparse |
N2 | 1 | 10 | 8 | short | dense |
N3 | 10 | 1 | 4 | long | semi-sparse |
N4 | 10 | 10 | 6 | medium (small @\sigma@) | semi-sparse |
N5 | 5 | 5 | 2 | medium (medium @\sigma@) | sparse |