US Implied Volatility as A predictor of International Returns

Mehmet F. Dicle; Mehmet F. Dicle

doi:10.3934/QFE.2017.4.388

Quantitative Finance and Economics

2017, Volume 1, Issue 4: 388-402. doi: 10.3934/QFE.2017.4.388

Previous Article Next Article

Research article Special Issues

US Implied Volatility as A predictor of International Returns

Mehmet F. Dicle ^,

Loyola University New Orleans, College of Business, 6363 St. Charles Avenue, Box 15, NewOrleans, LA 70118, USA

Received: 03 August 2017 Accepted: 06 November 2017 Published: 13 December 2017

This study provides evidence of the US implied volatility's e ect on international equity markets' returns. This evidence has two main implications: ⅰ) investors may find that foreign equity returns adjusting to US implied volatility may not provide true diversification benefits, and ⅱ) foreign equity returns may be predicted using US implied volatility. Our sample includes US volatility index (VIX) and major equity indexes in twenty countries for the period between January, 2000 through July, 2017. VIX leads eighteen of the international markets and Granger causes seventeen of the markets after controlling for the S & P-500 index returns and the 2007/2008 US financial crisis. US investors looking to diversify US risk may find that international equities may not provide intended diversification benefits. Our evidence provides support for predictability of international equity returns based on US volatility.

Keywords:

Citation: Mehmet F. Dicle. US Implied Volatility as A predictor of International Returns[J]. Quantitative Finance and Economics, 2017, 1(4): 388-402. doi: 10.3934/QFE.2017.4.388

Related Papers:

[1]	Manisha Bangar, Prachi Chaudhary . A novel approach for the classification of diabetic maculopathy using discrete wavelet transforms and a support vector machine. AIMS Electronics and Electrical Engineering, 2023, 7(1): 1-13. doi: 10.3934/electreng.2023001
[2]	Mamun Mishra, Bibhuti Bhusan Pati . A hybrid IDM using wavelet transform for a synchronous generator-based RES with zero non-detection zone. AIMS Electronics and Electrical Engineering, 2024, 8(1): 146-164. doi: 10.3934/electreng.2024006
[3]	K.V. Dhana Lakshmi, P.K. Panigrahi, Ravi kumar Goli . Machine learning assessment of IoT managed microgrid protection in existence of SVC using wavelet methodology. AIMS Electronics and Electrical Engineering, 2022, 6(4): 370-384. doi: 10.3934/electreng.2022022
[4]	Vuong Quang Phuoc, Nguyen Van Dien, Ho Duc Tam Linh, Nguyen Van Tuan, Nguyen Van Hieu, Le Thai Son, Nguyen Tan Hung . An optimized LSTM-based equalizer for 100 Gigabit/s-class short-range fiber-optic communications. AIMS Electronics and Electrical Engineering, 2024, 8(4): 404-419. doi: 10.3934/electreng.2024019
[5]	Sebin J Olickal, Renu Jose . LSTM projected layer neural network-based signal estimation and channel state estimator for OFDM wireless communication systems. AIMS Electronics and Electrical Engineering, 2023, 7(2): 187-195. doi: 10.3934/electreng.2023011
[6]	Assila Yousuf, David Solomon George . A hybrid CNN-LSTM model with adaptive instance normalization for one shot singing voice conversion. AIMS Electronics and Electrical Engineering, 2024, 8(3): 292-310. doi: 10.3934/electreng.2024013
[7]	Zaineb M. Alhakeem, Heba Hakim, Ola A. Hasan, Asif Ali Laghari, Awais Khan Jumani, Mohammed Nabil Jasm . Prediction of diabetic patients in Iraq using binary dragonfly algorithm with long-short term memory neural network. AIMS Electronics and Electrical Engineering, 2023, 7(3): 217-230. doi: 10.3934/electreng.2023013
[8]	D Venkata Ratnam, K Nageswara Rao . Bi-LSTM based deep learning method for 5G signal detection and channel estimation. AIMS Electronics and Electrical Engineering, 2021, 5(4): 334-341. doi: 10.3934/electreng.2021017
[9]	Desh Deepak Sharma, Ramesh C Bansal . LSTM-SAC reinforcement learning based resilient energy trading for networked microgrid system. AIMS Electronics and Electrical Engineering, 2025, 9(2): 165-191. doi: 10.3934/electreng.2025009
[10]	Loris Nanni, Michelangelo Paci, Gianluca Maguolo, Stefano Ghidoni . Deep learning for actinic keratosis classification. AIMS Electronics and Electrical Engineering, 2020, 4(1): 47-56. doi: 10.3934/ElectrEng.2020.1.47

Abstract

1. Introduction

One of the most important branches of speech processing is enhancing the speech recognition for noisy signals i.e. speech enhancement, speech recognition etc.. Reducing noise from a speech signal is very complex process. The main objective of speech enhancement is to find the optimal estimates of speech features. To obtain efficient feature, wavelet transforms are most useful because it is one of the most prominent technique to analyze the non stationary speech signals in both time and frequency domains in a better way.

Using wavelets ^[1], the noise can be reduced by appropriately selecting the wavelet coefficient threshold. These threshold values are subtracted from the noisy wavelet coefficients to obtain a noise reduced signal. Since features are computed in scalograms the obtained features are more prominent than the features obtained from short term Fourier transform technique.

In wavelet transforms there are two types: Continuous and Discrete wavelet transforms.

Discrete wavelet transform decomposes the signal into approximation and detail components by shifting and scaling the copies of the basic wavelet to a required level. BWT is proposed and used in the present work because, it resembles the auditory model of human cochlea ^{[2,3,4,5,6,7]} and it can be easily correlated with the MFCC feature extraction process. This helps in extracting the prominent features of the noisy speech signal.

CWT is used to obtain simultaneous time frequency analysis. It is preferred because it is based on Auditory model of Human Cochlea ^{[2,3,4,5,6,7]}.

In this paper, we propose the optimal feature selection procedure using BWT and MFCC procedures for convoluted noisy speech data for recognizing words. To calculate the optimal features mother wavelet’s central frequencies of Morlet ^[7], Daubechies, Bior, Coiflet wavelets are adapted to BWT with thresholding and central frequency techniques.

Thresholding on BWT is calculated using the following selection methods. They are ^[8]: ⅰ) Stein’s unbiased estimate of the risk rule (SURE), ⅱ) heuristic threshold selection rule, ⅲ) fixed selection rule, ⅳ) minimax ⅴ) sqtwolog threshold. To handle noise in the signal SURE threshold selection procedure has been adopted to BWT to estimate the recognition accuracy.

The contents of the paper is organized as follows: Section 2 discusses about the works carried out in literature using bionic wavelets. Section 3 provides introduction to continuous bionic wavelet. Section 4 presents the procedure adopted for converting the continuous wavelet to discrete wavelet. Section 5 discusses about the data set used for the experimentation purpose. Proposed system model is discussed in section 6 with results. The performance analysis of different classifier is discussed in section 7. Section 8 presents observations done during the simulation process. Last section discusses about the conclusion and future enhancements.

2. Literature survey

Extracting optimal feature plays a major role in classification and or recognition. However, many studies shows that bionic with Morlet wavelets are used for de-noising the speech signal by enhancing the signal component. At present the features can be extracted at three methods 1) Features from Time Domain, 2) Frequency Domain Features, 3) Features from Raw wave file. MFCC is the most popular method in frequency domain and the last method is now gearing up in the machine learning models. MFCC is well suited for clean speech signal but making it more robust for noisy data is also presented in this paper. In this direction the bionic wavelets are used for de-noising and the MFCC is made robust towards handling convoluted noisy speech data.

Bionic wavelet is made adaptive by applying various methods viz, by changing the ‘K’ factor, using different hard/soft thresholding methods and applying various base/central frequencies. The following are some of the related work towards the application of bionic wavelets used for denoising the speech data. A. Garg & O. P. Sahu ^[9] proposed a method to discretize bionic wavelet using CWT and ICWT using Morlet as the mother wavelet.

Fie Chen ^[10] proposed adaptive DBWT by changing T-function of BWT and splitting the dyadic tiling map of DWT that uses quadrature-mirror filters, organized as DBWT tiling map for decomposition. M. Talbi ^[11] proposed entropy technique to BWT to identify the two sub bands having minimal entropy for each coefficient.

Cao Bin-Fang ^[12] proposed a bionic wavelet method of hierarchical threshold based on PSO. The noisy speech signal is decomposed using bionic wavelet transform. In this Particle Swarm Optimization is proposed for threshold optimization. The noise with high frequency is separated by bionic wavelet transform and this is fed as an input to an adaptive filter. From the experimental work the paper illustrates speech enhancement for various SNR conditions.

A detail analysis is made by Yang Xi, Liu et al. to understand the behavior of bionic wavelet with additive noise for various db’s. It clearly explains the usage of bionic with Morlet as a mother wavelet for removing various db level noises from a speech signal. Yao and Zhango proposed an adaptive bionic with a Morlet wavelet base frequency “ωo” of mother wavelet 15165.4 Hz that is suitable for human auditory system.

Mourad used ^[13] MSS-MAP for wavelet transform and used four different test such as SNR, segmental SNR, Itakura and perceptual evaluation for various types of noises and their levels. A new speech enhancement procedure is proposed by WU Li-ming ^[14] on improved correlation function processing for Bionic wavelet co-efficient.

Speech recognition for Arabic words is demonstrated in Ben-Nasr ^[15]. Feature extraction is done by using MFCC with bionic wavelet. To increase the recognition rate Delta-Delta coefficients are used and classification is done by using feedforward back propagation neural network. Zehtabian ^[16], proposed speech enhancement technique using BWT and singular value decomposition method. The paper illustrates SVD is better than BWT for higher SNR’s.

Liu Yan ^[17], proposed de-noising algorithm on sub band spectrum entropy with bionic wavelet transform. They showed that sub band spectrum is good in detecting the end point of the speech signal. Hence it is used to distinguish speech as well as noise. The experimental work demonstrate sub band entropy de-noising method is superior than Wiener filter algorithm. Pritamdas ^[18] focus on continuous wavelet transform and thresholding of coefficients for speech enhancements using thresholds and wavelet transform scales in adaptive manner.

From the literature survey, it is observed that a lot of work is reported on Bionic wavelet for speech enhancement with thresholding and rescaling procedures used for converting continuous to discrete wavelet co-efficient’s for additive noise only. In this paper, procedure to convert continuous to discrete wavelet based on the central frequency is proposed. New feature extraction technique and the procedure to reduce the noise of convoluted noise is presented.

To the best of our knowledge this work is unique in its own way for de-noising the convoluted noise at various levels. The next section describes the characteristics of Bionic wavelet.

3. Continuous bionic wavelet

Alternative to STFT, is the WT technique ^{[19,20,21,22]}. When these two are compared visually, The scalograms of WT are better in representing the formant frequencies and structural harmonics of speech. Hence WT technique is identified as one of the prominent method to handle non stationary signals. CWT is fixed with some base scale ^[23] that is 2^1/m where m is an integer greater than 1. Where ‘m’ is the number of “voices per octave”. Different scales are obtained by raising this base scale to positive integer numbers, for example 2^k/m where k = 1, 2, 3…. The translation parameter in the CWT is discretized to integer values, represented by l. The resulting discretized wavelets for the CWT is represented by Eq. 1

$\frac{1}{2^{\frac{k}{m}}} \Psi \frac{(n-l)}{2^{\frac{k}{m}}}$

(1)

3.1. Bionic wavelet

Bionic wavelet transform (BWT) is an adaptive wavelet transform based on a model of the active biological auditory system ^[24]. The decomposition of BWT ^[2] is perceptually scaled and adaptive. It has the following properties:

ⅰ) High sensitivity and selectivity

ⅱ) Signal with determined energy distribution

ⅲ) Can be reconstructed

The resolution of bionic wavelet transform can be achieved by adjusting signal frequency and the instantaneous amplitude with its first order differential values.

4. Realization of discrete bionic wavelet from continuous

This section discusses about the mechanism adopted to convert continuous wavelet to discrete wavelet. To convert any continuous to discrete wavelet the discrete thresholding and central or base frequencies of different mother wavelets are adopted.

4.1. Center frequency ^[25]

Db11, Coif 5 and Bi-ortho3.5 wavelets are considered with central frequencies – 0.67, 0.68, 1.04 Hz respectively. The center frequency is calculated using centfrq of Matlab.

$\omega_{\mathrm{m}} = \omega_{0} /(1.1623)^{\mathrm{m}}$ , m varies from 1 to 22 for Morlet. For other wavelets centfrq function of Matlab is used.

All the wavelets possess different characteristics, hence the following four wavelets are considered

Db11: asymmetric, orthogonal, bi-orthogonal.

Coif 5: symmetric, orthogonal, bi-orthogonal.

Bior3.5: symmetric, not-orthogonal, bi-orthogonal.

22 scales are considered for BW in spite of center frequency.These wavelets are preferred because they mimics the mel-scale mapping of the MFCC ^[26] procedure and also these are designed to match the basilar membrane spacing i.e. based on nonlinear perceptual model of the auditory system.

4.2. Thresholding

This parameter decides about the number of levels used to reduce the redundant information in the CWT towards the discretisation of the wavelet. The following thresholding mechanisms are considered with various levels by trial and error procedure as listed below in the Table 1. Levels are fixed based on the obtained thresholds of the signal. The various ways of calculating the thresholding is as discussed below:

Table 1. SNR for various thresholding levels.

Sl. No	Thresholding	No. of levels based on the Thresholding	SNR before	SNR after
1	SURE.	2	-12.52	-3.3
2	Heuristic variant	5	-12.52	-11.96
3	Sqtwolog	4	-12.62	-11.96
4	Minimaxi	4	-12.52	-10.24

| Show Table

DownLoad: CSV

Sqtwolog:

$t h r = \sigma_{k} \sqrt{2 * \log (p)}$

where σ is the mean absolute deviation (MAD) and p is the length of the noisy signal. MAD is expressed as

$\sigma_{k} = \frac{M A D_{k}}{0.6745} = \frac{m e d i a n|\omega|}{0.6745}$

ω wavelet coefficient and k-scale for wavelet co-efficient

Rigrsure

$\mathit{\boldsymbol{t}}{\mathit{\boldsymbol{h}}_\mathit{\boldsymbol{k}}} = {\mathit{\boldsymbol{\sigma }}_\mathit{\boldsymbol{k}}}\sqrt {{\omega _c}}$

${{\omega _c}}$ is the cth coefficient wavelet square (coefficient at minimal risk) chosen from the vector

ω = [ω1, ω2, .., ωc]

σ is the standard deviation of the noisy signal.

Heursure: Heursure threshold selection rule is a combination of Sqtwolog and Rigrsure methods.

Minimaxi:

$t{h_k} = \left\{ {\begin{array}{*{20}{c}} {\sigma \left( {0.3936 + 0.10829{{\log }_2}M, } \right.}&{M \gt 32}\\ {0, }&{M \lt 32} \end{array}} \right\}$

ω: a vector of wavelet coefficients in units scale and M: vector length of the signal.

Algorithm

Steps for Discretizing Bionic Wavelet.

Step 1: Read the speech signal

Step 2: Multiply each value by ‘K’ as shown in Eq. 2

$\operatorname{BWT}_{\mathrm{f}}\left((a, \tau) = \mathrm{K} * \mathrm{WT}_{\mathrm{f}}((a, \tau)\right.$

(2)

Step 3: Thresholding function is selected with high SNR using Matlab function thselect.

Step 4: Base/central frequencies of various mother wavelets is applied using centfrq (wname).

Step 5: The modified bionic wavelet coefficients are divided by the ‘K’ factor to get the coefficients and reconstruction is done by taking its the inverse continuous wavelet transform. Where the ‘approximation is done by K’-factor using Eq. 3

$\frac{1.7772 T_{0}}{\sqrt{T^{2}+1}}$

(3)

Step 6: Compute the inverse continuous transform

Step 7: Obtain the Mel frequency Cepstral co-efficient for the de-noised signal ^[26]

Step 8: The Bionic-MFCC features obtained are listed as shown below

Step 9: Classify the same features using SVM, ANN and LSTM classifiers

Following tables shows the sample features are better than bare MFCC features.

Noisy speech Signal: The following table presents only the MFCC features without wavelet for noisy signal Co-efficient from MFCC without wavelets.

Table 1. SNR for various thresholding levels.

Sl. No	Thresholding	No. of levels based on the Thresholding	SNR before	SNR after
1	SURE.	2	-12.52	-3.3
2	Heuristic variant	5	-12.52	-11.96
3	Sqtwolog	4	-12.62	-11.96
4	Minimaxi	4	-12.52	-10.24

| Show Table

DownLoad: CSV

Coefficients after applying Wavelets

Table 1. SNR for various thresholding levels.

Sl. No	Thresholding	No. of levels based on the Thresholding	SNR before	SNR after
1	SURE.	2	-12.52	-3.3
2	Heuristic variant	5	-12.52	-11.96
3	Sqtwolog	4	-12.62	-11.96
4	Minimaxi	4	-12.52	-10.24

| Show Table

DownLoad: CSV

Clean speech Signal

The following table presents only the MFCC features without wavelet for clean signal

Table 1. SNR for various thresholding levels.

Sl. No	Thresholding	No. of levels based on the Thresholding	SNR before	SNR after
1	SURE.	2	-12.52	-3.3
2	Heuristic variant	5	-12.52	-11.96
3	Sqtwolog	4	-12.62	-11.96
4	Minimaxi	4	-12.52	-10.24

| Show Table

DownLoad: CSV

Coefficients after applying Wavelets

Table 1. SNR for various thresholding levels.

Sl. No	Thresholding	No. of levels based on the Thresholding	SNR before	SNR after
1	SURE.	2	-12.52	-3.3
2	Heuristic variant	5	-12.52	-11.96
3	Sqtwolog	4	-12.62	-11.96
4	Minimaxi	4	-12.52	-10.24

| Show Table

DownLoad: CSV

The above presents the weighted features obtained from step 1 to step 8 of the algorithms. From this it is clear that wavelets weighted feature values are better for both for clean and noisy speech signal.

5. Data set

Two different datasets considered are free spoken digit dataset (FSDD) ^[27] and Kannada dataset (Table 2) with recordings of spoken digits and words sampled at 8 kHz and 16 kHz respectively. The recordings are trimmed, so that they have near minimal silence at the beginnings and ends. It consists of English pronunciation words of numbers from one to nine from four different speakers. Totally 900 signals with 100 signals of each digit is collected. The second data set is isolated words Kannada data set. The words considered are as shown in Table 3. These signals are sampled at 16 KHz frequency consisting of 30 speakers with 20 male and 10 female speakers. 1000 word samples are collected from both genders for Kannada data set. The signals are artificially convoluted with street noise ^[28] with the SNR of 5, 10 and 15db to create convoluted noisy speech signals.

Table 2. Kannada dataset.

| Show Table

DownLoad: CSV

Table 3. English dataset.

| Show Table

DownLoad: CSV

6. System model for the proposed approach

The obtained features are modeled for classification and recognition using machine learning models like SVM ^{[29,30,31,32]} ANN ^[15,33] and LSTM ^[34,35] in the proposed work. The overall data flow diagram of adopting all the models is as shown below Figure 1.

Figure 1. Flowchart of the proposed work.

DownLoad: Full-Size Img PowerPoint

General experimental setup:

The obtained features of all the signals are grouped into training and testing samples. These signals are convoluted with 5 db, 10 db, 15 db street noise ^[28]. The same data set is used by all the models for testing and training purpose to evaluate the recognition accuracies performance of all the models. The results are discussed at two levels namely, i) signal to noise ratio before and after the application of bionic wavelet ii) Recognition accuracies of the models compared with the existing models if any.

6.1. Signal to Noise Ratio (SNR) ^[36]

It is a best indicator for identifying noise interference in a given signal. SNR is computed using the following formulas.

$sn{r_{before}} = \frac{{{\rm{ }}\mathit{mean}{\rm{ (}}\mathit{orgsignal}{\rm{)}}{{\rm{ }}^2}}}{{{\rm{ }}\mathit{mean}{\rm{ (}}\mathit{transpose}{\rm{ (}}\mathit{noise}{\rm{)}}{{\rm{ }}^2}}}$

$sn{r_{\mathit{befor}{\mathit{e}_{db}}}} = 10*\mathit{log}10\left( {\mathit{sn}{\mathit{r}_{\mathit{before}}}} \right)\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\% \;{\rm{in}}\;{\rm{dB}}$

$\mathit{sn}{\mathit{r}_{\mathit{after}}} = \frac{{\mathit{mean}\left( {\mathit{enhancesp}{{\rm{.}}^2}} \right)}}{{\mathit{mean}\left( {\mathit{redidual\_oise}{.^{\rm{2}}}} \right)}}$

$sn{r_{afte{r_{db}}}} = 10*\mathit{log}10({\rm{ }}\mathit{sn}{\mathit{r}_{\mathit{after}}}\mathit{ })\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\;\% \;{\rm{i}}{\rm{n}}\;{\rm{dB}}$

The Table 4 presents the application of different central frequencies to bionic wavelet to reduce the noise levels. It is clear that an average of 2db of noise is reduced.

Table 4. SNR for various central frequency with their mother wavelets.

| Show Table

DownLoad: CSV

Table 5 depicts the application of bionic wavelets for convoluted noise considering the 22 scales as mentioned in the literature.

Table 5. SNR for convoluted noise using bionic 22 scales.

| Show Table

DownLoad: CSV

Comparing Table 4 and 5 SNR level is better for convoluted noise. Hence noise reduction is better in Table 4 than in Table 5.

7. Performance analysis of various classification methods

In our earlier works ^[36,37], the experiments were carried no clean and noisy speech data set with normal MFCC features. The current feature extraction procedure applies bionic wavelets for extracting better features for the dataset specified in section 3. Hence, In this paper the new Bionic-MFCC features are used for the recognition purpose by reducing the noise using discrete bionic wavelets. Experiments are performed on standard benchmark dataset (FSDD) and Kannada dataset. The various models and their parameters used are as follows:

7.1. Support vector machine

Since the speech features are non-linear in nature, the features need to be mapped to high dimensional space. The basic idea is that the input space need to be mapped into a high dimensional feature space by nonlinear transformation and the optimal hyper plane is found in the new space. The optimal hyper plane not only needs to ensure that different categories can be discriminated correctly, but also the maximum categorization interval between them should be promised. Thus, the generalization capability of the support vector machine is stronger. The target function corresponding to the nonlinear separable support vector machine is given by:

$\begin{array}{l} \min \left( {\frac{1}{2}{\mathit{\boldsymbol{\omega }}^T}\mathit{\boldsymbol{\omega }} + C\sum\limits_{i = 1}^N {{\xi _i}} } \right)\\ {\rm{s}}{\rm{.t}}{\rm{. }}\;\;\;{y_i}\left( {{\mathit{\boldsymbol{\omega }}^T}{\mathit{\boldsymbol{x}}_i} + b} \right) \ge 1 - {\xi _i}, {\xi _i} \ge 0, i = 1, 2, \ldots , N \end{array}$

where ω represents the weight coefficient vector, and b is a constant. C denotes the penalty coefficient to control the penalty degree for misclassified samples and balance the complexity of the model and loss error. ξi represents the relaxation factor to adjust the number of misclassified samples that allowed exit in the process of classification.

When SVM is used to solve the classification problems, two strategies can be adopted. One is ONE-TO-ALL, and ONE-TO-ONE. In this paper ONE-To-ALL method is applied for multi-classification. Kernel functions are also the key functions for SVM. Hence, polynomial and radial basis kernel functions are considered.

Table 6 and Figure 2 depict the recognition performance using SVM model. To implement SVM RBF(r) and polynomial kernel (p) functions are used. It is observed that Bionic-MFCC features, well classifies the noisy signal compared to clean speech proposed using Bionic-MFCC features ^[30]. SVM performs better with RBF kernel function for standard data set. Whereas, as it fails for Kannada data set. Polynomial function performs better for Kannada data set as shown in Figure 2. From this it identifies that the kernel performance depends on the data set.

Table 6. Classification accuracy of SVM.

| Show Table

DownLoad: CSV

Figure 2. Classification accuracy of SVM.

DownLoad: Full-Size Img PowerPoint

7.2. Neural network

In the literature bionic wavelets are applied with Morlet base frequency for additive noisy Arabic speech recognition system ^[15,34,38] using NN. Hence in this paper bionic wavelets are tried for convoluted noisy speech data to identify the level of noise reduction and feature weights for recognition accuracy. Standard dataset has good recognition rate compared to Kannada data set. Less performance is due to the variable word length and existence of ambiguity in the utterance of the speaker. The Table 7 and Figure 3 show the recognition accuracies obtained.

Table 7. Recognition accuracy of NN.

| Show Table

DownLoad: CSV

Figure 3. Recognition accuracy of NN.

DownLoad: Full-Size Img PowerPoint

NN Implemented Procedure:

Neural network model has 9 nodes, each with 12 bionic MFCC features at the input layer. Two hidden layers are considered with 9 nodes at the output layer representing each word. The output layer has 9 nodes with one node for each digit. Softmax activation function is applied on the top of the network to get output class label probabilities. The model is optimized by adam-delta optimizer that adapts learning rate by moving window.

Learning is continued and network is learnt for all updates. The model is constructed and categorical cross entropy is used for multi classification.

7.3. LSTM

Procedure: The MFCC features are fed to the input-layer to build basic LSTM Cell. Wrapping of each layer in a dropout layer is considered with 0.5 probability value, for learning in each iteration. A group of dropout wrapped LSTMs are fed to a MultiRnn cell to group the layer together.

The CTC model helps to learn for labeling a variable –length sequence when the input-output arrangement is not known. Consider the features m = (m₁, m₂, ….m_T) and the label n = (n_1, n₂, ….n_U). The CTC is trained based on maximum probability. The loss function of the CTC model is computed as

${l_{CTC}} = - \mathit{\boldsymbol{lnP}}(\mathit{\boldsymbol{n}}|\mathit{\boldsymbol{m}}) \approx = \mathit{\boldsymbol{ln}}\sum\limits_{\pi \in \phi } {\prod\limits_{t = 1}^T \mathit{\boldsymbol{P}} } \left( {\mathit{\boldsymbol{k}} = {\mathit{\boldsymbol{\pi }}_\mathit{\boldsymbol{t}}}|\mathit{\boldsymbol{m}}} \right)$

The label sequence π is all expanded possible CTC path alignments Φ having length T

P(k = πt|m) is a label distribution at time step t.

Finally, stacked LSTM layers are embedded. The CTC ^[39,40] loss function and Adam-delta optimizer functions are used to define the model to create a single fully connected layer with SoftMax activation function to get the labeled predictions. The activation function is as given below:

${\mathit{\boldsymbol{P}}_\mathit{\boldsymbol{t}}}(\mathit{\boldsymbol{k}}|\mathit{\boldsymbol{m}}) = \frac{{\mathit{\boldsymbol{exp}}\left( {\mathit{\boldsymbol{h}}_\mathit{\boldsymbol{t}}^\mathit{\boldsymbol{L}}(\mathit{\boldsymbol{k}})} \right)}}{{\sum_{\mathit{\boldsymbol{i}} = 1}^{\mathit{\boldsymbol{K}} + 1} {\mathit{\boldsymbol{exp}}} \left( {\mathit{\boldsymbol{h}}_\mathit{\boldsymbol{t}}^\mathit{\boldsymbol{L}}(\mathit{\boldsymbol{i}})} \right)}}$

The Ada-delta optimizer is considered to minimize the loss by feeding the predictions to mean squared error loss function. Accuracy metric is used for training and testing process. The predicted values minimized with errors using mean squared error and Adam-delta optimizer.Then at the end accuracy metric is used for training and testing.

In the literature, works are carried out using Bi-LSTM and LSTM model for speech classification ^[34] with 95% and 96.58% of accuracy for clean speech signal. T. Goehring, et al. ^[41] uses recurrent neural network model for feature extraction for Babble noise for 5 dB and 10 dB with a recognition accuracy of 78% and 82% as illustrated in Table 8.

Table 8. Classification accuracy of LSTM.

| Show Table

DownLoad: CSV

Whereas in the proposed work LSTM model is applied to convoluted noisy speech data and the performance of the model is shown in Table 8 and Figure 4, demonstrates better results than identified in the literature. Using Bionic-MFCC features recognition accuracy is improved by 1% compared to Bi-LSTM model for speech data. Among SVM ANN, and LSTM models, LSTM is better in modeling the convoluted speech data using db11 mother wavelet.

Figure 4. Classification accuracy of LSTM.

DownLoad: Full-Size Img PowerPoint

Performance measures:

Word classification error rate is computed by

$\begin{array}{l} {\rm{Classification \;Accuracy = }}\frac{{{\rm{No}}{\rm{.\; of\; correctly \;classified \;audio}}}}{{{\rm{Notal\; No}}{\rm{.\; of \;audio \;files \;files}}}}{\rm{ }}\\ {\rm{Classification \;Error \;rate = }}\frac{{{\rm{No}}{\rm{. \;of\; incorrectly\; classified \;audio}}}}{{{\rm{Total \;No}}{\rm{.\; of \;audio\; files \;files}}}} \end{array}$

8. Observations and discussions

This section discusses about the observations done on the models used for the classification and recognition purposes.

SVM:

● In SVM the classification rate can be improved by applying different normalization methods.

● SVM performance varies with the choice of kernel function

● Non-linear SVM kernels are well suited for classification of speech data

ANN:

● Recognition accuracy can be increased by using large data set and the selection of appropriate optimizer function

● Increasing the number of hidden nodes improves the learning phase

LSTM:

It works on par with ANN, except the proper choice of the CTC loss function. The suitable selection of cost function will also help us to yield the good recognition rate. LSTM requires less features than SVM and ANN to model the data.

In general, SVM and ANN equally perform well compared to other model but not as good as LSTM. This is due to the optimality of the features obtained by the weighted values from Bionic-MFCC features. The results of LSTM model on FSDD dataset is better with db11 compared to other models because of fine-tuned dataset of FSDD. The results for db11 wavelet for 15 db is better because of high signal to noise ratio of noisy data.

9. Conclusion and future enhancements

In this work the discretization procedure of continuous bionic wavelet has been proposed for convoluted noisy speech recognition. The obtained bionic wavelet features are used for reducing the noise level in the speech data. These features are also used in MFCC to obtain the Bionic-MFCC speech features. It also presents the improvement of MFCC features using continuous wavelets. From the obtained results of the models it is clear that LSTM with DB11 wavelet at 15dB SNR outperforms. It is also observed that the recognition accuracies depend on the nature of dataset also.

It is a unique work of applying continuous bionic wavelet for feature extraction using the central frequencies of DB11, coif5 and Bior-3.5 wavelets for convoluted noisy speech data. This work also demonstrates that, even basic mother wavelets features can also be adopted in converting the continuous to discrete wavelets. It is very tedious to handle convoluted noisy speech data because of overlapping and the identification of the frequency of noise with the original data (convolution of signal and noise). According to our study, the additive noise can be completely removed by using filters but not convoluted noise. Hence this approach is towards reducing the noise using continuous wavelet for the isolated word recognition. As per the study, LSTM model better classifies and improves the recognition accuracy up to 96% with 4% of word error rate than other models. Hence bionic wavelet well sustains and it can be made adaptive in nature, by applying various thresholding concept. From this study it is also observed that central frequency and the thresholding concept plays a major role in noise reduction as well as in the conversion of continuous to discrete wavelet. For Kannada dataset, word error rate is high because of variation in speaker’s pronunciations. Whereas FSDD has good recognition rate because of its fine-tuned dataset.

Future enhancements:

In spite of thresholding, genetic algorithms can be adopted for feature reduction. Other wavelets central frequencies can also be tried for discretization of the wavelets. The performance of the above models can be verified for different types of noises for various noise levels to identify SNR. The model performances can also extend to sentence level recognition. The DWT trees can also be used for speech enhancement by noise reduction.

Acknowledgment

We are thankful to all the persons who helped in formulating this paper. The authors remain grateful to Dr. S.K. Katti for all his support.

Conflict of interest

All authors declare no conflicts of interest in this paper.

References

[1]	Aggarwal R, Inclan C, Leal R (1999) Volatility in emerging stock markets. J Financ Quant Analy 34: 33-55. doi: 10.2307/2676245
[2]	Agmon T, Lessard DR (1977) Investor recognition of corporate international diversification. J Financ 32: 1049-1055. doi: 10.1111/j.1540-6261.1977.tb03308.x
[3]	Ang A, Bekaert G (1999) International asset allocation with time-varying correlations. Technical report, National Bureau of Economic Research.
[4]	Ball CA, Torous WN (2000) Stochastic correlation across international stock markets. J Empir Financ 7: 373-388. doi: 10.1016/S0927-5398(00)00017-7
[5]	Basu S (1983) The relationship between earnings' yield, market value and return for NYSE common stocks: Further evidence. J Financ Econ 12: 129-156. doi: 10.1016/0304-405X(83)90031-4
[6]	Benartzi S, Thaler RH (2001) Naive diversification strategies in defined contribution saving plans. Am Econ Rev 91: 79-98.
[7]	Bollerslev T (1986) Generalized autoregressive conditional heteroskedasticity. J Econom 31: 307-327. doi: 10.1016/0304-4076(86)90063-1
[8]	Campbell JY, Shiller RJ (1988) Stock prices, earnings, and expected dividends. J Financ 43: 661-676. doi: 10.1111/j.1540-6261.1988.tb04598.x
[9]	Campbell JY, Yogo M (2006) Effcient tests of stock return predictability. J Financ Econ 81: 27-60. doi: 10.1016/j.jfineco.2005.05.008
[10]	Choi JJ (1989) Diversification, exchange risk, and corporate international investment. J Int Bus Stud 20: 145-155. doi: 10.1057/palgrave.jibs.8490356
[11]	Christensen BJ, Prabhala NR (1998) The relation between implied and realized volatility. J Financ Econ 50: 125-150. doi: 10.1016/S0304-405X(98)00034-8
[12]	Cochrane JH (1999) New facts in finance. Technical report, National bureau of economic research.
[13]	Dickey DA, Fuller WA (1979) Distribution of the estimators for autoregressive time series with a unit root. J Am Stat Associ 74: 427-431.
[14]	Dicle MF (2017) Increasing return response to changes in risk. Available at SSRN: https://ssrn.com/abstract=3057748.
[15]	Eichholtz PM (1996) Does international diversification work better for real estate than for stocks and bonds? Financ Analy J 52: 56-62.
[16]	Engle RF (1982) Autoregressive conditional heteroscedasticity with estimates of the variance of United Kingdom inflation. Econom: J Econom Society 50: 987-1007.
[17]	Engle RF (1984) Wald, likelihood ratio, and lagrange multiplier tests in econometrics. Handb econome 2: 775-826.
[18]	Fama EF (1965a) The behavior of stock-market prices. J Bus 38: 34-105.
[19]	Fama EF (1965b) Random walks in stock market prices. Financ Analy J 21: 55-59.
[20]	Fama EF (1998) Market e ciency, long-term returns, and behavioral finance. J Financ Econ 49: 283-306. doi: 10.1016/S0304-405X(98)00026-9
[21]	Fama EF, French KR (1988a) Dividend yields and expected stock returns. J Financ Econ 22: 3-25.
[22]	Fama EF, French KR (1988b) Permanent and temporary components of stock prices. J Political Econ 96: 246-273.
[23]	Fama EF, French KR (1989) Business conditions and expected returns on stocks and bonds. J Financ Econ 25: 23-49. doi: 10.1016/0304-405X(89)90095-0
[24]	French KR, Poterba JM (1991) Investor diversification and international equity markets. Am Econ Rev 81: 222-226.
[25]	Fuller WA (2009) Introduction to statistical time series, Volume 428. John Wiley & Sons.
[26]	Geweke J (1982) Measurement of linear dependence and feedback between multiple time series. J Am Stat Associ 77: 304-313. doi: 10.1080/01621459.1982.10477803
[27]	Goetzmann WN, Kumar A (2008) Equity portfolio diversification. Rev Financ 12: 433-463. doi: 10.1093/rof/rfn005
[28]	Granger CW (1969) Investigating causal relations by econometric models and cross-spectral methods. Econom: J Econom Soc 37: 424-438.
[29]	Grauer RR, Hakansson NH (1987) Gains from international diversification: 1968–85 returns on portfolios of stocks and bonds. J Financ 42: 721-739. doi: 10.1111/j.1540-6261.1987.tb04581.x
[30]	Hansen PR, Lunde A (2005) A forecast comparison of volatility models: Does anything beat a GARCH (1, 1)? J Appl Econom 20: 873-889. doi: 10.1002/jae.800
[31]	Hjalmarsson E (2010) Predicting global stock returns. J Financ Quant Analy 45: 49-80. doi: 10.1017/S0022109009990469
[32]	Jaffe J, Keim DB, Westerfield R (1989) Earnings yields, market values, and stock returns. J Financ 44: 135-148. doi: 10.1111/j.1540-6261.1989.tb02408.x
[33]	Jensen MC, Benington GA (1970) Random walks and technical theories: Some additional evidence. J Financ 25: 469-482. doi: 10.1111/j.1540-6261.1970.tb00671.x
[34]	Karolyi GA, Stulz RM (1996) Why do markets move together? An investigation of US-Japan stock return comovements. J Financ 51: 951-986.
[35]	Levy H, Lerman Z (1988) The benefits of international diversification in bonds. Financ Analy J 44: 56-64. doi: 10.2469/faj.v44.n5.56
[36]	Levy RA (1967) Random walks: Reality or Myth. Financ Analy J 23: 69-77. doi: 10.2469/faj.v23.n6.69
[37]	Lewellen J (2004) Predicting returns with financial ratios. J Financ Econ 74: 209-235. doi: 10.1016/j.jfineco.2002.11.002
[38]	Malkiel BG (1999) A random walk down Wall Street: Including a life-cycle guide to personal investing. WW Norton & Company.
[39]	Malkiel BG, Fama EF (1970) Effcient capital markets: A review of theory and empirical work. J Financ 25: 383-417.
[40]	Morana C, Beltratti A (2008) Comovements in international stock markets. J Int Financ Markets, Inst Money 18: 31-45. doi: 10.1016/j.intfin.2006.05.001
[41]	Nance DR, Smith CW, Smithson CW (1993) On the determinants of corporate hedging. J Financ 48: 267-284. doi: 10.1111/j.1540-6261.1993.tb04709.x
[42]	Poterba JM, Summers LH (1988) Mean reversion in stock prices: Evidence and implications. J Financ Econo 22: 27-59. doi: 10.1016/0304-405X(88)90021-9
[43]	Ramchand L, Susmel R (1998) Volatility and cross correlation across major stock markets. J Empir Financ 5: 397-416. doi: 10.1016/S0927-5398(98)00003-6
[44]	Rapach DE, Strauss JK, Zhou G (2013) International stock return predictability: What is the role of the united states? J Financ 68: 1633-1662. doi: 10.1111/jofi.12041
[45]	Rapach DE, Zhou G, et al. (2013) Forecasting stock returns. Handb Econ Forecasting 2 (Part A), 328-383.
[46]	Rogers RC (1988) The relationship between earnings yield and market value: Evidence from the American stock exchange. Financ Rev 23: 65-80. doi: 10.1111/j.1540-6288.1988.tb00775.x
[47]	Sarwar G (2012a) Intertemporal relations between the market volatility index and stock index returns. Appl Financ Econ 22: 899-909.
[48]	Sarwar G (2012b) Is vix an investor fear gauge in bric equity markets? J Multinatl Financ Manage 22: 55-65.
[49]	Sarwar G (2014) US stock market uncertainty and cross-market European stock returns. J Multinatl Financ Manage 28: 1-14. doi: 10.1016/j.mulfin.2014.07.001
[50]	Sarwar G, Khan W (2016) The e ect of us stock market uncertainty on emerging market returns. Emerg Mark Financ Trade 53: 1976-1811.
[51]	Solnik B, Boucrelle C, Fur YL (1996) International market correlation and volatility. Financ Analy J 52: 17-34. doi: 10.2469/faj.v52.n5.2021
[52]	Solnik B, Noetzlin B (1982) Optimal international asset allocation. J Portf Manage 9: 11-21. doi: 10.3905/jpm.1982.408895
[53]	Solnik BH (1995) Why not diversify internationally rather than domestically? Financ Analy J 51: 89-94. doi: 10.2469/faj.v51.n1.1864
[54]	Horne JCV, Parker GG (1967) The random-walk theory: An empirical test. Financ Analy J 23: 87-92. doi: 10.2469/faj.v23.n6.87
[55]	Zellner A (1962) An effcient method of estimating seemingly unrelated regressions and tests for aggregation bias. J Am Stat Associ 57: 348-368. doi: 10.1080/01621459.1962.10480664

This article has been cited by:

1.	Ashraf A. Ahmad, Ameer Mohammed, Mohammed Ajiya, Zainab Yunusa, Habibu Rabiu, Estimation of time-parameters of Barker binary phase coded radar signal using instantaneous power based methods, 2020, 4, 2578-1588, 347, 10.3934/ElectrEng.2020.4.347
2.	Xiangbo Liu, Rongxi Cui, Dianai Hu, Sen Wang, Wei Liang, Wenxuan Zhang, 2022, Speech Recognition System Based on Duty Log of Distribution Network, 978-1-6654-7968-4, 1929, 10.1109/IMCEC55388.2022.10020045
3.	Mohit Kumar Yadav, Vikrant Bhateja, Monalisa Singh, 2021, Chapter 44, 978-981-16-0979-4, 475, 10.1007/978-981-16-0980-0_44
4.	Bachchu Paul, Santanu Phadikar, Somnath Bera, Tanushree Dey, Utpal Nandi, Isolated word recognition based on a hyper-tuned cross-validated CNN-BiLSTM from Mel Frequency Cepstral Coefficients, 2024, 1573-7721, 10.1007/s11042-024-19750-3
5.	Bachchu Paul, Sumita Guchhait, Sandipan Maity, Biswajit Laya, Anudyuti Ghorai, Anish Sarkar, Utpal Nandi, Spoken word recognition using a novel speech boundary segment of voiceless articulatory consonants, 2024, 16, 2511-2104, 2661, 10.1007/s41870-024-01776-3
6.	Pavuluri Jaswanth, Pavuluri Yaswanth chowdary, M.V.S. Ramprasad, Deep learning based intelligent system for robust face spoofing detection using texture feature measurement, 2023, 29, 26659174, 100868, 10.1016/j.measen.2023.100868

Reader Comments

Your name:*

Email:*
© 2017 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)

通讯作者: 陈斌, bchen63@163.com

1.
沈阳化工大学材料科学与工程学院沈阳 110142

Quantitative Finance and Economics

3.2 0.3

Metrics

Article views(5053) PDF downloads(800) Cited by(0)

Preview PDF

Download XML

Export Citation

Article outline

Show full outline

Figures and Tables

Figures(1) / Tables(4)

Quantitative Finance and Economics

US Implied Volatility as A predictor of International Returns

Related Papers:

Abstract

1. Introduction

2. Literature survey

3. Continuous bionic wavelet

3.1. Bionic wavelet

4. Realization of discrete bionic wavelet from continuous

4.1. Center frequency ^[25]

4.2. Thresholding

5. Data set

6. System model for the proposed approach

6.1. Signal to Noise Ratio (SNR) ^[36]

7. Performance analysis of various classification methods

7.1. Support vector machine

7.2. Neural network

7.3. LSTM

8. Observations and discussions

9. Conclusion and future enhancements

Acknowledgment

Conflict of interest

References

This article has been cited by:

Reader Comments

通讯作者: 陈斌, bchen63@163.com

Metrics

Figures and Tables

Other Articles By Authors

Catalog

Quantitative Finance and Economics

US Implied Volatility as A predictor of International Returns

Related Papers:

Abstract

1. Introduction

2. Literature survey

3. Continuous bionic wavelet

3.1. Bionic wavelet

4. Realization of discrete bionic wavelet from continuous

4.1. Center frequency [25]

4.2. Thresholding

5. Data set

6. System model for the proposed approach

6.1. Signal to Noise Ratio (SNR) [36]

7. Performance analysis of various classification methods

7.1. Support vector machine

7.2. Neural network

7.3. LSTM

8. Observations and discussions

9. Conclusion and future enhancements

Acknowledgment

Conflict of interest

References

This article has been cited by:

Reader Comments

通讯作者: 陈斌, bchen63@163.com

Metrics

Figures and Tables

Other Articles By Authors

Related pages

Tools

Export File

Citation

Format

Content

Catalog

4.1. Center frequency ^[25]

6.1. Signal to Noise Ratio (SNR) ^[36]