Analyzing temporal coherence for deepfake video detection

Muhammad Ahmad Amin; Yongjian Hu; Jiankun Hu; Muhammad Ahmad Amin; Yongjian Hu; Jiankun Hu

doi:10.3934/era.2024119

Electronic Research Archive

2024, Volume 32, Issue 4: 2621-2641. doi: 10.3934/era.2024119

Previous Article Next Article

Research article Special Issues

Analyzing temporal coherence for deepfake video detection

1.
School of Electronic and Information Engineering, South China University of Technology, Guangzhou 510641, China
2.
School of Engineering and Information Technology, The University of New South Wales, Australian Defence Force Academy Canberra, ACT 2610, Australia

Received: 03 January 2024 Revised: 01 March 2024 Accepted: 20 March 2024 Published: 29 March 2024

Current facial image manipulation techniques have caused public concerns while achieving impressive quality. However, these techniques are mostly bound to a single frame for synthesized videos and pay little attention to the most discriminatory temporal frequency artifacts between various frames. Detecting deepfake videos using temporal modeling still poses a challenge. To address this issue, we present a novel deepfake video detection framework in this paper that consists of two levels: temporal modeling and coherence analysis. At the first level, to fully capture temporal coherence over the entire video, we devise an efficient temporal facial pattern (TFP) mechanism that explores the color variations of forgery-sensitive facial areas by providing global and local-successive temporal views. The second level presents a temporal coherence analyzing network (TCAN), which consists of novel global temporal self-attention characteristics, high-resolution fine and low-resolution coarse feature extraction, and aggregation mechanisms, with the aims of long-range relationship modeling from a local-successive temporal perspective within a TFP and capturing the vital dynamic incoherence for robust detection. Thorough experiments on large-scale datasets, including FaceForensics++, DeepFakeDetection, DeepFake Detection Challenge, CelebDF-V2, and DeeperForensics, reveal that our paradigm surpasses current approaches and stays effective when detecting unseen sorts of deepfake videos.

Keywords:

Citation: Muhammad Ahmad Amin, Yongjian Hu, Jiankun Hu. Analyzing temporal coherence for deepfake video detection[J]. Electronic Research Archive, 2024, 32(4): 2621-2641. doi: 10.3934/era.2024119

Related Papers:

[1]	Mirela Dubravac, Beat Meier . Stimulating the parietal cortex by transcranial direct current stimulation (tDCS): no effects on attention and memory. AIMS Neuroscience, 2021, 8(1): 33-46. doi: 10.3934/Neuroscience.2021002
[2]	Timothy J. Ricker . The Role of Short-term Consolidation in Memory Persistence. AIMS Neuroscience, 2015, 2(4): 259-279. doi: 10.3934/Neuroscience.2015.4.259
[3]	Laura Serra, Sara Raimondi, Carlotta di Domenico, Silvia Maffei, Anna Lardone, Marianna Liparoti, Pierpaolo Sorrentino, Carlo Caltagirone, Laura Petrosini, Laura Mandolesi . The beneficial effects of physical exercise on visuospatial working memory in preadolescent children. AIMS Neuroscience, 2021, 8(4): 496-509. doi: 10.3934/Neuroscience.2021026
[4]	Roya. Askari, Mohadeseh. NasrAbadi, Amir Hossein. Haghighi, Mohammad Jahan Mahin, Rajabi Somayeh, Matteo. Pusceddu . Effect of combined training in water on hippocampal neuronal Plasticity and memory function in healthy elderly rats. AIMS Neuroscience, 2024, 11(3): 260-274. doi: 10.3934/Neuroscience.2024017
[5]	Hadi Mohammadi Nia Samakosh, Maedeh Maktoubian, Seyyed Pedram Rouhani Doost, Rafael Oliveira, Georgian Badicu, Sameer Badri Al-Mhanna, Mahdieh Hassanzadeh, Peyman Amadekhiar, Reza Rezaeain Vaskasi . Active and sham transcranial direct-current stimulation (tDCS) plus core stability on the knee kinematic and performance of the lower limb of the soccer players with dynamic knee valgus; two armed randomized clinical trial. AIMS Neuroscience, 2025, 12(3): 312-331. doi: 10.3934/Neuroscience.2025017
[6]	Priscila Centeno Crespo, Leo Anderson Meira Martins, Otávio Garcia Martins, Clara Camacho Dos Reis, Ricardo Netto Goulart, Andressa de Souza, Liciane Fernandes Medeiros, Vanessa Leal Scarabelot, Giovana Duzzo Gamaro, Sabrina Pereira Silva, Marcos Roberto de Oliveira, Iraci Lucena da Silva Torres, Izabel Cristina Custódio de Souza . Short-term effectiveness of transcranial direct current stimulation in the nociceptive behavior of neuropathic pain rats in development. AIMS Neuroscience, 2023, 10(4): 433-446. doi: 10.3934/Neuroscience.2023032
[7]	Michaela D Curry, Abigail Zimmermann, Mohammadbagher Parsa, Mohammad-Reza A. Dehaqani, Kelsey L Clark, Behrad Noudoost . A Cage-Based Training System for Non-Human Primates. AIMS Neuroscience, 2017, 4(3): 102-119. doi: 10.3934/Neuroscience.2017.3.102
[8]	Zoha Deldar, Carlos Gevers-Montoro, Ali Khatibi, Ladan Ghazi-Saidi . The interaction between language and working memory: a systematic review of fMRI studies in the past two decades. AIMS Neuroscience, 2021, 8(1): 1-32. doi: 10.3934/Neuroscience.2021001
[9]	Monireh Asadi Ghaleni, Forouzan Fattahi Masrour, Narjes Saryar, Alexandra J. Bratty, Ebrahim Norouzi, Matheus Santos de Sousa Fernandes, Georgian Badicu . Effects of an intervention combining physical activity and components of Amygdala and Insula Retraining (AIR) on sleep and working memory among older male adults. AIMS Neuroscience, 2024, 11(4): 421-438. doi: 10.3934/Neuroscience.2024025
[10]	Graziella Orrù, Valentina Cesari, Eleonora Malloggi, Ciro Conversano, Danilo Menicucci, Alessandro Rotondo, Cristina Scarpazza, Laura Marchi, Angelo Gemignani . The effects of Transcranial Direct Current Stimulation on food craving and food intake in individuals affected by obesity and overweight: a mini review of the magnitude of the effects. AIMS Neuroscience, 2022, 9(3): 358-372. doi: 10.3934/Neuroscience.2022020

Abstract

1. Introduction

The focus of this "interim" review is to provide a summary of the literature pairing working memory (WM) tasks with transcranial direct current stimulation (tDCS) and other neuromodulatory techniques and to propose some contemporary recommendations in going forward. A primary challenge to this work is the enduring problem of optimizing a broad parameter space associated with these techniques including: the number of sessions, session spacing, online/offline stimulation, electrode placement, stimulus type and duration, task and transfer task selection, and individual differences. The goal is to help future researchers make better experimental design decisions by clarifying where there are some guideposts and where there are none. This article joins recent review papers of single tDCS session ^[1,2,3], and longitudinal ^[4,5] effects on cognitive tasks, with a more narrow focus on recent work applying longitudinal tDCS to enhance working memory (WM). It also continues the thread of 'lessons learned' that we began in a previous article detailing recommendations for conducting tDCS studies reflecting on single sessions per stimulation condition ^[6]. For those interested in beginning tDCS-related work, there are a number of helpful tutorials ^[7], and recent methodological reviews ^{[2,3,8,9,10,11]} that provide greater context for those beginning tDCS-related research. In the following sections, we provide some background for longitudinal studies using tDCS to improve WM. Our own work serves as an example of the logistical challenges and limitations.

2. Single Sessions tDCS and Working Memory

Over the last decade, tDCS has emerged as a popular cognitive neuroscience research tool following the seminal, and continuing, findings from the Nitsche and Paulus labs ^{[12,13,14,15]}. To provide some historical framework, early studies pairing WM with tDCS tested structure-function relationships. For example, one of our early projects demonstrated that tDCS over parietal regions replicated an unexpected pattern of behavior observed in neuropsychological patients with parietal lesions ^[16,17,18]. Namely, cathodal tDCS over the posterior parietal cortex selectively impaired visual working memory probed by recognition, but had no effect on trials probed by free recall ^[19]. This work provided convergent evidence complementing the patient research. In subsequent work, we applied tDCS to healthy young and older adults prior to episodic memory, or WM, tasks. We found striking differences in performance with nearly equal and opposite responses as a function of factors such as level of education or WM capacity ^[6,20,21]. These data reflecting the importance of individual differences are not oddities, and are often overlooked in studies applying tDCS ^[22]. Other researchers report that measurement of the initial response to tDCS can predict different patterns of tDCS-linked performance changes ^{[23,24,25,26,27,28]}. Another issue that is rarely noted is that differences in brain morphologies shape an individual's response to tDCS ^[29], as does skull thickness ^[30], amid a range of other individual differences ^[31]. This individual variability in responsiveness to tDCS contributes to variability in meta-analyses and reviews with some reporting null effects and others reporting modest or notable effects in single session paradigms ^{[32,33,34,35,36,37]}; but see: ^{[38,39,40,41]}. In short, single sessions of tDCS, across a variety of montages, intensities and WM tasks, have variable, difficult to predict, and modest effects on WM performance. This makes it difficult to produce a single all-purpose protocol for general use. This is a cautionary tale particularly when considered in the light of a growing do-it-yourself community and greater usage of commercially available products including foc.us, which has been shown to significantly impair WM performance ^[42]. It may not be an overstatement to state that tDCS-related cognitive research is in danger of losing legitimacy given exuberant industry claims and marketing.

3. Longitudinal tDCS Consistently Benefits WM

Despite the aforementioned limitations associated with single sessions of stimulation, there is a clear translational appeal in testing the use of tDCS to sustain or improve cognitive performance over longer time periods. This is because tDCS is well-tolerated, affordable, and offers some participants significant cognitive benefits. WM is valuable to explore because it is an executive function essential for wide-ranging cognitive tasks, but is strictly capacity limited. Thus, any WM improvement can be a meaningful quality of life improvement. Furthermore, there is a large WM training literature to draw on for guidance in developing protocols (see recent reviews touching on this topic and current debates in the WM training field: ^{[43,44,45,46,47,48,49,50,51,52]}. Perhaps surprisingly, given the variability in the broader WM training literature, there is marked consistency in the small literature in which WM training is paired with tDCS. This consistency is notable because these studies use different WM training tasks, tDCS protocols, number of sessions, and participant populations; see Table 1. WM benefits are associated with longitudinal tDCS in younger ^[2], and older adults ^{[53,54,55,56,57]}, in special populations (vascular dementia ^[58]; schizophrenia ^[59]; stroke ^[60]; PTSD ^[61]), and across verbal and visuospatial WM tasks ^{[3,24,28,32,62,63,64]}. Other forms of cognitive training paired with tDCS also show WM benefits in healthy and special populations (major depression) ^[65]. To the best of our knowledge, these are all of the published studies pairing WM training with longitudinal tDCS that appear on PubMed (March 2017) using search terms, "tDCS, working memory training", and "transcranial direct current stimulation, working memory training", and with replacing working memory with short-term memory.

Table 1. Longitudinal tDCS studies showing improved WM.

Study	N	#	mA/min	A/C	Stim	Task	Trained Result	Transfer
Healthy Adults
^[66]	54 ya	10	2/30	F3/R.Delt.	D	Verbal	+	Y-
^[64]	58 ya	10	1.5/15	F3/F4	D	Verbal	+	Y+
^[55]	40 oa	10	2/30	F3, F4/Arm	D	Aud/Vis	+	-
^[54]	72 oa	10	1.5/10	F4 or P4 or F4-P4; Contra. cheek	D, B	Vis	+	Y+
^[57]	90 oa	5	1 or 2/15	F4/Contra. Cheek	D, B	Vis	-	+
^[28]	30 ya	3	1/20	F3/R.Supra	D	Verbal	+ faster learning	Y-
^[62]^[24]	62 ya	7	2/25	F3 or F4, Contra. Supra.	D	Adaptive n-back	+ for space b/w 3^rd–4^th session	Y+; active groups
Special Populations
^[60]	11stroke	18.5	2/30	F3, F4/Arm	D	Aud/Vis	+
^[67]	23 TBI	15	1/10	F3/R.Supra	B	Att, mem	-	Y-
^[61]	4 PTSD	5	1/10	F3/R.Supra	B		-	n
^[58]	21 VasD	4	2/20	F3/		Verbal	-	Y+
Abbreviations: A: anode placement in 10–20 system; Aud: auditory; B: before training task (offline); b/w: between; C: cathode placement in 10–20 system; Contra: contralateral; D: during training task (online); Delt: deltoid; N: Number of participants; PTSD: posttraumatic stress disorder; oa: older adults, R.: right, Supra: supraorbital, TBI: traumatic brain injury; VasD: vascular dementia; Vis: Visual WM training task; ya: young adults, Y: yes, tested, #: number of sessions; −: null effect, +: significant positive effect.

| Show Table

DownLoad: CSV

Next, we provide a more detailed summary of several of our longitudinal experiments. We paired visual WM training with tDCS by conducting a longitudinal study in healthy well-educated older adults ^[54]. After baseline assessment, 72 participants completed 10-sessions of VWM training after they received 10 minutes of 1.5 mA anodal tDCS to either the right PFC (F4), the right PPC (P4), alternating between those sites, or sham (20 s ramp up/down). We selected 1.5 mA because our prior work had shown that this intensity was effective at disrupting WM tested by recognition ^[19]. It is noteworthy that some evidence indicates that the tDCS dosage effects can be non-linear ^[68,69], and future work is needed to comprehensively characterize dosage × task interactions. In this study, the WM training tasks required retention of object identity or location and performance was tested by recall or recognition. Participants returned for a follow-up session after a 1-month period of no contact. We also included measures of near transfer to untrained WM tasks, including the perennial n-back task, to explore the generalizability of WM related benefits. Transfer tasks are typically categorized as near or far to reflect how similar the task is to the training task. In this instance, near transfer tasks would refer to other WM tasks that were not trained, whereas far transfer tasks would fall under other cognitive domains, such as an episodic memory task, or another executive function task. By the end of training all participants had improved similarly on the trained WM tasks, with greater improvement on more challenging WM training tasks. In other words, for this WM-focused training task stimulation to any or both nodes in this frontoparietal network was beneficial, so all groups who received tDCS were collapsed into an "active tDCS" group. However, at follow-up testing one month later, significant tDCS-linked benefits emerged. First, the active tDCS group, stimulated at either site, maintained their performance gains whereas the sham group had lost ground. Second, that at follow up the active tDCS group also showed significantly higher performance on a set of unpracticed near transfer WM tasks. Thus, there were notable visuospatial WM benefits in this group of older adults that emerged only after an extended delay. This late emergence may explain why some groups fail to detect significant differences if they fail to include a follow up testing. The nature of these benefits was to perpetuate training gains rather than to show continued improvement. For practical purposes the timeline of tDCS-linked cognitive changes can be protracted and overlooked if no follow up testing takes place.

The logistical challenges associated with a 10-session training design ^[55,64,65] prompted us to reduce the number of training sessions to 5 in subsequent work ^[61].We were also interested in tDCS intensity. It remains debated whether higher intensity tDCS leads to stronger effects, with some showing that impact is non-linear, meaning higher intensity is not always better ^{[68,70,71,72,73,74,75]}. In short, older adults completed 5 WM training sessions and received either 1 mA, 2 mA, or sham targeting right DLPFC ^[57]. We also measured far transfer to computer-based laboratory tasks (processing speed, cognitive flexibility, arithmetic) and to measures with more ecological validity, the Occupational Therapy—Driver Off Road Assessment ^[76], and the Weekly Calendar Planning Activity ^[77]. It is worth noting that the quest for far transfer may turn out to be the cognitive training analogue of the great white whale (recently reviewed in ^[78], although a single session of either left or right PFC tDCS during spatial or verbal WM task showed both near and far transfer ^[79]. This study also showed evidence of far transfer, but no significant training effect and no significant near transfer ^[57].

In addition, we combined DNA data from both the Jones et al. (2015) and Stephens and Berryhill (2016) to test the relationship between a key single point mutation in the COMT gene (val¹⁵⁸met) ^[80]. The COMT gene is important in WM because it encodes the enzyme responsible for dopamine degradation in frontal synapses. Furthermore, the val¹⁵⁸met mutation of interest reflects functional changes in the rate of enzyme activity, such that those with val alleles have a faster acting enzyme, and those with more met alleles have a slower acting enzyme ^[81]. Genotype predicts WM performance, such that those with more met alleles perform better when the WM task rules remain consistent, as in change detection or n-back tasks ^[82,83,84]. We found an interaction between COMT genotype, WM improvement, and tDCS such that moderate tDCS (1.5 mA) enhanced WM performance where it had been weakest prior to training ^[80]. These data indicate that there is a "Goldilocks" tDCS intensity such that too much or too little is suboptimal. However, the genotype × tDCS intensity findings suffer from low power and fitting the pieces into a comprehensive mechanistic understanding limits the predictive power over a given protocol. These observations point toward a challenging set of parameters to optimize for individually tailored protocols that will require very large sample sizes to obtain sufficient power regarding the effect size of interactions between genotypes × tDCS protocol × task.

In addition to the number of sessions, and tDCS intensity, the question of session spacing may be important. A recent study including 7-sessions of anodal tDCS to the right or left PFC stimulation showing lasting WM benefits compared to sham on a visuospatial n-back WM task ^[62]. Unexpectedly, when the weekend fell as a gap of two days between the 4^th and 5^th sessions the benefits were significantly smaller than when the weekend was a gap between the 3^rd and 4^th sessions ^[62]. In a recent re-analysis and addition of follow up testing, the same group reports that the tDCS-linked improvements to verbal WM remained evident a year after training ended ^[24]. However, a recent meta-analysis of inter-session spacing found no systematic benefit of greater spacing between tDCS training sessions in cognitive tasks ^[1]. Thus, there remains some question regarding the optimization of trial spacing in longitudinal designs.

The observations from the visual WM training literature are consistent with reports from verbal WM training paired with tDCS ^[28,64,85]. Although the training tasks differ, Martin et al, (2013) and Talsma et al., (2016) used n-back tasks, whereas Richmond et al., (2014) used an adaptive verbal and spatial span WM task the results show a benefit of tDCS to the left DLPFC. Importantly, Richmond et al., (2014) also found near transfer benefits to other WM tasks; see Table 1. Talsma et al., (2016) included three sessions and found that the benefit of tDCS was to boost participants to their end state faster, rather than to show a main effect of training. These three studies reveal a heterogeneity of findings that may be attributed to different experimental designs, but such differences can only be resolved with further study. However, they speak to the generalizability of performance benefits when WM training is paired with anodal tDCS targeting the right or left DLPFC.

4. MIA: A Complete Understanding of the Mechanism Underlying tDCS

Progress is being made in identifying the mechanisms responsible for tDCS-linked cognitive performance benefits. The combination of cognitive task ^[86] and tDCS likely strengthens task-relevant networks via some form of LTP-like neuroplastic change ^[87] particularly in task-relevant networks ^[8,11], and especially when it is applied 'online', meaning concurrently with, the task, rather than 'offline' or after the end of tDCS ^[88]. However, getting to the fully fleshed out understanding of this seemingly straight forward perspective is non-trivial. One challenge is that mechanism can be studied from the molecular to the network level. Understanding the literature at each of these levels is difficult and developing research teams developing inquiries at each level is non-trivial. Physiological data indicate that tDCS induces changes in all evaluated neurotransmitter systems ^[8]. Neuroimaging using various techniques including EEG and fMRI paired with tDCS reveals that tDCS has a number of effects: blood flow changes in functionally connected neural networks ^[89,90], enhanced BOLD signal ^[91], enhanced resting state connectivity ^{[92,93,94,95]}, enhanced functional connectivity ^[96], greater neural synchrony ^{[72,97,98,99]}, and modulated oscillatory activity ^[100,101]. Current-flow modeling using realistic human head models reveals that tDCS modulates neural activity between anodal and cathodal electrodes, making it difficult to predict the extent of stimulation ^{[102,103,104,105,106,107]}. In essence, in addition to the various experimental parameters that must be better characterized, the consequences of longitudinal tDCS remain only very partially understood. This limitation hampers our ability to design useful interventions with predictable outcomes.

5. Other Emerging Noninvasive Brain Stimulation Techniques and Working Memory

Several other experimental techniques are worth noting. Recent work suggests that transcranial alternating current stimulation (tACS) might be a beneficial approach for WM because participants showed superior performance after tACS compared to performance after tDCS or sham conditions ^[108]. TACS also modulates oscillations during WM tasks, with theta band (6 Hz) improving WM performance ^[109]. Furthermore, gamma band (80–100 Hz) tACS applied during the peaks of ongoing theta tACS improved WM performance further. This approach is also open to consistency challenges as a second study examining WM performance after multiple tACS sessions reported no benefit ^[110]. Transcranial random noise stimulation (tRNS) has been included in several longitudinal studies showing benefits to those with tinnitus ^[110], amblyopia ^[111], motor learning ^[112], approximate number sense ^[113], and arithmetic performance ^[114]. To clarify, the approximate number sense task requires people to estimate and compare the magnitude of two quantities, and make a judgment such as reporting which set is larger ^[115]. The only published study of WM training involving tRNS failed to show a performance benefits ^[116]. Evidence is emerging and in piecemeal fashion across techniques further complicating optimization.

6. Translational Applications in Cognition

Many of us share a growing awareness of time and its cognitive consequences. Age-related cognitive decline or worse, dementia, looms on the horizon. For some researchers with an interest in cognitive performance in the aging, there is a desire to examine how to stave off cognitive decline. Importantly, the aging population, as well as the Alzheimer's population are both growing and interested in trying noninvasive approaches. Non-invasive brain stimulation approaches in the aging population that might serve in an adjuvant capacity to prolong quality of life in a multipronged approach addressing diet, exercise ^[117], and social support ^[118].

In addition, other special populations have demonstrated improved performance after tDCS. This includes cognitive training and improved WM in those with vascular dementia ^[58], TBI ^[67], posttraumatic stress disorder ^[61], and stroke ^[60]. Single session work shows benefits in those with Parkinson's disease ^[119], depression and epilepsy ^[120], schizophrenia ^[59], and pain ^[121].

7. Lessons Learned

Longitudinal studies are heavily resource intensive. The optimal value for various paradigm settings (tDCS montage/intensity/duration, task selection, transfer task selection) with regard to pairing tDCS with cognitive training has not been systematically studied, and merits further research. It is possible that filling that gap in knowledge data would make it feasible to tailor a paradigm for a particular individual. Here are several points where some limited consistency has emerged in WM performance benefits after multiple sessions of anodal tDCS targeting the PFC. These issues may be useful to consider in future studies:

• tDCS targeting different nodes (e.g., PFC, PPC) in a task-relevant network can result in similar effects, suggesting the widespread stimulation associated with tDCS may be helpful, but nonspecific. These networks influence each other and the faster ventral network oscillations via CFPS, which tDCS can modulate.

• Testing performance changes after a long ( > 1 month) delay can reveal effects of tDCS consistent with a prolonging of training related performance benefits.

• Including near and far transfer tasks, and test at follow-up.

• Collecting independent measures of performance on a different task to evaluate individual differences with an independent measure.

Acknowledgments

This research was funded by NSF OIA 1632849, and NSF OIA 1632738.

Conflict of Interest

All authors declare no conflicts of interest pertaining to this paper.

References

[1]	M. Kowalski, Deepfakes. Available from: https://www.github.com/MarekKowalski/FaceSwap/.
[2]	K. Liu, I. Perov, D. Gao, N. Chervoniy, W. Zhou, W. Zhang, Deepfacelab: integrated, flexible and extensible face-swapping framework, Pattern Recognit., 141 (2023), 109628. https://doi.org/10.1016/j.patcog.2023.109628 doi: 10.1016/j.patcog.2023.109628
[3]	D. Afchar, V. Nozick, J. Yamagishi, I. Echizen, MesoNet: a compact facial video forgery detection network, in 2018 IEEE International Workshop on Information Forensics and Security (WIFS), (2018), 1–7. https://doi.org/10.1109/WIFS.2018.8630761
[4]	F. Matern, C. Riess, M. Stamminger, Exploiting visual artifacts to expose deepfakes and face manipulations, in 2019 IEEE Winter Applications of Computer Vision Workshops (WACVW), (2019), 83–92. https://doi.org/10.1109/WACVW.2019.00020
[5]	Y. Qian, G. Yin, L. Sheng, Z. Chen, J. Shao, Thinking in frequency: face forgery detection by mining frequency-aware clues, in ECCV 2020: Computer Vision – ECCV 2020, Springer-Verlag, (2020), 86–103. https://doi.org/10.1007/978-3-030-58610-2_6
[6]	H. Liu, X. Li, W. Zhou, Y. Chen, Y. He, H. Xue, et al., Spatial-phase shallow learning: rethinking face forgery detection in bfrequency domain, in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), (2021), 772–781.
[7]	S. Chen, T. Yao, Y. Chen, S. Ding, J. Li, R. Ji, Local relation learning for face forgery detection, in Proceedings of the AAAI Conference on Artificial Intelligence, 35 (2021), 1081–1088. https://doi.org/10.1609/aaai.v35i2.16193
[8]	Q. Gu, S. Chen, T. Yao, Y. Chen, S. Ding, R. Yi, Exploiting fine-grained face forgery clues via progressive enhancement learning, in Proceedings of the AAAI Conference on Artificial Intelligence, 36 (2022), 735–743. https://doi.org/10.1609/aaai.v36i1.19954
[9]	X. Li, Y. Lang, Y. Chen, X. Mao, Y. He, S. Wang, et al., Sharp multiple instance learning for DeepFake video detection, in Proceedings of the 28th ACM International Conference on Multimedia, (2020), 1864–1872. https://doi.org/10.1145/3394171.3414034
[10]	Z. Gu, Y. Chen, T. Yao, S. Ding, J. Li, F. Huang, et al., Spatiotemporal inconsistency learning for DeepFake video detection, in Proceedings of the 29th ACM International Conference on Multimedia, (2021), 3473–3481. https://doi.org/10.1145/3474085.3475508
[11]	S. A. Khan, H. Dai, Video transformer for deepfake detection with incremental learning, in Proceedings of the 29th ACM International Conference on Multimedia, (2021), 1821–1828. http://doi.org/10.1145/3474085.3475332
[12]	D. H. Choi, H. J. Lee, S. Lee, J. U. Kim, Y. M. Ro, Fake video detection with certainty-based attention network, in 2020 IEEE International Conference on Image Processing (ICIP), (2020), 823–827. http://doi.org/10.1109/ICIP40778.2020.9190655
[13]	E. Sabir, J. Cheng, A. Jaiswal, W. Abdalmageed, I. Masi, P. Natarajan, Recurrent convolutional strategies for face manipulation detection in videos, Interfaces (GUI), (2019), 80–87.
[14]	A. Chintha, B. Thai, S. J. Sohrawardi, K. Bhatt, A. Hickerson, M. Wright, et al., Recurrent convolutional structures for audio spoof and video deepfake detection, IEEE J. Sel. Top. Signal Process., 14 (2020), 1024–1037. http://doi.org/10.1109/JSTSP.2020.2999185 doi: 10.1109/JSTSP.2020.2999185
[15]	A. Haliassos, K. Vougioukas, S. Petridis, M. Pantic, Lips don't lie: a generalisable and robust approach to face forgery detection, in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), (2021), 5039–5049.
[16]	Y. Zheng, J. Bao, D. Chen, M. Zeng, F. Wen, Exploring temporal coherence for more general video face forgery detection, in Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), (2021), 15044–15054.
[17]	Z. Gu, Y. Chen, T. Yao, S. Ding, J. Li, L. Ma, Delving into the local: dynamic inconsistency learning for DeepFake video detection, in Proceedings of the AAAI Conference on Artificial Intelligence, 36 (2022), 744–752. http://doi.org/10.1609/aaai.v36i1.19955
[18]	X. Zhao, Y. Yu, R. Ni, Y. Zhao, Exploring complementarity of global and local spatiotemporal information for fake face video detection, in ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), (2022), 2884–2888. http://doi.org/10.1109/ICASSP43922.2022.9746061
[19]	R. Shao, T. Wu, Z. Liu, Detecting and recovering sequential DeepFake manipulation, in ECCV 2022: Computer Vision – ECCV 2022, Springer-Verlag, (2022), 712–728. http://doi.org/10.1007/978-3-031-19778-9_41
[20]	A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, et al., An image is worth 16 x 16 words: transformers for image recognition at scale, preprint, arXiv: 2010.11929.
[21]	A. Arnab, M. Dehghani, G. Heigold, C. Sun, M. Lučić, C. Schmid, ViViT: a video vision transformer, in Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), (2021), 6836–6846.
[22]	Y. Zhang, X. Li, C. Liu, B. Shuai, Y. Zhu, B. Brattoli, et al., VidTr: Video transformer without convolutions, in Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), (2021), 13577–13587.
[23]	L. He, Q. Zhou, X. Li, L. Niu, G. Cheng, X. Li, et al., End-to-end video object detection with spatial-temporal transformers, in Proceedings of the 29th ACM International Conference on Multimedia, (2021), 1507–1516. http://doi.org/10.1145/3474085.3475285
[24]	Z. Xu, D. Chen, K. Wei, C. Deng, H. Xue, HiSA: Hierarchically semantic associating for video temporal grounding, IEEE Trans. Image Process., 31 (2022), 5178–5188. http://doi.org/10.1109/TIP.2022.3191841 doi: 10.1109/TIP.2022.3191841
[25]	O. de Lima, S. Franklin, S. Basu, B. Karwoski, A. George, Deepfake detection using spatiotemporal convolutional networks, preprint, arXiv: 2006.14749.
[26]	D. Güera, E. J. Delp, Deepfake video detection using recurrent neural networks, in 2018 15th IEEE International Conference on Advanced Video and Signal Based Surveillance (AVSS), (2018), 1–6. http://doi.org/10.1109/AVSS.2018.8639163
[27]	I. Masi, A. Killekar, R. M. Mascarenhas, S. P. Gurudatt, W. AbdAlmageed, Two-branch recurrent network for isolating deepfakes in videos, in ECCV 2020: Computer Vision – ECCV 2020, Springer-Verlag, (2020), 667–684. http://doi.org/10.1007/978-3-030-58571-6_39
[28]	Y. Yu, R. Ni, Y. Zhao, S. Yang, F. Xia, N. Jiang, et al., MSVT: Multiple spatiotemporal views transformer for DeepFake video detection, IEEE Trans. Circuits Syst. Video Technol., 33 (2023), 4462–4471. http://doi.org/10.1109/TCSVT.2023.3281448 doi: 10.1109/TCSVT.2023.3281448
[29]	H. Cheng, Y. Guo, T. Wang, Q. Li, X. Chang, L. Nie, Voice-face homogeneity tells deepfake, ACM Trans. Multimedia Comput. Commun. Appl., 20 (2023), 1–22. http://doi.org/10.1145/3625231 doi: 10.1145/3625231
[30]	W. Yang, X. Zhou, Z. Chen, B. Guo, Z. Ba, Z. Xia, et al., AVoiD-DF: Audio-visual joint learning for detecting deepfake, IEEE Trans. Inf. Forensics Secur., 18 (2023), 2015–2029. http://doi.org/10.1109/TIFS.2023.3262148 doi: 10.1109/TIFS.2023.3262148
[31]	M. Liu, J. Wang, X. Qian, H. Li, Audio-visual temporal forgery detection using embedding-level fusion and multi-dimensional contrastive loss, IEEE Trans. Circuits Syst. Video Technol., 2023. http://doi.org/10.1109/TCSVT.2023.3326694
[32]	Q. Yin, W. Lu, B. Li, J. Huang, Dynamic difference learning with spatio–temporal correlation for deepfake video detection, IEEE Trans. Inf. Forensics Secur., 18 (2023), 4046–4058. http://doi.org/10.1109/TIFS.2023.3290752 doi: 10.1109/TIFS.2023.3290752
[33]	Y. Wang, C. Peng, D. Liu, N. Wang, X. Gao, Spatial-temporal frequency forgery clue for video forgery detection in VIS and NIR scenario, IEEE Trans. Circuits Syst. Video Technol., 33 (2023), 7943–7956. http://doi.org/10.1109/TCSVT.2023.3281475 doi: 10.1109/TCSVT.2023.3281475
[34]	D. E. King, Dlib-ml: A machine learning toolkit, J. Mach. Learn. Res., 10 (2009), 1755–1758. Available from: http://www.jmlr.org/papers/volume10/king09a/king09a.pdf.
[35]	A. Rossler, D. Cozzolino, L. Verdoliva, C. Riess, J. Thies, M. Niessner, FaceForensics++: Learning to detect manipulated facial images, in 2019 IEEE/CVF International Conference on Computer Vision (ICCV), (2019), 1–11. http://doi.org/10.1109/ICCV.2019.00009
[36]	E. Xie, W. Wang, Z. Yu, A. Anandkumar, J. M. Alvarez, P. Luo, SegFormer: Simple and efficient design for semantic segmentation with transformers, in Advances in Neural Information Processing Systems, 34 (2021), 12077–12090.
[37]	W. Wang, E. Xie, X. Li, D. P. Fan, K. Song, D. Liang, et al., Pyramid vision transformer: a versatile backbone for dense prediction without convolutions, in Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), (2021), 568–578.
[38]	X. Chu, Z. Tian, B. Zhang, X. Wan, C. Shen, Conditional positional encodings for vision transformers, preprint, arXiv: 2102.10882.
[39]	M. A. Islam, S. Jia, N. D. B. Bruce, How much position information do convolutional neural networks encode? preprint, arXiv: 2001.08248.
[40]	N. Dufour, A. Gully, P. Karlsson, A. V. Vorbyov, T. Leung, J. Childs, et al., Contributing data to Deepfake detection research by Google Research & Jigsaw, 2019. Available from: http://blog.research.google/2019/09/contributing-data-to-deepfake-detection.html.
[41]	B. Dolhansky, J. Bitton, B. Pflaum, J. Lu, R. Howes, M. Wang, et al., The DeepFake detection challenge (DFDC) dataset, preprint, arXiv: 2006.07397.
[42]	Y. Li, X. Yang, P. Sun, H. Qi, S. Lyu, Celeb-DF: A large-scale challenging dataset for DeepFake forensics, in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), (2020), 3207–3216.
[43]	L. Jiang, R. Li, W. Wu, C. Qian, C. C. Loy, DeeperForensics-1.0: A large-scale dataset for real-world face forgery detection, in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), (2020), 2889–2898.
[44]	A. P. Bradley, The use of the area under the ROC curve in the evaluation of machine learning algorithms, Pattern Recognit., 30 (1997), 1145–1159. http://doi.org/10.1016/S0031-3203(96)00142-2 doi: 10.1016/S0031-3203(96)00142-2
[45]	P. Micikevicius, S. Narang, J. Alben, G. F. Diamos, E. Elsen, D. Garcia, et al., Mixed precision training, preprint, arXiv: 1710.03740.
[46]	Z. Zhang, M. R. Sabuncu, Generalized cross entropy loss for training deep neural networks with noisy labels, in Advances in Neural Information Processing Systems, 31 (2018).
[47]	L. van der Maaten, G. Hinton, Visualizing data using t-SNE, J. Mach. Learn. Res., 9 (2008), 2579–2605.
[48]	I. Radosavovic, R. P. Kosaraju, R. Girshick, K. He, P. Dollár, Designing network design spaces, in 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), (2020), 10425–10433. http://doi.org/10.1109/CVPR42600.2020.01044
[49]	Z. Liu, H. Hu, Y. Lin, Z. Yao, Z. Xie, Y. Wei, et al., Swin transformer v2: Scaling up capacity and resolution, in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), (2022), 12009–12019.
[50]	H. Bao, L. Dong, S. Piao, F. Wei, BEiT: BERT pre-training of image transformers, preprint, arXiv: 2106.08254.
[51]	W. Yu, M. Luo, P. Zhou, C. Si, Y. Zhou, X. Wang, et al., Metaformer is actually what you need for vision, in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), (2022), 10819–10829.

This article has been cited by:

1.	Jessica W. Younger, James R. Booth, Parietotemporal Stimulation Affects Acquisition of Novel Grapheme-Phoneme Mappings in Adult Readers, 2018, 12, 1662-5161, 10.3389/fnhum.2018.00109
2.	Nadine Külzow, Angelica Vieira Cavalcanti de Sousa, Magda Cesarz, Julie-Marie Hanke, Alida Günsberg, Solvejg Harder, Swantje Koblitz, Ulrike Grittner, Agnes Flöel, No Effects of Non-invasive Brain Stimulation on Multiple Sessions of Object-Location-Memory Training in Healthy Older Adults, 2018, 11, 1662-453X, 10.3389/fnins.2017.00746
3.	Dayana Hayek, Daria Antonenko, A. Veronica Witte, Sophie M. Lehnerer, Marcus Meinzer, Nadine Külzow, Kristin Prehn, Dan Rujescu, Alice Schneider, Ulrike Grittner, Agnes Flöel, Impact of COMT val158met on tDCS-induced cognitive enhancement in older adults, 2021, 401, 01664328, 113081, 10.1016/j.bbr.2020.113081
4.	Marian E. Berryhill, Donel Martin, Cognitive Effects of Transcranial Direct Current Stimulation in Healthy and Clinical Populations, 2018, 34, 1533-4112, e25, 10.1097/YCT.0000000000000534
5.	Kevin T. Jones, Hector Arciniega, Marian E. Berryhill, Replacing tDCS with theta tACS provides selective, but not general WM benefits, 2019, 1720, 00068993, 146324, 10.1016/j.brainres.2019.146324
6.	Daria Antonenko, Friederike Thams, Jessica Uhrich, Annika Dix, Franka Thurm, Shu-Chen Li, Ulrike Grittner, Agnes Flöel, Effects of a Multi-Session Cognitive Training Combined With Brain Stimulation (TrainStim-Cog) on Age-Associated Cognitive Decline – Study Protocol for a Randomized Controlled Phase IIb (Monocenter) Trial, 2019, 11, 1663-4365, 10.3389/fnagi.2019.00200
7.	Hector Arciniega, Filiz Gözenman, Kevin T. Jones, Jaclyn A. Stephens, Marian E. Berryhill, Frontoparietal tDCS Benefits Visual Working Memory in Older Adults With Low Working Memory Capacity, 2018, 10, 1663-4365, 10.3389/fnagi.2018.00057
8.	Kevin T. Jones, Elizabeth L. Johnson, Marian E. Berryhill, Frontoparietal theta-gamma interactions track working memory enhancement with training and tDCS, 2020, 211, 10538119, 116615, 10.1016/j.neuroimage.2020.116615
9.	Rongjuan Zhu, Yangmei Luo, Ziyu Wang, Xuqun You, Within-session repeated transcranial direct current stimulation of the posterior parietal cortex enhances spatial working memory, 2021, 1758-8928, 1, 10.1080/17588928.2021.1877648
10.	Daria Antonenko, Till Nierhaus, Marcus Meinzer, Kristin Prehn, Axel Thielscher, Bernd Ittermann, Agnes Flöel, Age-dependent effects of brain stimulation on network centrality, 2018, 176, 10538119, 71, 10.1016/j.neuroimage.2018.04.038
11.	Jacky Au, Benjamin Katz, Austin Moon, Sheebani Talati, Tessa R. Abagis, John Jonides, Susanne M. Jaeggi, Post‐training stimulation of the right dorsolateral prefrontal cortex impairs working memory training performance, 2021, 0360-4012, 10.1002/jnr.24784
12.	Sara Assecondi, Rong Hu, Gail Eskes, Xiaoping Pan, Jin Zhou, Kim Shapiro, Impact of tDCS on working memory training is enhanced by strategy instructions in individuals with low working memory capacity, 2021, 11, 2045-2322, 10.1038/s41598-021-84298-3
13.	Daria Antonenko, Nadine Külzow, Angelica Sousa, Kristin Prehn, Ulrike Grittner, Agnes Flöel, Neuronal and behavioral effects of multi-day brain stimulation and memory training, 2018, 61, 01974580, 245, 10.1016/j.neurobiolaging.2017.09.017
14.	Friederike Thams, Anna Kuzmina, Malte Backhaus, Shu-Chen Li, Ulrike Grittner, Daria Antonenko, Agnes Flöel, Cognitive training and brain stimulation in prodromal Alzheimer’s disease (AD-Stim)—study protocol for a double-blind randomized controlled phase IIb (monocenter) trial, 2020, 12, 1758-9193, 10.1186/s13195-020-00692-5
15.	P. Šimko, M. Pupíková, M. Gajdoš, I. Rektorová, J. Michael Wyss, Cognitive Aftereffects of Acute tDCS Coupled with Cognitive Training: An fMRI Study in Healthy Seniors, 2021, 2021, 1687-5443, 1, 10.1155/2021/6664479
16.	Sara Assecondi, Bernardo Villa-Sánchez, Kim Shapiro, Event-Related Potentials as Markers of Efficacy for Combined Working Memory Training and Transcranial Direct Current Stimulation Regimens: A Proof-of-Concept Study, 2022, 16, 1662-5137, 10.3389/fnsys.2022.837979
17.	Serkan Aksu, Buse Rahime Hasırcı Bayır, Ceyhun Sayman, Ahmet Zihni Soyata, Gökalp Boz, Sacit Karamürsel, Working memory ımprovement after transcranial direct current stimulation paired with working memory training ın diabetic peripheral neuropathy, 2023, 2327-9095, 1, 10.1080/23279095.2022.2164717
18.	Elizabeth L. Johnson, Hector Arciniega, Kevin T. Jones, Alexandrea Kilgore-Gomez, Marian E. Berryhill, Individual predictors and electrophysiological signatures of working memory enhancement in aging, 2022, 250, 10538119, 118939, 10.1016/j.neuroimage.2022.118939
19.	Theodore P. Zanto, Kevin T. Jones, Avery E. Ostrand, Wan-Yu Hsu, Richard Campusano, Adam Gazzaley, Individual differences in neuroanatomy and neurophysiology predict effects of transcranial alternating current stimulation, 2021, 14, 1935861X, 1317, 10.1016/j.brs.2021.08.017
20.	Marlen Schmicker, Inga Menze, Christine Schneider, Marco Taubert, Tino Zaehle, Notger G. Mueller, Making the rich richer: Frontoparietal tDCS enhances transfer effects of a single-session distractor inhibition training on working memory in high capacity individuals but reduces them in low capacity individuals, 2021, 242, 10538119, 118438, 10.1016/j.neuroimage.2021.118438
21.	Ingrid Rebello-Sanchez, Karen Vasquez-Avila, Joao Parente, Kevin Pacheco-Barrios, PauloS De Melo, PauloE P Teixeira, Kian Jong, Wolnei Caumo, Felipe Fregni, Insights and future directions on the combined effects of mind-body therapies with transcranial direct current stimulation: An evidence-based review, 2022, 5, 2349-7904, 129, 10.4103/ijprm.JISPRM-000167
22.	Sara Assecondi, Rong Hu, Jacob Kroeker, Gail Eskes, Kim Shapiro, Older adults with lower working memory capacity benefit from transcranial direct current stimulation when combined with working memory training: A preliminary study, 2022, 14, 1663-4365, 10.3389/fnagi.2022.1009262
23.	Loredana Frau, Valentina Cazzato, Francis McGlone, Davide Bruno, The effects of multimodal training on working memory in younger and older adults, 2022, 1, 2397-2653, 23, 10.53841/bpscog.2022.1.7.23
24.	Friederike Thams, Shu-Chen Li, Agnes Flöel, Daria Antonenko, Functional connectivity and microstructural network correlates of interindividual variability in distinct executive functions of healthy older adults, 2023, 03064522, 10.1016/j.neuroscience.2023.06.005
25.	Casey M. Imperio, Elizabeth F. Chua, Differential effects of remotely supervised transcranial direct current stimulation on recognition memory depending on task order, 2023, 17, 1662-5161, 10.3389/fnhum.2023.1239126
26.	Irene Gorrino, Nicola Canessa, Giulia Mattavelli, Testing the effect of high-definition transcranial direct current stimulation of the insular cortex to modulate decision-making and executive control, 2023, 17, 1662-5153, 10.3389/fnbeh.2023.1234837
27.	Rongjuan Zhu, Xiaoliang Ma, Xiaoqing Liu, Xuqun You, tDCS over the left auditory cortex enhances working memory of nonsense auditory syllables: The role of stimulation montages, 2025, 74, 09116044, 101250, 10.1016/j.jneuroling.2025.101250
28.	Rongjuan Zhu, Xiaoliang Ma, Ziyu Wang, Qi Hui, Xuqun You, Improving auditory alarm sensitivity during simulated aeronautical decision-making: the effect of transcranial direct current stimulation combined with computerized working memory training, 2025, 10, 2365-7464, 10.1186/s41235-025-00620-x

Reader Comments

Your name:*

Email:*
© 2024 the Author(s), licensee AIMS Press. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0)