1 Introduction

Mental fatigue has been becoming one of the most important factors perplexing people’s work due to the fast-paced and high-pressure work in modern society. Mental fatigue is often caused by excessive mental or physical activity, resulting in the decline of psychological and cognitive function [1]. It is manifested in the inability of individuals to maintain a certain level of psychological operation, which usually was caused by short-term high-intensity work and long-term monotonous repetitive work [2]. Besides affecting the individual’s work efficiency, mental fatigue also may lead to life-threatening accidents for some professions related to safety in production. Therefore, effective monitoring of mental fatigue is of great significance to reduce the adverse impact of mental fatigue on work and life, and scientific classification of mental fatigue is the key to effectively monitor the state of mental fatigue.

Psychological fatigue, whether caused by short-term high-intensity work or long-term monotonous repetitive work, is mainly due to the overload of information processing process under the condition of limited cognitive resources [1,2,3,4,5,6]. Therefore, psychological fatigue caused by visual information processing may be different from that caused by auditory information processing, due to the difference of information processing process and mechanism between visual and auditory channels [7,8,9]. Hence, this study intended to explore the consistency of visual and auditory information processing, and based on which the classification and validation of visual and auditory fatigue is carried out.

Working memory, as one of the important links in information processing, is a psychological process in which people process and store information at the same time [10, 11]. It is shown that the capacity of working memory, which is the amount of information processed simultaneously by working memory system, can reflect the load level of information processing [12,13,14]. It is assumed that the fatigue state of short-time high strength and long-time low strength can be simulated through adjusting the information processing intensity and time, which can induce different levels of fatigue, and ultimately achieve the classification and verification of fatigue.

2 Experiment I: Cognitive Processing Consistency Test Between Audio and Visual Information

2.1 Method

2.1.1 Experimental Material

Visual Material

Ten meaningless pictures were evaluated and screened out from multiple pictures collected on the internet through multi-round expert evaluation method. Ten pictures were evaluated by six subjects from the same group described in Sect. 2.1.1 from six aspects: the difference between pictures, aesthetics, easy association, likability, meaning (whether it has special meaning), and pleasure. Picture materials were assessed with a 10-level score of 0–9. The smaller the number, the smaller the difference between the two pictures/the less beautiful the picture/the less easy it was for the subjects to associate/the less they liked the evaluated pictures/the less they felt the pictures had specific meanings/the unhappier they felt when they saw the pictures. After several-round evaluation, nine pictures were selected as visual stimulus materials in this experiment according to the scores on different dimensions and the results of cluster analysis.

Auditory Material

Under the condition of the same total energy, nine sounds were generated by changing the frequency, line spectrum and attenuation. Then, nine sounds were evaluated by seven subjects from the same group described in Sect. 2.1.1 from six aspects: the difference between sounds, loudness, sharpness, pleasure, associativity and enjoyment. The smaller the number, the smaller the difference between the two sounds/the smaller the loudness/the lower the voice/the less easy it is for the subjects to produce associations/the unhappier it sounds. According to the results of cluster analysis, the frequency, broadband spectrum, attenuation and energy of nine sound were adjusted, which, finally, were selected as the auditory stimulus materials in this experiment.

2.1.2 Experimental Procedure

The experiment was conducted in a quiet environment with sound insulation, room temperature of about 20 ℃, as well as suitable humidity and lighting. Visual and auditory working memory tasks were used to examine the consistency of information processing between visual and auditory channels. The experimental procedure was shown in Fig. 1.

Fig. 1.
figure 1

Experimental procedure for working memory task

Visual Working Memory Task

2-back:

As the experiment starts, a “×” mark with duration of 200 ms will appear in the center of the screen, reminding the subjects to start the task immediately. After that, a “+” mark with duration of 1000 ms will appear in the center of the screen, then a picture, as the visual stimulus, will show 500 ms. The subjects need to judge whether the picture is the same as the previous second picture. If the same, please press the “F” button, if not, press the “J” button. For example, assuming that the stimulus presented as “Picture 1, Picture 2, Picture 3, Picture 4…”. When the third picture appears, the subjects press the “F” button if it is the same as the first picture, and press the “J” button otherwise. There are 64 trials, including 22 trail for exercise and 42 trails for test. The first two numbers after the beginning of the experiment do not need to be judged by the subjects, and the correct rate of the exercises must reach 60% before the formal experiment can start. Before the experiment, the subjects were told to react as quickly and accurately as possible. It takes about 5–6 min to complete all the exercises and formal experiments.

3-back:

The experimental procedure was basically the same as the visual 2-back working memory task, but the difference was whether the current picture is the same as the previous third one.

Auditory Working Memory

2-back:

As the experiment starts, a “×” mark with duration of 200 ms will appear in the center of the screen, reminding the subjects to start the task immediately. After that, a “+” mark with duration of 1000 ms will appear in the center of the screen, then a sound, as the auditory stimulus, will be played 500 ms. The subjects need to judge whether the sound is the same as the previous second sound. If the same, please press the “F” button, if not, press the “J” button. There are 64 trials, including 22 trail for exercise and 42 trails for test. The first two sounds after the beginning of the experiment do not need to be judged by the subjects, and the correct rate of the exercises must reach 60% before the formal experiment can start. Before the experiment, the subjects were told to react as quickly and accurately as possible. It takes about 5–6 min to complete all the exercises and formal experiments.

3-back:

The program is basically the same as the 2-auditory 3-back working memory task. The difference is that the subjects need to respond to whether the current playing voice is the same as the previous third voice.

2.2 Results

The response time and correct rate of 11 subjects who completed visual and auditory working memory tasks are shown in Figs. 2 and 3.

Fig. 2.
figure 2

Response time for visual and auditory working memory

Fig. 3.
figure 3

Accuracy for visual and auditory working memory

Intra-group variance analysis was used to test the data with response time and accuracy as dependent variables, working memory capacity and channel type as independent variable. The results showed in Table 1 that:

Table 1. T-test and correlation analysis results for visual-auditory working memory task

In terms of accuracy, the main effect of capacity is significant, (F(1, 11) = 15.03, p < 0.05, η2 = 0.60). The analysis shows that the accuracy of subjects under 2-back condition is significantly higher than that under 3-back condition; the main effect of channel and the interaction between them are not significant.

Using response time and accuracy as dependent variables and information channel as self-adaptability, paired T-test and correlation analysis were carried out. The results showed that there were significant correlation relationships between response times for both of visual and auditory channel under 2-back and 3-back task. There was also significant correlation relationship between accuracy for both of visual and auditory channel under 3-back task, however, there was no such significant correlation relationship under 2-back task.

2.3 Discussion

At the same capacity, there was no significant difference in the accuracy of visual and auditory working memory tasks, indicating that auditory and visual channels were similar in processing accuracy. The response time of visual and auditory working memory tasks has significant main effect in terms of capacity, specifically the response time of 3-back tasks is significantly longer than that of 2-back tasks. On the other hand, the main effect is not significant in terms of channels, and the interaction edge is significant. Simple effect analysis showed that there was no significant difference in response time between visual and auditory channels under 2-back task, but significant difference in response time between visual and auditory channels under 3-back task, which represented at lower working memory load, the processing level of visual channel and auditory channel is the same, but at higher workload, the processing time of auditory channel is longer than that of visual channel. This may reflect that the processing of auditory information is more difficult than that of visual information, and may also be due to the slower processing speed of auditory information than that of visual channel. In conclusion, the working memory of visual and auditory channels can achieve the same level of accuracy in information processing, but when the working memory load is high, the processing time of auditory channels is longer than that of visual channels.

3 Experiment II: Fatigue Classification Experiment for Visual and Auditory Channel

The results of Experiment I show that there are differences in the information processing between the visual and auditory working memory tasks, which may lead to differences in the fatigue trend between the visual and auditory information processing. Therefore, Experiment II explores the fatigue classification parameters of the visual and auditory channels respectively. Previous studies have confirmed that cognitive behavioral indicators such as digital decoding, short-term memory and critical flicker fusion frequency can also be used as fatigue assessment indicators, besides subjective questionnaires [15]. Critical scintillation fusion frequency is the basic perceptual ability of human beings, while digital decoding, as one of the problem solving ability, belongs to the advanced cognitive function. The changes of these two indicators shows that both basic perceptual ability and advanced cognitive function will decrease significantly under fatigue condition. However, compared with the problem-solving ability, fatigue has a more direct effect on attention. It is difficult to maintain attention and alertness under fatigue, which is one of the important reasons for many safety accidents. Therefore, in experiment 2, subjective scale combined with alertness and basic reaction time were selected as the criteria of fatigue classification.

3.1 Method

3.1.1 Subjects

Eleven healthy male adults, with an average age of 24 (SD = 3.12), were right-handed and had normal vision or corrected visual acuity. They were paid a certain amount after the experiment.

3.1.2 Experimental Procedure

The experiment was conducted in a quiet environment with sound insulation, room temperature of about 20 ℃, as well as suitable humidity and lighting. Visual and auditory working memory tasks were used to induce visual and auditory cognitive fatigue, respectively. The fatigue self-assessment questionnaire and attention-alert tasks were used as fatigue criteria. The NASA load scale was utilized as work load evaluation index to test the validity of the fatigue classification and its relevant parameter. Specific fatigue classification and related task parameters determination procedures are as follows:

Classification of Fatigue Grade and Preliminary Determination of Task Parameters

Assuming that the relationship between fatigue level and cognitive task performance may follow a growth curve model. In order to better distinguish the different fatigue levels, the lower limit inflection point value, the middle value and the upper limit inflection point value of the curve model can be selected as the low, middle and high three fatigue level values, which has been represented as “red points” in Fig. 4. Then, the working memory task parameters, inducing different fatigue levels, has been determined preliminarily, which includes the number of memory targets, task duration, task trial number, see Table 2.

Fig. 4.
figure 4

Growth model for fatigue prediction (Color figure online)

Table 2. Working memory task parameters corresponding to different fatigue levels

Hierarchical Design Experiment

Since it is not yet possible to determine whether the fatigue levels, induced by the working memory task levels corresponding to the low, middle and high fatigue levels described in Table 2, can fit the lower, middle and upper inflection points in the prediction model, the experiment is carried out by using the method of hierarchical experimental design to determine the task parameters corresponding to different fatigue levels.

Stage I:

Using the working memory task parameters of middle and low levels described in Table 2, the fatigue scores induced by the corresponding tasks were obtained, and the levels of fatigue at different times, loads and channels were determined.

The fatigue grade was determined by taking polynomial response time, subjective scale and attention alertness test as fatigue criterion. We used the method of multiple measurements within the subjects to obtain the criterion data of the same subject before and after completing different tasks with different experimental time, load levels and channels. Based on the criterion data, the model was fitted to determine the task-induced fatigue level. Taking the time cost of the test, the endurance of the subjects and the cumulative fatigue that the pre-test may cause to the subsequent experimental tasks into consideration, we did not collect the pre-test data for each task, but took the pre-test data before all task as the pre-test reference value for all tasks. Adequate rest was given to reduce the possible impact of fatigue caused by previous tests on subsequent tests, due to multiple tests in the same day.

Stage II:

According to the experimental results of the Stage I, the experimental parameters of three kinds of fatigue grades are determined. If the fatigue induced by 5-min task in Stage I is low-grade fatigue and that induced by 30-min task is medium-grade fatigue, 45 min or 60 min would be used as the experimental task parameters of high-level fatigue induction. Based on the experimental results of Stage I, the fatigue grade can be determined. Selected experimental parameters are shown in Table 3.

Table 3. Optional set of fatigue classification test parameters based on the Stage I

3.2 Results

3.2.1 Results for Working Memory Task

The response time of the visual and auditory working memory tasks with the duration of 5 min and 30 min have been shown in Fig. 5. Response time for fatigue classification experiment and Fig. 6. Intra-group variance analysis was used by taking response time and accuracy as dependent variables, task duration, working memory load and channel type as independent variables. The analysis results showed that the main effect of channel was significant in terms of response time, (F(1, 10) = 24.22, p < 0.001, η2 = 0.71). Other main effects and interactions were not significant, as shown in Fig. 5. On the other hand, in terms of accuracy, the main effect of load is significant (F(1, 10) = 14.65, p < 0.01, η2 = 0.59), and the interaction between time and channel is significant (F(1, 10) = 10.27, p < 0.01, η2 = 0.51), however, the main effect of channel and the interaction related to them were not significant.

Fig. 5.
figure 5

Response time for fatigue classification experiment

Fig. 6.
figure 6

Accuracy for fatigue classification experiment

3.2.2 Results for Cognitive Behavior Performance

The results of polynomial response time showed that the simple response time increased after 5 min of cognitive task and continued to increase after 30 min of cognitive task. The results were consistent with the hypothesis, see Fig. 7.

Fig. 7.
figure 7

Simple reaction time under different fatigue condition

According to the theory of signal detection, there are four relationships between whether the signal appears or not and whether the subject detects the signal, which are hit rate, false report rate, missed report rate and correct negative rate. Hit rate and missed report rate are selected as the criteria of fatigue classification for statistical analysis, due to the sum of missed report rate and missed report rate is 1, and the sum of correct negative rate and false report rate is 1.

Repeated measurement variance analysis was carried out by taking task time, task channel and task load were taken as independent variables, and the hit rate and false alarm rate of attention alertness as dependent variable. The results showed that the main effects of task time, task channel and task load, second-order interaction and third-order interaction are not significant in terms of hit rate. However, in terms of false alarm rate, the main effect of task time was significant (F(1, 5) = 20.61, p < 0.01, η2 = 0.81), more specifically, the false alarm rate of vigilant task after 30 min of working memory task was significantly higher than that after 5 min of working memory task, the rest main effect and interaction effects were are not significant in terms of false alarm rate, see Fig. 8.

Fig. 8.
figure 8

False alarm under different fatigue condition

The results above showed that the alertness of the subjects gradually decreased with the increase of fatigue level, which showed that the false alarm rate of completing 5-min working memory task was higher than that of baseline level, and the false alarm rate of completing 30-min working memory task was higher than that of completing 5-min working memory task.

3.2.3 Results for Cognitive Behavior Performance

The results showed that after 5 min of working memory task, the positive emotions from the PASSS Emotion Scale decreased. After 30 min of working memory task, the positive emotions were lower than the baseline level and that for 5-min condition, which accorded with the research hypothesis, see Fig. 9. Negative emotions have no significant difference between 5-min working memory task and 30-min working memory task, as shown in Fig. 10.

Fig. 9.
figure 9

Positive emotions under different fatigue condition

Fig. 10.
figure 10

Negative emotions under different fatigue condition

The Fatigue Scale score (see Fig. 11) showed that the fatigue induced by 30-min working memory task was significantly greater than that induced by 5-min working memory task, especially for the auditory channel. However, the 5-min working memory task induced less fatigue than the baseline level.

Fig. 11.
figure 11

Fatigue scale score

The experimental results show that the workload of 30-min working memory task is greater than that of 5-min working memory task (Fig. 12), and the main difference is mental workload (Fig. 13). This shows that working memory task mainly induces mental workload, and with the increase of task time, the workload increases, which is consistent with the research expectations.

Fig. 12.
figure 12

NASA-TLX score

Fig. 13.
figure 13

Mental demand

3.3 Discussion

The false alarm rate of vigilant tasks after 30-min working memory tasks was significantly lower than that after 5-min working memory tasks, which represented that the vigilance of subjects after 30-min working memory tasks was lower than that after 5-min working memory tasks. It is indicated that the fatigue induced by 30-min working memory tasks was significantly greater than that induced by 5-min tasks. Therefore, working memory tasks with different duration can be used as fatigue-inducing tasks. According to the scores of fatigue scale and workload scale, 30 min working memory task can induce moderate level fatigue. Hence, 60 min is determined as the working memory task parameter induced by high-level fatigue.

4 Experiment III: Comprehensive Verification of Fatigue Classification

4.1 Method

4.1.1 Subjects

Five healthy male adults, with an average age of 24.40 (SD = 5.51), were right-handed and had normal vision or corrected visual acuity. They were paid a certain amount after the experiment.

4.1.2 Experimental Procedure

Based on the results of fatigue grading experiment, visual and auditory 2-back working memory task with the duration of 60 min were selected to induce fatigue. Subjects did working memory tasks in visual and auditory channels respectively, between which interval was at least one day. The subjects were asked to ensure adequate rest one day before the experiment. The order of visual and auditory tasks was balanced among the subjects.

In order to monitor and evaluate the subjects’ subjective fatigue state in real time during the experiment, the 60-min working memory task was divided into 10 blocks. After each block, they were asked to score their fatigue state, and then the next Block experiment was conducted immediately without any rest for the subjects. The state score is 0–9, 0 represents very unhappy/no fatigue at all, 9 represents very happy or extreme fatigue, the closer to 0 the score is, the less unhappy or fatigue, and the closer to 9 the score is, the happier or fatigue is. The experimental process is shown in Fig. 14.

Fig. 14.
figure 14

Fatigue comprehensive verification of audiovisual channel

4.2 Results

4.2.1 Subjective Evaluation Result

Subjective Fatigue Score

The subjects were asked to score their subjective fatigue status with 0-9 data every five minutes. The results showed that the subjective fatigue evaluation score increased gradually with the working memory time, as shown in Fig. 15.

Fig. 15.
figure 15

Changing trend for mental fatigue

Subjective Emotion Score

Whether in visual or auditory channels, the subjective emotions showed a gradual downward trend with the increase of working time, see Fig. 16.

Fig. 16.
figure 16

Changing trend for emotions

4.2.2 Results for Cognitive Behavior

Response Time

The experimental results showed that the working memory response time of the subjects on the visual channel decreases continuously with the elapse of the task time, while that on the auditory channel decreases first with the elapse of the task time, then maintains a relatively stable level after the third time period, and declines again at the sixth time period, and then increases slowly, as shown in Fig. 17.

Fig. 17.
figure 17

Changing trend for response time

Accuracy Rate

Whether visual or auditory, the accuracy of the working memory task decreased gradually with the task time, and presented a stepwise downward trend, as shown in Fig. 18. In terms of the auditory channel, the accuracy rate is relatively high in the first and second time periods (5–10 min). In the fourth and sixth time periods (20–30 min), it decreases and stabilizes to a certain level, and reach the lowest point in the eighth time period. In the ninth and tenth time periods, the accuracy rate of the subjects slightly increases. In terms of the visual channel, the change trend of working memory task accuracy is similar to that of auditory channel, but there are some differences between the results of Gergelyfi’s research [12]. Gergelyfi’s 120-min working memory task using visual channel showed that the accuracy of 24–48 min increased, even exceeded the average accuracy of 0–24 min, and then continued to decrease. However, in this study, although the accuracy rate of the subjects also increased during 25–30 min, the average accuracy rate of 25–45 min was not significantly higher than the average accuracy rate of 0–25 min.

Fig. 18.
figure 18

Changing trend for accuracy

4.3 Discussion

Based on research above, this study utilized 5-min, 30-min and 60-min working memory task to induce low-level fatigue, medium-level fatigue and high-level fatigue respectively. The subjects were asked to use 0–9 score to evaluate subjective fatigue. If 0–9 score is divided into five grades, 0–1 score is the first level, which can represent almost no fatigue or low fatigue, 2–3 score is the second level, which can represent lower fatigue, 4–5 score is the third level, which can represent middle fatigue, 6–7 score is the fourth level, which can represent high fatigue, 8–9 is the fifth level, can represent extreme high fatigue:

At the beginning of the working memory experiment including visual and auditory channel, the fatigue was at lower level. When the task lasted for about 30 min, i.e. the 5th and 6th periods, the fatigue was at a moderate level. At the end of the experiment, the fatigue was at a higher level. The change trend of subjective fatigue of the visual channel was in line with the expectation.

5 General Discussion

This study explored the consistency of visual and auditory information processing through visual and auditory working memory tasks. On this basis, the fatigue classification and validation of visual and auditory channels were carried out. The results of Experiment 1 showed that there was no significant difference in the accuracy of working memory tasks between visual and auditory channels, which indicated that the information processing of visual and auditory channels could achieve similar accuracy. In terms of processing time, when the workload is low, there is no significant difference between visual channel and auditory channel in processing time, but when the workload level is high, the working memory response time of auditory channel is significantly longer than that of visual channel, which indicates that there is a certain difference in processing time between visual channel and auditory channel. The longer processing time on the auditory channel may reflect that the processing of auditory presentation information is more difficult than that of visual presentation information, or the processing speed of auditory information is slower, which needs further study and discussion.

In experiment 2, it was determined that low, medium and high levels of fatigue could be induced by 2-back tasks of 5 min, 30 min and 60 min, and the fatigue-induced tasks were comprehensively validated in experiment 3. At the same time, the experimental results also show that the fatigue caused by visual information processing is different from that caused by auditory information processing. Visual channel is easier to accumulate fatigue than auditory channel, which may be related to the differences in the characteristics and ways of information processing between visual channel and auditory channel.

Mental fatigue, whether it is caused by short-term high-intensity work or long-term monotonous repetitive work, mainly comes from people’s overloaded information processing process. 2-back working memory task is a low-difficulty task, hence the lower, middle and higher levels of fatigue, induced in experiment 2 and Experiment 3, were similar with the fatigue caused by monotonous repetitive mental work. Researchers generally believe that 3-back tasks are more difficult than 2-back tasks [13, 16]. However, in experiment 2, there is no significant difference between 2-back tasks and 3-back tasks in response time, but the accuracy of 2-back tasks is significantly lower than that of 3-back tasks (Note: The average correct rate of 3-back tasks in different channels and time is more than 72.9%, which is consistent with previous studies [12]). However, there was no significant difference between the fatigue induced by 3-back task and 2-back task, no mater the task time was 5 min or 30 min, which indicated that although 3-back task was more difficult than 2-back task, the fatigue caused by 3-back task was similar to that caused by 2-back task. Thus, 3-back can not be used as a fatigue-induced task to simulate short-time high-intensity work. If it is necessary, the difficulty of work can further be increased, such as choosing 4-back working memory task. However, there may be a risk that the participants will not be able to complete the task well or even give up the task because of the significant decline in the quality of task completion [17]. This risk exists not only in the use of working memory tasks, but also in the use of other difficult tasks to induce fatigue. Therefore, in future studies, in order to avoid the risk of completing tasks or even giving up tasks, it better choose low intensity and long duration task during fatigue induction.

6 Conclusion

  1. (1)

    There are difference in information processing between visual channel and auditory channel. The processing accuracy of the two channels is similar, but the processing time of auditory channel is longer than that of visual channel at higher workload level.

  2. (2)

    2-back tasks with duration of 5 min, 30 min and 60 min can be used as low, medium and high levels of fatigue induction tasks.

  3. (3)

    When 2-back working memory task with duration of 60 min is used to induce fatigue caused by long-term low-intensity repetitive work, visual task is more likely to accumulate fatigue than auditory channel.