From a presentation by Dr. Kakuichi Shiomi
Here is a technical detail presentation about SiCECA. That is used as the fatigue and drowsiness predictor. This predictor is an application of chaos theory that is a field of modern mathematics. the predictor can measure a degree of tiredness from human voice. Strictly speaking, the predictor can measure human cerebral activity of a speaker.
In 1999, I had tried to make the noisy voice clear for analyzing recorded by a Cockpit Voice Recorder. Though I could not make the noisy voice clear, my collaborator had found that the chaotic characteristics of the voice had some variation. Concretely, he had found that time average of the first Lyapunov exponent of human voice might indicate cerebral activity of the speaker. In those days, it was very difficult for us to calculate chaotic characteristics of human voice time series. It took more than several hours for processing one second uttered voice.
Now, this Research and Development has been a cooperative work between Electronic Navigation Research Institute and several private companies. Cray Inc. that is a super-computer maker is an important partner for this development, since the calculation of the chaotic characteristics of voice signal needs a lot of computing resources.
At first, I hope that all of you will be able to make some kinds of sense on chaos theory, since it is too difficult for me to explain it strictly and easily. In modern mathematics, chaos is a kind of deterministic order, and it is quite different from random or nonsence.
This example of chaos is from a famous book written by Edward Lorenz, he is a father of the chaos theory.
Curved lines shown in Figure 1 are locus of a slid snowboard. Though the snowboard has slid from quite the same point, it reaches quite different points owing to little difference that exists strictly. The chaos theory is the theory to understand extraordinary complexities.
Click to enlarge
Figure 2 show the wave forms of voiced gah sounds. The left figure shows normal waveform of my voice, and the right figure shows fatigued waveform after more then one hour story reading. When you see these figures, you maybe think that it is easy to discriminate between normal voice and fatigued voice. But there are many people whose normal waveform looked like that of my fatigued voice in the right figure. Every waveform is always different from the others.
The waveform in the right figure can be made from the waveform in the left figure, easily, artificially, by adding some kinds of low frequency noises.
Click to enlarge
Figure 3 show the power spectrums of normal voiced gah sound and fatigued voiced gah sound. In the right figure, the first and the second peaks were changed to be higher, since there was low frequency noise in the fatigued voice.
But the positions of all peaks were not changed.
Then, it is quite impossible to discriminate between normal voice and fatigued voice, by using the method based on the frequency spectrum analysis. I think that it is impossible to measure fluctuations of human voice with any conventional methods. The degree of the fluctuation of human voice will be increased, when the speaker will get tired.
Click to enlarge
The next two figures in Figure 4 are important pictures for SiCECA. these show the chaotic characteristics of human voice. These
two figures are called Strange Attractors or gTakensf Plotsh. These
figures show that "there are more fluctuations in the fatigued voice than the
normal voice".
Many spikes of the right figure means that the fatigued voice
contain more noises. The width of locus means the degree of fluctuations.
The
fatigued voice has higher degree of fluctuations. In the chaos theory,
Lyapunov Exponents are used to show and indicate the stability of the locus, in
this case, the level of noise and fluctuation of the human voice.
The brain activity level we named is originated from the Lyapunov Exponents explained here.
Click to enlarge
Figure 5 shows the way to calculate the Lyapunov exponents that can be thought as a measure of human brain activity level.
At first, we have to find neighborhood points on the locus. After making neighborhood points set, we calculate the ratio of the separation of every couple of neighborhood points. In usual case, there are many sets of neighborhood points. In the case of the human voice strange attractor, there are more than ten thousands neighborhood points sets per second.
Consequently, it will take several seconds or more to calculate average of the cerebral exponents from one second uttered voice, even if the newest and fastest PC can be used for the calculation.
Click to enlarge
Figure 6 show the way to make Takens Plot. In this example, a sine wave is embedded into 2 dimensional space.
Every couple of x and y coordinates of the point on the Takens Plot can be defined on the sine wave as shown in the left figure. Between x and y coordinates of each point has constant difference in time, that called embedding delay time. And embedding interval is also constant as shown in the left figure.
Click to enlarge
In the case of the strange attractor of the human voice, every point on the attractor is defined as shown in the left figure of Figure 7.
Every point on the voice waveform represents the values of more than 4 constituents. And each constituent has quite different specific delay time. In this case, the embedded point moves from Point(n) to Point(n+1), and Point(n+1) to (n+2), and from (n+2) to (n+3), as shown in the right figure of Figure 7.
Finally, the movement of the point makes a strange attractor of the human voice.
Click to enlarge
Figure 8 show the case of white noise. The white noise cannot make any strange attractor. The movement of the embedded point will cover the whole space after infinite long time.
Click to enlarge
Figure 9 show the very historical and ideal case that had found by professor Lorenz of MIT. Before the discovery by Lorenz, the waveform as shown in the left figure was considered as a kind of random noise. After the discovery, scientists had come to think that the chaos is not a kind of random, but the chaos is a kind of cosmos. The Chaos is an order of too much complexity to make long term forecast.
Click to enlarge
@
Please, proceed to the explanation of our experimental data.
The last update 07/03/05