Technical Information

From a presentation by Dr. Kakuichi Shiomi

Here is a technical detail presentation about SiCECA. That is used as the fatigue and drowsiness predictor. This predictor is an application of chaos theory that is a field of modern mathematics. the predictor can measure a degree of tiredness from human voice. Strictly speaking, the predictor can measure human cerebral activity of a speaker.

In 1999, I had tried to make the noisy voice clear for analyzing recorded by a Cockpit Voice Recorder. Though I could not make the noisy voice clear, my collaborator had found that the chaotic characteristics of the voice had some variation. Concretely, he had found that time average of the first Lyapunov exponent of human voice might indicate cerebral activity of the speaker. In those days, it was very difficult for us to calculate chaotic characteristics of human voice time series. It took more than several hours for processing one second uttered voice.

Now, this Research and Development has been a cooperative work between Electronic Navigation Research Institute and several private companies. Cray Inc. that is a super-computer maker is an important partner for this development, since the calculation of the chaotic characteristics of voice signal needs a lot of computing resources.

At first, I hope that all of you will be able to make some kinds of sense on chaos theory, since it is too difficult for me to explain it strictly and easily. In modern mathematics, chaos is a kind of deterministic order, and it is quite different from random or nonsence.

This example of chaos is from a famous book written by Edward Lorenz, he is a father of the chaos theory.

Curved lines shown in Figure 1 are locus of a slid snowboard. Though the snowboard has slid from quite the same point, it reaches quite different points owing to little difference that exists strictly. The chaos theory is the theory to understand extraordinary complexities.

Figure 1.

Click to enlarge

Figure 2 show the wave forms of voiced gah sounds. The left figure shows normal waveform of my voice, and the right figure shows fatigued waveform after more then one hour story reading. When you see these figures, you maybe think that it is easy to discriminate between normal voice and fatigued voice. But there are many people whose normal waveform looked like that of my fatigued voice in the right figure. Every waveform is always different from the others.

The waveform in the right figure can be made from the waveform in the left figure, easily, artificially, by adding some kinds of low frequency noises.

Figure 2.

Click to enlarge

Figure 3 show the power spectrums of normal voiced gah sound and fatigued voiced gah sound. In the right figure, the first and the second peaks were changed to be higher, since there was low frequency noise in the fatigued voice.

But the positions of all peaks were not changed.

Then, it is quite impossible to discriminate between normal voice and fatigued voice, by using the method based on the frequency spectrum analysis. I think that it is impossible to measure fluctuations of human voice with any conventional methods. The degree of the fluctuation of human voice will be increased, when the speaker will get tired.

Figure 3.

Click to enlarge

The next two figures in Figure 4 are important pictures for SiCECA. these show the chaotic characteristics of human voice. These two figures are called Strange Attractors or gTakensf Plotsh. These figures show that "there are more fluctuations in the fatigued voice than the normal voice"
Many spikes of the right figure means that the fatigued voice contain more noises. The width of locus means the degree of fluctuations. 
The fatigued voice has higher degree of fluctuations. In the chaos theory, Lyapunov Exponents are used to show and indicate the stability of the locus, in this case, the level of noise and fluctuation of the human voice.

The brain activity level we named is originated from the Lyapunov Exponents explained here.  

Figure 4.

Click to enlarge

Figure 5 shows the way to calculate the Lyapunov exponents that can be thought as a measure of human brain activity level.

At first, we have to find neighborhood points on the locus. After making neighborhood points set, we calculate the ratio of the separation of every couple of neighborhood points. In usual case, there are many sets of neighborhood points. In the case of the human voice strange attractor, there are more than ten thousands neighborhood points sets per second.

Consequently, it will take several seconds or more to calculate average of the cerebral exponents from one second uttered voice, even if the newest and fastest PC can be used for the calculation.

Figure 5.

Click to enlarge

Figure 6 show the way to make Takens Plot. In this example, a sine wave is embedded into 2 dimensional space.

Every couple of x and y coordinates of the point on the Takens Plot can be defined on the sine wave as shown in the left figure. Between x and y coordinates of each point has constant difference in time, that called embedding delay time. And embedding interval is also constant as shown in the left figure.

Figure 6.

Click to enlarge

In the case of the strange attractor of the human voice, every point on the attractor is defined as shown in the left figure of Figure 7.

Every point on the voice waveform represents the values of more than 4 constituents. And each constituent has quite different specific delay time. In this case, the embedded point moves from Point(n) to Point(n+1), and Point(n+1) to (n+2), and from (n+2) to (n+3), as shown in the right figure of Figure 7.

Finally, the movement of the point makes a strange attractor of the human voice.

Figure 7.

Click to enlarge

Figure 8 show the case of white noise. The white noise cannot make any strange attractor. The movement of the embedded point will cover the whole space after infinite long time.

Figure 8.

Click to enlarge

Figure 9 show the very historical and ideal case that had found by professor Lorenz of MIT. Before the discovery by Lorenz, the waveform as shown in the left figure was considered as a kind of random noise. After the discovery, scientists had come to think that the chaos is not a kind of random, but the chaos is a kind of cosmos. The Chaos is an order of too much complexity to make long term forecast.

Figure 9.

Click to enlarge

@

Please, proceed to the explanation of our experimental data.

The last update 07/03/05