Rolling bearing fault diagnosis based on quantum LS-SVM

Rolling bearing is an indispensable part of the contemporary industrial system, and its working conditions affect the state of the entire industrial system. Therefore, there is great engineering value to researching and improving the fault diagnosis technology of rolling bearings. However, with the involvement of the whole mechanical equipment, we need to have a large quantity of data to support the accuracy of fault diagnosis, while the efficiency of classical machine learning algorithms is poor in processing big data, and huge amount of computing resources is required. To solve this problem, this paper combines the HHL algorithm in quantum computing with the LS-SVM algorithm in machine learning and proposes a fault diagnosis model based on a quantum least square support vector machine (QSVM). Based on experiments simulated on analog quantum computers, we demonstrate that our fault diagnosis based on QSVM is feasible, and it can play a far superior advantage over the classical algorithm in the context of big data.


Introduction
With the development of science and technology, the modern industrial system has entered a new era of integration, precision and intelligence, these characteristics not only make the mechanical equipment organically integrated into a whole but also improve the modern industrial system's higher production efficiency. On the other hand, with the increase in operation time and equipment aging, mechanical malfunctions are always inevitable. The failure of any part of the industrial production line may have a great impact on the entire industrial system, which may bring serious economic losses to enterprises and factories, and even cause major safety accidents in serious cases. Rolling bearing has always been one of the essential key parts of mechanical equipment. According to some research, in rotating machinery, about 30% of mechanical failures are caused by rolling bearings [1]. Therefore, it is very important to conduct research on the fault diagnosis of rolling bearings [2].
In recent years, with the improvement of machine learning theory, more and more researchers have applied these artificial intelligence algorithms to the fault diagnosis of rolling bearings and achieved good results [3][4][5][6]. However, it should also be noted that classical machine learning algorithms have gradually reached a bottleneck in computing power when dealing with high-dimensional and massive data. Finding algorithms that are more efficient in processing big data will be the focus of research on rolling bearing fault diagnosis [7].
Quantum is an important concept in modern physics, in which quantum is the smallest unit that cannot be divided, so the characteristics of quantum are mainly shown in the microscopic world. To describe the laws of physics of the microscopic world, the theory of quantum mechanics was proposed. This theory is often contrary to the experience and common sense of the macroscopic world, such as quantum superposition, quantum entanglement, quantum coherence and so on. The science and technology developed from quantum mechanics is called quantum technology. After decades of development, quantum technology has made great progress and gradually entered the field of interdisciplinary application research [8]. Quantum computing is one of the important branches of quantum technology, and is also the most promising technology, which can be put into practice in the foreseeable future [9]. Compared with classical computing methods, quantum computing can even achieve exponential acceleration in solving specific problems. As soon as its theory was put forward, it attracted the close attention of many scholars [10]. The super-strong computing power makes quantum computing one of the methods which are most likely to break through the existing computing bottleneck. Therefore, using quantum computing to solve the rolling bearing fault diagnosis problem in the context of big data will be one of the development directions in the future.
HHL algorithm is a quantum algorithm for solving linear equations proposed by Harrow, Hassidim and Lloyd in 2008 [11]. Compared with the classical solution methods, the HHL algorithm can achieve exponential acceleration in theory, and the proposal of this algorithm also drives the rapid development of quantum machine learning (QML). It has promoted scholars' research on quantum machine learning algorithms. Later, Childs et al. improved the HHL algorithm by using chebyshev class method to represent the operator, avoiding the phase estimation process in the original algorithm and enhancing the universality of the algorithm. WieBe et al. first proposed the quantum linear regression algorithm based on HHL algorithm in 2012. HHL algorithm can be divided into three steps to solve the least square support vector machine problem: First, the classical data is represented by quantum bits and stored in the quantum random access memory; Then the phase estimation algorithm is used to solve the parameters of the least square support vector machine, and the corresponding quantum states of the parameters are obtained and applied to the classification of test samples. Finally, the coherent term is used to measure the final quantum state, and the expectation of the coherent term is obtained, and the category of the test sample is judged according to the final expectation value.
SVM is one of the most classical algorithms in traditional machine learning. Its basic principle is to completely separate two types of data through a hyperplane. Different from black-box algorithms such as neural networks, SVM has complete theoretical proof and excellent generalization performance. In recent years, many scholars have applied it to the fault diagnosis of rolling bearings and achieved good results [12][13][14][15]. The solving process of standard SVM does not involve linear equations, but its derivative algorithm, the Least Square Support Vector Machine (LS-SVM) has computations involving linear equations [16]. In solving small-scale linear equations, the construction of the LS-SVM model is faster, but with the expansion of the scale of equations, it may even be impossible to solve. Therefore, this paper combines the HHL algorithm with the LS-SVM algorithm to propose a fault diagnosis model based on Quantum Support Vector Machine (QSVM).
Since quantum hardware with enough coherence time to demonstrate our proposed QSVM algorithm are not available at present, to verify the feasibility of QSVM, we need to spend much more time and computing resources compared with the classical LS-SVM. However, our research provided theoretical guidance and empirical results that will help further improvement in the theoretical research area and can better guide the development of practical applications soon.
In addition, the contributions of this paper are as follows: (1) We combine the HHL algorithm with the LS-SVM algorithm to propose a fault diagnosis model based on Quantum Least Square Support Vector Machine (QSVM), which has greater engineering application value.
(2) We use QSVM to realize three-classification fault diagnosis on small-scale data (classical computer simulation of quantum computing is very resource-intensive), achieve 100% fault diagnosis, and show that the fault diagnosis model is based on QSVM is feasible.

Theoretical of least square support vector machine
Suppose that the training set contains p samples, denoted as In Eq. (1), x i ∈ R q , x i represents the q-dimensional input vector, and y i is the sample categories, including 1 and -1. SVM tries to find an optimal hyperplane that can completely divide these two types of data. The optimal partition scheme is that the point closest to the hyperplane in the sample is the farthest away from the hyperplane. These points that determine the hyperplane are called support vectors. For each training point, its geometric distance to the hyperplane is Where d i is the distance from the i-th training point to the hyperplane, W and b are the parameters of the hyperplane. According to the theory of SVM, we need to find the training point closest to the hyperplane.
According to the Eq. (2) and (3), the optimization problem of SVM is transformed into To facilitate the solution, the Eq. (4) can be rewritten as LS-SVM transforms the inequality constraint of the Eq. (5) into equality constraint Where e i is the relaxation variable and λ is the regularization parameter. For nonlinear classification problems, training sample x i can be mapped from the original space to a higher dimensional feature space by the kernel function.
Construct the Lagrange function of the Eq. (6) Where α i is the Lagrange multiplier corresponding to sample x i . The partial derivative of each variable in the above formula is taken and sorted out, which can be obtained Where K is the kernel matrix of order p, and the values of α and b can be obtained by solving the linear equation.
LS-SVM needs to use all the training data, so its time complexity is a polynomial order of sample number p and feature number q, denoted as O(Ploy(pq)). When p and q are large, the computational complexity is extremely high. So, we use the HHL algorithm to replace the classical method of solving linear equations.

HHL algorithm 3.1 The solution form of HHL algorithm
HHL algorithm is a quantum method to solve linear equations, and is the key of quantum support vector machine to solve linear equations of LSSVM quickly. Firstly, HHL algorithm describes the systems of linear equations using quantum symbols, assuming that A is an Ermi operator in the N-dimensional state space and |b is a state vector of this space, so solving the system of linear equations can be expressed as solving the |x that satisfies A|x = |b .
The two core steps of the algorithm are sparse Hamiltonian simulation and phase estimation. When the data matrix in the algorithm is a sparse Hermitian matrix and the number of conditions is small, the time complexity of HHL algorithm to solve linear equations is O(log N), compared with the time complexity of the best known classical algorithm O(n), achieving exponential acceleration. The HHL algorithm promotes the research of quantum machine learning algorithms, especially for the problems that can be solved by data matrix algebraic operations. In the HHL algorithm, the solution form of linear equations is expressed as Where A is the N order matrix of Hermitian, |x and |b is the column vector of Hilbert space.
The Hermitian matrix can be decomposition: Where μ i is the eigenvalues of A, |u i is the eigenvector corresponding to μ i .
Taking u i as the base vector, we could construct b According to Eqs. (10) and (12) Where |x is the target to be solved by HHL. In the HHL algorithm, we need to use Quantum phase estimation (QPE) and Quantum Fourier transform (QFT).

Quantum Fourier transform
Similar to classical Fourier transform, QFT also converts a quantum state into another quantum state, and its quantum circuit is (see Fig. 1).
H stands for Hadamard gate, and it can be expressed In the first qubit |x 1 , after applying H-gate, | 1 is expressed as R k stands for controlled rotation gate Applies R k -gate, | 2 is expressed Similarly, after the application of R 3 to R n , we can get And according to Eq. (14), | 3 can be rewritten Repeat the steps above, we can get | 4 can be rewritten Eq. (22) shows that the original quantum state |x is transformed into |k , which completed the QFT. And it can be calculated from the circuit that the total number of quantum gates used by QFT is n + (n -1) + · · · + 1 = n(n + 1)/2, and the computational complexity is O(n 2 ). In the classical algorithm, the computational complexity of the Fourier transform is O(n2 n ).

Quantum phase estimation
In the QPE, the solution form of eigenvalue is expressed as Where e 2π iθ is the eigenvalue, and the function of QPE is to estimate θ . QPE's quantum circuit is (see Fig. 2).
The first register of QPE contains t-qubits and is all set to |0 . The second register contains the eigenvector |u of matrix A. From the Fig. 2, we can get U is the controlled rotation gate and can be expressed as After applying U-gate, we can get Eq. (26) can be rewritten Then applied inverse QFT x,y=0 e -2π xy 2 t e 2π ixθ |y . From Eq. (28), it can be seen that the probability amplitude corresponding to |2 t θ is the largest, that is, after the measurement of | 3 , the quantum state is most likely to collapse toward |2 t θ . In the same way, if we take enough measurements of | 3 , the frequency it collapses to |2 t θ must be the most. Since 2 t θ /2 t = θ , we can get θ . Note that t represents the number of qubits in register 1, and the larger the t is, the more accurate the θ is. Considering that too many qubits will require a lot of computing resources on classical computers, only three qubits are used in this paper (t = 3).
In Fig. 3, |y as auxiliary qubits, and |x 0 to |x m-1 are qubits that store eigenvalues. All of these qubits are set to |0 .
Applies QPE for |x 0 to |x m-1 and |b . Where |b is expressed in Eq. (12), we can get Applies SWAP-gate for |x 0 to |x m-1 , and SWAP-gate's function is to calculate the reciprocal of eigenvalues Applies Controlled Rotation, and its function is to save the eigenvalue from the |μ i to the probability amplitudes of an auxiliary qubit.
Where C is a constant and satisfies C ≤ min |μ i |. Applies QPE -1 , and its function is to untangle |x 0 , x 1 , . . . , x m-1 from | 4 Measure the ancilla qubit, and if the result is 0, we need to recalculate, until the result is 1. Finally, we can get We can observe that | 6 is proportional to |x in Eq. (13). Thus, the solution of linear equations is completed.
The mathematical derivation process of the HHL algorithm is extremely complicated. Therefore, it is very difficult to simulate the HHL algorithm with a classical computer. But quantum computers do not involve these complex mathematical operations, just controlling qubits to rotate in Hilbert space. We can prove the superiority of the HHL algorithm by analyzing its time complexity: The time complexity of the HHL algorithm is O(log(N)s 2 κ 2 /ε), where N is the order of the matrix, κ is the number of conditions of the linear equations, s is the sparsity of the matrix, and ε is the precision of the solution. Compared with classical algorithms, HHL can theoretically achieve exponential acceleration, thus greatly improving the efficiency of LS-SVM when dealing with a huge quantity of data.
Finally, there can be errors in solving the HHL algorithm, and the main source of errors is the eigenvalues solved in QPE. As mentioned in Sect. 3.2, the accuracy of eigenvalues depends on the number of qubits, and the increase in the number of qubits will improve the time complexity of the HHL algorithm, and how to balance accuracy and time complexity can be an area for further research of QSVM in fault diagnosis.

Rolling bearing fault diagnosis experiment 4.1 Data source
The experimental data selected in this paper came from XJTU-SY Bearing Datasets [17], and the data includes the outer race fault, inner race fault, cage fault and normal state of rolling bearings. The detailed introduction is shown in Table 1.
The computer used in the experiments is configured with an i5-9300H CPU, clocked at 2.4 GHz, with a memory of 16 GB, the programming language used is Python, the quantum programming framework is Qiskit, and the quantum simulator is statevector.

Data preprocessing
Rolling bearing fault diagnosis generally consists of two steps: feature extraction and fault identification. An appropriate and effective feature extraction method can effectively improve the accuracy of fault diagnosis.
In general, we reconstruct the original data into a signal matrix. Horizontal vibration data of the Bearing1_1 dataset and Bearing2_1 dataset was selected in this paper (see Figs. 4 and 5).
The sampling frequency of the Bearing1_1 dataset is 35 KHz, and the sampling time is 123 minutes. It contains 4 million sample points in total, and the outer race fault occurs in about 4896 seconds. (We use the Pauta criteria to calculate the point in time when the fault occurred [18]).
The sampling frequency of the Bearing2_1 dataset is 37.5 Hz, the sampling time is 491 minutes. It contains 16.08 million sample points in total, and the inner race fault occurs in about 28,038 seconds.  Then we reconstructed the above two data sets into two signal matrices. The Bearing1_1 dataset matrix contained a total of 1953 samples, and each sample was composed of 2048 continuously sampled points. The labels of normal samples were denoted as H1, and the labels of outer race fault samples were denoted as H2. In the same way, the Bearing2_1 dataset matrix contained a total of 7851 samples, and each sample was composed of 2048 continuously sampled points. The labels of normal samples were denoted as H1, and the labels of inner race fault samples were denoted as H3 (see Figs. 6 and 7).
After the sample division is completed, the next step is to extract data features. The commonly used feature extraction methods include time domain, frequency domain and time-frequency domain. Considering that the rolling bearing data used in this paper is collected in the laboratory with less external interference and no complex feature extraction process is required, only the time domain features of the data are extracted in this paper.
Time domain feature extraction refers to the calculation of various time domain statistical parameters from the original vibration signal. The commonly used time-domain statistical parameters include root mean square, crest factor, kurtosis and waveform fac-tor, etc. These statistical parameters will change with the change in the running status of rolling bearings. Therefore, by analyzing these parameters, the running status of rolling bearings can be reflected to a considerable extent.
Kurtosis is one of the most widely used statistical parameters in the field of rolling bearing fault diagnosis. When the rolling bearing runs normally, the amplitude of its vibration signal approximately meets the Gaussian distribution, and the kurtosis value is approximately equal to 3. When the fault occurs, the Gaussian distribution curve will be skewed, and correspondingly, the kurtosis value will increase. The mathematical formula of kurtosis is In Eq. (35), X i represents the sample, X mean represents the average value of the sample in a certain period, and n represents the total number of samples.
The crest factor is sensitive to faults with surface damage and wear, the mathematical formula of the crest factor is In Eq. (36), X rms represents the root mean square value of the sample in a certain period of time.
Specifically, we calculate the kurtosis and crest factor of all data in each sample and take them as the input features of QSVM. At the same time, to imitate quantum computing, classical computers need to consume massive computing resources and computing time. Therefore, a total of 20 samples were selected from normal, outer race fault and inner race fault in this paper. Among them, 70%(14 samples) were used as training data, and the remaining 30%(6 samples) were used as test data. According to the mathematical derivation in Sect. 2, linear equations are constructed and solved by the HHL algorithm.
Traditional QSVM can only complete two classification tasks, and we take Eq. (8) as an example, after solving α and b, the QSVM classification can be expressed as Where k represents the kernel function of test data and training data.
To implement the classification of the three types of faults, we use training data to train three QSVM classifiers for H1 and H2 (1 and -1), H1 and H3 (1 and -1), H2 and H3 (1 and -1) respectively. Then the test data were input into the above three QSVM classifiers respectively, and the test data were assigned to the corresponding labels according to the results. For example, if the results of the three QSVM classifications are "1", "1" and "-1" respectively, the test data will be labeled as H1 classes, and so on.
The decision boundary divided by QSVM according to the training data is shown in Fig. 8. It can be observed from Fig. 9 and Fig. 10 that there is a certain deviation between the decision boundary of QSVM and LS-SVM, and the reasons have been explained in Sect. 3, but the overall deviation is not large, so QSVM can also achieve 100% fault diagnosis accuracy. And it also shows that the fault diagnosis model based on QSVM is feasible and can play a far superior advantage over the classical algorithm in the context of big data.

Conclusion
To solve the problem of fault diagnosis in the context of big data, this paper proposes a fault diagnosis model based on QSVM. Compared with traditional algorithms, QSVM can theoretically achieve exponential operation acceleration and solve high-dimensional data that cannot be processed by traditional algorithms. With the rapid development of quantum hardware, QSVM and other quantum machine learning algorithms can be run on quantum computers soon to prove the superiority of quantum. The fault diagnosis algorithm based on quantum machine learning will also be one of the best choices in the context of big data, which has a profound impact on the field of fault diagnosis.