Review on Welding Process Monitoring Based on Deep Learning using Time-Series Data
Article information
Abstract
The quality of welds during welding processes significantly affects the performance and the reliability of the final products. Therefore, to guarantee a high quality of the products, technologies that utilize time-series data measured by various sensors for monitoring the welding processes are required. Because the time-series data measured during the welding processes exhibit nonlinear and nonstationary characteristics, deep learning techniques, which can automatically learn the features of nonlinear and nonstationary signals through deep network structures, have recently gained recognition as a new monitoring method. Therefore, in this review, recent research that applied deep learning models based on time-series data measured during welding processes to monitor welding processes are introduced. In addition, the types of time-series data and deep learning model structures that are predominantly used to monitor the welding processes, such as predicting the penetration states and identifying the welding defects are discussed. Lastly, based on the research cases discussed herein, future research directions and the prospects of deep learning-based welding process monitoring technology that uses time-series data are discussed.
1. Introduction
Welding is an essential process in various manufacturing sectors, such as the automotive, shipbuilding, and aerospace industries, where the quality of the welds is a significant factor that directly influences the mechanical properties of the final products1,2). Consequently, there is a growing demand in industrial settings for welding process monitoring technologies that can observe and analyze the physical phenomena occurring during welding in real-time to manage the weld quality3).
Common methods for monitoring welding processes typically involve camera-based vision monitoring technologies that inspect the molten pool or weld bead4). However, camera-based monitoring technologies are limited, because while these technologies can capture information related to the external appearance of the weld, they cannot identify internal defects, such as porosity or cracks. Additionally, under welding conditions in which excessive fumes are generated by flux or high-intensity plasma, the performance of these monitoring methods is significantly degraded5). Consequently, welding process monitoring technologies that utilize current, voltage, acoustic, and optical sensors are gaining attention. The signals measured by these sensors form time-series data characterized by continuous values over time6). Because these time-series data contain information on the physical phenomena occurring during the welding process, they are used for various real- time welding process monitoring purposes, such as predicting the weld penetration states and diagnosing the internal defects in welds.
The techniques employed to analyze time-series data for process monitoring usually involve direct signal processing in either the time or frequency domain to generate manually defined features7). However, during the signal processing process, the range for feature extraction heavily depends on the expertise and knowledge of the operator, and extracting features from welding signals characterized by nonlinear and nonstationary properties is difficult8,9). To overcome this limitation, recent research has focused on Artificial Intelligence (AI) technologies that can automatically extract features with nonlinear and nonstationary characteristics without extensive signal preprocessing. Among these methods, deep learning is particularly noteworthy for its rapid, economical ability to process large volumes of data measured by various sensors in contrast with other methods that rely on manually defined features. In addition, deep learning can effectively learn the characteristics of nonlinear and nonstationary signals through its deep network architecture, making it a suitable approach for analyzing the time-series data obtained from the welding processes.
Consequently, deep learning models based on time- series data are being researched for welding process monitoring technologies, such as predicting the weld penetration depth and identifying welding defects. In this review, we introduce research cases in which the deep learning models based on time-series data have been applied to monitor welding processes and summarize the characteristics of the time-series data and deep learning models used in this context. We also explain methods for analyzing the time-series data measured during the welding process using deep learning technology, and we compare and analyze the types of time-series data and deep learning model structures used for various monitoring purposes, such as predicting the penetration state, diagnosing welding defects, and identifying welding conditions.
2. Methods for Analyzing Deep Learning-Based Time-Series Data
Time-series signals measured during the welding process, such as current, voltage, and acoustic signals, are fundamentally one-dimensional (1D) in the time domain and exhibit the characteristics of nonstationary signals, which means that their statistical properties change over time6). In addition, these signals exhibit nonlinearity and complex dynamic characteristics depending on the welding process variables and time dependency, where the data at the current time point are related to the data at previous time points10). Therefore, to utilize these 1D time-series data to monitor welding processes, deep learning models must be capable of effectively analyzing the nonlinear and nonstationary characteristics of the signals and the time dependency within the data.
Among the deep learning models capable of learning the characteristics of nonlinear and nonstationary signals, Recurrent Neural Networks (RNNs) are particularly suited for learning time dependencies within 1D time-series signals. As shown in the basic structure of an RNN and the RNN cell in Fig. 1(a, b), RNNs have a structure in which the hidden state from the previous time step is used along with the current input to derive the hidden state at the current time step. The hidden state in an RNN contains information from the previous time steps, and this information, combined with the current input, forms the hidden state of the current cell. This mechanism allows RNNs to effectively learn the dynamic characteristics over time from time-series data with nonlinear and nonstationary properties. In the output layer, the learned dynamic characteristics are used to derive predictions for regression or classification tasks from the hidden state. Among the RNN family of models, Long Short-Term Memory (LSTM) is a deep learning model that can learn long-term dependencies by addressing the vanishing gradient problem that occurs during the training process of traditional RNNs. Because of this capability, LSTM is primarily used in welding process monitoring, which involves large volumes of time-series data. As depicted in Fig. 1(c), the LSTM has a cell state that includes information from the initial time step through previous steps, in addition to the hidden state transferred from the immediately preceding cell. This allows LSTM to effectively learn long-term dependencies. The LSTM cell is composed of three gates: forget, input, and output. First, the forget gate determines the proportion of information from the previous cell state that is retained in the current state. It applies a sigmoid (σ) activation function to the current input and hidden state to produce an output between 0 and 1, which is then multiplied by the previous cell state to selectively preserve past information. Next, the input gate adds new information to the current cell state. It applies a hyperbolic tangent (tanh) activation function to the current input and hidden state to filter new information and then it uses a sigmoid activation function to selectively update the current cell state with some of this new information. Finally, the output gate uses a hyperbolic tangent activation function to filter the information from the current cell state to be passed to the next cell, and the sigmoid function determines how much of this information is transferred to the hidden state of the next cell. Through this structure, the LSTM can selectively learn important information from the initial time step to the present, thereby effectively mitigating the vanishing gradient problem encountered by traditional RNN models. This enables LSTM to learn efficiently from large volumes of time-series data, making it highly effective for welding process monitoring.
Recently, various methods based on LSTM models have been studied to improve the performance of welding process monitoring. One representative technique involves preprocessing 1D time-series data to extract features in the time or frequency domain before learning the time dependencies using an LSTM model. In the case of acoustic signals measured during the welding process, which contain a mixture of various frequency components, important features that are not visible in the time domain can appear in the frequency domain through frequency-component decomposition. Learning the nonlinear and nonstationary characteristics from these features with an LSTM can enhance the accuracy of monitoring the welding process. In addition to preprocessing and learning 1D time-series data, efforts have been devoted to improving the structure of LSTM models to enhance not only the monitoring performance but also the computational efficiency and interpretability. As shown in Fig. 2, deep learning techniques, such as 1D convolution layers and attention mechanisms, are widely used in combination with LSTM models for welding process monitoring. As shown in Fig. 2(a), a hybrid model with a 1D convolution layer added to the input stage of the LSTM model has the advantage of being able to extract superior features with noise removed from the time-series data before learning the time dependencies. Additionally, 1D convolution operations can reduce the size of a large time-series dataset, thereby decreasing the amount of data input to the LSTM and increasing the computational efficiency. Furthermore, as shown in Fig. 2(b), a hybrid model that combines an attention mechanism at the output stage of the LSTM has also been used to monitor welding processes. The attention mechanism analyzes the similarity between features using a dot product and determines the features on which the model should focus during the prediction process. By assigning differential weights to features according to their contribution, the hybrid model with LSTM and the attention mechanism can improve the prediction accuracy while also allowing for the interpretation of the prediction results based on the contribution of each feature. In the next section, a detailed information regarding the cases of welding process monitoring using these hybrid models will be explained.
Another method for learning welding signals with nonlinear and nonstationary characteristics involves converting 1D time-series data in the time domain into two-dimensional (2D) time-frequency images, as shown in Fig. 3. This enables the simultaneous analysis of information in both the time and frequency domains. A representative method for analyzing the frequency domain from 1D time-series data is Short-Time Fourier Transform (STFT). The STFT technique divides time- series data into fixed-size windows and performs a Fourier transform on each window to analyze the frequency components. This results in a spectrogram, which is a 2D image representing the frequency characteristics over time. Another technique for analyzing the frequency domain from 1D time-series data is Wavelet Transform (WT). Unlike STFT, the WT uses variable time windows, making it suitable for analyzing aperiodic signals and allowing the analysis of a range of frequency components. The scalogram obtained from the WT analysis provides a better trade-off between time and frequency resolution, which is advantageous for capturing the characteristics of signals with nonlinear and nonstationary properties. The 2D time-series images obtained using these techniques effectively represent the complex nonlinear and nonstationary characteristics within welding signals by showing the frequency and amplitude characteristics over time4).
A suitable model for extracting features from such 2D time-series images is a Convolutional Neural Network (CNN), which can extract spatial features from images based on positional information11). CNNs utilize positional information within images to extract spatial features, enabling them to learn signal characteristics in specific frequency regions at specific times from time- series images with time and frequency axes. This capability makes CNNs useful for improving the accuracy of welding process monitoring by identifying changes in the signals in specific frequency regions at specific times when abnormal events occur during the welding process. This strategy uses information learned from normal welding processes. Fig. 3(b) illustrates the process by which the CNN learns from time-series images. These images are used as inputs in the CNN after first passing through a convolutional layer. In this layer, filters scan the time-series images and perform convolution operations involving element-wise multiplication and summation. This process results in the output of feature maps that contain the time-frequency characteristics. The generated feature maps are then passed through a pooling layer, where their size is reduced, and only the essential features are forwarded to the next layer. The resulting feature maps are then passed through the pooling layer, where their size is reduced, and only the essential features are forwarded to the next layer. These extracted features are then passed through a fully connected layer, where they are processed using nonlinear activation functions. The final output layer computes the probabilities of belonging to each class to produce the prediction results. Through this series of steps, CNNs can effectively learn the time-frequency domain characteristics of time-series images, making CNNs highly useful in various applications for monitoring welding processes, such as welding quality control and defect detection.
3. Deep Learning-Based Welding Process Monitoring Research
3.1 Prediction of the Weld Penetration State
The quality of welds is influenced by several factors, such as the weld penetration state, which is a crucial determinant of weld zone strength. Consequently, it is essential to monitor the weld penetration state in real-time during the welding process to ensure a sufficiently strong weld zone. Many studies have adopted deep learning models for real-time monitoring, which are summarized in Table 1.
Luo et al.12) proposed a hybrid model combining a 1D convolution layer and LSTM to monitor the penetration state in real-time based on acoustic signals in a laser welding process. For this purpose, acoustic signals corresponding to the laser power were measured using a microphone during the butt joint pulsed laser welding of an aluminum alloy. To clearly distinguish the penetration state from the measured acoustic signals, the nonperiodic characteristics of the signals occurring during the welding process were analyzed in the frequency domain using Variational Mode Decomposition (VMD). This technique can decompose a signal into multiple modes with different frequency bands; therefore, acoustic signals are decomposed into nine frequency bands using VMD. A 1D convolution layer was used to extract the energy distribution features from each frequency band of the decomposed signals. To predict the penetration state over time, a 1D convolution layer was combined with the LSTM model. After classifying the penetration states into three categories (insufficient, partial, and full penetration), the acoustic signals measured under the given welding conditions were trained using a 1D convolution layer-LSTM model. The results demonstrated that the model could predict all three penetration states with an accuracy of 99.8%.
Zhao et al.13) proposed a hybrid model combining LSTM and an attention mechanism to classify and predict three similar penetration states (partial penetration, normal penetration, and burn-through) in pulsed Gas Tungsten Arc Welding (GTAW) processes based on acoustic signals. The acoustic signals corresponding to the welding current were measured using a microphone during the butt joint pulsed GTAW of the aluminum alloy. The measured acoustic signals were decomposed into three segments corresponding to the ignition and extinction of the arc to analyze the time-domain data. Nine time-domain features, such as mean amplitude and impulse, were extracted from these segments. Additionally, to analyze nonlinear and nonstationary characteristics not evident in the time-domain data, a WT was performed, decomposing the acoustic signals into six frequency bands, with the short-time average energy extracted as a feature from each band. A model combining LSTM and an attention mechanism was constructed to predict the penetration state over time, with the attention mechanism assigning weights to features that were highly correlated with the penetration state. The model was trained using 15 features extracted from both the time and frequency domains. When applied to acoustic signals measured under welding conditions producing the three distinct penetration states, the LSTM-attention mechanism model achieved a prediction accuracy of 95.36%, surpassing that of traditional machine-learning-based models for predicting the penetration state, such as the Back Propagation Neural Network (BPNN) and Support Vector Machine (SVM).
Instead of using LSTM to learn time dependencies, research has also been conducted to develop hybrid models that can extract superior features from the time domain by fusing multiple signals. Cao et al.14) proposed a method for monitoring the penetration state during the laser welding process without first analyzing the time and frequency domains using a hybrid model based on a 1D CNN and a cross-attention mechanism. In their study, three penetration states (full, partial, and excessive penetration) were distinguished based on the laser power and welding speed during the laser welding of the aluminum alloy. The acoustic and optical time-series data were measured using a microphone and photodiode, respectively, under each set of welding parameters. To train the two types of time-series data, a 1D CNN model was used to extract features from each signal, and then a cross-attention mechanism was employed to combine the two features, resulting in a Cross-Attention Fusion Network (CAFNet) model capable of extracting opto-acoustic features. To evaluate the penetration state prediction performance of the CAFNet model trained with acoustic and optical data, its accuracy was compared with those of six CNN models without the attention mechanism under single- and multi-sensor conditions. The proposed model showed the best performance, with a prediction accuracy of 99.73%; further, even when the proportion of training data was reduced to 20%, it could still predict the penetration state with over 90% accuracy.
To analyze 1D time-series data, deep learning models can be used, such as hybrid models combining LSTM with 1D convolution layers and attention mechanisms. Alternatively, as described earlier, another method involves converting 1D time-series data in the time domain into 2D time-frequency images and using both time- and frequency-domain information for process monitoring. For example, Gao et al.15) converted arc acoustic signals occurring during the DC GTAW welding process into time-frequency images using STFT. Then, they proposed a method for monitoring the penetration state of a weld using the image data in conjunction with a CNN model. In the DC GTAW process of mild steel, acoustic signals were measured using a microphone to predict how the penetration state varied with welding conditions (current and welding speed). To remove noise from the measured acoustic signals, the signals were amplified or suppressed, and the processed acoustic signals were decomposed into several frequency bands using STFT. The frequency of each band was transformed into a spectrogram, a 2D time- frequency image, along with time-domain data. The spectrograms were classified into three penetration states, partial, full, and excessive penetration, and were used to train a CNN model based on VGG16. This model showed a classification accuracy of over 95% when the signal-to-noise ratio was less than 5 dB.
Ren et al.16) monitored the penetration state in pulsed GTAW using a CNN model with time-frequency images of arc acoustic signals transformed through STFT as the input. In the pulsed GTAW process of aluminum alloy plates, arc acoustic signals were measured using a microphone to monitor the penetration state under various preheating and processing (milling and hole drilling) conditions. Because the measured arc acoustic signals exhibited nonstationary characteristics, frequency data were extracted using STFT, which is advantageous for analyzing nonstationary signals, and 2D spectrograms in the form of time-frequency images were generated. To classify the penetration states during the GTAW process into four cases (partial, normal, and excessive penetration as well as burn-through), the transformed 2D images were used as inputs to train the CNN model. The performance of the trained CNN model in predicting the penetration states was compared with that of other four machine-learning techniques, including the BPNN and SVM. The results showed that the CNN method achieved a high accuracy of 98.2% in predicting the penetration state, with a low accuracy variance of 0.21, thus outperforming conventional methods.
Similarly, Zhao et al.17) converted arc acoustic signals collected during the pulsed GTAW process into time-frequency images using STFT and monitored the penetration state using a CNN model. To enhance the monitoring performance of the CNN model, they added a Squeeze-Excitation (SE) module, a type of attention mechanism, to the output layer of the CNN structure to evaluate the contribution of each feature map. Based on this evaluation, weights were applied differentially, enabling the deep learning model to learn more significant features. In addition, to prevent overfitting, the fully connected layer was replaced with either global average pooling or global max pooling. Using these methods, the model developed by Zhao et al. classified three penetration states (partial penetration, normal penetration, and burn-through) with an accuracy of 98.25%, demonstrating superior performance compared with those in the abovementioned studies.
Meanwhile, Zhang et al.18) proposed a method for monitoring the welding process using a CNN model with 2D time-frequency images obtained by applying STFT to acoustic emission signals propagating inside the material rather than acoustic signals measured with a microphone. Specifically, piezoelectric sensors were used to measure acoustic emission signals, which are elastic waves propagating inside the material, during the laser-Tungsten Inert Gas (TIG) hybrid welding process of 316 L stainless steel to predict changes in the penetration state according to the welding current, laser power, and welding speed. The measured acoustic emission signals were converted into spectrograms containing time-frequency domain information using STFT. The converted time-series images were then used to train a CNN model to classify the penetration state into three cases (insufficient, normal, and excessive penetration) during the welding process. The prediction performance of the trained model was compared with those of other models, such as SVM, 1D-CNN, Recurrence Plot (RP)-CNN, and Fast Fourier Transform (FFT)-CNN, among which it achieved the highest accuracy of 99.62%.
The aforementioned studies primarily used the STFT technique to convert 1D time-series data into frequency-domain data, generating 2D time-frequency images in the form of spectrograms. Instead of STFT, some studies have used WTs to form 2D time-frequency data for training. In a representative study, Jin et al.19) applied WT techniques to welding current signals measured in a Gas Metal Arc Welding (GMAW) process to extract 2D time-frequency images and predict the formation of back-beads using a CNN model. In their butt joint GMAW experiments on galvanized steel sheets, the welding current signals were measured under various conditions (wire feed speed and the existence of a root gap) to predict the formation of back beads. Because setting appropriate time and frequency resolutions is crucial when converting nonperiodic signals, such as welding current signals, into time-frequency domain images, they used the Morlet Wavelet Transform (MWT), a type of WT, to convert the frequency domain data. This resulted in 2D time-frequency images in the form of scalograms. The 2D images obtained under conditions that both led to and avoided the formation of back beads were used to train the CNN model, which model predicted the formation of the back beads with an accuracy of approximately 93.5%.
3.2 Detection of Weld Defects
As mentioned in Chapter 3.1, the strength of the weld zone is influenced by the penetration state. Similarly, various weld defects that occur during the welding process can also affect the penetration state and the strength of the weld, thereby compromising the weld quality. To detect these weld defects, various studies have developed deep learning models that can extract features with high contributions to defect detection from various types of time-series data, as summarized in Table 2.
Wang et al.20) proposed an LSTM-based deep learning model using the power signals of a welding machine to predict the fracture modes observed during shear tensile tests of ultrasonically welded metal materials. Thin sheets of aluminum and copper were ultrasonically welded under various process parameters (welding energy, amplitude, and pressure). During the shear tensile tests of the welded specimens, the power signals corresponding to the two fracture modes (interfacial and pull-out fracture) were measured. These measured power signals were used as inputs to the LSTM model, which can understand the characteristics of nonlinear and nonstationary data, to analyze the correlation between the power signals and the fracture modes of the weld. To enhance the learning efficiency and improve the model accuracy, the data were divided into blocks to train the LSTM model. The trained model predicted the fracture modes of ultrasonic welds with a high validation accuracy of 92.9%.
As described in Chapter 2, researchers are enhancing the performance of deep learning models by adding 1D convolution layers or attention mechanisms to traditional LSTM models. Ji et al.21) proposed a hybrid model combining LSTM with 1D convolution layers and an attention mechanism to develop a digital twin system for robotic GMAW processes, aiming to monitor defects in real-time by measuring arc acoustic signals. To detect the four types of welding defects associated with robot movements in GMAW (insufficient penetration, excessive penetration, undercut, and burn-through), the arc acoustic signals were measured under each defect condition. To extract features related to welding defects from these arc acoustic signals, an SE-CNN model was constructed, wherein multiple 1D convolution layers were followed by the SE module, which is a type of attention mechanism. Additionally, an LSTM model was used to learn the long-term dependencies from the features extracted by the SE-CNN model to classify the four types of welding defects. To quantitatively evaluate the performance of the trained SE-CNN-LSTM model, it was compared with other methods, such as CNN-LSTM and CNN-Bidirectional Long Short-Term Memory (BiLSTM), using six performance metrics, including accuracy, precision, recall, area under the curve, receiver operating characteristic and F1 score. The results showed that the SE-CNN-LSTM model outperformed all the other methods in every aspect.
Chang et al.22) applied an attention mechanism to a 1D convolution-LSTM deep learning model to monitor nugget defects during the resistance spot welding process. For this purpose, current and voltage signals were measured for three cases (insufficient, normal, and overburning welding), and the dynamic resistance was calculated from these two signals. To extract key features from the current and dynamic resistance data, a 1D convolution layer was introduced. The LSTM structure was then used to learn the time-dependent correlations between the extracted features. Additionally, an attention mechanism was implemented to select the most useful features for nugget defect detection, assigning higher weights to features with a larger contribution to defect detection, thereby improving the performance of the model. The trained model monitored the nugget defects with an accuracy of 98.5%. The weights calculated using the attention mechanism were visualized to interpret the monitoring results of the proposed model.
Peng et al.23) proposed a method for monitoring welding defects during the laser welding process by applying an attention mechanism to a model combining a Multi-Scale Convolutional Neural Network (MSCNN) and BiLSTM. They aimed to predict nine types of welding defects during the laser welding of SUS304 stainless steel (defocus 3 mm, defocus -3 mm, deformation, cracks, repetition, lack of weld, drift, base material tilt, and watermarks). To that end, they measured the plasma intensity, light intensity, and temperature signals under both defect and non-defect conditions. An MSCNN with a parallel structure of 1D convolution layers using different kernel sizes was employed to extract various features related to defects from multisignal data. The BiLSTM model, which performs computations in both the forward and backward directions, was used to learn the temporal arrangement of the features extracted from each 1D convolution layer. To enhance the monitoring performance by selecting the most useful features for defect detection, an attention mechanism was applied to the model combining MSCNN and BiLSTM. The MSCNN-BiLSTM-attention mechanism model was compared with other methods, including MSCNN-BiLSTM, CNN-BiLSTM-attention mechanism, CNN, and SVM; among them, MSCNN-BiLSTM-attention demonstrated superior performance when monitoring welding defects, showing the highest accuracy of 99.38%.
In addition to using hybrid models that combine LSTM with various deep learning techniques to learn 1D time-series data, deep learning models that use 2D time-frequency images obtained through STFT, WT, and other techniques have also been applied to monitor and detect welding defects. Yang et al.24) proposed a method for detecting burn-through and a lack of fusion in welds using time-frequency images of arc acoustic signals transformed using STFT and a CNN model. To identify defects in the gaps and alignment between the base materials during the butt welding process of aluminum sheets using a Cold Metal Transfer (CMT) welder, arc acoustic signals were measured with a microphone. To prevent the loss of time and frequency information of the arc acoustic signals during the conversion to 2D images, first and second differential operations were applied, forming a three-channel time-frequency spectrum through the STFT technique. The multichannel time-frequency spectrum was used to train a CNN model to detect defects during the welding process. The defect detection performance of the trained CNN model was evaluated, and it was found to outperform methods such as SVM, BPNN, and decision tree. Furthermore, using multiple channels was confirmed to result in a higher defect detection accuracy (91.49%) compared with the use of a single channel.
Shevchik et al.25) proposed a method for detecting welding process instabilities in laser welding by training a CNN model with time-frequency images extracted through WT. To monitor the process state based on the pulse duration and laser power during the laser welding of aluminum alloys, laser back-reflection and acoustic emission signals were measured using an optical system and acoustic sensors, respectively. X-ray transmission tests were performed during the welding process to observe the process instabilities that cause welding defects, and the process state of laser welding was classified into five categories (conduction welding, stable keyhole, unstable keyhole, blowout, and pores). The back-reflection and acoustic signals measured during the welding process were transformed into frequency-domain data using a WT and subsequently converted into 2D time-frequency images in the form of scalograms. These extracted 2D images were used to train a CNN model for the real-time monitoring of the process state and the detection of instabilities in the laser welding process. The CNN model demonstrated high accuracy in classifying the process state into five categories, with the keyhole state, which is a major cause of defects, being classified into stable and unstable states with accuracies of 93% and 94%, respectively.
Zhang et al.26) applied WT techniques to signals measured during the laser welding process, decomposing them into various frequency resolutions and reconstructing the extracted features into image-like maps for input to a CNN model. To determine the presence of defects based on the laser power, focal position, shielding gas flow rate, and welding speed, they used photodiode sensors to measure the laser radiation and visible light radiation reflected from the keyhole. To analyze the frequency characteristics of the measured signals in detail, the laser and visible light radiation signals were decomposed into 128 frequency bands using a WT, resulting in a total of 256 features, as the energy of each frequency band was calculated as a feature. To extract the key features, the 256 initially extracted features were reconstructed into a 2D feature distribution map. The CNN model used this reconstructed feature distribution map as an input to detect defects in laser welding. The CNN model classified laser welding defects into four categories (good, blowout, hump, and undercut) with an average accuracy of 96.4%, outperforming traditional models such as backpropagation, logistic regression, SVM, and random forest.
3.3 Monitoring of the Welding Conditions (Workpiece Alignment and Shielding Gas)
The aforementioned studies monitored the weld penetration state or the occurrence of welding defects in real-time. However, to ensure high welding quality, it is also necessary to monitor whether the alignment of the base materials and the consistency of the set welding conditions are maintained throughout the welding process. Consequently, the use of 1D time-series data-based deep learning technologies to monitor these aspects have also been investigated, as listed in Table 3.
Kim et al.27) proposed a Deep Neural Network (DNN)- based deep learning technique to detect the misalignment between the electrode and rivet using the dynamic resistance generated by the pre-pulse current during the multi-pulse Resistance Element Welding (REW) process. For this purpose, they performed REW on three layers (Al6061-T6, SPFC590DP, and SABC1470). During the preheating process using the pre-pulse, the dynamic resistance waveforms were measured and analyzed using a Rogowski coil and a voltage measurement module for various electrode eccentric distances. The measured signals were used as training data for the DNN model to detect the electrode-rivet misalignment before the actual process. The validation of the trained DNN model showed that the misalignment level between the electrode and rivet was predicted with 100% accuracy.
Liu et al.28) conducted a study to understand the effect of the shielding gas flow rate on weld quality in the CMT welding process, aiming to monitor this effect. The current, voltage, and acoustic signals were measured during the welding process under varying shielding gas flow rates. To analyze the influence of the shielding gas flow on the welding process, the power was calculated from the current and voltage signals, and the frequency-domain data were extracted from the acoustic signals using FFT. Features were extracted from the time frames of the power signals and frequency data of the acoustic signals, and these features were used as input data to train the BiLSTM model. When trained with arc acoustic and electrical signals, the BiLSTM model detected abnormal welding process states with an error rate of 0.295, thus outperforming the monitoring performance when only arc acoustic signals were used.
In addition to technologies for monitoring the welding process, systems that control the process in real-time based on monitoring results have also been investigated. Jin et al.29) developed a system for monitoring the welding gap and controlling a welding robot to maintain a consistent weld bead shape during the T-fillet Flux-Cored Arc Welding (FCAW) process of SS400 steel plates. First, welding experiments were conducted at various welding speeds based on the gap size, and the welding current and voltage were measured during the welding process. To understand the changes in the welding current and voltage due to variations in the welding gap, both the time- and frequency-domain data were analyzed. Statistical indicators were applied to the data from both domains (extracted using FFT), resulting in 24 feature variables. A DNN model was trained using these data to classify the welding gap into three sizes (0, 2, and 4 mm). The trained model was validated with both a constant welding gap and a continuously varying welding gap, achieving accuracies of 87.7% and 93.7%, respectively. In addition, when the welding speed of the robot was controlled using the trained model under the continuously varying gap conditions, the weld bead shape remained consistent, thereby demonstrating the effectiveness of the developed real-time process monitoring and control system.
The aforementioned studies utilized DNN and LSTM models for monitoring various aspects of welding, such as the misalignment between electrodes and rivets, the correlation between shielding gas and weld quality, and process system control. However, as discussed in Sections 3.1 and 3.2, few studies have developed and applied hybrid models that combine techniques, such as CNN and attention mechanisms, to improve the performance of deep learning models. Therefore, future research is anticipated to actively explore the application of hybrid models, such as CNN-LSTM-attention for monitoring the welding conditions, which include the alignment of base materials or the use of shielding gas. This will enhance the overall quality management of the welding process.
4. Conclusion
This review fully examined the research cases of deep learning-based welding process monitoring that utilizes time-series data measured during the welding process.
1) When monitoring welding processes using 1D time-series data, LSTM models are primarily utilized, because they can effectively learn the nonlinearity, nonstationary characteristics, and long-term dependencies. To improve the accuracy of welding process monitoring, hybrid models that combine a 1D convolution layer at the input stage of the LSTM or an attention mechanism at the output stage have been studied. These models can enhance prediction accuracy by extracting key features from time-series data to learn time dependencies or by differentially assigning weights to features in the order of their contribution at the output stage of the model.
2) Another method for learning the nonlinear and nonstationary characteristics of welding signals involves analyzing the time-frequency domain by converting 1D time-series data into 2D time-frequency images. These images are then used as inputs for CNN-based deep learning models, and this approach has been actively researched for monitoring welding processes. A representative method for converting 1D time-series signals into 2D time-frequency images is STFT, which analyzes the frequency components at fixed time intervals. Additionally, the WT, which analyzes the frequency components over various time intervals and improves the tradeoff between time and frequency resolution, has also been applied.
3) Time-series data-based deep learning technologies are utilized for various purposes, such as predicting weld penetration states, diagnosing defects, and monitoring welding conditions. Acoustic signals are primarily used to predict weld penetration states. To diagnose defects, current and voltage signals are commonly used in arc and resistance spot welding, whereas optical signals are used in laser welding. Additionally, in both arc and resistance spot welding, current, voltage, and resistance signals are employed to monitor the welding process conditions, such as the alignment of base materials and the flow rate of the shielding gas, as well as to control the process in real-time.
Recently, as the reliability of the results from welding process monitoring using deep learning models has become increasingly important, research attention has also been devoted to incorporating explainable AI (XAI) technologies to interpret the predictions of deep learning models. For instance, using SHapley Additive exPlanations (SHAP), features can be ranked according to their contribution to predicting welding quality, allowing the selection of the most relevant features for welding process monitoring. Using Gradient-weighted Class Activation Mapping (Grad-CAM), the rationale behind model predictions can be visually represented when predicting welding quality from time-frequency images using CNN models30). These technologies enhance the trustworthiness of the results by interpreting the decision-making process of the model, thereby significantly contributing to the management of welding quality using deep learning-based monitoring technologies. Additio- nally, recently highlighted Physics-Informed Neural Networks (PINNs), which combine physical governing equations with AI algorithms, can be used to predict complex physical phenomena during welding processes, such as flow and temperature changes in the molten pool. In the future, technologies such as XAI and PINNs are expected to be actively adopted for monitoring welding processes. These technologies are expected not only to monitor and control welding processes but also to become essential tools for achieving welding process automation.
Acknowledgement
This work was supported by the National Research Foundation of Korea(NRF) grant funded by the Korea government(MSIT) (No. NRF-2023R1A2C1004272).