We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.LG

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo ScienceWISE logo

Computer Science > Machine Learning

Title: AIS: A nonlinear activation function for industrial safety engineering

Abstract: In the task of Chinese named entity recognition based on deep learning, activation function plays an irreplaceable role, it introduces nonlinear characteristics into neural network, so that the fitted model can be applied to various tasks. However, the information density of industrial safety analysis text is relatively high, and the correlation and similarity between the information are large, which is easy to cause the problem of high deviation and high standard deviation of the model, no specific activation function has been designed in previous studies, and the traditional activation function has the problems of gradient vanishing and negative region, which also lead to the recognition accuracy of the model can not be further improved. To solve these problems, a novel activation function AIS is proposed in this paper. AIS is an activation function applied in industrial safety engineering, which is composed of two piecewise nonlinear functions. In the positive region, the structure combining exponential function and quadratic function is used to alleviate the problem of deviation and standard deviation, and the linear function is added to modify it, which makes the whole activation function smoother and overcomes the problem of gradient vanishing. In the negative region, the cubic function structure is used to solve the negative region problem and accelerate the convergence of the model. Based on the deep learning model of BERT-BiLSTM-CRF, the performance of AIS is evaluated. The results show that, compared with other activation functions, AIS overcomes the problems of gradient vanishing and negative region, reduces the deviation of the model, speeds up the model fitting, and improves the extraction ability of the model for industrial entities.
Subjects: Machine Learning (cs.LG); Computation and Language (cs.CL); Neural and Evolutionary Computing (cs.NE)
Cite as: arXiv:2111.13861 [cs.LG]
  (or arXiv:2111.13861v1 [cs.LG] for this version)

Submission history

From: Zhenhua Wang [view email]
[v1] Sat, 27 Nov 2021 10:22:29 GMT (843kb)

Link back to: arXiv, form interface, contact.