Page 423 - Contributed Paper Session (CPS) - Volume 2
P. 423
CPS1915 Han G. et al.
(2) Initializing the model’s parameters;
(3) Implementing the forward propagation as well as computing the cost
function;
(4) Implementing the backward propagation and using gradient decent
algorithm to update the model parameters;
(5) Predicting the accuracy of the classification problem.
In an ANN model, the hyperparameters include the choice of activation
function, the number of hidden layer, the number of hidden neuron (n_h),
learning rate and so on. Selecting a suitable activation function for the hidden
layer is significant for an ANN model to learn and make sense of complicated
problems. The activation functions can be classified as linear (considered as
without activation functions) and non-linear ones, such as tanh, sigmoid,
rectified linear unit (ReLU) and leaky ReLU. Sigmoid and tanh functions are
most popular in classification problems. An ANN without activation function
or with a linear one would simply be a linear regression model, which
possesses very limited power and cannot perform well in most cases. Non-
linear activation functions play a role to conduct the non-linear transformation
to the input features. In this research, we create two models, both with one
hidden layer, to predict the bank clients’ choice on the term deposit. The
model activated by tanh function in the hidden layer and sigmoid function in
the output layer is referred to Model I. The other model activated by sigmoid
function in both hidden layer and output layer is referred to Model II.
2.1 Sigmoid function
In a broad sense, sigmoid function is a kind of mathematical function
having a characteristic “S” shaped curve. The sigmoid function in neural
network usually refers to the logistic function, which can be expressed as
follows:
() = 1/(1 + − ) (1)
where () ranges from 0 through 1 no matter what value takes.
can be regarded as a linear function. The first derivative of sigmoid function is
shown in Equation (2),
) ,
′
() = − /(1 + − 2 (2)
which can be used in backward propagation. In order to make more sense of
the relationship between sigmoid function and its derivative, Equation (2) can
be rewritten as follows:
′
() = () ∗ (1 − ()) (3)
2.2 Tanh function
Tanh function is also sigmoidal. But unlike sigmoid, the output values are
zero-centered ranging from -1 through 1. It can be expressed as follows:
412 | I S I W S C 2 0 1 9