Page 423 - Contributed Paper Session (CPS) - Volume 2
P. 423

CPS1915 Han G. et al.
                (2)  Initializing the model’s parameters;
                (3)  Implementing the forward propagation as well as computing the cost
                   function;
                (4)  Implementing the backward propagation and using gradient decent
                   algorithm to update the model parameters;
                (5)  Predicting the accuracy of the classification problem.
                In an ANN model, the hyperparameters include the choice of activation
            function, the number of hidden layer, the number of hidden neuron (n_h),
            learning rate and so on. Selecting a suitable activation function for the hidden
            layer is significant for an ANN model to learn and make sense of complicated
            problems. The activation functions can be classified as linear (considered as
            without  activation  functions)  and  non-linear  ones,  such  as  tanh,  sigmoid,
            rectified linear unit (ReLU) and leaky ReLU. Sigmoid and tanh functions are
            most popular in classification problems. An ANN without activation function
            or  with  a  linear  one  would  simply  be  a  linear  regression  model,  which
            possesses very limited power and cannot perform well in most cases. Non-
            linear activation functions play a role to conduct the non-linear transformation
            to the input features. In this research, we create two models, both with one
            hidden  layer,  to  predict  the  bank  clients’  choice  on  the  term  deposit.  The
            model activated by tanh function in the hidden layer and sigmoid function in
            the output layer is referred to Model I. The other model activated by sigmoid
            function in both hidden layer and output layer is referred to Model II.

            2.1 Sigmoid function
                In  a  broad  sense,  sigmoid  function  is  a  kind  of  mathematical  function
            having  a  characteristic  “S”  shaped  curve.  The  sigmoid  function  in  neural
            network  usually  refers  to  the  logistic  function,  which  can  be  expressed  as
            follows:
                                           () = 1/(1 +  − )         (1)
            where   () ranges from 0 through 1 no matter what value  takes. 
            can be regarded as a linear function. The first derivative of sigmoid function is
            shown in Equation (2),
                                                                    ) ,
                                           ′
                                           () =  − /(1 +  − 2     (2)
            which can be used in backward propagation. In order to make more sense of
            the relationship between sigmoid function and its derivative, Equation (2) can
            be rewritten as follows:
                            ′
                             () =   () ∗ (1 −   ())      (3)

            2.2 Tanh function
                Tanh function is also sigmoidal. But unlike sigmoid, the output values are
            zero-centered ranging from -1 through 1. It can be expressed as follows:


                                                               412 | I S I   W S C   2 0 1 9
   418   419   420   421   422   423   424   425   426   427   428