enerzyme.models.activation.Swish#
- class enerzyme.models.activation.Swish(dim_feature: int = 1, initial_alpha: float = 1.0, initial_beta: float = 1.702, learnable: bool = True)[source]#
Bases:
BaseScaledTemperedActivationSwish activation function with learnable feature-wise parameters: f(x) = alpha*x * sigmoid(beta*x) sigmoid(x) = 1/(1 + exp(-x)) For beta -> 0 : f(x) -> 0.5*alpha*x For beta -> inf: f(x) -> max(0, alpha*x)
- Arguments:
- num_features (int):
Dimensions of feature space.
- initial_alpha (float):
Initial “scale” alpha of the “linear component”.
- initial_beta (float):
Initial “temperature” of the “sigmoid component”. The default value of 1.702 has the effect of initializing swish to an approximation of the Gaussian Error Linear Unit (GELU) activation function from Hendrycks, Dan, and Gimpel, Kevin. “Gaussian error linear units (GELUs).”
- __init__(dim_feature: int = 1, initial_alpha: float = 1.0, initial_beta: float = 1.702, learnable: bool = True) None[source]#
Initializes the Swish class.