enerzyme.models.activation.Swish#

class enerzyme.models.activation.Swish(dim_feature: int = 1, initial_alpha: float = 1.0, initial_beta: float = 1.702, learnable: bool = True)[source]#

Bases: BaseScaledTemperedActivation

Swish activation function with learnable feature-wise parameters: f(x) = alpha*x * sigmoid(beta*x) sigmoid(x) = 1/(1 + exp(-x)) For beta -> 0 : f(x) -> 0.5*alpha*x For beta -> inf: f(x) -> max(0, alpha*x)

Arguments:
num_features (int):

Dimensions of feature space.

initial_alpha (float):

Initial “scale” alpha of the “linear component”.

initial_beta (float):

Initial “temperature” of the “sigmoid component”. The default value of 1.702 has the effect of initializing swish to an approximation of the Gaussian Error Linear Unit (GELU) activation function from Hendrycks, Dan, and Gimpel, Kevin. “Gaussian error linear units (GELUs).”

__init__(dim_feature: int = 1, initial_alpha: float = 1.0, initial_beta: float = 1.702, learnable: bool = True) None[source]#

Initializes the Swish class.

activation_fn(x: Tensor) Tensor[source]#

Evaluate activation function given the input features x. num_features: Dimensions of feature space.

Arguments:
x (FloatTensor [:, num_features]):

Input features.

Returns:
y (FloatTensor [:, num_features]):

Activated features.

simple_activation_fn(x: Tensor) Tensor[source]#