AlibiBias layer
- Original Link : https://keras.io/api/keras_nlp/modeling_layers/alibi_bias/
- Last Checked at : 2024-11-26
AlibiBias
class
keras_nlp.layers.AlibiBias(alibi_bias_max=8, **kwargs)
A layer that adds the alibi bias to attention scores.
This layer adds the alibi bias to the attention scores. Alibi bias is a linear, non-learned bias. Defined and formalized in Train Short, Test Long: Attention with Linear Biases Enables Input Length Extrapolation.
This layer takes as input the attention scores. and returns the attention scores after adding the alibi bias to it. The output will have the same shape as the input.
Arguments
- alibi_bias_max: int. This value will be used to compute the slope of
each head. The heads’ slopes are a geometric sequence that starts at
2**(-alibi_bias_max/num_heads)
and uses that same value as its ratio. Defaults to 8. - **kwargs: other keyword arguments passed to
keras.layers.Layer
, includingname
,trainable
,dtype
etc.
Call arguments
- attention_scores: The result of multipying the query and the key of the
multi-head attention layer of the transformer to add alibi bias to
it. With shape
(batch_size, num_heads, query_length, key_length)
.
Example
query_length = 10
key_length = 10
num_heads = 4
batch_size = 2
hidden_dim = 8
# Create new alibi layer.
alibi_layer = keras_hub.layers.AlibiBias()
query = np.zeros((batch_size, num_heads, query_length, hidden_dim))
key = np.zeros((batch_size, num_heads, hidden_dim, key_length))
attention_scores = keras.ops.matmul(query, key)
# Add alibi bias to attention scores.
attention_scores = alibi_layer(attention_scores)
References