BatchNormTraining // Compute mean and variance from the input.
Description¶
Inputs¶
Name |
Element Type |
Shape |
---|---|---|
|
real |
\((\bullet, C, \ldots)\) |
|
same as |
\((C)\) |
|
same as |
\((C)\) |
Attributes¶
Name |
Type |
Notes |
---|---|---|
|
|
Small bias added to variance to avoid division by 0. |
Outputs¶
Name |
Element Type |
Shape |
---|---|---|
|
same as |
Same as |
|
same as |
\((C)\) |
|
same as |
\((C)\) |
The batch_mean
and batch_variance
outputs are computed per-channel from
input
.
Mathematical Definition¶
The axes of the input fall into two categories: positional and channel, with channel being axis 1. For each position, there are \(C\) channel values, each normalized independently.
Normalization of a channel sample is controlled by two values:
the batch_mean \(\mu\), and
the batch_variance \(\sigma^2\);
and by two scaling attributes: \(\gamma\) and \(\beta\).
The values for \(\mu\) and \(\sigma^2\) come from computing the
mean and variance of input
.
Backprop¶
C++ Interface¶
-
class
BatchNormTraining
: public ngraph::op::Op¶ -
Batchnorm for training operation.
Subclassed by ngraph::op::gpu::BatchNormTrainingWithStats
Public Functions
-
const std::string &
description
() const¶ -
Get the string name for the type of the node, such as
Add
orMultiply
. The class name, must not contain spaces as it is used for codegen.- Return
-
A const reference to the node’s type name
-
BatchNormTraining
(const Output<Node> &input, const Output<Node> &gamma, const Output<Node> &beta, double epsilon)¶ -
- Parameters
-
input
: Must have rank >= 2, [., C, …]gamma
: gamma scaling for normalized value. [C]beta
: bias added to the scaled normalized value [C]epsilon
: Avoids divsion by 0 if input has 0 variance
-
BatchNormTraining
(double eps, const Output<Node> &gamma, const Output<Node> &beta, const Output<Node> &input)¶ -
In this version of BatchNorm:
MEAN AND VARIANCE: computed directly from the content of ‘input’.
OUTPUT VALUE: A tuple with the following structure: [0] - The normalization of ‘input’. [1] - The per-channel means of (pre-normalized) ‘input’. [2] - The per-channel variances of (pre-normalized) ‘input’.
AUTODIFF SUPPORT: yes: ‘generate_adjoints(…)’ works as expected.
SHAPE DETAILS: gamma: must have rank 1, with the same span as input’s channel axis. beta: must have rank 1, with the same span as input’s channel axis. input: must have rank >= 2. The second dimension represents the channel axis and must have a span of at least 1. output[0]: shall have the same shape as ‘input’. output[1]: shall have rank 1, with the same span as input’s channel axis. output[2]: shall have rank 1, with the same span as input’s channel axis.
-
void
validate_and_infer_types
()¶ -
Throws if the node is invalid.
-
const std::string &