Basic Data Structures¶

Variable Metatypes¶

class pysigma.defs.VariableMetatype(value)[source]¶

Enum class for Variable metatypes.

Indexing = 0¶

Relational = 1¶

Random = 2¶

Parameter = 3¶

Variable¶

class pysigma.defs.Variable(name, metatype, size, value_constraints=None)[source]¶

Variable represented by variable nodes in the graphical architecture. Stores information such as a variable’s meta-type, dimension size, and value constraints (if the variable has Random meta-type).

The equality testing is used for matching variables in Alpha-Beta subgraphs. Two variables are equal if and only if ALL of their fields are equal.

Parameters

name (str) – The name of the variable.
metatype ({VariableMetatype.Indexing, VariableMetatype.Relational, VariableMetatype.Random, VariableMetatype.Parameter}) – The meta-type of this variable.
size (int) – The size of the message dimension this variable corresponds to.
value_constraints (iterable of torch.distributions.constraints.Constraint) – The set of value constraints that determine the value range (support) of this variable. Specify if and only if variable’s metatype is VariableMetatype.Random.

name¶

Variable name.

Type: str

metatype¶

The meta-type of this variable.

Type: {VariableMetatype.Indexing, VariableMetatype.Relational, VariableMetatype.Random, VariableMetatype.Parameter}

size¶

The size of the message dimension this variable corresponds to.

Type: int

constraints¶

The set of value constraints that determine the value range (support) of this variable.

Type: set of torch.distributions.constraints.Constraint

Message Types¶

class pysigma.defs.MessageType(value)[source]¶

Enum class to represent message types

The True-valued boolean relationship between types, using the in operator:

Undefined in Undefined == Undefined in Parameter == Undefined in Particles == Undefined in Both == True
Parameter in Parameter == Undefined in Both == True
Particles in Particles == Undefined in Both == True

All other relations are False.

Undefined = 0¶

Parameter = 1¶

Particles = 2¶

Both = 3¶

Message¶

The Message class constitutes the most fundamental data structure in PySigma Graphical Architecture.

class pysigma.defs.Message(msg_type, batch_shape=torch.Size([]), param_shape=torch.Size([]), sample_shape=torch.Size([]), event_shape=torch.Size([]), parameter=0, particles=None, weight=1, log_densities=None, device='cpu', **kwargs)[source]¶

Message to be propagated between nodes in the graphical architecture.

The Message class is the most fundamental data structure in PySigma that carries the knowledge of a batch of distributions to be processed by downstream graphs.

Parameters

msg_type ({MessageType.Undefined, MessageType.Parameter, MessageType.Particles, MessageType.Both}) – The type of this message.
batch_shape (torch.Size, optional) – The size of the batch dimensions. Must specify and be a shape of at least length 1, unless the message is representing an identity. See Notes section below for more details on identity message.
param_shape (torch.Size, optional) – The size of the parameter dimension of parameter. Must specify if msg_type is MessageType.Parameter with a length of exactly 1. Default to an empty shape torch.Size([]).
sample_shape (torch.Size, optional) – The size of the sample dimensions of each particle tensor in particles respectively in order. Must specify if message type is MessageType.Particles, with a length equal to the number of particle tensors. Default to an empty shape torch.Size([]).
event_shape (torch.Size, optional) – The size of the event dimensions of each particle tensor in particles respectively in order. Must specify if message type is MessageType.Particles, with a length equal to the number of particle tensors. Default to an empty shape torch.Size([]).
parameter (torch.Tensor or an int of 0, optional) – The parameter tensor to the batch of distributions this message is encoding. Must specify if the message type is MessageType.Parameter. A torch.Tensor of shape batch_shape + param_shape if the parameters do not represent the identity in the parameter vector space. Alternatively, can be an int of 0 to specify the identity, in which case it is not necessary to specify batch_shape. Default to an int of 0.
particles (iterable of torch.Tensor, optional) – The list of particles representing events w.r.t. each random variable respectively whose collective joint distribution this message is encoding. Must specify if the message type is MessageType.Particles, unless weight is 1, in which case the message represents a universal identity in the particles space. The jth entry of the iterable should have shape sample_shape[j] + event_shape[j].
weight (torch.Tensor or an int of 1, optional) – The importance weight tensor that, when multiplied with the exponential of the cross product of the log sampling densities in log_densities, yields the pdf of each combined particle w.r.t. the target distribution that this message is encoding. Must specify if the message type is MessageType.Particles. If the weights are non-uniform, must be a positively valued tensor of shape batch_shape + sample_shape. The supplied tensor will be normalized during initialization so that it sums to 1 across the subspace spanned by the sample dimensions. Alternatively, can be an int of 1 to specify the identity (uniform weight), in which case it is not necessary to specify batch_shape. Default to 1.
log_densities (iterable of torch.Tensor, optional) – The jth entry in the iterable represents the log pdf of the jth particle in particles w.r.t. the (marginal) sampling distribution from which the jth particle was originally drawn. Must specify if the message type is MessageType.Particles, unless weight is 1, in which case the message represents a universal identity in the particles space. The jth entry must have shape sample_shape[j].
device (str, optional) – The device where the tensor components are hosted. .to(device) will be called on the tensor arguments during initialization. Defaults to ‘cpu’.
**kwargs – Other keyword arguments that specify special attributes of the message. Will be deep copied when the message is cloned. Note that any dist_info required by DistributionServer regarding the specification of the parameters should be associated with the key "dist_info".

type¶

Message type.

Type: {MessageType.Undefined, MessageType.Parameter, MessageType.Particles, MessageType.Both}

b_shape¶

Batch shape.

Type: torch.Size

p_shape¶

Parameter shape.

Type: torch.Size

s_shape¶

Sample shape.

Type: torch.Size

e_shape¶

Event shape.

Type: torch.Size

parameter¶

Parameter tensor

Type: torch.Tensor or None

particles¶

Tuple of particle value tensors

Type: list of torch.Tensor or None

weight¶

Particle weight tensor

Type: torch.Tensor or None

log_densities¶

Tuple of particles log sampling tensors

Type: list of torch.Tensor or None

num_rvs¶

The number of random variables. Inferred from the length of particles.

Type: int

device¶

The device where the tensor components are hosted.

Type: str

attr¶

Miscellaneous optional attributes, specified by **kwargs in the constructor.

Type: dict

Notes

In PySigma Graphical Architecture, a message can represent not only a single joint distribution w.r.t. multiple random variables, but a batch of such joint distribution instances. The distribution instances in the batch are mutually independent, but may or may not be identically distributed. This batch is managed and indexed by the batch dimensions, specified by batch_shape.

Depending on how each of the distribution instance is represented, a message can be roughly categorized into two types: Parameter type or Particles type.

Parameter type: a message of this type encodes a batch of distributions by holding their parameter tensors. The semantics of the parameters depends on the context, e.g. whether they are natural parameters to exponential family distributions or conventional parameters to PyTorch distribution class. For the latter one, the semantics may even be distribution class dependent.

Specifying the parameter argument only in the constructor is sufficient in terms of the message contents.
Particles type: a message of this type encodes a batch of distributions by a particle list, with the particles being importantly weighted to correctly reflect their pdf w.r.t. to each of the target distribution in the distribution batch. In other words, conceptually, each entry in the particle list is a 3-tuple: (x, w_x, log_p(x)) where x is the event value, log_p(x) is the log pdf of x w.r.t. its sampling distribution P(x), and w_x is defined as the ratio of Q(x), the target distribution pdf, over P(x). Therefore, the target pdf of x can be recovered by:
```
Q(x) = w_x * exp(log_p(x))
log Q(x) = log(w_X) + log_p(x)
```
Note that a message uses a single list of particles to encode and approximate each and every distribution in the batch. In other words, the set of event values used to represent each distribution instance is the same, but the importance weights assigned to each event value by different distribution instances are different. This is the reason that weight tensor should include batch dimensions, whereas particle tensors in particles and log sampling density tensors in log_densities should not.

When there are multiple random variables, each distribution instance in the batch is a joint distribution over all random variables. In this case, each of the entry in the provided particles are events w.r.t. each random variable only. To represent the joint distributions, a list of joint particles will be formed by concatenating the event tensors in particles combinatorially, or so to speak, by taking the tensor product. Accordingly, the log sampling density vectors in log_densities will be taken cross product to form a higher dimensional sampling density tensors. In this way, the joint particles are effectively arranged in a lattice in the joint event space, therefore easing the marginalization process because we can simply summarize over one dimension to achieve the effect of marginalizing over the corresponding random variable.

To support the above semantics and computations, all of the arguments particles, weight, and log_densities must be specified in the constructor.

A message can encode both type of contents, in which case the message type is MessageType.Both.

Both types of messages are assumed to reside in certain vector space, and thus the appropriate arithmetic operations – Addition and Scalar Multiplication – are defined and implemented:

For Parameter messages,
- Addition operation is defined as arithmetic addition on the parameter tensors.
- Scalar multiplication is defined as arithmetic scalar multiplication with the parameter tensors.
- 0 is treated as the identity element.
For Particles messages:
- The following two operations are defined as operations on the particle weights, and meaningful only for Particle messages that share the same particle values and the same sampling log densities of the particles, except for the message that represents particle identity (See more below). In addition, results from these two operations are normalized so that the weight tensor sums to 1 across the sample dimensions.
- Addition operation is defined as element-wise multiplication of particle weights tensors, up to a normalization factor.
- Scalar Multiplication is defined as taking elements of the particle weights tensor to the power of the scalar, up to a normalization factor.
- 1 is treated as the identity element for the operations.
- Note that it is provably correct that the weighs with above operations form a vector space. The proof idea is to consider the log quotient space over one dimension, which reduces to standard real space with one less dimension.

Regarding the identity messages:

The MessageType.Parameter type identity message is one whose parameter field is 0.
The MessageType.Particles type identity message is one whose weight field is 1, **regardless of its particle values particles or sampling log densities log_densities.
The MessageType.Both type identity message is the composition of the above two identity messages.

Accordingly, the ‘+’ and ‘*’ operator are overloaded according the to the specifications above.

property isid¶: Whether self is an identity message.

property shape¶: Shape of the message. Equivalent to calling size()

__add__(other)[source]¶

Overloads the addition operation +.

Implements the semantics of addition operation as in vector spaces. The computational operations used to implement the semantics are different for different message contents. See Message class notes on arithmetic structures for more details.

Only messages with compatible types can be added. This means a MessageType.Parameter type message can only be added with one of type MessageType.Parameter or MessageType.Both, and similarly a MessageType.Particles type message can only be added with one of type MessageType.Particles or MessageType.Both. MessageType.Both type message can be added with any other type except MessageType.Undefined, and in any case a MessageType.Undefined type message cannot be added.

There are more restrictions for MessageType.Particles type messages. Messages of such type can only be added together if their particles and log_densities fields are equal, unless one (or both) is the identity Particles message.

If two messages with compatible but not identical types are added together, the resulting message will have the smaller type, meaning only the common components will be added. For example, the result of adding a MessageType.Parameter type message with a MessageType.Both type message is a MessageType.Parameter type message. But if two MessageType.Both type messages are added, the resulting message will also have type MessageType.Both, containing both parameter and particles components.

Note that the identity messages (Parameter message with parameter == 0, Particles message with weight == 1, or Both message with both conditions) are assumed universal, i.e., they can be added with any other message that has a compatible type but may or may not have a compatible shape. The resulting message will be the other message itself. If both self and other are identity messages, the returning message will be the identity message with the larger type.

Parameters

other (Message) – The other message instance to be added together with self. It should have a compatible message type with self.

Returns

The new message as a result of the summation.

Return type

Message

Raises

AssertionError – If other’s message type is incompatible with self.
AssertionError – If either self or other’s message type is MessageType.Undefined.
AssertionError – If contents of self and other have conflicting shapes, when both self and other are not identity messages.
AssertionError – If self and other have particles message contents to be added, but their particle values do not match, or their log sampling density tensors do not match.

Warning

The attribute dictionaries self.attr and other.attr from the two messages will be merged. However, if there exist conflicting entries, some would be overwritten by the other. In general, it is the last operand in the expression, i.e., other, whose attribute entries persist, but this behavior should not be counted on.

__iadd__(other)[source]¶: Overloads self-addition operator +=.

See also

__add__()

__mul__(other)[source]¶

Overloads multiplication operator *.

Implements the semantics of scalar multiplication operation as in vector spaces. The computational operations used to implement the semantics are different for different message contents. See Message class notes regarding arithmetic structures for more details.

Message of type MessageType.Undefined cannot be scalar multiplied.

If self is an identity message, returns self unchanged directly.

Parameters: other (int, float, or torch.Tensor) – The scalar to the multiplication. If a torch.Tensor, can be a singleton tensor representing a single scalar, or a tensor of shape batch_shape representing a batched scalars, assigning a different scalar value to each distribution instance in the batch.
Returns: The new message as a result of the scalar multiplication.
Return type: Message
Raises: AssertionError – Attempting to scalar multiply a message of type MessageType.Undefined.

__imul__(other)[source]¶: Overloads self-multiplication operator *=.

See also

__mul__()

static compose(msg1, msg2)[source]¶

Composes a MessageType.Particles message with a MessageType.Parameters message to return a MessageType.Both message that contain all components from both messages.

Both msg1 and msg2 cannot be identity messages.

Parameters

msg1 (Message) – The first message to be composed. Its type must be either MessageType.Particles` or MessageType.Parameters, but must be different from that of msg2.
msg2 (Message) – The second message to be composed. Its type must be either MessageType.Particles` or MessageType.Parameters, but must be different from that of msg1.

Returns

A message with type MessageType.Both that contains all components from both msg1 and msg2.

Return type

Message

Raises

AssertionError – If msg1 and msg2 have conflicting attributes, such as batch shape.

Warning

The attribute dictionaries msg1.attr and msg2.attr will be merged. If there exists conflicting entries (key-value pairs with same key but different values), those from msg2.attr will overwrite those from msg1.attr.

static identity(msg_type=<MessageType.Both: 3>)[source]¶

Returns a minimum identity message (without declaration of shapes) of the specified type.

Parameters: msg_type (MessageType) – Target message type. Defaults to MessageType.Both.
Returns: The identity message.
Return type: Message

size()[source]¶

Returns a tuple of the message’s shapes: (batch_shape, param_shape, sample_shape, event_shape)

Returns: A tuple of the message’s shapes
Return type: tuple of torch.Size

same_particles_as(other)[source]¶

Check if self has the same particles as the other message. This include checking the list of particle value tensors as well as checking the list of particle log sampling density tensors.

Note

Will always return False if self or other is not Particles message.

Note

Will always return True if both self and other are Particles message and one (or both) is the identity.

Parameters: other (Message) – The other message.
Returns: True if self has the same particles as the other message.
Return type: bool

diff_param(other)[source]¶

Compute the difference between the parameters of self and other.

Returns a batch average L2 distance between the two parameters. Since parameters have shape (batch_shape, param_shape), with param_shape of exactly length 1, the L2 distance is calculated along dim=-1.

Parameters: other (Message) – The other message.
Returns: The batch average L2 distance.
Return type: torch.Tensor or 0
Raises: AssertionError – If self and/or other do not have parameters.

See also

The L2 norm computation: torch.norm().

diff_weight(other)[source]¶

Compute the difference between the weight of self and other.

Returns a mean element-wise absolute value difference between the two weight tensors.

Note that calculating the difference of weights only makes sense if both messages have the same particle value tensors and particle log sampling density tensors. Therefore, same_particles_as() will first be called for a sanity check. An assertion error will be raised if same_particles_as() returns False.

Parameters

other (Message) – The other message

Returns

The batch average cosine similarity.

Return type

torch.Tensor or 0

Raises

AssertionError – If self and/or other do not have particles.
AssertionError – If self does not have the same particle values and log sampling densities as other.

See also

The cosine similarity computation: torch.nn.functional.cosine_similarity().

reduce_type(msg_type)[source]¶

Returns a reduced msg_type type message from self, where irrelevant components w.r.t. ‘msg_type’ in self is removed, and only relevant components are retained and cloned.

The target message type must be either MessageType.Parameter or MessageType.Particles.

Parameters: msg_type ({MessageType.Parameter, MessageType.Particles}) – The message type of the returned reduced message.
Returns: The reduced message from self.
Return type: Message
Raises: AssertionError – If target message type specified by msg_type is not compatible with self type.

clone()[source]¶

Return a cloned message from self.

Guarantees that every content is deep-copied. Tensors will be cloned and dictionaries will be deep-copied.

Returns: A cloned and deep-copied message of self.
Return type: Message

to_device(device)[source]¶

Returns a version of self where the tensor components are hosted on the specified device.

Per PyTorch design, the original tensors will be returned without copying if target device is the current device, otherwise a copied version will be returned.

Note

Any tensor stored in the optional attribute dictionary self.attr will NOT be inspected and be moved to the target device.

Parameters: device (str) – The target device
Returns: self on target device.
Return type: Message

batch_permute(target_dims)[source]¶

Returns a permuted message whose tensor contents that include batch dimensions (e.g. parameters and particle values) are permuted w.r.t. target_dims.

The dimensions specified in target_dims are relative to the batch dimensions only. Its values should be in range [-len(batch_shape), len(batch_shape) - 1]

contiguous() will be called so that the returning message’s tensor contents are contiguous

Parameters: target_dims (list of ints) – The desired ordering of the target batch dimensions. Must have the same length as the message’s batch shape.
Returns: The permuted message from self.
Return type: Message

See also

This method is a mimic of torch.Tensor.permute().

batch_unsqueeze(dim)[source]¶

Returns a new message with a dimension of size one inserted at the target batch dimension specified by dim.

The target dimension is relative to the batch dimensions only. It should be in range [-len(batch_shape) - 1, len(batch_shape) + 1].

Parameters: dim (int) – The position where the new singleton dimension (a dim of size 1) is to be inserted.
Returns: The unsqueezed message from self.
Return type: Message

See also

This method is a mimic of torch.unsqueeze()

batch_index_select(dim, index)[source]¶

Returns a message that is a concatenation of the slices from self along the dim batch dimension and indexed by index.

In other words, along dim dimension, the i th slice of the returned message is the index[i] th slice of self. Consequently, the size of the dim dimension of the returned message equals the length of index array.

A dim value within the range [-len(batch_shape), len(batch_shape) - 1] can be used. Note that dim is relative to the batch dimension only.

Parameters

dim (int) – The dimension along which entries will be selected according to index.
index (torch.LongTensor) – The array of indices of entries along dim to be selected. Entries must be non-negative.

Returns

The returned index-selected and concatenated message from self.

Return type

Message

See also

This method is a mimic of torch.index_select()

batch_index_put(dim, index)[source]¶

Returns a message whose entries along the dimension dim are slices from self message and are put into the positions along the axis specified by indices in index.

In other words, along dim dimension, the index[i] th slice of the returned message is the i th slice of self. Consequently, the size of the dim dimension of the returned message equals the maximum value in the index array.

For slices in the new message not referenced by index, they will be filled with identity values. For parameter tensor, the identity value is 0, and for particle weight tensor, the identity value is a positive uniform constant such that the sum across the sample dimensions is 1.

A dim value within the range [-len(batch_shape), len(batch_shape) - 1] can be used. Note that dim is relative to the batch dimension only.

Parameters

dim (int) – The dimension along which entries will be put according to index.
index (torch.LongTensor) – The array of indices of entries along dim to be put. Entries must be non-negative.

Returns

The returned index-put message of self.

Return type

Message

See also

batch_index_select(): The inverse of batch_index_put(). There is no direct counterpart to this method in PyTorch.

batch_diagonal(dim1=0, dim2=1)[source]¶

Returns a partial view of self with the its diagonal elements with respect to dim1 and dim2 appended as a dimension at the end of the shape.

dim values in the range [-len(batch_shape), len(batch_shape) - 1] can be used. Note that dim1 and dim2 are relative to the batch dimensions only. The appended dimension will be placed as the last batch dimension, but before any sample or param dimensions.

contiguous() will be called so that the returning content tensors are contiguous.

Parameters

dim1 (int, optional) – The first dimension of the 2D subspace where diagonal entries will be taken. Defaults to 0, the first batch dimension.
dim2 (int, optional) – The second dimension of the 2D subspace where diagonal entries will be taken. Defaults to 1, the second batch dimension.

Returns

The diagonalized message of self.

Return type

Message

See also

This method is a mimic of torch.diagonal() , with offset defaults to 0

batch_diag_embed(diag_dim=- 1, target_dim1=- 2, target_dim2=- 1)[source]¶

Returns a message whose diagonals of certain 2D planes (dimensions specified by target_dim1 and target_dim2) are filled by slices of self along the dimension specified by diag_dim).

The last dimension of self is chosen by default as the diagonal entries to be filled, and the last two dimensions of the new message are chosen by default as the 2D planes where the diagonal entries will be filled in.

The 2D planes will be shaped as square matrices, with the size of each dimension matches the size of the diag_dim dimension in self.

The length of returned message’s batch shape will be the length of original message’s batch shape plus 1.

For slots not on the diagonals of the resulting message, they will be filled with identity values. For parameter tensor, the identity value is 0, and for particle weight tensor, the identity value is a positive uniform constant such that the sum across the sample dimensions is 1.

contiguous() will be called so that the returning content tensors are contiguous.

Parameters

diag_dim (int, optional) – The dimension of self along which slices will be selected. Defaults to -1.
target_dim1 (int, optional) – The first dimension of the target 2D planes in the target message. Defaults to -2.
target_dim2 (int, optional) – The second dimension of the target 2D planes in the target message. Defaults to -1.

Returns

The diagonally embedded message from self.

Return type

Message

See also

This method is a mimic of torch.diag_embed() , with offset default to 0 plus an additional diag_dim argument.

batch_narrow(dim, length)[source]¶

Returns a message that is a narrowed version of self along the dimension specified by dim.

Effectively, this method selects the chunk spanning [:length] along the dimension dim of self. The returned message and self share the same underlying storage.

contiguous() will be called so that the returning content tensors are contiguous.

Parameters

dim (int) – The dimension of along which self will be narrowed.
length (int) – The length of the message chunk to select. It must be no greater than the size of the dim dimension in self.

Returns

A narrowed message of self.

Return type

Message

See also

This method is a mimic of torch.narrow() , with start default to 0.

batch_broaden(dim, length)[source]¶

Returns a message that is a broadened version of self along the dimension specified by dim, with identity values filled in [dim_size + 1: length] along the dimension dim in the returned message.

In other words, this method is concatenating an identity message to self along dimension dim so that the resulting dimension size is length.

For parameter tensor, the identity value is 0, and for particle weight tensor, the identity value is a positive uniform constant such that the sum across the sample dimensions is 1.

contiguous() will be called so that the returning content tensors are contiguous.

Parameters

dim (int) – The dimension of self which will be broadened in the returned message.
length (int) – The length of the broadened dimension of the returned message. It must be greater than the size of the dim dimension in self.

Returns

A broadened message of self.

Return type

Message

See also

batch_narrow(): The inverse of batch_broaden(). There is no direct counterpart to this method in PyTorch.

batch_summarize(dim)[source]¶

Implements the default Sum-Product summarization semantics. Summarizes over the batch dimension specified by dim. Returns a message with one less dimension.

For Parameter message, the summarization is realized by taking the mean of the parameter tensor along dimension dim. For particles message, this is realized by taking addition defined for particle weights along dimension dim, a.k.a. factor product.

Parameters: dim (int) – The dimension of self to be summarized over.
Returns: The summarized message from self.
Return type: Message

batch_flatten(dims=None)[source]¶

Flattens the set of batch dimensions specified by dims and append the flattened dimension as the last batch dimension. If dims is None, will flatten all batch dimensions.

contiguous() will be called so that the returning content tensors are contiguous.

Parameters: dims (iterable of ints, optional) – The set of batch dimensions to be flattened. Defaults to None.
Returns: The flattened message of self.
Return type: Message

batch_reshape(new_batch_shape)[source]¶

Returns a message with the same underlying data as self, but with the specified new_batch_shape.

Parameters: new_batch_shape (iterable of int, or torch.Size) – The target batch shape.
Returns: A reshaped message from self with new batch shape.
Return type: Message

See also

This method is a mimic of torch.reshape()

batch_expand(new_batch_shape)[source]¶

Returns a new view of self with singleton batch dimensions expanded to a larger size.

Passing a -1 as the size for a batch dimension means not changing the size of that batch dimension.

Expanding self would not allocate new memory for self’s tensor contents, but would create a new view on the existing tensors. Any dimension of size 1 can be expanded to an arbitrary value without allocating new memory.

Note that new_batch_shape is relative to the batch dimensions only.

contiguous() will be called so that the returning content tensors are contiguous.

Parameters: new_batch_shape (iterable of int, or torch.Size) – The target expanded batch shape. Must have the same length as self’s current batch shape.
Returns: An expanded message from self.
Return type: Message

See also

This method is a mimic of torch.Tensor.expand().

event_transform(trans)[source]¶

Applies a transformation on the self’s event values. Returns the transformed message.

self contents will be cloned before being passed to the transformed message.

Note

For now, only Particles message support transformations. reduce_type() will first be called to eliminate the parameter components before performing the transformation.

The adjustment made to the particle values and log sampling densities:

Apply the transformation directly on the particle tensors in self.particles.
Log sampling density tensors in self.log_densities will be adjusted by adding the log absolute determinant of the Jacobian of the transformation:
```
log P(Y) = log P(X) + log |det (dX / dY)|
```
Weights are kept the same, but the tensor will be cloned.

Parameters: trans (torch.distributions.transforms.Transform) – The transformation object
Returns: The transformed message.
Return type: Message
Raises: AssertionError – If dist_info attribute is not present in self.attr

See also

torch.distributions.Transform

event_reweight(target_log_prob)[source]¶

Returns a new message with the same type of self with the same particle values and log sampling densities as self, but a different weight tensor, derived from importance weighting target_log_pdf against stored log sampling density tensors in self.log_densities.

self ‘s type must be either MessageType.Particles or MessageType.Both to support this method.

Parameters: target_log_prob (torch.Tensor) – The batched log pdf of the self particles w.r.t. to the batched target distributions the new message is to encode. Should have shape (self.b_shape + self.s_shape).
Returns: A new importance-reweighted message with the same type and components as self except the importance weight.
Return type: Message
Raises: AssertionError – If self has neither type MessageType.Particles` nor type MessageType.Both.

Notes

The importance weighting procedure can be summarized in two steps:

log_ratio = target_log_pdf - joint_log_density
new_weight = normalize(exp(log_ratio))

Some remarks:

joint_log_density here refers to the joint log sampling density of the combinatorially concatenated marginal event particles in self.particles. Therefore, if there are multiple random variables, this quantity is derived by first expanding each marginal log sampling density tensor in self.log_densities to the full sampling dimensions, then taking the sum over all such expanded log density tensor.
The last step guarantees that new_weight sums to 1 across sampling dimensions. Note that this step is not explicitly implemented in this method; we assume it is taken care of by Message class constructor.

event_marginalize(event_dim)[source]¶

Returns a message from self where the event dimension specified bv event_dim is marginalized, corresponding to marginalizing the corresponding random variable.

Only messages with particles support this operation. If self’s message type is MessageType.Both, a MessageType.Parameter type message will be returned, where the parameter of self is discarded.

Parameters

event_dim (int) – Which event dimension / random variable to be marginalized over. Can accept a value in the range [-len(event_shape), len(event_shape) - 1].

Returns

A MessageType.Particles type message where the event_dim th event dimension is marginalized over.

Return type

Message

Raises

AssertionError – If self does not contain particles.
AssertionError – If self’s len(event_shape) is 1, i.e., currently only one event dimension, but still this method is called to marginalize the only left event dimension.

Notes

Regarding the implementation:

Marginalization of the particles is implemented by simply discarding the target particle value tensor as well as its corresponding log sampling density tensor, and summing over the target prob tensor over the event dimension. The target prob tensor is recovered by multiplying the weight tensor with the cross product of all of the marginal sampling density tensor.

Note that the target prob tensor recovered in this way is NOT the actual probability w.r.t. the target distributions, but one that is proportional to that up to a normalization constant factor.

event_concatenate(cat_event_dims, target_event_dim=- 1)[source]¶

Concatenate the particle events corresponding to the event dimensions specified by cat_event_dims. The new concatenated events will be placed at target_event_dim dimension.

To concatenate events means to

combinatorially concatenate the particle value tensors,
take the cross product of associated marginal sampling density tensors and flatten it,
reshape the weight tensor into correct flattened shape.

Note that the event dimensions will be concatenated in the order given by cat_event_dims.

Only messages with particles support this operation. If self’s message type is MessageType.Both, a MessageType.Parameter type message will be returned, where the parameter of self is discarded.

Parameters

cat_event_dims (iterable of int) – The list of event dimensions to be concatenated. Must have length at least 2. Each should be in range [-len(event_shape), len(event_shape) - 1].
target_event_dim (int) – The target event dimension where the concatenated event will be placed. Should be in range [-len(event_shape) + k, len(event_shape) - k - 1], where k equals to len(cat_event_dims).

Returns

A Message.Particles type message where the specified event dimensions are concatenated.

Return type

Message

Raises

AssertionError – If self does not contain particles.

LinkData¶

class pysigma.graphical.basic_nodes.LinkData(vn, fn, to_fn, msg_shape, epsilon=0.0001, **kwargs)[source]¶

Identifies the data of a directed link between a factor node and a variable node. Stores intermediate messages in its message memory.

Note that links are directional, and two of such links should be specified with opposite directions to represent a bidirectional link between a factor node and a variable node.

During construction of the graph, its instance should be passed to NetworkX methods as the edge data to instantiate an edge.

Parameters

vn (VariableNode) – VariableNode instance that this link is incident to.
fn (FactorNode) – FactorNode instance that this link is incident to.
to_fn (bool) – True if this link is pointing toward the factor node.
msg_shape (tuple of torch.Size) – The shape of the message to carry. Used for sanity check of message shapes. Should be in the format (batch_shape, param_shape, sample_shape, event_shape). An empty shape torch.Size([]) should be used as the default none shape.
epsilon (float, optional) – Epsilon upper bound for checking message difference.

memory¶

The message memory buffer.

Type: Message or None

new¶

Indicates if this link-data has received a new message in the current decision phase.

Type: bool

vn¶

The incident variable node.

Type: VariableNode

fn¶

The incident factor node.

Type: FactorNode

msg_shape¶

The allowable message shape, in the format (batch_shape, param_shape, sample_shape, event_shape).

Type: tuple of torch.Size

to_fn¶

Indicates if this link-data is pointing towards a factor node.

Type: bool

epsilon¶

Epsilon upper bound for checking message difference.

Type: float

attr¶

Additional special attributes specified via kwargs in the constructor.

Type: dict

pretty_log¶

Pretty logging for front-end visualization.

Type: dict

reset_shape(msg_shape)[source]¶

Reset shape for the Message

Parameters: msg_shape (tuple of torch.Size) – The target message shape, in the format (batch_shape, param_shape, sample_shape, event_shape). An empty shape torch.Size([]) should be used as the default none shape.

Warning

This method will clear the memory buffer self.memory and set self.new to False.

write(new_msg, check_diff=True, clone=False)[source]¶

Writes to the link message memory with the new message specified via new_msg. Once a new message is written, self.new will be set to True.

The message shape new_msg.shape will first be checked against self.msg_shape to ensure that the message is compatible in shape. See compatible_shape() for more details.

If check_diff is True, will check if the new message is different from the existing one before replacing the existing with the new one.

If clone is True, then will first clone new_msg and store the cloned message in the memory buffer.

Parameters

new_msg (Message) – The new message to be stored in this link-data.
check_diff (bool, optional) – Whether to compare the difference between stored message against new_msg and decide whether to receive the new message and set self.new to True.
clone (bool, optional) – Whether to clone new_msg before storing it in the memory buffer.

Raises

AssertionError – If new message’s shape is not compatible.

Notes

Messages will be deemed different in the following cases:

If they are of different types,

If new message has MessageType.Undefined type,

If they both have parameters and the batch average L2 distance between the two parameter tensors is larger than epsilon,

If they both have particles and either their particle value tensors or their particle log sampling tensors are different.

If they both have particles, and they possess the same particles value tensors and same sampling log density tensors, but the batch average cosine similarity distance between the two particle weight tensors is larger than epsilon.

Note

When self and other have type MessageType.Both, the parameters will be chosen over the particles to compare message difference.

Note

If want to set a new message of a different message type than the current memory, make sure reset_shape() is first called so that shape check works for the new message.

read(clone=False)[source]¶

Returns the current content stored in memory. Set self.new to False to indicate this link message has been read in the current decision phase.

Parameters: clone (bool) – Whether to return a cloned message of the memory.
Returns
Return type: The current memory message.