tensor_t

tensor_t is the main tensor class in MatX used for tensor operations.

template<typename T, int RANK, typename Storage = DefaultStorage<T>, typename Desc = DefaultDescriptor<RANK>>
class matx::tensor_t : public matx::tensor_impl_t<T, RANK, DefaultDescriptor<RANK>>

View of an underlying tensor data object

The tensor_t class provides multiple ways to view the data inside of a matxTensorData_t object. Views do not modify the underlying data; they simply present a different way to look at the data. This includes where the data begins and ends, the stride, the rank, etc. Views are very lightweight, and any number of views can be generated from the same data object. Since views represent different ways of looking at the same data, it is the responsibility of the user to ensure that proper synchronization is done when using multiple views on the same data. Failure to do so can result in race conditions on the device or host.

Public Functions

__MATX_HOST__ inline void Shallow(const tensor_t<T, RANK, Storage, Desc> &rhs) noexcept

Perform a shallow copy of a tensor view

Alternative to operator= since it’s used for lazy evaluation. This function is used to perform a shallow copy of a tensor view where the data pointer points to the same location as the right hand side’s data. *

Parameters

rhs – Tensor to copy from

inline tensor_t(Storage s, Desc &&desc, T *ldata)

Construct a new tensor t object. Used to copy an existing storage object for proper reference counting.

Parameters
  • s

  • desc

  • ldata

template<typename D2 = Desc, typename = typename std::enable_if_t<is_matx_descriptor_v<D2>>>
__MATX_INLINE__ inline tensor_t(Desc &&desc)

Constructor for a rank-1 and above tensor.

Parameters

desc – Tensor descriptor

__MATX_INLINE__ inline tensor_t(const typename Desc::shape_type (&shape)[RANK])

Constructor for a rank-1 and above tensor.

Parameters

shape – Tensor shape

__MATX_INLINE__ __MATX_HOST__ inline auto operator=(const tensor_t<T, RANK, Storage, Desc> &op)

Lazy assignment operator=. Used to create a “set” object for deferred execution on a device

Parameters

op – Tensor view source

Returns

set object containing the destination view and source object

template<typename T2>
__MATX_INLINE__ __MATX_HOST__ inline auto operator=(const T2 &op)

Lazy assignment operator=. Used to create a “set” object for deferred execution on a device

Parameters

op – Operator or scalar type to assign

Returns

set object containing the destination view and source object

__MATX_INLINE__ __MATX_HOST__ inline auto operator+=(const tensor_t<T, RANK, Storage, Desc> &op)

Lazy assignment operator+=. Used to create a “set” object for deferred execution on a device

Parameters

op – Tensor view source

Returns

set object containing the destination view and source object

template<typename T2>
__MATX_INLINE__ __MATX_HOST__ inline auto operator+=(const T2 &op)

Lazy assignment operator+=. Used to create a “set” object for deferred execution on a device

Parameters

op – Operator or scalar type to assign

Returns

set object containing the destination view and source object

__MATX_INLINE__ __MATX_HOST__ inline auto operator-=(const tensor_t<T, RANK, Storage, Desc> &op)

Lazy assignment operator-=. Used to create a “set” object for deferred execution on a device

Parameters

op – Tensor view source

Returns

set object containing the destination view and source object

template<typename T2>
__MATX_INLINE__ __MATX_HOST__ inline auto operator-=(const T2 &op)

Lazy assignment operator-=. Used to create a “set” object for deferred execution on a device

Template Parameters

T2 – Type of operator

Parameters

op – Operator or scalar type to assign

Returns

set object containing the destination view and source object

__MATX_INLINE__ __MATX_HOST__ inline auto operator*=(const tensor_t<T, RANK, Storage, Desc> &op)

Lazy assignment operator*=. Used to create a “set” object for deferred execution on a device

Parameters

op – Tensor view source

Returns

set object containing the destination view and source object

template<typename T2>
__MATX_INLINE__ __MATX_HOST__ inline auto operator*=(const T2 &op)

Lazy assignment operator*=. Used to create a “set” object for deferred execution on a device

Parameters

op – Operator or scalar type to assign

Returns

set object containing the destination view and source object

__MATX_INLINE__ __MATX_HOST__ inline auto operator/=(const tensor_t<T, RANK, Storage, Desc> &op)

Lazy assignment operator/=. Used to create a “set” object for deferred execution on a device

Parameters

op – Tensor view source

Returns

set object containing the destination view and source object

template<typename T2>
__MATX_INLINE__ __MATX_HOST__ inline auto operator/=(const T2 &op)

Lazy assignment operator/=. Used to create a “set” object for deferred execution on a device

Parameters

op – Operator or scalar type to assign

Returns

set object containing the destination view and source object

__MATX_INLINE__ __MATX_HOST__ inline auto operator<<=(const tensor_t<T, RANK, Storage, Desc> &op)

Lazy assignment operator<<=. Used to create a “set” object for deferred execution on a device

Parameters

op – Tensor view source

Returns

set object containing the destination view and source object

template<typename T2>
__MATX_INLINE__ __MATX_HOST__ inline auto operator<<=(const T2 &op)

Lazy assignment operator<<=. Used to create a “set” object for deferred execution on a device

Parameters

op – Operator or scalar type to assign

Returns

set object containing the destination view and source object

__MATX_INLINE__ __MATX_HOST__ inline auto operator>>=(const tensor_t<T, RANK, Storage, Desc> &op)

Lazy assignment operator>>=. Used to create a “set” object for deferred execution on a device

Parameters

op – Tensor view source

Returns

set object containing the destination view and source object

template<typename T2>
__MATX_INLINE__ __MATX_HOST__ inline auto operator>>=(const T2 &op)

Lazy assignment operator>>=. Used to create a “set” object for deferred execution on a device

Parameters

op – Operator or scalar type to assign

Returns

set object containing the destination view and source object

__MATX_INLINE__ __MATX_HOST__ inline auto operator|=(const tensor_t<T, RANK, Storage, Desc> &op)

Lazy assignment operator|=. Used to create a “set” object for deferred execution on a device

Parameters

op – Tensor view source

Returns

set object containing the destination view and source object

template<typename T2>
__MATX_INLINE__ __MATX_HOST__ inline auto operator|=(const T2 &op)

Lazy assignment operator|=. Used to create a “set” object for deferred execution on a device

Parameters

op – Operator or scalar type to assign

Returns

set object containing the destination view and source object

__MATX_INLINE__ __MATX_HOST__ inline auto operator&=(const tensor_t<T, RANK, Storage, Desc> &op)

Lazy assignment operator&=. Used to create a “set” object for deferred execution on a device

Parameters

op – Tensor view source

Returns

set object containing the destination view and source object

template<typename T2>
__MATX_INLINE__ __MATX_HOST__ inline auto operator&=(const T2 &op)

Lazy assignment operator&=. Used to create a “set” object for deferred execution on a device

Parameters

op – Operator or scalar type to assign

Returns

set object containing the destination view and source object

__MATX_INLINE__ __MATX_HOST__ inline auto operator^=(const tensor_t<T, RANK, Storage, Desc> &op)

Lazy assignment operator^=. Used to create a “set” object for deferred execution on a device

Parameters

op – Tensor view source

Returns

set object containing the destination view and source object

template<typename T2>
__MATX_INLINE__ __MATX_HOST__ inline auto operator^=(const T2 &op)

Lazy assignment operator^=. Used to create a “set” object for deferred execution on a device

Parameters

op – Operator or scalar type to assign

Returns

set object containing the destination view and source object

__MATX_INLINE__ __MATX_HOST__ inline auto operator%=(const tensor_t<T, RANK, Storage, Desc> &op)

Lazy assignment operator%=. Used to create a “set” object for deferred execution on a device

Parameters

op – Tensor view source

Returns

set object containing the destination view and source object

template<typename T2>
__MATX_INLINE__ __MATX_HOST__ inline auto operator%=(const T2 &op)

Lazy assignment operator%=. Used to create a “set” object for deferred execution on a device

Parameters

op – Operator or scalar type to assign

Returns

set object containing the destination view and source object

template<typename M = T, int R = RANK, typename Shape>
__MATX_INLINE__ inline auto View(Shape &&shape)

Get a view of the tensor from the underlying data using a custom shape

Returns a view based on the shape passed in. Both the rank and the dimensions can be increased or decreased from the original data object as long as they fit within the bounds of the memory allocation. This function only allows a contiguous view of memory, regardless of the shape passed in. For example, if the original shape is {8, 2} and a view of {2, 1} is requested, the data in the new view would be the last two elements of the last dimension of the original data.

The function is similar to MATLAB and Python’s reshape(), except it does NOT make a copy of the data, whereas those languages may, depending on the context. It is up to the user to understand any existing views on the underlying data that may conflict with other views.

While this function is similar to Slice(), it does not allow slicing a particular start and end point as slicing does, and slicing also does not allow increasing the rank of a tensor as View(shape) does.

Note that the type of the data type of the tensor can also change from the original data. This may be useful in situations where a union of data types could be used in different ways. For example, a complex<float> could be reshaped into a float tensor that has twice as many elements, and operations can be done on floats instead of complex types.

Template Parameters
  • M – New type of tensor

  • R – New rank of tensor

Parameters

shape – New shape of tensor

Returns

A view of the data with the appropriate strides and dimensions set

__MATX_INLINE__ inline auto View()

Make a copy of a tensor and maintain all refcounts.

Template Parameters
  • M

  • R

Returns

MATX_INLINE

__MATX_INLINE__ inline void PrefetchDevice(cudaStream_t const stream) const noexcept

Prefetch the data asynchronously from the host to the device.

All copies are done asynchronously in a stream. The order of the copy is predictable within work in the same stream, but not when the transfer will occur.

Parameters

stream – The CUDA stream to prefetch within

__MATX_INLINE__ inline void PrefetchHost(cudaStream_t const stream) const noexcept

Prefetch the data asynchronously from the device to the host.

All copies are done asynchronously in a stream. The order of the copy is predictable within work in the same stream, but not when the transfer will occur.

Parameters

stream – The CUDA stream to prefetch within

template<typename U = T>
__MATX_INLINE__ inline auto RealView() const noexcept

Create a view of only real-valued components of a complex array

Only available on complex data types.

Returns

tensor view of only real-valued components

__MATX_INLINE__ inline auto GetStorage() noexcept

Return the storage container from the tensor.

Returns

storage container

template<typename U = T>
__MATX_INLINE__ inline auto ImagView() const noexcept

Create a view of only imaginary-valued components of a complex array

Only available on complex data types.

Returns

tensor view of only imaginary-valued components

__MATX_INLINE__ inline tensor_t Permute(const uint32_t (&dims)[RANK]) const

Permute the dimensions of a tensor

Accepts any order of permutation. Number of dimensions must match RANK of tensor

Template Parameters

M – Rank of tensor to permute. Should not be used directly

Parameters

dims – Dimensions of tensor

Returns

tensor view of only imaginary-valued components

__MATX_INLINE__ inline auto PermuteMatrix() const

Permute the last two dimensions of a matrix

Utility function to permute the last two dimensions of a tensor. This is useful in the numerous operations that take a permuted matrix as input, but we don’t want to permute the inner dimensions of a larger tensor.

Template Parameters

M – Rank of tensor

Parameters

dims – Dimensions of tensors

Returns

tensor view with last two dims permuted

__MATX_HOST__ __MATX_INLINE__ inline T *Data() const noexcept

Get the underlying local data pointer from the view

Returns

Underlying data pointer of type T

template<typename ShapeType, std::enable_if_t<!std::is_pointer_v<typename remove_cvref<ShapeType>::type>, bool> = true>
__MATX_HOST__ __MATX_INLINE__ inline void Reset(T *const data, ShapeType &&shape) noexcept

Set the underlying data pointer from the view

Decrements any reference-counted memory and potentially frees before resetting the data pointer. If refcnt is not nullptr, the count is incremented.

Template Parameters

ShapeType – Shape type

Parameters
  • data – Data pointer to set

  • shape – Shape of tensor

__MATX_HOST__ __MATX_INLINE__ inline void Reset(T *const data) noexcept

Set the underlying data pointer from the view

Decrements any reference-counted memory and potentially frees before resetting the data pointer. If refcnt is not nullptr, the count is incremented.

Parameters

data – Data pointer to set

__MATX_HOST__ __MATX_INLINE__ inline void Reset(T *const data, T *const ldata) noexcept

Set the underlying data and local data pointer from the view

Decrements any reference-counted memory and potentially frees before resetting the data pointer. If refcnt is not nullptr, the count is incremented.

Parameters
  • data – Allocated data pointer

  • ldata – Local data pointer offset into allocated

__MATX_INLINE__ __MATX_HOST__ inline Desc::stride_type Stride(uint32_t dim) const

Get the stride of a single dimension of the tensor

Parameters

dim – Desired dimension

Returns

Stride (in elements) in dimension

__MATX_INLINE__ __MATX_HOST__ inline auto GetRefCount() const noexcept

Get the reference count

Returns

Reference count or 0 if not tracked

__MATX_INLINE__ inline auto OverlapView(std::initializer_list<typename Desc::shape_type> const &windows, std::initializer_list<typename Desc::stride_type> const &strides) const

Create an overlapping tensor view

Creates and overlapping tensor view where an existing tensor can be repeated into a higher rank with overlapping elements. For example, the following 1D tensor [1 2 3 4 5] could be cloned into a 2d tensor with a window size of 2 and overlap of 1, resulting in:

[1 2 2 3 3 4 4 5]

Currently this only works on 1D tensors going to 2D, but may be expanded for higher dimensions in the future. Note that if the window size does not divide evenly into the existing column dimension, the view may chop off the end of the data to make the tensor rectangular.

Parameters
  • windows – Window size (columns in output)

  • strides – Strides between data elements

Returns

Overlapping view of data

template<int N>
__MATX_INLINE__ inline auto Clone(const typename Desc::shape_type (&clones)[N]) const

Clone a tensor into a higher-dimension tensor

Clone() allows a copy-less method to clone data into a higher dimension tensor. The underlying data does not grow or copy, but instead the indices of the higher-ranked tensor access the original data potentially multiple times. Clone is similar to MATLAB’s repmat() function where it’s desired to take a tensor of a lower dimension and apply an operation with it to a tensor in a higher dimension by broadcasting the values.

For example, in a rank=2 tensor that’s MxN, and a rank=1 tensor that’s 1xN, Clone() can take the rank=1 tensor and broadcast to an MxN rank=2 tensor, and operations such as the Hadamard product can be performed. In this example, the final operation will benefit heavily from device caching since the same 1xN rank=1 tensor will be accessed M times.

Parameters

clones – List of sizes of each dimension to clone. Parameter length must match rank of tensor. A special sentinel value of matxKeepDim should be used when the dimension from the original tensor is to be kept.

Returns

Cloned view representing the higher-dimension tensor

template<int M = RANK, std::enable_if_t<M == 0, bool> = true>
__MATX_INLINE__ __MATX_HOST__ inline void SetVals(T const &val) noexcept

Rank-0 initializer list setting

Parameters

val – 0 initializer list value

Returns

reference to view

template<int M = RANK, std::enable_if_t<(!is_cuda_complex_v<T> && M == 1) || (is_cuda_complex_v<T> && M == 0), bool> = true>
__MATX_INLINE__ __MATX_HOST__ inline void SetVals(const std::initializer_list<T> &vals) noexcept

Rank-1 non-complex or rank-0 initializer list setting

Parameters

vals – 1D initializer list of values

Returns

reference to view

template<int M = RANK, std::enable_if_t<(!is_cuda_complex_v<T> && M == 2) || (is_cuda_complex_v<T> && M == 1), bool> = true>
__MATX_INLINE__ __MATX_HOST__ inline void SetVals(const std::initializer_list<const std::initializer_list<T>> &vals) noexcept

Rank-2 non-complex or rank-1 initializer list setting

Parameters

vals – 1D/2D initializer list of values

Returns

reference to view

template<int M = RANK, std::enable_if_t<(!is_cuda_complex_v<T> && M == 3) || (is_cuda_complex_v<T> && M == 2), bool> = true>
__MATX_INLINE__ __MATX_HOST__ inline void SetVals(const std::initializer_list<const std::initializer_list<const std::initializer_list<T>>> vals) noexcept

Rank-3 non-complex or rank-2 complex initializer list setting

Parameters

vals – 3D/2D initializer list of values

Returns

reference to view

template<int M = RANK, std::enable_if_t<(!is_cuda_complex_v<T> && M == 4) || (is_cuda_complex_v<T> && M == 3), bool> = true>
__MATX_INLINE__ __MATX_HOST__ inline void SetVals(const std::initializer_list<const std::initializer_list<const std::initializer_list<const std::initializer_list<T>>>> &vals) noexcept

Rank-4 non-complex or rank-3 complex initializer list setting

Parameters

vals – 3D/4D initializer list of values

Returns

reference to view

template<int M = RANK, std::enable_if_t<is_cuda_complex_v<T> && M == 4, bool> = true>
__MATX_INLINE__ __MATX_HOST__ inline void SetVals(const std::initializer_list<const std::initializer_list<const std::initializer_list<const std::initializer_list<const std::initializer_list<T>>>>> &vals) noexcept

Rank-4 complex initializer list setting

Parameters

vals – 4D initializer list of values

Returns

reference to view

template<int N = RANK>
__MATX_INLINE__ inline auto Slice([[maybe_unused]] const typename Desc::shape_type (&firsts)[RANK], [[maybe_unused]] const typename Desc::shape_type (&ends)[RANK], [[maybe_unused]] const typename Desc::stride_type (&strides)[RANK]) const

Slice a tensor either within the same dimension or to a lower dimension

Slice() allows a copy-less method to extract a subset of data from one or more dimensions of a tensor. This includes completely dropping an unwanted dimension, or simply taking a piece of a wanted dimension. Slice() is very similar to indexing operations in both Python and MATLAB.

2) matxDropDim is used to slice (drop) a dimension entirely. This results in a tensor with a smaller rank than the original

Parameters
  • firsts – List of starting index into each dimension. Indexing is 0-based

  • ends – List of ending index into each dimension. Indexing is 0-based Two special sentinel values can be used: 1) matxEnd is used to indicate the end of that particular dimension without specifying the size. This is similar to “end” in MATLAB and leaving off an end in Python “a[1:]”

  • strides – List of strides for each dimension. A special sentinel value of matxKeepStride is used to keep the existing stride of the dimension

Returns

Sliced view of tensor

template<int N = RANK>
__MATX_INLINE__ inline auto Slice(const typename Desc::shape_type (&firsts)[RANK], const typename Desc::shape_type (&ends)[RANK]) const

Slice a tensor either within the same dimension or to a lower dimension

Slice() allows a copy-less method to extract a subset of data from one or more dimensions of a tensor. This includes completely dropping an unwanted dimension, or simply taking a piece of a wanted dimension. Slice() is very similar to indexing operations in both Python and MATLAB.

2) matxDropDim is used to slice (drop) a dimension entirely. This results in a tensor with a smaller rank than the original

Parameters
  • firsts – List of starting index into each dimension. Indexing is 0-based

  • ends – List of ending index into each dimension. Indexing is 0-based Two special sentinel values can be used: 1) matxEnd is used to indicate the end of that particular dimension without specifying the size. This is similar to “end” in MATLAB and leaving off an end in Python “a[1:]”

Returns

Sliced view of tensor

__MATX_INLINE__ inline auto GetIdxFromAbs(typename Desc::shape_type abs)

Returns an N-D coordinate as an array corresponding to the absolute index abs.

Parameters

abs – Absolute index

Returns

std::array of indices