sceptr#
SCEPTR is a small, fast, and performant TCR representation model for alignment-free TCR analysis. The root module provides easy access to SCEPTR through a functional API which uses the default model.
- sceptr.calc_cdist_matrix(anchors: DataFrame, comparisons: DataFrame) ndarray#
Generate a cdist matrix between two collections of TCRs.
- Parameters:
anchors (DataFrame) – DataFrame specifying the first (anchor) collection of input TCRs. It must be in the prescribed format.
comparisons (DataFrame) – DataFrame specifying the second (comparison) collection of input TCRs. It must be in the prescribed format.
- Returns:
A 2D numpy ndarray representing a cdist matrix between TCRs from anchors and comparisons. The returned array will have shape \((X, Y)\) where \(X\) is the number of TCRs in anchors and \(Y\) is the number of TCRs in comparisons.
- Return type:
ndarray
- sceptr.calc_pdist_vector(instances: DataFrame) ndarray#
Generate a pdist vector of distances between each pair of TCRs in the input data.
- Parameters:
instances (DataFrame) – DataFrame specifying the input TCRs. It must be in the prescribed format.
- Returns:
A 1D numpy ndarray representing a pdist vector of distances between each pair of TCRs in instances. The returned array will have shape \((\frac{1}{2}N(N-1),)\), where \(N\) is the number of TCRs in instances.
- Return type:
ndarray
- sceptr.calc_residue_representations(instances: DataFrame) ResidueRepresentations#
Map each TCR to a set of amino acid residue-level representations. The residue-level representations are the output of the penultimate self-attention layer, as also used by the
average_pooling()variant when generating TCR receptor-level representations.- Parameters:
instances (DataFrame) – DataFrame specifying the input TCRs. It must be in the prescribed format.
- Returns:
An array of representation vectors for each amino acid residue in the tokenised forms of the input TCRs. For details on how to interpret/use this output, please refer to the documentation for
ResidueRepresentations.- Return type:
- sceptr.calc_vector_representations(instances: DataFrame) ndarray#
Map TCRs to their corresponding vector representations.
- Parameters:
instances (DataFrame) – DataFrame specifying the input TCRs. It must be in the prescribed format.
- Returns:
A 2D numpy ndarray object where every row vector corresponds to a row in instances. The returned array will have shape \((N, 64)\) where \(N\) is the number of TCRs in instances.
- Return type:
ndarray