Usage#

Model variants#

The sceptr.variant submodule allows users access a variety of non-default SCEPTR model variants, and use them for TCR analysis. The submodule exposes functions which return Sceptr objects with the model state of the chosen variant loaded. These model instances expose the same functions as those used in the functional API, so you can just plug and play. For example:

>>> from sceptr import variant
>>> sceptr_tiny = variant.tiny()
>>> tiny_reps = sceptr_tiny.calc_vector_representations(tcrs)
>>> print(tiny_reps.shape)
(4, 16)

Hardware acceleration / device selection#

By default, SCEPTR will detect any available devices with harware-acceleration capabilities and automatically load models onto those devices. Currently, CUDA-enabled devices are supported, with MPS pending on better upstream Pytorch support. In cases where you would like to explicitly limit SCEPTR to using the CPU, you can call sceptr.disable_hardware_acceleration().

Mus musculus support (Experimental)#

Caution

This is an experimental feature!

You can now use SCEPTR to generate representations of Mus musculus TCRs. See sceptr.setup() for more details.

Prescribed data format#

SCEPTR expects to receive TCR data in the form of pyrepseq standard format-compliant pandas DataFrames. All TCR data should be represented as a DataFrame with the following structure and data types. The column order is irrelevant. Each row should represent one TCR. Incomplete rows are allowed (e.g. only beta chain data available) as long as the SCEPTR variant that is being used has at least some partial information to go on. Extra columns are also allowed.

Column name

Column datatype

Column contents

TRAV

str

IMGT symbol for the alpha chain V gene

CDR3A

str

Amino acid sequence of the alpha chain CDR3, including the first C and last W/F residues, in all caps

TRBV

str

IMGT symbol for the beta chain V gene

CDR3B

str

Amino acid sequence of the beta chain CDR3, including the first C and last W/F residues, in all caps