MIST

MIST improves 93.75% of model-task configurations over the standard projection layer, with an average gain of 3.5%. MIST is reusable across any frozen foundation model and composable with any MIL aggregator because neither the encoder nor…

1 sources - 7 claims

MIST improves 93.75% of model-task configurations over the standard projection layer, with an average gain of 3.5%. MIST is reusable across any frozen foundation model and composable with any MIL aggregator because neither the encoder nor the aggregator is modified. The net parameter increase from MIST is less than 0.09M for seven of eight evaluated architectures. MIST repositions molecular supervision to the projection layer rather than the encoder, using paired spatial transcriptomics data only during training to construct molecular prototypes. MIST uses sigmoid rather than softmax affinities so that a patch can activate multiple prototypes simultaneously, reflecting mixed cellular programs. MIST's residual formulation preserves the original morphological patch signal regardless of prototype alignment, analogous to how staining leaves the underlying tissue accessible. At inference time, MIST requires no transcriptomics data.