Predictive Markers Require Thorough Analytic Validation Article Swipe
YOU?
·
· 2019
· Open Access
·
· DOI: https://doi.org/10.5858/arpa.2019-0112-le
· OA: W2963816221
To the Editor.—The recent editorial by Erik Thunnissen, “How to Validate Predictive Immunohistochemistry Testing in Pathology? A Practical Approach Exploiting the Heterogeneity of Programmed Death Ligand-1 Present in Non–Small Cell Lung Cancer,”1 propagates misunderstandings regarding immunohistochemistry (IHC) assay validation that could be deleterious to accurate predictive marker IHC test development, especially given the emerging impact of programmed death ligand-1 (PD-L1) testing.Thunnissen considers whether an immunohistochemical slide with heterogeneous staining might be considered a “composite of several hundreds or thousands of analytes.” We disagree with this premise. First, if 2 assays were to be compared using serial sections cut from the same block, it would be impossible to directly compare cell-to-cell results, even with sophisticated image analysis capabilities. We consider one slide or case to be one analyte for validation purposes. Furthermore, one cannot assume that heterogeneity is due only to variable protein expression in a given area of tissue, particularly in “critical samples, which have an epitope concentration close to the threshold of the validated assay.”1,2 Besides the stated differences in protein expression, these “critical samples” are more likely to yield heterogeneous results for multiple reasons, including (but not limited to) uneven chemical exposures during fixation and processing, small variations in the thickness of cut sections (even between adjacent sections from the same ribbon), uncontrolled variations in ambient conditions between runs, the effects of storage on unstained sections, and the tendency for most automated platforms to occasionally leave small understained or overstained zones relative to immediately adjacent fields due to local variations in reagent delivery or rinse. Because the result is averaged over the tissue section as a whole, the testing of multiple cases from an internal archive helps to mitigate the impact of these variables.Thunnissen states, “In lung cancer, the only compendium diagnostic test is the 22C3 assay from Agilent, coupled to pembrolizumab.” However, there are 3 US Food and Drug Administration (FDA)–approved companion-complementary assays for non–small cell lung cancer using the antibodies 22C3, 28-8, and SP142, with SP263 FDA-approved for other indications.3–6We would also like to emphasize that the proposed comparator assay used for validation of a predictive marker laboratory-developed test (LDT) must itself be validated. Thus, it would not be acceptable to purchase the FDA-approved 22C3 assay “kit,” run it out of the box on a set of cases, and then use the resultant slides to validate an LDT. The performance of the “kit” assay would first have to be verified in a given laboratory. In the setting of validation of a new predictive IHC assay, such as PD-L1, it might be most judicious to compare “the new test's results with the results of testing the same tissue validation set in another laboratory using a validated assay.”7 Unfortunately, for PD-L1, comparator molecular assays are not available as they are for ERBB2 (HER2) amplification, ALK translocation, and BRAF mutation, for example. This comparison needs to be executed and evaluated whether the new IHC assay is an LDT or an unmodified “compendium” diagnostic kit.Titration of an antibody is part of assay optimization and is traditionally performed prior to validation, rather than after staining 20 or more cases. Optimization is generally performed on a small number of cases and/or designated control tissues, followed by full validation to total 20+/20− for predictive markers. A subset of the validation cohort could be tested as an intermediate step, to include a range of expression levels. Reoptimization is performed if the new assay does not show satisfactory results. On modern automated platforms, changes in staining characteristics can be accomplished not only by increasing or decreasing antibody concentration, but also by altering antibody incubation time, antigen retrieval time, heat, etc. However, changing parameters beyond the manufacturer's specifications converts a “kit” assay into an LDT (laboratory-modified test).We would like to re-emphasize that the College of American Pathologists (CAP) guideline on IHC validation7 indeed refers to analytic validation, not clinical validation, and it provides reasonably explicit instructions on validating predictive markers, whether LDT or not. Initial assay validation does not ensure that an LDT will perform appropriately across all intended uses; revalidation may be necessary as test conditions change and new applications emerge (eg, PD-L1 in cancers other than lung).8 Although the guidelines also confer latitude to laboratory directors in tailoring validation plans to their practice, we think that Thunnissen's approach to PD-L1 LDT assay validation would be insufficient. Although guidelines for analytic validation of complementary/companion diagnostic tests do not currently exist, a reasonable approach might be to follow the published CAP guidelines for predictive markers (90% concordance over a validation set of 40 total cases: 20 negative, 20 positive)7 until further data emerge.