Interestingly, the rapidity of clinical trial enrollment and regulatory agencies accelerated approval has left many unsolved issues to explore in the next wave of immuno-oncology trials. Specifically, relevant unanswered questions concern the optimal study design, endpoints and statistical methods for evaluating immunotherapeutic drugs, the appropriate radiological assessment of antitumor responses, the development of predictive biomarkers and the harmonization of the assays to test these biomarkers in large patient populations. Most of these issues are related to the intrinsic mechanism of action and kinetic of immune checkpoint inhibitors (ICI). Differently from chemotherapy and targeted agents, ICI induce a continuum of biological events that starts early with immune system activation and that procrastinates until the ideal obtainment of a (sometimes) delayed clinical benefit. This peculiar feature should be carefully considered when designing clinical trials with ICI and innovative study methodologies should be applied to appropriately assess the delayed effect of immunotherapeutic agents in terms of responses and survival benefit.
Clinical Trials, Second Edition: Study Design, Endpoints and Biomarkers, Drug Safety, and FDA and IC
Although it is a common belief that survival represents the main endpoint for regulatory agency approvals, 15 (60%) out of 25 FDA approvals for ICI were based on ORR as primary endpoint (2). Interestingly, in some patients treated with ICI, initial disease progression assessed by conventional tumor response criteria, such as WHO criteria (17) or RECIST 1.1 (3), may be followed by prolonged clinical stabilization or partial/complete responses. This phenomenon defined as pseudoprogression is caused by T-cell tumor infiltration as a result of immune activation and was described both with anti-PD-1/PD-L1 and anti-CTLA-4 agents in advanced melanoma patients (18,19) and with PD-1/PD-L1 inhibitors in advanced renal cell carcinoma (RCC) (20) and NSCLC patients (5,21,22). The emerging of pseudoprogression and dissociated responses to ICI brought to the development in 2009 of immune related response criteria (irRC) (7). The key differences compared to RECIST criteria were the introduction of bidimensional measurements (sum of products of the two largest perpendicular diameters), the inclusion of new lesions [usually classified as progressive disease (PD) according to RECIST 1.1] in the total tumor burden and the requirement of confirmation of PD on two consecutive scans at least 4 weeks apart. Subsequently, unidimensional irRC (irRECIST), which used the longest diameter measurements as in RECIST, demonstrated high concordance compared to bidimensional irRC, bypassing the methodological issues linked to the use of bidimensional measurements (8). Finally, the RECIST working group has recently developed a guideline for the use of modified RECIST (named iRECIST) in order to establish a common framework for the management of data from clinical trials with ICI (9). As irRECIST, iRECIST introduced the concept of immune unconfirmed PD (iUPD) which consents to reset the bar if RECIST progression is followed at the next assessment by tumor shrinkage. Basically, the main difference between irRECIST and iRECIST regards the new lesions, which are incorporated into the sum of target lesions in irRECIST while in iRECIST are recorded separately. However, high concordance has been recently reported between irRECIST and iRECIST in a retrospective study including advanced NSCLC patients treated with anti PD-1/PD-L1 agents. Interestingly, for only 4% of NSCLC patients there was a mismatch between irRECIST and iRECIST, where iRECIST interpretation as iUPD led to unnecessary continuation of immunotherapy (5). To date, few clinical trials have used irRC/irRECIST as secondary response endpoints (23-25) and none has used iRECIST as response criteria to define their endpoints. Therefore, the regulatory agencies continue to base the approvals of new ICI on RECIST 1.1 defined outcomes. In the future, the integration and validation of irRECIST/iRECIST in clinical trials will be of paramount importance in order to provide to immuno-oncologists a practical and reliable tool to face the dilemma about whether and when continue immunotherapy beyond progression.
Another emerging challenge for immunotherapy trials is represented by the evaluation of accelerated tumor growth under ICI, a phenomenon known as hyperprogression (HPD) and recently described in 9% of advanced cancer patients (26), in 29% of head and neck cancers (27) and in 14% of NSCLC patients treated with ICI (28). Although each study used different methodologies to assess HPD, all of them highlighted the importance of measuring tumor growth speed on consecutive computed topography (CT) scans, before the start and during immunotherapy treatment. Retrospective evaluation of HPD in published randomized studies is actually difficult because the CT scans data before immunotherapy start are usually not captured. Therefore, a prospective assessment of HPD in adequately designed clinical trials, which collect CT scans before and during ICI and adopt innovative radiologic tools to quantify tumor kinetics and dynamics over time, will provide a confirmatory evidence regarding this rapid and atypical phenomenon. Finally, the use of ORR as a surrogate endpoint for OS in trials with ICI remains an unsolved question. A meta regression analysis of seventeen randomized trials testing ICI showed a weak but statistically significant correlation between the treatment effect on the ORR and the treatment effect on survival outcomes (i.e., OS and PFS) and suggested that the activity of ICI in terms of ORR explain 50% of the effects detected in survival (29). Conversely, a systematic review of ten clinical trials evaluating PD-1/PD-L1 inhibitors in advanced NSCLC failed to show a significant correlation between response and survival (30). Considering that ICI activity potentially leads to prolonged disease stabilization and/or unconventional responses, it is likely that disease control rate (DCR), including both responses and tumor stabilization for at least 6 months of treatment (clinical benefit), may be a more clinically relevant surrogate endpoint for survival compared to ORR. The potential future validation of ORR or clinical benefit as surrogate endpoints for survival may consent an earlier analysis of trial data, allowing less expensive and prolonged studies and, most of all, rapidly addressing progressive patients towards other treatments.
Traditionally, OS is considered the gold standard among efficacy endpoints in clinical trials and median OS is often quoted as the primary or secondary endpoint of interest. However, median OS may not be the best endpoint for therapies with potential long-term benefit. This observation was reported for the first time in clinical trials evaluating cancer vaccines, such as the phase III study comparing sipuleucel-T, an autologous active cellular immunotherapy, to placebo in advanced prostate cancer patients, where the effect on survival was not evident for the first 8 months of treatment (44). Similarly, also phase III trials of CTLA-4 or PD-1/PD-L1 inhibitors in advanced melanoma (45) and NSCLC (38,42) patients showed delayed separation of survival curves, or even a cross between them with an initial better survival outcome for chemotherapy compared to ICI, as observed in Checkmate 057, a phase III trial comparing nivolumab versus docetaxel in pretreated non-squamous NSCLC patients (38). Recently, an update of the phase I CA209-003 trial testing nivolumab in 129 previously-treated NSCLC patients showed that 5-year OS was 16% for squamous and 15% for non-squamous patients (46), however the 9.9 months of median OS did not adequately estimate the durable benefit demonstrated by the plateaus in the tails of the survival curves. In Figure 1 is reported the hypothetical survival curve of a treatment (i.e., immunotherapy) that leads to long-term survival in a small proportion of patients (green line) compared to a standard therapy, potentially a cytotoxic agent, (red line) not associated with a prolonged survival benefit. Median OS, calculated as the time point after initiation of the treatment at which 50% of patients are still alive, clearly does not provide any information concerning the minor proportion of patients who occupies the tail of the curves (cure fraction). Therefore, median OS neither differentiates the proportion of patients alive or dead after 50% of patients have died nor reflects the survival time of the patients who are alive after the median OS is reached. In addition, the delayed clinical effect observed with ICI leads to the loss of statistical power if the trial is designed based on conventional proportional hazard model assumption (12). According to the proportional hazard model, HR is equal to 1 in the first part of the curves (early HR) and it becomes unequal to 1 after the separation of the curves (delayed HR). To demonstrate a statistically significant difference in OS, the delta between these two HRs should be high, in fact the HR after the separation of the curves must compensate the lack of separation during the first months of treatment (47) (Figure 1). However, the number of events required to have a large delta value should also increase and the study risks to definitively result as underpowered. In this regard, a recent report by the Institute for Clinical and Economic Review (ICER) highlighted the difficulty in using a proportional hazard model in studies evaluating ICI in advanced NSCLC patients (48). In particular, the ICER analysis stated that the existence of two populations in the immune-oncology arms of the trials, a majority who does not respond to ICI and has a high hazard for survival and a minority with sustained responses and low hazard for progression and mortality, makes difficult the use of proportional hazard models for survival analysis. Notably, survival curve statistic that optimally captures the benefit offered by a particular therapy can differ according to the class of drugs or the clinical context (35). As an example, traditional statistical methods (log rank and Cox model) and survival measures (median OS and HRs) can be usually applied for drugs that start to work early (the OS curve separate since the beginning) and continue to be more active compared to the control arm along the treatment is administered with the assumption that anything affecting the hazard does so by the same ratio at all times (Figure 2A). Median OS but not HR could be used for non-proportional risk models with absence of long-term survivors, as observed in trials evaluating targeted agents (Figure 2B). In fact, the initial large benefit driven by the target agent is entirely capture by the median survival, however with the emerging of resistance this difference disappears and the survival curves cross at a certain time, making the assumption of proportional hazards not applicable in this case. For drugs with delayed benefit, which lead to prolonged survival in a relatively small subset of patients, following a non-proportional risk model (Figure 2C), neither median OS, nor HR are appropriate and alternative statistical methods and survival measures should be reported.
2ff7e9595c
Comments