Surprising Findings That Should Shape the Manuscript

A systematic search does not just confirm what you expected to find -- it surfaces patterns, absences, and contradictions that reshape the argument.

The following findings emerged from the search process itself and should influence the structure, emphasis, and argumentation of the manuscript.

0. The Feature Selection Choice Determines Everything

The single most surprising methodological finding: across 25 multi-omics lung cancer studies, the downstream results -- including the number of subtypes, their biological interpretation, and their clinical utility -- are determined more by the feature selection strategy than by the integration algorithm. Studies that use Cox-based supervised filtering invariably find 2 subtypes; studies that use variance-based unsupervised filtering find 4-5. The integration method (MOVICS, iCluster, SNF) is secondary. Yet feature selection methodology is rarely discussed, justified, or compared within individual studies.

Furthermore, 72% of studies use the same framework (MOVICS), 88% use the same data (TCGA), and the majority apply the same feature selection (univariate Cox p<0.05). This represents a methodological monoculture that risks producing convergent but potentially biased conclusions.

Implication: This should be a central argument in the review, not a methods footnote. The manuscript should explicitly compare the two philosophies, demonstrate what biology is lost under each, and propose the hierarchical approach (biological clustering first, clinical stratification second) as a resolution. See Chapter 13.

1. The Field Is Converging

1,928 papers span three or more themes simultaneously. The most cross-cutting papers (spanning 7-8 themes) are all from 2024-2026, with titles combining multi-omics, AI/ML, immunotherapy, and biomarker discovery in single studies. This is not a coincidence -- it reflects a genuine convergence in the field toward integrative, multi-modal approaches.

Implication: The review's central thesis -- that multi-omics integration combined with AI/ML represents the future of lung cancer research -- is supported not just by the content of individual papers but by the structural pattern of the literature itself. Use this convergence trend as evidence in the introduction.

2. Never-Smoker Literature Is Thinner Than Expected

Despite lung cancer in never-smokers accounting for 10-25% of global cases and rising in incidence, the molecular characterization literature for this population is surprisingly thin compared to smoking-associated disease. Theme 06 collected only 2,689 papers (the second-smallest corpus), with only 2 Tier A and 76 Tier B. The intersection with multi-omics (Theme 03) and spatial omics (Theme 12) is essentially empty.

Implication: The gap IS the finding. The relative scarcity of never-smoker research should be foregrounded as a call to action, not buried as a limitation. Frame the review's emphasis on never-smokers as addressing a critical blind spot.

3. PM2.5 Does Not Mutate -- It Promotes

The mechanistic insight from Martinez-Ruiz et al. [PMID: 37046093] that air pollution promotes EGFR-driven lung adenocarcinoma in never-smokers through inflammatory IL-1-beta signaling -- expanding pre-existing EGFR-mutant clones rather than causing new mutations -- represents a genuine paradigm shift. For decades, the default model of environmental carcinogenesis assumed mutagenesis. This promotional mechanism demands a reframing.

Implication: Dedicate a subsection to this finding. It has direct implications for prevention (anti-inflammatory interventions), screening (targeting individuals with chronic PM2.5 exposure), and molecular understanding (why never-smoker LUAD is EGFR-enriched in polluted regions).

4. Foundation Models Are Rewriting the Rules

Since 2023, pathology foundation models -- large-scale vision models pre-trained on millions of histopathology images -- have emerged as the fastest-growing subfield in AI/ML oncology. The current manuscript outline under-emphasizes this revolution. Models like those from Wang et al. [PMID: 39232164] and Yang et al. [PMID: 40064883] represent a shift from task-specific architectures to general-purpose visual encoders that can be fine-tuned for virtually any pathology task.

Implication: Add a dedicated subsection on foundation models. This is where the field is heading, and a 2026 review that does not discuss them will read as dated.

5. Drug Repurposing Needs Reframing

The spec emphasized dimethyl fumarate for KEAP1-mutant cancers, but this query produced only 13 results. Synthetic lethality approaches (targeting KRAS, STK11, KEAP1 dependencies) have substantially more literature support. Long et al. [PMID: 36512628] demonstrated that PARP inhibition induces both synthetic lethality AND adaptive immunity in LKB1-mutant lung cancer -- a dual mechanism with both therapeutic and immunological implications.

Implication: Broaden the drug repurposing section beyond DMF. Anchor it in synthetic lethality and pharmacogenomic database approaches (GDSC, CCLE, PRISM, DGIdb). DMF can be mentioned as one candidate among several.

6. SCLC Transformation Demands Mention

Even in an NSCLC-focused review, NSCLC-to-SCLC histologic transformation under EGFR TKI pressure is too clinically important to omit. It represents a well-documented mechanism of acquired resistance with distinct molecular underpinnings, and it connects the NSCLC and SCLC molecular subtyping literatures.

Implication: Include a brief discussion of lineage plasticity and histologic transformation as a resistance mechanism, even if the review's primary focus is NSCLC.

7. Liquid Biopsy Is Ready for Its Close-Up

ctDNA analysis and cfDNA methylation profiling are approaching clinical utility for early detection, treatment monitoring, and minimal residual disease assessment. Abbosh et al. [PMID: 37055640] and Black et al. [PMID: 41205598] represent cutting-edge applications. The current outline does not include a dedicated liquid biopsy section.

Implication: Consider adding a subsection on liquid biopsy technologies, particularly cfDNA methylation-based approaches that bridge epigenomics (Theme 08) with clinical diagnostics (Theme 12).

8. The Epigenetics Reproducibility Crisis Is Quantifiable

76 retracted papers in a single theme -- concentrated in miRNA and lncRNA prognostic signature studies -- is not noise. It is a finding. This pattern is consistent with known concerns about irreproducible non-coding RNA biomarker research and should be discussed in the manuscript.

Implication: Include a paragraph in the epigenetics section (or in a broader "Challenges and Limitations" section) discussing reproducibility concerns in ncRNA biomarker studies, citing the retraction pattern as quantitative evidence. Recommend pre-registered validation frameworks.

Surprising Findings and Writing Recommendations