
the "Nodule Map Startup" project for lung cancer diagnostics or therapeutics
A bio-medical engineering deep dive into “Big data”, Artificial Intelligence, and machine-learning. Discover a theoretical, creative and scientific model approach to lung cancer diagnostics or therapeutics, as it pertains to the incidence of lung cancer in totals public.
Winnie Sung
Sep 26, 2023
Summary
NoduleMap is a theoretical company focused on creating Artificial Intelligence (AI) and Machine Learning (ML) tools for lung cancer diagnoses. The overarching mission of this company is to enhance accurate and early diagnosis of lung cancer so that appropriate interventions can be provided to patients to increase their survival rate. By developing an AI tool, the Nodule Mapper supports imaging analysis and clinical decision-making. NoduleMap’s vision is to integrate it into clinical workflow. Ultimately, the Nodule Mapper aims to act as a secondary evaluation of patient chest scans to detect and classify potentially malignant lung nodules, and support the work of radiologists and pulmonologists. An important feature of the Nodule Mapper will be its explainability in decision-making when analyzing image features. This will provide medical professionals with an easy way to understand the decision-making process of the AI tool, and to better integrate it as a secondary opinion in their evaluations. NoduleMap’s designs are ready to proceed in increasing volumes of research and development to build an accurate tool that is validated by medical professionals and data from diverse patient pools. Through commercialization and integration into hospital settings, specifically radiology and pulmonary departments, NoduleMap hopes to redefine early lung cancer diagnosis.
Significance
Lung cancer persists as the leading cause of cancer-related deaths. Across the globe, there are approximately two million new cases of lung cancer each year [1]. In the US, the rate of survival for patients with lung cancer after five years is only 18%. The low rate of survival is partially attributed to late-stage diagnosis of lung cancer, in which treatments are less effective. The American Cancer Society (ACS) states that detection and appropriate treatment of lung cancer in its early stage can improve patient survival rate up to 90%. Therefore, developing lung cancer screening modalities that are able to accurately identify and classify lung cancer in its early stage is very important.
Another burden created by inaccurate lung cancer screening falls on patients who receive false positive results. In this case, patients face unnecessary prolonged testing and follow-up, such as biopsies, which have associated risks and financial costs. Around 30% of biopsies performed on suspected malignant tumors turn out to be benign. As a result, patients and their families also face psychological impact from the possibility of being diagnosed with lung cancer. With a tool that can enhance decision-making about the malignancy of detected nodules, fewer patients will face the impact of false positive lung cancer diagnosis.

Background
There are two main forms of lung cancer. While non-small cell lung cancer (NSCLC) is more common, small cell lung cancer (SCLC) has a higher growth rate and is more difficult to treat [2]. NSCLC is classified from stages 0-4 based on tumor size, location, and its effects on the lungs. Meanwhile, SCLC has two stages: the limited stage is characterized by the cancer reaching one area of the chest whereas the extensive stage is characterized by the cancer spreading throughout the lungs and metastasis to the lymph nodes and various locations.
Currently, the recommended screening tool for lung cancer is low-dose computed tomography (LDCT) because it produces high resolution imaging while using a low radiation dose. To identify early stage cancer, LDCT aims to detect pulmonary nodules, which typically present as small and round-shaped spots in the lungs [3]. In classifying nodules as benign or malignant, size is taken into account. The nodules are considered benign when their diameter is less than 3 cm and potential causes include inflammation and injury which led to calcification. When the diameter is greater than 3 cm, they are considered malignant, which are further classified into solid, sub-solid, and liquid nodules. Solid nodules have well-defined edges due to higher calcification levels, whereas pure ground glass (liquid) nodules appear hazy and are difficult to see. The intermediate, subsolid nodules are hard to visualize as they present relatively hazy structures that are heterogeneous and not well-defined. Taken from LDCT scans, subsolid nodules have been mistaken for blood vessels. To assist radiologists and pulmonologists in the detection and classification of nodules, more research focused on developing ways to fit AI into lung cancer workflow.
State-of-the-art developments in computer-aided diagnosis (CAD) systems to interpret CT scans have included the following general steps: lung segmentation, nodule detection and segmentation, nodule classification, feature analysis, and diagnosis [4]. In lung segmentation, various approaches are used to separate the lungs from the rest of the body. For example, a machine learning-based method trains a model to label lung and non-lung regions by extracting features from each pixel and predicting boundaries for the lungs. Effective nodule detection must first detect the nodules and then discard false positive nodules. Deep learning models are trained with CT scans from public databases to discriminate between nodules and lung parenchyma. The use of 3D convolutional neural networks (CNN) is advantageous in using various feature analyses, such as through local shape analysis, data-driven local contextual feature learning, geometric and intensity statistical features [5]. Subsequently, nodule classifications using
classifiers such as support vector machines have also shown high amounts of accuracy, sensitivity, and specificity.
In the clinical workflow, a radiologist first reviews patient LDCT scans via the Lung Reporting and Data System (Lung-RADS) classification system. Lung-RADs provide structured reporting for radiologists to classify lung nodules into categories ranging from 0 - 4 [6]. The classification is used to determine if subsequent interventions are needed, such as additional imaging, biopsy, or specialist referrals. Current models are built to analyze LDCT scans based on patient data and this classification system outputs a lung cancer risk score. The largest data collection of LDCT for patients with high-risk for lung cancer is from the National Lung Screening Trial (NLST) [7]. Conducted in 2011, this was the first multicenter randomized controlled trial (RCT) which included 53,000 participants. The second largest RCT was the Dutch Belgian NELSON trial, which included 15,789 participants [8]. Both trials conducted studies that showed that annual lung cancer screenings decreased the mortality of high-risk patients by around 20%.

Innovation
The Nodule Mapper presents the novel idea of a deep-learning model that has explainability functions for its decision-making regarding whether a patient’s LDCT scans suggest a significant risk for lung cancers. The deep-learning model will analyze LDCT scans to detect and classify lung nodules by two-step classifications. The identified nodule will be sorted into solid, sub-solid, and ground-glass groups, in which a further binary classification will be made to determine if the nodule is benign or malignant. A gradient-based map will be generated to highlight which regions of the scan led to the model’s decision-making.The integration of explainability will create a software that can better aid radiologists and pulmonologists in evaluating LDCT scans. In contrast to current softwares that outputs a lung cancer prediction score, the explainability aspect of Nodule Mapper will serve to integrate better into clinical setting workflows. Medical professionals will be able to compare their decision-making with that of the software, determining which features from the scans were prioritized and led to subsequent classifications of lung nodules. Understanding how an AI model came to its decision will make medical professionals feel more comfortable working with a system and using it as a second scan reader. This model will also aim to detect lung cancer nodules with less false positive rates and greater detection rates for sub-solid and ground-glass nodules, which tend to have higher risks for malignancy [9].
Study Design
To design a CAD system, the following steps will be considered. For lung segmentation, a hybrid approach will be used. A machine learning-based algorithm will be developed to isolate pulmonary regions of the scan and a region-based method will be used to correct the segmentation. In the region-based method, similarities between neighboring pixels as regions grow are considered to determine whether the next pixel meets a predetermined criteria to be labeled as a region of the lungs [10]. This was chosen as it is suited for early stage lung nodule development which does not create extensive abnormalities in lung appearance. In identifying and classifying the nodules, radiomics based feature analysis will be conducted [11]. The first step will identify and segment all nodule candidates. The second step will classify nodules as either solid, subsolid, or ground-glass. The third step will classify the nodule as malignant or benign. Data will be extracted from images, transformed to numeric form, and grouped into vectors [12]. A classifier will be used to sort the feature vectors. The features that will be considered to distinguish between different types of nodules include: morphology, texture, histogram, gradient, and spatial. A secondary custom convolutional neural network (CNN) based deep learning model will also be developed to correct the radiomics based classifications.
Alongside this, an attribution-based explainability model will be developed and incorporated to explain the decisions of the deep learning model in identifying and classifying nodules. There will be a focus on using
backpropagation based methods that compute the attribution of specific image features by passing through network layers [13]. Grad-CAM is a model that can be used in nodule detection for its ability to create gradient-weighted class activation maps as it flows through convolutional layers of the CNN [14]. This is useful in highlighting specific regions on the LDCT scans where the model detected nodules, explaining features that led to its decision-making. A saliency map can be created to visualize which regions of each nodule contributed to the model’s classification [15]. This is an extension of the previous step as it uses a gradient-based approach to determine the attribution of each pixel to decision-making. A Layer-wise relevance propagation (LRP) will be used to explain decision-making of which features of nodules were prioritized and led to its classification.
To obtain data, this study will apply for access to LDCT scans and follow-up data from 40,000 NLST participants [16]. The images will be processed to standardize pixel values for comparison. This data set will be split into sets for training, development and tests. The models will be trained using the training dataset. Each LDCT will be considered a unique data point and the thinnest CT image slice will be analyzed. Specific LDCTs that resulted in lung cancer diagnosis, confirmed by biopsy within a 5 year period, will be considered positive for malignancy. Ten expert radiologists, each with more than ten years of experience, will also participate in annotating the LDCT scans for classifying the lung nodules. In evaluating the prediction model, the detections, classifications, and subsequent explanations made by the AI and the radiologists will both be measured by sensitivity, specificity, and accuracy. Following this, three validation studies with distinct and diverse patient pools will be conducted. Once this model receives clearance from the FDA, this company intends to partner with imaging devices to integrate this model into clinical settings.
Market
In 2022, the global lung cancer screening market was valued at US$ 3.03 Billion and this value is projected to increase to US$ 6.60 Billion by 2032 [17]. While the Nodule Mapper can be applicable in markets across the world, we will aim to first enter the US market because it has the highest commercial potential, with high volumes of medical imaging devices suitable for the Nodule Mapper. The ACS currently recommends annual lung cancer screening by LDCT for individuals who are current or former smokers between ages 50 - 80 [18]. In the US alone, figures from 2023 showed that approximately 650,000 people in the US have been diagnosed with lung cancer and around 240,000 people will be diagnosed by the end of 2023. Due to various risk factors for lung cancer, namely smoking, air pollution, radiation, and genetic factors, up to 8 million people in the US are at high risk for developing lung cancer [19]. With increasing awareness for the benefit of early lung cancer detection and a global push for regular lung cancer screening using LDCT, the Nodule Mapper will be marketed toward supporting radiologists and pulmonologists in clinical settings to support their evaluation of LDCTs as a secondary evaluator. According to the US National Cancer Institute, the total budget for lung cancer research was US$ 459 Million in 2021. Over recent years, both government and private investments in Lung Cancer AI imaging tools have increased significantly, and NoduleMap aims to attract investments from both sectors. We will apply for Lung Cancer Research Foundation grants. Some suitable grants include the LCRF Leading Edge Research Grant Program and the LCRF Research Grant on Early Detection in Lung Cancer [20]. Our product demonstrates value in supporting the healthcare workforce to reduce errors associated with lung cancer diagnosis and enhance early detection. credibility by working with leading instruction and planning partnerships with medical device imaging companies. After development and clinical validation, the downstream business model will be to provide the Nodule Mapper on a subscription-based model to hospitals. Annual subscription models minimize capital investment while being advantageous in its flexibility and ability to keep software up to date. A secondary profitable product will be to license the models developed by the Nodule Mapper to individual businesses based on one-time fees or ongoing royalties.

Competition
The Optellum Virtual Nodule Clinic Software is a lung cancer prediction convolutional neural network model that received premarket notification clearance from the FDA in 2021, demonstrating its safety and effectiveness [21]. This deep learning model was trained using a myriad of data from lung CT scans and corresponding diagnoses of patients across the UK, the US, and Europe. Currently, it is capable of analyzing a lung CT scan and outputting a lung cancer prediction score from one to ten for the risk of malignancy. Clinical studies have demonstrated that radiologists and pulmonologists showed improvements in forming predictions for the malignancy risk of pulmonary nodules when aided by Optellum’s predictive AI. The Virtual Nodule Clinic Software now collaborates with GE HealthCare to integrate the software into clinical settings. Optellum aims to incorporate its predictive tool into GE HealthCares imaging devices to enhance accurate lung cancer diagnoses in the early stages.
A separate deep learning model, Sybil, also predicts the future risk of lung cancer development. Sybil was trained using data from the National Lung Screening Trial (NLST), by allocating LDCTs to training, development and test sets [22]. Scans were considered positive for lung cancer if they corresponded to a diagnosis confirmed by biopsies within six years. The model was also validated by sets of data from patients at Massachusetts General Hospital in Boston and Chang Gung Memorial Hospital in Taiwan. Sybil achieved an accuracy from 0.86 - 0.94 across the three datasets in the year after the scan.
References
[1] Sharma, Rajesh. “Mapping of Global, Regional and National Incidence, Mortality and Mortality-to-Incidence Ratio of Lung Cancer in 2020 and 2050.” International Journal of Clinical Oncology, vol. 27, no. 4, 2022, pp. 665–75. PubMed Central, https://doi.org/10.1007/s10147-021-02108-2.
[2] “Lung Cancer Stages and Survival Rate.” City of Hope, 5 Oct. 2018, https://www.cancercenter.com/cancer-types/lung-cancer/stages.
[3] Savitha, G., and P. Jidesh. “A Holistic Deep Learning Approach for Identification and Classification of Sub-Solid Lung Nodules in Computed Tomographic Scans.” Computers & Electrical Engineering, vol. 84, June 2020, p. 106626. ScienceDirect, https://doi.org/10.1016/j.compeleceng.2020.106626.
[4] Fahmy, Dalia, et al. “How AI Can Help in the Diagnostic Dilemma of Pulmonary Nodules.” Cancers, vol. 14, no. 7, Apr. 2022, p. 1840. PubMed Central, https://doi.org/10.3390/cancers14071840.
[5] Schreuder, Anton, et al. “CT-Detected Subsolid Nodules: A Predictor of Lung Cancer Development at Another Location?” Cancers, vol. 13, no. 11, June 2021, p. 2812. PubMed Central, https://doi.org/10.3390/cancers13112812.
[6] Pinsky, Paul F., et al. “Performance of Lung-RADS in the National Lung Screening Trial.” Annals of Internal Medicine, vol. 162, no. 7, Apr. 2015, pp. 485–91. PubMed Central, https://doi.org/10.7326/M14-2086.
[7] “The National Lung Screening Trial: Overview and Study Design1.” Radiology, vol. 258, no. 1, Jan. 2011, pp. 243–53. PubMed Central, https://doi.org/10.1148/radiol.10091808.[8] Zhao, Ying Ru, et al. “NELSON Lung Cancer Screening Study.” Cancer Imaging, vol. 11, no. 1A, Oct. 2011, pp. S79–84. PubMed Central, https://doi.org/10.1102/1470-7330.2011.9020.
[9] Hammer, Mark M., and Hiroto Hatabu. “Subsolid Pulmonary Nodules: Controversy and Perspective.” European Journal of Radiology Open, vol. 7, Sept. 2020, p. 100267. PubMed Central, https://doi.org/10.1016/j.ejro.2020.100267.
[10] Arlova, Alena, et al. “Artificial Intelligence-Based Tumor Segmentation in Mouse Models of Lung Adenocarcinoma.” Journal of Pathology Informatics, vol. 13, Jan. 2022, p. 100007. PubMed Central, https://doi.org/10.1016/j.jpi.2022.100007.
[11] Binczyk, Franciszek, et al. “Radiomics and Artificial Intelligence in Lung Cancer Screening.” Translational Lung Cancer Research, vol. 10, no. 2, Feb. 2021, pp. 1186–99. PubMed Central, https://doi.org/10.21037/tlcr-20-708.
[12] Nageswaran, Sharmila, et al. “Lung Cancer Classification and Prediction Using Machine Learning and Image Processing.” BioMed Research International, vol. 2022, Aug. 2022, p. 1755460. PubMed Central, https://doi.org/10.1155/2022/1755460.
[13] Singh, Amitojdeep, et al. “Explainable Deep Learning Models in Medical Image Analysis.” Journal of Imaging, vol. 6, no. 6, June 2020, p. 52. PubMed Central, https://doi.org/10.3390/jimaging6060052.
[14] Neal Joshua, Eali Stephen, et al. “3D CNN with Visual Insights for Early Detection of Lung Cancer Using Gradient-Weighted Class Activation.” Journal of Healthcare Engineering, vol. 2021, Mar. 2021, p. 6695518. PubMed Central, https://doi.org/10.1155/2021/6695518.
[15] Pertuz, Said, et al. “Saliency of Breast Lesions in Breast Cancer Detection Using Artificial Intelligence.” Scientific Reports, vol. 13, Nov. 2023, p. 20545. PubMed Central, https://doi.org/10.1038/s41598-023-46921-3.
[16] Liu, Mingsi, et al. “The Value of Artificial Intelligence in the Diagnosis of Lung Cancer: A Systematic Review and Meta-Analysis.” PLOS ONE, vol. 18, no. 3, Mar. 2023, p. e0273445. PubMed Central, https://doi.org/10.1371/journal.pone.0273445.
[17] Philipson, Tomas J., et al. “The Aggregate Value of Cancer Screenings in the United States: Full Potential Value and Value Considering Adherence.” BMC Health Services Research, vol. 23, Aug. 2023, p. 829. PubMed Central, https://doi.org/10.1186/s12913-023-09738-4.
[18] Wolf, Andrew M. D., et al. “Screening for Lung Cancer: 2023 Guideline Update from the American Cancer Society.” CA: A Cancer Journal for Clinicians, Nov. 2023, p. caac.21811. DOI.org (Crossref), https://doi.org/10.3322/caac.21811.
[19] Thandra, Krishna Chaitanya, et al. “Epidemiology of Lung Cancer.” Contemporary Oncology, vol. 25, no. 1, 2021, pp. 45–52. PubMed Central, https://doi.org/10.5114/wo.2021.103829.
[20] “Funding Opportunities 2023.” Lung Cancer Research Foundation, https://www.lungcancerresearchfoundation.org/research/funding-opportunities-2023/. Accessed 29 Nov. 2023.
[21] Massion, Pierre P., et al. “Assessing the Accuracy of a Deep Learning Method to Risk Stratify Indeterminate Pulmonary Nodules.” American Journal of Respiratory and Critical Care Medicine, vol. 202, no. 2, July 2020, pp. 241–49. DOI.org (Crossref), https://doi.org/10.1164/rccm.201903-0505OC.
[22] Mikhael, Peter G., et al. “Sybil: A Validated Deep Learning Model to Predict Future Lung Cancer Risk From a Single Low-Dose Chest Computed Tomography.” Journal of Clinical Oncology: Official Journal of the American Society of Clinical Oncology, vol. 41, no. 12, Apr. 2023, pp. 2191–200. PubMed, https://doi.org/10.1200/JCO.22.01345.