close
Skip to main content

Abstract

Computer vision (CV) in medicine, propelled by deep learning and robust data resources, has revolutionized the healthcare landscape. AI models, primarily built on convolutional neural networks (CNNs), have advanced various diagnostic and prognostic tasks in healthcare. These breakthroughs owe their success to the convergence of deep learning, increased GPU-driven compute power, and abundant labeled datasets. In the field of computer vision, progress has been profound, particularly in object classification, localization, and detection, with the ImageNet competition playing a pivotal role. These strides extend to medicine, where AI models tackle tasks from disease classification to medical scene interpretation. They have the potential to rival or surpass expert physician accuracy, especially when trained on sizable medical datasets. Overcoming the challenge of limited medical data involves leveraging techniques like transfer learning and synthetic data generation and tapping into open sourced image annotations to build robust medical algorithms. However, the availability of medical data remains a pivotal concern, invoking ethical and legal queries about data ownership and re-identification risks. In such cases, federated learning emerges as a solution, allowing centralized algorithms to train on distributed data securely. Moreover, computer vision’s growth in medicine has influenced other domains like multimodal learning, 3D visions, and video analysis, proving invaluable in clinical settings. The technology has made significant strides in medical imaging, particularly in radiology, with AI models obtaining regulatory approvals and shaping clinical practices. This transformation extends to cardiology, pathology, dermatology, ophthalmology, and even medical videos, impacting diagnostic accuracy and patient care. However, the responsible deployment of these technologies in clinical settings necessitates addressing data access, ethical concerns, and community involvement, emphasizing data quality, model transparency, and clinical trials. It’s a promising revolution, yet one that requires careful navigation to ensure equity and trust in healthcare.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+
from $39.99 /Month
  • Starting from 10 chapters or articles per month
  • Access and download chapters and articles from more than 300k books and 2,500 journals
  • Cancel anytime
View plans

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 159.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 199.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Free shipping worldwide - view details

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Szeliski R (2010) Computer vision: algorithms and applications

    Google Scholar 

  2. Lecun Y, Bengio Y, Hinton G (2015) Deep learning. Nature. Nature Publishing Group, pp 436–444. Available at: https://doi.org/10.1038/nature14539

  3. Esteva A, Robicquet A, Ramsundar B, Kuleshov V, DePristo M, Chou K, Cui C, Corrado G, Thrun S, Dean J (2019) A guide to deep learning in healthcare. Nature medicine. Nature Publishing Group, pp 24–29. Available at: https://doi.org/10.1038/s41591-018-0316-z

  4. Topol EJ (2019) High-performance medicine: the convergence of human and artificial intelligence. Nature medicine. Nature Publishing Group, pp 44–56. Available at: https://doi.org/10.1038/s41591-018-0300-7

  5. Esteva A, Kuprel B, Novoa RA, Ko J, Swetter SM, Blau HM, Thrun S (2017) Dermatologist-level classification ofskin cancer with deep neural networks. Nature 542(7639):115–118. Available at: https://doi.org/10.1038/nature21056

  6. Yeung S, Rinaldo F, Jopling J, Liu B, Mehra R, Downing NL, Guo M, Bianconi GM, Alahi A, Lee J, Campbell B, Deru K, Beninati W, Fei-Fei L, Milstein A (2019) A computer vision system for deep learning-based detection of patient mobilization activities in the ICU. NPJ Digital Med 2(1). Available at: https://doi.org/10.1038/s41746-019-0087-z

  7. Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M, Berg AC, Fei-Fei L (2014) ImageNet large scale visual recognition challenge. Available at: http://arxiv.org/abs/1409.0575

  8. Krizhevsky, A, Sutskever I, Hinton GE (2012) ImageNet classification with DeepConvolutional neural networks. In: Pereira F, Burges CJC, Bottou L, Weinberger KQ (ed) Advances in neural information processing systems vol 25. Curran Associates, Inc., pp 1097–1105

    Google Scholar 

  9. Sermanet P, Eigen D, Zhang X, Mathieu M, Fergus R, LeCun Y (2013) OverFeat: integrated recognition, localization and detection using convolutional networks. Available at: http://arxiv.org/abs/1312.6229

  10. Simonyan K, Zisserman A (2014a) Very deep convolutional networks for large scale image recognition. Available at: http://arxiv.org/abs/1409.1556

  11. Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions

    Google Scholar 

  12. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. Available at: http://image-net.org/challenges/LSVRC/2015/

  13. Gebru T, Hoffman J, Fei-Fei L (2017) Fine-grained recognition in the wild: a multi-task domain adaptation approach

    Google Scholar 

  14. Gulshan V, Rajan RP, Widner K, Wu D, Wubbels P, Rhodes T, Whitehouse K, Coram M, Corrado G, Ramasamy K, Raman R, Peng L, Webster DR (2019) Performance of a deep-learning algorithm versus manual grading for detecting diabetic retinopathy in India. JAMA Ophthalmol 137(9):987–993. Available at: https://doi.org/10.1001/jamaophthalmol.2019.2004

  15. Ronneberger O, Fischer P, Brox T (2015) U-Net: convolutional networks for biomedical image segmentation. Available at: http://arxiv.org/abs/1505.04597

  16. Isensee F, Petersen J, Klein A, Zimmerer D, Jaeger PF, Kohl S, Wasserthal J, Koehler G, Norajitra T, Wirkert S, Maier-Hein KH (2018) NNU-Net: self-adapting framework for U-net-based medical image segmentation. Available at: http://arxiv.org/abs/1809.10486

  17. Shouno H et al (2017) Deep convolution neural network with 2-stage transfer learning for medical image classification. Brain Neural Network 24(1):3–12. Available at: https://doi.org/10.3902/jnns.24.3

  18. Deng J (2009) A large-scale hierarchical image database. Proceeding of IEEE computer vision and pattern recognition

    Google Scholar 

  19. Cubuk ED, Zoph B, Mane D, Vasudevan V, Le QV (2018) AutoAugment: learning augmentation policies from data. Available at: http://arxiv.org/abs/1805.09501

  20. Goodfellow IJ, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2014) Generative adversarial nets. Available at: http://www.github.com/goodfeli/adversarial

  21. Ørting S, Doyle A, van Hilten A, Hirth M, Inel O, Madan CR, Mavridis P, Spiers H, Cheplygina V (2019) A survey of crowdsourcing in medical image analysis. Available at: http://arxiv.org/abs/1902.09159

  22. Créquit P, Mansouri G, Benchoufi M, Vivot A, Ravaud P (2018b) Mapping of crowdsourcing in health: systematic review. J Med Internet Res. JMIR Publications Inc. Available at: https://doi.org/10.2196/jmir.9330

  23. Jing L, Tian Y (2020) Self-supervised visual feature learning with deep neural networks: a survey. IEEE Trans Pattern Anal Mach Intell 43(11):4037–4058

    Article  Google Scholar 

  24. McMahan B, Moore E, Ramage D, Hampson S, yArcas BA (2017) Communication-efficient learning of deep networks from decentralized data. In: Artificial intelligence and statistics. PMLR, pp 1273–1282

    Google Scholar 

  25. Karpathy A, Fei-Fei L (2015) Deep visual-semantic alignments for generating image descriptions

    Google Scholar 

  26. Lv D, Ying X, Cui Y, Song J, Qian K, Li M (2017) Research on the technology of LIDAR data processing

    Google Scholar 

  27. Lillo I, Niebles JC, Soto A (2017) Sparse composition of body poses and atomic actions for human activity recognition in RGB-D videos. Image Vision Comput 59:63–75. Available at: https://doi.org/10.1016/j.imavis.2016.11.004

  28. Haque A, Guo M, Alahi A, Yeung S, Luo Z, Rege A, Jopling J, Downing L, Beninati W, Singh A, Platchek T, Milstein A, Fei-Fei L (2017) Towards vision based smart hospitals: a system for tracking and monitoring hand hygiene compliance

    Google Scholar 

  29. Caba Heilbron F, Escorcia V, Ghanem B, Niebles JC (2015) ActivityNet: a large-scale video benchmark for human activity understanding. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 961–970

    Google Scholar 

  30. Wang K, Lin YA, Weissmann B, Savva M, Chang AX, Ritchie D (2019) Planit: planning and instantiating indoor scenes with relation graph and spatial prior networks. ACM Trans Graph 38(4). Available at: https://doi.org/10.1145/3306346.3322941

  31. Singh A, Haque A, Alahi A, Yeung S, Guo M, Glassman JR, Beninati W, Platchek T, Fei-Fei L, Milstein A (2020) Automatic detection of hand hygiene using computer vision technology. J Am Med Inform Assoc 27(8):1316–1320. Available at: https://doi.org/10.1093/jamia/ocaa115

  32. Litjens G, Kooi T, Bejnordi BE, Setio AAA, Ciompi F, Ghafoorian M, van der Laak JAWM, van Ginneken B, Sánchez CI (2017) A survey on deep learning in medical image analysis. Available at: https://doi.org/10.1016/j.media.2017.07.005

  33. Zhou ZH, Zhang ML (2007) Solving multi-instance problems with classifier ensemble based on constructive clustering. Knowl Inf Syst 11:155–170. Available at: https://doi.org/10.1007/s10115-006-0029-3

  34. Singh SP, Wang L, Gupta S, Goli H, Padmanabhan P, Gulyás B (2020) 3d deep learning on medical images: a review. Sens (Switz). MDPI AG, pp 1–24. Available at: https://doi.org/10.3390/s20185097

  35. Ouyang D, He B, Ghorbani A, Yuan N, Ebinger J, Langlotz CP, Heidenreich PA, Harrington RA, Liang DH, Ashley EA, Zou JY (2020) Video-based AI for beat-to-beat assessment of cardiac function. Nature 580(7802):252–256. Available at: https://doi.org/10.1038/s41586-020-2145-8

  36. Benjamens S, Dhunnoo P, Meskó B (2020) The state of artificial intelligence based FDA-approved medical devices and algorithms: an online database. NPJ Digital Med 3(1). Available at: https://doi.org/10.1038/s41746-020-00324-0

  37. Beede E, Baylor E, Hersch F, Iurchenko A, Wilcox L, Ruamviboonsuk P, Vardoulakis LM (2020) A human-centered evaluation of a deep learning system deployed in clinics for the detection of diabetic retinopathy. In: Conference on human factors in computing systems—proceedings. Association for Computing Machinery. Available at: https://doi.org/10.1145/3313831.3376718

  38. Viz.ai (2020) Granted medicare new technology add-on payment. PR Newswire https://www.prnewswire.com/news-releases/vizai-granted-medicare-newtechnologyadd-on-payment-301123603.html. Accessed 20 Dec 2023

  39. Crowson MG, Ranisau J, Eskander A, Babier A, Xu B, Kahmke RR, Chen JM, Chan TCY (2020) A contemporary review of machine learning in otolaryngology—head and neck surgery. Laryngoscope. Wiley Inc., pp 45–51. Available at: https://doi.org/10.1002/lary.27850

  40. Livingstone D, Talai AS, Chau J, Forkert ND (2019) Building an otoscopic screening prototype tool using deep learning. J Otolaryngol—Head Neck Surg 48(1). Available at: https://doi.org/10.1186/s40463-019-0389-9

  41. Chen PHC, Gadepalli K, MacDonald R, Liu Y, Kadowaki S, Nagpal K, Kohlberger T, Dean J, Corrado GS, Hipp JD, Mermel CH, Stumpe MC (2019) An augmented reality microscope with real-time artificial intelligence integration for cancer diagnosis. Nat Med 25(9):1453–1457. Available at: https://doi.org/10.1038/s41591-019-0539-7

  42. Gunčar G, Kukar M, Notar M, Brvar M, Černelč P, Notar M, Notar M (2018) An application of machine learning to haematological diagnosis. Sci Rep 8(1). Available at: https://doi.org/10.1038/s41598-017-18564-8

  43. Alam MM, Islam MT (2019) Machine learning approach of automatic identification and counting of blood cells. Healthc Technol Lett 6(4):103–108. Available at: https://doi.org/10.1049/htl.2018.5098

  44. El Hajjar A, Rey JF (2020) Artificial intelligence in gastrointestinal endoscopy: general overview. Chin Med J 326–334. Lippincott Williams and Wilkins. Available at: https://doi.org/10.1097/CM9.0000000000000623

  45. Horie Y, Yoshio T, Aoyama K, Yoshimizu S, Horiuchi Y, Ishiyama A, Hirasawa T, Tsuchida T, Ozawa T, Ishihara S, Kumagai Y, Fujishiro M, Maetani I, Fujisaki J, Tada T (2019) Diagnostic outcomes of esophageal cancer by artificial intelligence using convolutional neural networks. Gastrointest Endosc 89(1):25–32. Available at: https://doi.org/10.1016/j.gie.2018.07.037

  46. Hirasawa T, Aoyama K, Tanimoto T, Ishihara S, Shichijo S, Ozawa T, Ohnishi T, Fujishiro M, Matsuo K, Fujisaki J, Tada T (2018) Application of artificial intelligence using a convolutional neural network for detecting gastric cancer in endoscopic images. Gastric Cancer 21(4):653–660. Available at: https://doi.org/10.1007/s10120-018-0793-2

  47. Kubota K, Kuroda J, Yoshida M, Ohta K, Kitajima M (2012) Medical image analysis: computer-aided diagnosis of gastric cancer invasion on endoscopic images. Surg Endosc 26:1485–1489. Available at: https://doi.org/10.1007/s00464-011-2036-z

  48. Itoh T, Kawahira H, Nakashima H, Yata N (2018) Deep learning analyzes helicobacter pylori infection by upper gastrointestinal endoscopy images. Endosc Int Open 06(02):E139–E144. Available at: https://doi.org/10.1055/s-0043-120830

  49. He JY, Wu X, Jiang YG, Peng Q, Jain R (2018) Hookworm detection in wireless capsule endoscopy images with deep learning. IEEE Trans Image Proc 27(5):2379–2392. Available at: https://doi.org/10.1109/TIP.2018.2801119

  50. Park SM, Won DD, Lee BJ, Escobedo D, Esteva A, Aalipour A, Ge TJ, Kim JH, Suh S, Choi EH, Lozano AX, Yao C, Bodapati S, Achterberg FB, Kim J, Park H, Choi Y, Kim WJ, Yu JH, Bhatt AM, Lee JK, Spitler R, Wang SX, Gambhir SS (2020) A mountable toilet system for personalized health monitoring via the analysis of excreta. Nat Biomed Eng 4(6):624–635. Available at: https://doi.org/10.1038/s41551-020-0534-9

  51. Ver Milyea M, Hall JMM, Diakiw SM, Johnston A, Nguyen T, Perugini D, Miller A, Picou A, Murphy AP, Perugini M (2021) Development of an artificial intelligence-based assessment model for prediction of embryo viability using static images captured by optical light microscopy during IVF. Hum Reprod 35(4):770–784. Available at: https://doi.org/10.1093/HUMREP/DEAA013

  52. Choy G et al (2018) Current applications and future impact of machine learning in radiology. Radiology 288(2):318–328. Available at: https://doi.org/10.1148/radiol.2018171820

  53. Saba L et al (2019) The present and future of deep learning in radiology. Eur J Radiol 114:14–24. https://doi.org/10.1016/j.ejrad.2019.02.038

    Article  Google Scholar 

  54. Mazurowski MA et al (2018) Deep learning in radiology: an overview of the concepts and a survey of the state of the art with focus on MRI. J Magn Reson Imaging 49(4):939–954. https://doi.org/10.1002/jmri.26534

    Article  Google Scholar 

  55. Johnson AEW et al (2019a) MIMIC-CXR, a de-identified publicly available database of chest radiographs with free-text reports. Sci Data 6(1). https://doi.org/10.1038/s41597-019-0322-0

  56. Irvin J et al (2019) CheXpert: a large chest radiograph dataset with uncertainty labels and expert comparison. Proc AAAI Conf Artif Intell 33(01):590–597. https://doi.org/10.1609/aaai.v33i01.3301590

  57. Wang X et al (2017) ChestX-Ray8: hospital-scale chest X-ray database and benchmarks on weakly-supervised classification and localization of common thorax diseases. http://openaccess.thecvf.com/content_cvpr_2017/html/Wang_ChestXray8_Hospital-Scale_Chest_CVPR_2017_paper.html

  58. Chilamkurthy S et al (2018) Deep learning algorithms for detection of critical findings in head CT scans: a retrospective study. The Lancet 392(10162):2388–2396

    Article  Google Scholar 

  59. Weston AD et al (2019) Automated abdominal segmentation of CT scans for body composition analysis using deep learning. Radiology 290(3):669–679. https://doi.org/10.1148/radiol.2018181432

    Article  Google Scholar 

  60. Ding J et al (2017) Accurate pulmonary nodule detection in computed tomography images using deep convolutional neural networks. In: Lecture notes in computer science, pp 559–567. https://doi.org/10.1007/978-3-319-66179-7_64

  61. Tan LK et al (2017) Convolutional neural network regression for short-axis left ventricle segmentation in cardiac cine MR sequences. Med Image Anal 39:78–86. https://doi.org/10.1016/j.media.2017.04.002

    Article  Google Scholar 

  62. Zhang J et al (2020) Viral pneumonia screening on chest X-rays using confidence aware anomaly detection. IEEE Trans Med Imaging 40(3):879–890

    Article  Google Scholar 

  63. Zhang X et al (2020) CT super-resolution using multiple dense residual block based GAN. SIViP 15(4):725–733. https://doi.org/10.1007/s11760-020-01790-5

    Article  Google Scholar 

  64. Papolos A et al (2016) U.S. hospital use of echocardiography. J Am Coll Cardiol 67(5):502–511. https://doi.org/10.1016/j.jacc.2015.10.090

    Article  Google Scholar 

  65. Heart Flow NXT—heart flow analysis of coronary blood flow using coronary CT angiography—study results—ClinicalTrials.gov. https://clinicaltrials.gov/ct2/show/results/NCT01757678. Accessed 22 Dec 2023

  66. Madani A et al (2018) Fast and accurate view classification of echocardiograms using deep learning. NPJ Digital Med 1(1). https://doi.org/10.1038/s41746-017-0013-1

  67. Zhang J et al (2018) Fully automated echocardiogram interpretation in clinical practice. Circulation 138(16):1623–1635. https://doi.org/10.1161/circulationaha.118.034338

    Article  Google Scholar 

  68. Rajpurkar A et al (2022) Evaluation of a deep learning model for automated quantification of left ventricular ejection fraction from echocardiograms in a multicenter setting. JAMA Cardiology 7(1):43–53

    Google Scholar 

  69. Li Y et al (2023) A deep learning framework for robust and efficient coronary artery segmentation in CTA images. IEEE Trans Med Imaging 42(2):704–716

    Google Scholar 

  70. Lu Z et al (2022) Deep learning-based risk stratification for major adverse cardiovascular events using CoronaryCT angiography. Circ: Cardiovasc Imaging 15(10):011924

    Google Scholar 

  71. Ghorbani A et al (2020) Deep learning interpretation of echocardiograms. NPJ Digital Med 3(1). https://doi.org/10.1038/s41746-019-0216-8

  72. Madani A, Ong JR et al (2018) Deep echocardiography: data-efficient supervised and semi-supervised deep learning towards automated diagnosis of cardiac disease. NPJ Digital Med 1(1). https://doi.org/10.1038/s41746-018-0065-x

  73. Zhang Y et al (2023) Deep learning-based real-time segmentation of coronary arteries for minimally invasive cardiac surgery. Nat Sci Rep 13(1):8351

    Google Scholar 

  74. Hu Y et al (2022) Blood loss estimation in cardiac surgery using deep-learning based image segmentation. Artif Intell Med 136:106785

    Google Scholar 

  75. Gao X et al (2023) Machine learning model for predicting response to cardiac resynchronization therapy using echocardiographic data. Comput Cardiol 50(1):1–4

    MathSciNet  Google Scholar 

  76. Perkins CL, Balma D, Garcia R (2007) Why current breast pathology practices must be evaluated. A Susan G. Komen for the cure white paper: June 2006. Breast J 13(5):443–447. https://doi.org/10.1111/j.1524-4741.2007.00463.x

  77. Brimo F, Schultz L, Epstein JI (2010) The value of mandatory second opinion pathology review of prostate needle biopsy Interpretation before radical prostatectomy. J Urol 184(1):126–130. https://doi.org/10.1016/j.juro.2010.03.021

    Article  Google Scholar 

  78. Elmore JG et al (2015) Diagnostic concordance among pathologists interpreting breast biopsy specimens. JAMA 313(11):1122. https://doi.org/10.1001/jama.2015.1405

    Article  Google Scholar 

  79. Evans A et al (2018) US food and drug administration approval of whole slide imaging for primary diagnosis: a key milestone is reached and new questions are raised. Arch Pathol Lab Med 142(11):1383–1387. https://doi.org/10.5858/arpa.2017-0496-cp

    Article  Google Scholar 

  80. Srinidhi CL, Ciga O, Martel AL (2021) Deep neural network models for computational histopathology: a survey. Med Image Anal 67:101813. https://doi.org/10.1016/j.media.2020.101813

    Article  Google Scholar 

  81. Bera K et al (2019) Artificial intelligence in digital pathology—new tools for diagnosis and precision oncology. Nat Rev Clin Oncol 16(11):703–715. https://doi.org/10.1038/s41571-019-0252-y

    Article  Google Scholar 

  82. Cireşan D et al (2013) Mitosis detection in breast cancer histology images with deep neural networks. In: Lecture notes in computer science, pp 411–418. https://doi.org/10.1007/978-3-642-40763-5_51

  83. Wang H, Cruz-Roa A, Basavanhally A, Gilmore H, Shih N, Feldman M, Tomaszewski J, Gonzalez F, Madabhushi A (2014) Mitosis detection in breast cancer pathology images by combining handcrafted and convolutional neural network features. J Med Imaging 1(3):034003–034003

    Article  Google Scholar 

  84. Handcrafted features with convolutional neural networks for detection of tumor cells in histology images (2016). https://ieeexplore.ieee.org/abstract/document/7493441/

  85. Wang D et al (2016) Deep learning for identifying metastatic breast cancer. https://arxiv.org/abs/1606.05718

  86. Esteva A et al (2021b) Deep learning-enabled medical computer vision. NPJ Digital Med 4(1). https://doi.org/10.1038/s41746-020-00376-2

  87. Chen H et al (2017) DCAN: deep contour-aware networks for object instance segmentation from histology images. Med Image Anal 36:135–146. https://doi.org/10.1016/j.media.2016.11.004

    Article  Google Scholar 

  88. Gland instance segmentation using deep multichannel neural networks (2017). https://ieeexplore.ieee.org/abstract/document/7885586/

  89. Litjens G et al (2016) Deep learning as a tool for increased accuracy and efficiency of histopathological diagnosis. Sci Rep 6(1). https://doi.org/10.1038/srep26286

  90. Coudray N et al (2018) Classification and mutation prediction from non-small cell lung cancer histopathology images using deep learning. Nat Med 24(10):1559–1567. https://doi.org/10.1038/s41591-018-0177-5

    Article  Google Scholar 

  91. Campanella G et al (2019) Clinical-grade computational pathology using weakly supervised deep learning on whole slide images. Nat Med 25(8):1301–1309. https://doi.org/10.1038/s41591-019-0508-1

    Article  Google Scholar 

  92. Mobadersany P et al (2018) Predicting cancer outcomes from histology and genomics using convolutional networks. Proc National Acad Sci United States America 115(13). https://doi.org/10.1073/pnas.1717139115

  93. Courtiol P et al (2019) Deep learning-based classification of mesothelioma improves prediction of patient outcome. Nat Med 25(10):1519–1525. https://doi.org/10.1038/s41591-019-0583-3

    Article  Google Scholar 

  94. Rawat RR et al (2020) Deep learned tissue fingerprints classify breast cancers by ER/PR/Her2 status from H&E images. Sci Rep 10(1). https://doi.org/10.1038/s41598-020-64156-4

  95. Dietterich TG, Lathrop RH, Lozano-Pérez T (1997) Solving the multiple instance problem with axis-parallel rectangles. Artif Intell 89(1–2):31–71. https://doi.org/10.1016/s0004-3702(96)00034-3

    Article  Google Scholar 

  96. Christiansen EM, Yang SJ, Ando DM, Javaherian A, Skibinski G, Lipnick S, Mount E, O’neil A, Shah K, Lee AK, Goyal P (2018) In silico labeling: predicting fluorescent labels in unlabeled images. Cell 173(3):792–803

    Google Scholar 

  97. Egeva A et al (2023) Deep learning-based detection of prostate cancer in whole-slide images: a prospective multicentre study. Lancet Digital Health 5(7):573–582

    Google Scholar 

  98. Bylund J et al (2022) Deep learning detection of suspicious regions of interest in breast biopsy images: a multicenter study. JAMA Netw Open 5(12):2241052

    Google Scholar 

  99. Yuan Y et al (2023) Deep learning predicts response to immunotherapy in colorectal cancer. Nat Med 29(1):142–151

    Google Scholar 

  100. Wang X et al (2022) Deep learning for predicting recurrence risk in patients with head and neck squamous cell carcinoma. Cancers 14(19):4888

    Google Scholar 

  101. Bai H et al (2023) Deep learning model for urgency stratification of digital pathology slides in a large, multi-institutional prostate cancer cohort. BMC Med Inform Decis Mak 23(1):74

    Google Scholar 

  102. Liu Y et al (2022) Automated cell counting and morphometric analysis of gastrointestinal stromal tumors using deep learning-based image analysis. Arch Pathol Lab Med 146(10):1015–1023

    Google Scholar 

  103. Zhou J et al (2023) A novel virtual reality training system for pathology education: development and preliminary evaluation. Stud Health Technol Inform 321:583–588

    Google Scholar 

  104. Aeffner F et al (2022) Development and validation of interactive digital pathology cases with embedded quizzes: a new approach to pathology education. Hum Pathol 119:28–36

    Google Scholar 

  105. Esteva A, Topol E (2019) Can skin cancer diagnosis be transformed by AI? The Lancet 394(10211):1795

    Article  Google Scholar 

  106. Haenssle HA et al (2018) Man against machine: diagnostic performance of a deep learning convolutional neural network for dermoscopic melanoma recognition in comparison to 58 dermatologists. Ann Oncol 29(8):1836–1842. https://doi.org/10.1093/annonc/mdy166

    Article  Google Scholar 

  107. Brinker TJ et al (2019) Deep learning outperformed 136 of 157 dermatologists in a head-to-head dermoscopic melanoma image classification task. Eur J Cancer 113:47–54. https://doi.org/10.1016/j.ejca.2019.04.001

    Article  Google Scholar 

  108. Liu Y, Jain A, Eng C, Way DH, Lee K, Bui P, Kanada K, de Oliveira Marinho G, Gallegos J, Gabriele S, Gupta V (2020) A deep learning system for differential diagnosis of skin diseases. Nat Med 26(6):900–908

    Article  Google Scholar 

  109. Yap J, Yolland W, Tschandl P (2018) Multimodal skin lesion classification using deep learning. Exp Dermatol 27(11):1261–1267. https://doi.org/10.1111/exd.13777

    Article  Google Scholar 

  110. Marchetti MA et al (2018) Results of the 2016 international skin imaging collaboration international symposium on biomedical imaging challenge: comparison of the accuracy of computer algorithms to dermatologists for the diagnosis of melanoma from dermoscopic images. J Am Acad Dermatol 78(2):270-277.e1. https://doi.org/10.1016/j.jaad.2017.08.016

    Article  Google Scholar 

  111. Li Y, Esteva A, Kuprel B, Novoa R, Ko J, Thrun S (2017) Skin cancer detection and tracking using data synthesis and deep learning. In: Workshops at the thirty-first AAAI conference on artificial intelligence

    Google Scholar 

  112. Yu L et al (2023) An augmented reality tool for skin cancer education. Nat Med 29(2):346–352

    Google Scholar 

  113. Afaq A et al (2022) Deep learning-based algorithm for classification of skin biopsy images. J Pathol Inform 13(1):33

    Google Scholar 

  114. Ting DSW et al (2018) Artificial intelligence and deep learning in ophthalmology. Springer eBooks 103(2):167–175. https://doi.org/10.1136/bjophthalmol-2018-313173

    Article  Google Scholar 

  115. Keane PA, Topol EJ (2018b) With an eye to AI and autonomous diagnosis. NPJ Digital Med 1(1). https://doi.org/10.1038/s41746-018-0048-y

  116. Keane P, Topol E (2019) Reinventing the eye exam. The Lancet 394(10215):2141

    Article  Google Scholar 

  117. De Fauw J et al (2018) Clinically applicable deep learning for diagnosis and referral in retinal disease. Nat Med 24(9):1342–1350. https://doi.org/10.1038/s41591-018-0107-6

    Article  Google Scholar 

  118. Kern C et al (2019) Implementation of a cloud-based referral platform in ophthalmology: making telemedicine services a reality in eye care. Br J Ophthalmol 104(3):312–317. https://doi.org/10.1136/bjophthalmol-2019-314161

    Article  Google Scholar 

  119. Gulshan V et al (2016) Development and validation of a deep learning algorithm for detection of diabetic retinopathy in retinal fundus photographs. JAMA 316(22):2402. https://doi.org/10.1001/jama.2016.17216

    Article  Google Scholar 

  120. Ruamviboonsuk P, Krause J, Chotcomwongse P, Sayres R, Raman R, Widner K, Campana BJ, Phene S, Hemarat K, Tadarati M, Silpa-Archa S (2019) Deep learning versus human graders for classifying diabetic retinopathy severity in a nationwide screening program. NPJ Digital Med 2(1):25

    Article  Google Scholar 

  121. Abràmoff MD, Lou Y, Erginay A, Clarida W, Amelon R, Folk JC, Niemeijer M (2016) Improved automated detection of diabetic retinopathy on a publicly available dataset through integration of deep learning. Invest Ophthalmol Visual Sci 57(13):5200–5206. Available at: https://doi.org/10.1167/iovs.16-19964

  122. Ting DSW, Cheung CYL, Lim G, Tan GSW, Quang ND, Gan A, Hamzah H, Garcia-Franco R, San Yeo IY, Lee SY, Wong EYM (2017) Development and validation of a deep learning system for diabetic retinopathy and related eye diseases using retinal images from multiethnic populations with diabetes. JAMA 318(22):2211–2223

    Article  Google Scholar 

  123. Abràmoff MD, Lavin PT, Birch M, Shah N, Folk JC (2018) Pivotal trial of an autonomous AI-based diagnostic system for detection of diabetic retinopathy in primary care offices. NPJ Digital Med 1(1). Available at: https://doi.org/10.1038/s41746-018-0040-6

  124. Varadarajan AV, Bavishi P, Ruamviboonsuk P, Chotcomwongse P, Venugopalan S, Narayanaswamy A, Cuadros J, Kanai K, Bresnick G, Tadarati M, Silpa Archa S (2020) Predicting optical coherence tomography-derived diabetic macular edema grades from fundus photographs using deep learning. Nat Commun 11(1):130

    Article  Google Scholar 

  125. Yim J, Chopra R, Spitz T, Winkens J, Obika A, Kelly C, Askham H, Lukic M, Huemer J, Fasler K, Moraes G (2020) Predicting conversion to wet age-related macular degeneration using deep learning. Nat Med 26(6):892–899

    Article  Google Scholar 

  126. Li Z, He Y, Keel S, Meng W, Chang RT, He M (2018b) Efficacy of a deep learning system for detecting glaucomatous optic neuropathy based on color fundus photographs. Ophthalmology 125(8):1199–1206. Available at: https://doi.org/10.1016/j.ophtha.2018.01.023

  127. Yousefi S, Kiwaki T, Zheng Y, Sugiura H, Asaoka R, Murata H, Lemij H, Yamanishi K (2018) Detection of longitudinal visual field progression in glaucoma using machine learning. Am J Ophthalmol 193:71–79

    Article  Google Scholar 

  128. Brown JM, Campbell JP, Beers A, Chang K, Ostmo S, Chan RVP, Dy J, Erdogmus D, Ioannidis S, Kalpathy-Cramer J, Chiang MF (2018) Automated diagnosis of plus disease in retinopathy of prematurity using deep convolutional neural networks. JAMA Ophthalmol. Am Med Assoc 803–810. Available at: https://doi.org/10.1001/jamaophthalmol.2018.1934

  129. Poplin R, Varadarajan AV, Blumer K, Liu Y, Mcconnell MV, Corrado GS, Peng L, Webster DR (2017) Predicting cardiovascular risk factors from retinal fundus photographs using deep learning. arXiv 2017. arXiv preprint arXiv:1708.09843

  130. Mitani A, Liu Y, Huang A, Corrado GS, Peng L, Webster DR, Hammel N, Varadarajan AV (no date) Detecting Anemia from Retinal Fundus Images

    Google Scholar 

  131. Mitani A, Huang A, Venugopalan S, Corrado GS, Peng L, Webster DR, Hammel N, Liu Y, Varadarajan AV (2020) Detection of anaemia from retinal fundus images via deep learning. Nat Biomed Eng 4(1):18–27

    Article  Google Scholar 

  132. Sabanayagam C, Xu D, Ting DSW, Nusinovici S, Banu R, Hamzah H, Lim C, Tham YC, Cheung CY, Tai ES, Wang YX, Jonas JB, Cheng CY, Lee ML, Hsu W, Wong TY (2020) A deep learning algorithm to detect chronic kidney disease from retinal photographs in community-based populations. Lancet Digit Health 2(6):e295–e302. Available at: https://doi.org/10.1016/S2589-7500(20)30063-7

  133. Vasudevan S (2022) Digital biomarkers: convergence of digital health technologies and biomarkers, pp 3–5. Available at: https://doi.org/10.1038/s41746-022-00583-z

  134. Tan TF et al (2023) Review Artificial intelligence and digital health in global eye health: opportunities and challenges, pp 1432–1443. Available at: https://doi.org/10.1016/S2214-109X(23)00323-6

  135. Maier-Hein L, Vedula SS, Speidel S, Navab N, Kikinis R, Park A, Eisenmann M, Feussner H, Forestier G, Giannarou S, Hashizume M, Katic D, Kenngott H, Kranzfelder M, Malpani A, März K, Neumuth T, Padoy N, Pugh C, Schoch N, Stoyanov D, Taylor R, Wagner M, Hager GD, Jannin P (2017) Surgical data science for next-generation interventions. Nat Biomed Eng 691–696. Nature Publishing Group. Available at: https://doi.org/10.1038/s41551-017-0132-7

  136. Garcia-Peraza-Herrera LC, Li W, Fidon L, Gruijthuijsen C, Devreker A, Attilakos G, Deprest J, Poorten E vander, Stoyanov D, Vercauteren T, Ourselin S (2017) ToolNet: holistically-nested real-time segmentation of robotic surgical tools. Available at: https://doi.org/10.1109/IROS.2017.8206462

  137. Zia A, Sharma Y, Bettadapura V, Sarin EL, Essa I (2017) Video and accelerometer-based motion analysis for automated surgical skills assessment. Available at: http://arxiv.org/abs/1702.07772

  138. Sarikaya D, Corso JJ, Guru KA (2020) Detection and localization of robotic tools in robot-assisted surgery videos using deep neural networks for region proposal and detection. Available at: https://doi.org/10.1109/TMI.2017.2665671

  139. Jin A, Yeung S, Jopling J, Krause J, Azagury D, Milstein A, Fei-Fei L (2018) Tool detection and operative skill assessment in surgical videos using region based convolutional neural networks. Available at: http://arxiv.org/abs/1802.08774

  140. Twinanda AP, Shehata S, Mutter D, Marescaux J, de Mathelin M, Padoy N (2016) EndoNet: a deep architecture for recognition tasks on laparoscopic videos. Available at: http://arxiv.org/abs/1602.03012

  141. Lin HC, Shafran I, Yuh D, Hager GD (2006) Towards automatic skill evaluation: detection and segmentation of robot-assisted surgical motions. Comput Aided Surg 11(5):220–230. Available at: https://doi.org/10.3109/10929080600989189

  142. Khalid S, Goldenberg M, Grantcharov T, Taati B, Rudzicz F (2020) Evaluation of deep learning models for identifying surgical actions and measuring performance. JAMA Network Open 3(3):e201664. Available at: https://doi.org/10.1001/jamanetworkopen.2020.1664

  143. Vassiliou MC, Feldman LS, Andrew CG, Bergman S, Leffondré K, Stanbridge D, Fried GM (2005) A global assessment tool for evaluation of intraoperative laparoscopic skills. Am J Surg 190(1):107–113. Available at: https://doi.org/10.1016/j.amjsurg.2005.04.004

  144. Jin Y, Dou Q, Chen H, Yu L, Qin J, Fu CW, Heng PA (2018) SV RCNet: workflow recognition from surgical videos using recurrent convolutional network. IEEE Trans Med Imaging 37(5):1114–1126. Available at: https://doi.org/10.1109/TMI.2017.2787657

  145. Azari DP, Frasier LL, Quamme SRP, Greenberg CC, Pugh CM, Greenberg JA, Radwin RG (2019) Modeling surgical technical skill using expert assessment for automated computer rating. Ann Surg 269(3):574–581. Available at: https://doi.org/10.1097/SLA.0000000000002478

  146. Ma AJ, Rawat N, Reiter A, Shrock C, Zhan A, Stone A, Rabiee A, Griffin S, Needham DM, Saria S (2017) Measuring patient mobility in the ICU using a novel noninvasive sensor. Crit Care Med 45(4):630–636. Available at: https://doi.org/10.1097/CCM.0000000000002265

  147. Davoudi A, Malhotra KR, Shickel B, Siegel S, Williams S, Ruppert M, Bihorac E, Ozrazgat-Baslanti T, Tighe PJ, Bihorac A, Rashidi P (2019) Intelligent ICU for autonomous patient monitoring using pervasive sensing and deep learning. Sci Rep 9(1). Available at: https://doi.org/10.1038/s41598-019-44004-w

  148. Chakraborty I, Elgammal A, Chakraborty I (2013) Video based activity recognition in a trauma center. In: 2013 10th IEEE international conference and workshops on automatic face and gesture recognition (FG), pp 1–8. Available at: https://doi.org/10.7282/T3VH5SBX

  149. Twinanda AP, Alkan EO, Gangi A, de Mathelin M, Padoy N (2015) Data driven spatio-temporal RGBD feature encoding for action recognition in operating rooms. Int J Comput Assist Radiol Surg 10(6):737–747. Available at: https://doi.org/10.1007/s11548-015-1186-1

  150. Kaplan RS, Porter ME (2011) How to solve the cost crisis in health care. Harv Bus Rev 89(9):46–52

    Google Scholar 

  151. Wang S, Chen L, Zhou Z, Sun X, Dong J (2016) Human fall detection in surveillance video based on PCANet. Multimedia Tools Appl 75(19):11603–11613. Available at: https://doi.org/10.1007/s11042-015-2698-y

  152. Núñez-Marcos A, Azkune G, Arganda-Carreras I (2017) Vision-based fall detection with convolutional neural networks. Wirel Commun Mobile Comput 2017. Available at: https://doi.org/10.1155/2017/9474806

  153. Luo Z, Balachandar N, Li L-J, Hsieh J-T, Yeung S, Pusiol G, Luxenberg J, Li G, Downing NL, Milstein A, Fei-Fei L (2018) Computer vision-based descriptive analytics of seniors daily activities for long-term health monitoring, proceedings of machine learning research. Available at: https://www.researchgate.net/publication/338558227

  154. Zhang C, Tian Y (2012) RGB-D camera-based daily living activity recognition. J Comput Vision Image Proces 2(4):12

    Google Scholar 

  155. Pirsiavash H, Ramanan D (2012) Detecting activities of daily living in first-person camera views. In: 2012 IEEE conference on computer vision and pattern recognition. IEEE, pp 2847–2854

    Google Scholar 

  156. Kishore PVV, Prasad MVD (2015) Optical flow hand tracking and active contour hand shape features for continuous sign language recognition with artificial neural networks. Int J Softw Eng Appl 9(12):231–250. Available at: https://doi.org/10.14257/ijseia.2015.9.12.21

  157. Webster D, Celik O (2014) Systematic review of Kinect applications in elderly care and stroke rehabilitation. Jner J Neuroengineering Rehabil Webster Celik J NeuroEngineering Rehabil. Available at: http://www.jneuroengrehab.com/content/11/1/108

  158. Chen W, McDuff D (2018) Deepphys: video-based physiological measurement using convolutional attention networks. In: Proceedings of the European conference on computer vision (ECCV), pp 349–365

    Google Scholar 

  159. Moazzami B, Razavi-Khorasani N, Dooghaie Moghadam A, Farokhi E, Rezaei N (2020) COVID-19 and telemedicine: immediate action required for maintaining healthcare providers well-being. J Clin Virol 126. Available at: https://doi.org/10.1016/j.jcv.2020.104345

  160. Gerke S, Yeung S, Cohen IG (2020) Ethical and legal aspects of ambient intelligence in hospitals. JAMA—J Am Med Assoc 601–602. American Medical Association. Available at: https://doi.org/10.1001/jama.2019.21699

  161. Young AT, Xiong M, Pfau J, Keiser MJ, Wei ML (2020) Artificial intelligence in dermatology: a primer. J Invest Dermatol 1504–1512. Elsevier B.V. Available at: https://doi.org/10.1016/j.jid.2020.02.026

  162. Huang H, Li Z, Wang L, Chen S, Dong B, Zhou X (2020) Feature space singularity for out-of-distribution detection. Available at: http://arxiv.org/abs/2011.14654

  163. Schaekermann M, Cai CJ, Huang AE, Sayres R (2020) Expert discussions improve comprehension of difficult cases in medical image assessment. In: Proceedings of the 2020 CHI conference on human factors in computing systems, pp 1–13

    Google Scholar 

  164. Caruana R, Pratt L, Thrun S (1997) Multitask Learning *. Kluwer Academic Publishers

    Google Scholar 

  165. Wulczyn E, Steiner DF, Xu Z, Sadhwani A, Wang H, Flament-Auvigne I, Mermel CH, Chen PHC, Liu Y, Stumpe MC (2020) Deep learning-based survival prediction for multiple cancer types using histopathology images. PLoS ONE 15(6). Available at: https://doi.org/10.1371/journal.pone.0233678

  166. Dusenberry MW, Tran D, Choi E, Kemp J, Nixon J, Jerfel G, Heller K, Dai AM (2020) Analyzing the role of model uncertainty for electronic health records. In: ACM CHIL 2020—proceedings of the 2020 ACM conference on health, inference, and learning. Association for Computing Machinery, Inc, pp 204–213. Available at: https://doi.org/10.1145/3368555.3384457

  167. Obermeyer Z, Powers B, Vogeli C, Mullainathan S (2019) Dissecting racial bias in an algorithm used to manage the health of populations. Science 366(6464):447–453. Available at: https://doi.org/10.1126/science.aax2342

  168. Liu X, Cruz Rivera S, Moher D, Calvert MJ, Denniston AK, Ashrafian H, Beam AL, Chan AW, Collins GS, Deeks ADJJ, ElZarrad MK, Espinoza C, Esteva A, Faes L, Ferrante di Ruffano L, Fletcher J, Golub R, Harvey H, Haug C, Holmes C, Jonas A, Keane PA, Kelly CJ, Lee AY, Lee CS, Manna E, Matcham J, McCradden M, Monteiro J, Mulrow C, Oakden-Rayner L, Paltoo D, Panico MB, Price G, Rowley S, Savage R, Sarkar R, Vollmer SJ, Yau C (2020a) Reporting guidelines for clinical trial reports for interventions involving artificial intelligence: the CONSORT-AI extension. The Lancet Digital Health. Elsevier Ltd, pp e537–e548. Available at: https://doi.org/10.1016/S2589-7500(20)30218-1

  169. Asan O, Bayrak AE, Choudhury A (2020) Artificial intelligence and human trust in healthcare: focus on clinicians. J Med Internet Res. JMIR Publications Inc. Available at: https://doi.org/10.2196/15154

  170. McKinney M, Sieniek MT, Godbole V, Godwin J, Ashrafian H, Back T, Chesus M, Corrado GC, Darzi A, Etemadi M, Garcia-Vicente F, Gilbert FJ, Halling-Brown M, Jansen S, Karthikesalingam A, Kelly CJ, King D, Ledsam R, Melnick D, Mostofi H, Romera-Paredes B, Peng L, JayReicher J, Sidebottom R, Suleyman M, Tse D, De Fauw J, ShettyS.(no date) International evaluation of an AI system 1 for breast cancer screening 2 3 Scott

    Google Scholar 

  171. Kamulegeya L, Bwanika J, Okello M, Rusoke D, Nassiwa F, Lubega W, Musinguzi D, Börve A (2023) Using artificial intelligence on dermatology conditions in Uganda: a case for diversity in training data sets for machine learning. Afr Health Sci 23(2):753–763. Available at: https://doi.org/10.4314/ahs.v23i2.86

  172. Widya AJIR, Member S, Monno Y (2019) Whole stomach 3D reconstruction and frame localization from monocular endoscope video. IEEE J Transl Eng Health Med 7:1–10. https://doi.org/10.1109/jtehm.2019.2946802

    Article  Google Scholar 

  173. Makki K, Chandelon K, Bartoli A (2023) Elliptical specularity detection in endoscopy with application to normal reconstruction. Int J Comput Assist Radiol Surg 18(7):1323–1328. https://doi.org/10.1007/s11548-023-02904-3

    Article  Google Scholar 

  174. Azagra P, Sostres C, Ferrández Á, Riazuelo L, Tomasini C, Barbed OL, Morlana J, Recasens D, Batlle VM, Gómez-Rodríguez JJ, Elvira R (2023) Endomapper dataset of complete calibrated endoscopy procedures, pp 1–16. Available at: https://doi.org/10.1038/s41597-023-02564-7

  175. Zhao S, Wang C, Wang Q, Liu Y, Zhou SK (2022) 3D endoscopic depth estimation using 3D surface-aware constraints, pp 1–11

    Google Scholar 

  176. He K, Girshick R, Dollar P (2019) Rethinking imageNet pre-training. 2019 IEEE/CVF international conference on computer vision (ICCV). Seoul, Korea (South), pp 4917–4926. https://doi.org/10.1109/ICCV.2019.00502

  177. Sanders J, Kandrot E, Jacoboni E (2011) CUDA by example: an introduction to general purpose GPU programming. Pearson

    Google Scholar 

  178. Padoy N, Blum T, Ahmadi S-A, Feussner H, Berger M-O, Navab N (2012) Statistical modeling and recognition of surgical workflow. Med Image Analysis 16(3):632–641. Available at: http://www.brainlab.com

  179. De Sousa TF, Camilo CG (2023) HDEEP: hierarchical deep learning combination for detection of diabetic retinopathy. Procedia Comput Sci 222:425–434. https://doi.org/10.1016/j.procs.2023.08.181

    Article  Google Scholar 

  180. Lee RS, Gimenez F, Hoogi A, Miyake KK, Gorovoy M, Rubin DL et al (2017) Data descriptor: a curated mammography data set for use in computer-aided detection and diagnosis research. Nature Publishing Group, pp 1–9. Available at: https://doi.org/10.1038/sdata.2017.177

  181. Codella NC, Gutman D, Celebi ME, Helba B, Marchetti MA, Dusza SW, Kalloo A, Liopyris K, Mishra N, Kittler H, Halpern A (2018) Skin lesion analysis toward melanoma detection: a challenge at the 2017 International symposium on biomedical imaging (ISBI), hosted by the international skin imaging collaboration (ISIC). 2018 IEEE 15th international symposium on biomedical imaging (ISBI 2018). Washington, DC, USA, pp 168–172. https://doi.org/10.1109/ISBI.2018.8363547

  182. Irvin J, Rajpurkar P, Ko M, Yu Y, Ciurea-Ilcus S, Chute C, Marklund H, Haghgoo B, Ball R, Shpanskaya K, Seekins J, Mong DA, Halabi SS, Sandberg JK, Jones R, Larson DB, Langlotz CP, Patel BN, Lungren MP, Ng AY (2019) CheXpert: a large chest radiograph dataset with uncertainty labels and expert comparison. Proc AAAI Conf Artif Intell 33(01):590–597. https://doi.org/10.1609/aaai.v33i01.3301590

    Article  Google Scholar 

  183. Chen X, Liu J, Liu Z, Wan L, Lan X, Zheng N. Knowledge graph enhancement for fine-grained zero-shot learning on ImageNet21K. In: IEEE transactions on circuits and systems for video technology. https://doi.org/10.1109/TCSVT.2024.3396215

  184. Aoki, N. et al. (2021b) 3rd place solution to the Google Universal Image Embedding competition on Kaggle, Google Universal Image Embedding Competition on Kaggle. conference-proceeding.https://www.kaggle.com/competitions/google-universal-image embedding.

  185. Mullenbach J, Pruksachatkun Y, Adler S, Seale J, Swartz J, McKelvey TG, Dai H, Yang Y, Sontag D (2021) CLIP: a dataset for extracting action items for physicians from hospital discharge notes

    Google Scholar 

  186. Ranftl R, Lasinger K, Hafner D (2022) Towards robust monocular depth estimation: mixing datasets for zero-shot cross-dataset transfer. IEEE Trans Pattern Anal Mach Intell 44(3):1623–1637. Available at: https://doi.org/10.1109/TPAMI.2020.3019967

  187. Kirillov A, Mintun E, Ravi N, Mao H, Rolland C, Gustafson L, Xiao T, Whitehead S, Berg AC, Lo WY, Dollár P (2023) Segment anything. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 4015–4026

    Google Scholar 

  188. Zou X, Yang J, Zhang H, Li F, Li L, Wang J, Wang L, Gao J, Lee YJ (2024) Segment everything everywhere all at once, pp 1–13

    Google Scholar 

  189. Oquab M, Darcet T, Moutakanni T, Vo H, Szafraniec M, Khalidov V, Fernandez P, Haziza D, Massa F, El-Nouby A, Assran M (2024) DINOv2: learning robust visual features without supervision, pp 1–32

    Google Scholar 

  190. Radford A, Kim JW, Hallacy C, Ramesh A, Goh G, Agarwal S, Sastry G, Askell A, Mishkin P, Clark J, Krueger G (2021) Learning transferable visual models from natural language supervision

    Google Scholar 

  191. Li J, Li D, Xiong C, Hoi S, Li J et al (2022) BLIP: bootstrapping language image pre-training for unified vision-language understanding and generation (2)

    Google Scholar 

  192. Touvron H et al (2021) Training data-efficient image transformers and distillation through attention. In: Meila M, Zhang T (eds) Proceedings of the 38th international conference on machine learning. PMLR (Proceedings of machine learning research), pp 10347–10357. Available at: https://proceedings.mlr.press/v139/touvron21a.html

  193. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. 2016 IEEE conference on computer vision and pattern recognition (CVPR). Las Vegas, NV, USA, pp 770–778. https://doi.org/10.1109/CVPR.2016.90

  194. He K, Chen X, Xie S, Li Y, Dollár P, Girshick R (2022) Masked autoencoders are scalable vision learners

    Google Scholar 

  195. Liu Z, Mao H, Wu CY, Feichtenhofer C, Darrell T, Xie S (2022) A ConvNet for the 2020s

    Google Scholar 

  196. Liu Z, Hu H, Lin Y, Yao Z, Xie Z, Wei Y, Ning J, Cao Y, Zhang Z, Dong L, Wei F (2022) Swin transformer V2: scaling up capacity and resolution

    Google Scholar 

  197. Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Zhai X, Unterthiner T, Dehghani M, Minderer M, Heigold G, Gelly S, Uszkoreit J (2021) An image is worth 16×16 words: transformers for image recognition at scale, ICLR 2021 conference-proceeding. https://arxiv.org/pdf/2010.11929.pdf

  198. Rombach R, Blattmann A, Lorenz D (2022) High-resolution image synthesis with latent diffusion models. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 10684–10695

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shilpa S. Borkar.

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2025 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Deshpande, A.H., Borkar, S.S., Baheti, J.R. (2025). Medical Computer Vision. In: Dulhare, U.N., Houssein, E.H. (eds) Deep Learning and Computer Vision: Models and Biomedical Applications. Algorithms for Intelligent Systems. Springer, Singapore. https://doi.org/10.1007/978-981-96-3648-8_3

Download citation

Keywords

Publish with us

Policies and ethics