References & Citations

Dataset, metric, backbones, and methods — with the license obligations

Dataset

ImportantCite both, per the CC BY-NC 4.0 license

The SLICE-3D dataset is licensed CC BY-NC 4.0 (non-commercial, attribution required). If you use it, you must cite both the dataset paper and the challenge release.

Metric

Gradient-boosted decision trees

Image backbones (the frontier sweep)

  • ConvNeXt-V2 + FCMAE — Woo, S. et al. (2023) (Woo et al. 2023) — the primary backbone (convnextv2_nano, FCMAE then ImageNet-22k→1k fine-tune; ImageNet weights only).
  • ViT / AugReg — Dosovitskiy, A. et al. ICLR (2021) (Dosovitskiy et al. 2021); Steiner, A. et al.
    1. (Steiner et al. 2021).
  • EVA-02 — Fang, Y. et al. (2023) (Fang et al. 2023).
  • MobileNetV4 — Qin, D. et al. (2024) (Qin et al. 2024).
  • MobileNetV5 / Gemma-3n encoder — Google / Wightman (2025) (Google and Wightman 2025) — the heavy anchor of the frontier (~300 M params).
  • Swin Transformer V2 — Liu, Z. et al. CVPR (2022) (Liu et al. 2022).
  • EfficientViT — Cai, H. et al. CVPR (2023) (Cai et al. 2023).
  • timm (PyTorch Image Models) — Wightman, R. (2019) (Wightman 2019) — source of the backbones and the RTX-class inference-timing benchmarks for the latency axis.

Imbalance & (partial) AUC optimization

Cite this work

Please cite the repository (CITATION.cff renders a “Cite this repository” button on GitHub) and also the SLICE-3D dataset above.

@software{pakistanai_isic2024,
  title   = {ISIC-2024 SLICE-3D: an efficiency-frontier, single-dataset,
             no-external-data baseline (Pakistan.AI)},
  author  = {Raja, Muhammad Junaid Ali Asif and Sultan, Adil and Hassan, Shahzaib Ahmed},
  year    = {2026},
  url     = {https://github.com/junaidaliop/isic2024-tbp},
  note    = {Neural Networks course project, National Yunlin University of Science and Technology}
}

Full bibliography

Cai, Han, Chuang Gan, and Song Han. 2023. EfficientViT: Memory Efficient Vision Transformer with Cascaded Group Attention.” CVPR.
Chen, Tianqi, and Carlos Guestrin. 2016. XGBoost: A Scalable Tree Boosting System.” KDD.
Dosovitskiy, Alexey et al. 2021. “An Image Is Worth 16x16 Words: Transformers for Image Recognition at Scale (ViT).” ICLR.
Fang, Yuxin, Quan Sun, Xinggang Wang, Tiejun Huang, Xinlong Wang, and Yue Cao. 2023. EVA-02: A Visual Representation for Neon Genesis. https://arxiv.org/abs/2303.11331.
Gao, Wei, and Zhi-Hua Zhou. 2012. On the Consistency of AUC Pairwise Optimization. https://arxiv.org/abs/1208.0645.
Google, and Ross Wightman. 2025. MobileNetV5 / Gemma 3n Vision Encoder.
Hasan, Md Zahid, and Fahmid Yousuf Rifat. 2025. Hybrid Ensemble of Segmentation-Assisted Classification and GBDT for Skin Cancer Detection. https://arxiv.org/abs/2506.03420.
International Skin Imaging Collaboration. 2024. SLICE-3D 2024 Challenge Dataset. https://doi.org/10.34970/2024-slice-3d.
ISIC Research, and Nicholas R. Kurtansky. 2024. ISIC 2024 Challenge Metrics: Partial AUC Above 80% TPR. Https://github.com/ISIC-Research/Challenge-2024-Metrics.
Ke, Guolin, Qi Meng, Thomas Finley, et al. 2017. LightGBM: A Highly Efficient Gradient Boosting Decision Tree.” Advances in Neural Information Processing Systems (NeurIPS).
Kurtansky, Nicholas R. et al. 2024. “The SLICE-3D Dataset: 400,000 Skin Lesion Image Crops Extracted from 3D TBP for Skin Cancer Detection.” Scientific Data, ahead of print. https://doi.org/10.1038/s41597-024-03743-w.
Kurtansky, Nicholas R. et al. 2025. “Automated Triage of Cancer-Suspicious Skin Lesions with 3D Total-Body Photography.” Npj Digital Medicine.
Lin, Tsung-Yi, Priya Goyal, Ross Girshick, Kaiming He, and Piotr Dollár. 2017. “Focal Loss for Dense Object Detection.” ICCV.
Liu, Mingrui, Zhuoning Yuan, Yiming Ying, and Tianbao Yang. 2019. Stochastic AUC Maximization with Deep Neural Networks. https://arxiv.org/abs/1908.10831.
Liu, Ze et al. 2022. Swin Transformer V2: Scaling up Capacity and Resolution.” CVPR.
Prokhorenkova, Liudmila, Gleb Gusev, Aleksandr Vorobev, Anna Veronika Dorogush, and Andrey Gulin. 2018. CatBoost: Unbiased Boosting with Categorical Features.” Advances in Neural Information Processing Systems (NeurIPS).
Qin, Danfeng et al. 2024. MobileNetV4: Universal Models for the Mobile Ecosystem. https://arxiv.org/abs/2404.10518.
Steiner, Andreas, Alexander Kolesnikov, Xiaohua Zhai, Ross Wightman, Jakob Uszkoreit, and Lucas Beyer. 2021. How to Train Your ViT? Data, Augmentation, and Regularization in Vision Transformers (AugReg). https://arxiv.org/abs/2106.10270.
Wightman, Ross. 2019. PyTorch Image Models (Timm). Https://github.com/huggingface/pytorch-image-models.
Woo, Sanghyun, Shoubhik Debnath, Ronghang Hu, et al. 2023. ConvNeXt V2: Co-Designing and Scaling ConvNets with Masked Autoencoders (FCMAE). https://arxiv.org/abs/2301.00808.
Yuan, Zhuoning, Yan Yan, Milan Sonka, and Tianbao Yang. 2021. “Large-Scale Robust Deep AUC Maximization: A New Surrogate Loss and Empirical Studies on Medical Image Classification.” ICCV.