Age and Gender Prediction Using Swin Transformer and Multitasking Learning

B Sailendra Reddy; B Aakash Vishal Raj; B Tharun Raju; B Jahnavi; T Anusha

doi:10.5281/zenodo.15129759

Vol. 4 No. 1 (2025): January

RESEARCH ARTICLES

Age and Gender Prediction Using Swin Transformer and Multitasking Learning

DOWNLOAD PDF

B Sailendra Reddy,
B Aakash Vishal Raj,
B Tharun Raju,
B Jahnavi,
T Anusha

more info

B Sailendra Reddy
Department of Computer Science and Engineering, Annamacharya Institute of Technology and Sciences, Kadapa, Andhra Pradesh, India.

B Aakash Vishal Raj
Department of Computer Science and Engineering, Annamacharya Institute of Technology and Sciences, Kadapa, Andhra Pradesh, India.

B Tharun Raju
Department of Computer Science and Engineering, Annamacharya Institute of Technology and Sciences, Kadapa, Andhra Pradesh, India.

B Jahnavi
Department of Computer Science and Engineering, Annamacharya Institute of Technology and Sciences, Kadapa, Andhra Pradesh, India.

T Anusha
Department of Computer Science and Engineering, Annamacharya Institute of Technology and Sciences, Kadapa, Andhra Pradesh, India.

DOI: https://doi.org/10.5281/zenodo.15129759

Published 2025-04-03

Keywords

Age Prediction Gender Classification,
Swin Transformer,
Multi-task Learning,
Facial Analysis,
Attention Mechanism

How to Cite

B Sailendra Reddy, B Aakash Vishal Raj, B Tharun Raju, B Jahnavi, & T Anusha. (2025). Age and Gender Prediction Using Swin Transformer and Multitasking Learning . International Journal of Computational Learning & Intelligence, 4(1), 374–382. https://doi.org/10.5281/zenodo.15129759

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

Abstract

Age and gender prediction from facial images is an essential task in applications such as security systems human-computer interaction and personalized recommendations however variations in facial features due to lighting expressions and aging effects make it a challenging problem traditional convolutional neural networks CNNs often struggle with generalization whereas transformer-based models have shown superior performance by capturing long-range dependencies through self-attention mechanisms a multi-task learning approach where age estimation gender classification and contextual age positioning are trained together enhances feature representation and improves accuracy incorporating feature reweighting techniques allows the model to focus on critical facial attributes refining predictions dynamically additionally leveraging contextual learning such as relative age positioning strengthens the models ability to understand relationships between different age groups evaluations using benchmark datasets with diverse demographic distributions demonstrate the effectiveness of such an approach with performance measured through metrics like mean absolute error MAE for age estimation and classification accuracy for gender prediction future research can further enhance these models by integrating domain adaptation techniques and optimizing computational efficiency for real-time applications in biometric authentication healthcare and social media analytics

DOWNLOAD PDF

References

Liu, Z., Lin, Y., Cao, Y., Hu, H., Wei, Y., Zhang, Z., Lin, S., & Guo, B. (2021). Swin Transformer: Hierarchical Vision Transformer using shifted windows. arXiv preprint arXiv:2103.14030.
Zhang, K., Wang, X., Liu, D., & Tan, X. (2020). Multi-task learning for age and gender estimation with deep CNNs. IEEE Transactions on Image Processing, 29, 24048-24055.
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, Ł., & Polosukhin, I. (2017). Attention is all you need. In Proceedings of the Advances in Neural Information Processing Systems (NeurIPS) (pp. 5998-6008).
He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 770-778).
Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., … & Houlsby, N. (2021). An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929.
Gao, B.-B., Liu, X.-X., Zhou, H.-Y., Wu, J., & Geng, X. (2020). Learning expectation of label distribution for facial age and attractiveness estimation. arXiv preprint arXiv:2007.01771.
Othmani, A., Taleb, A. R., Abdelkawy, H., & Hadid, A. (2020). Age estimation from faces using deep learning: A comparative analysis. Computer Vision and Image Understanding, 196, 102961.
Li, Q., Deng, Z., Xu, W., Li, Z., & Liu, H. (2021). Age label distribution learning based on unsupervised comparisons of faces. Wireless Communications and Mobile Computing, 2021(1), 1–7.
Deng, Y., Teng, S., Fei, L., Zhang, W., & Rida, I. (2021). A multifeature learning and fusion network for facial age estimation. Sensors, 21(13), 4597.
Kuprashevich, M., & Tolstykh, I. (2023). MiVOLO: Multi-input transformer for age and gender estimation. arXiv preprint arXiv:2307.04616.
Syed Thouheed Ahmed, S., Sandhya, M., & Shankar, S. (2018, August). ICT’s role in building and understanding indian telemedicine environment: A study. In Information and Communication Technology for Competitive Strategies: Proceedings of Third International Conference on ICTCS 2017 (pp. 391-397). Singapore: Springer Singapore.
Levi, G., & Hassner, T. (2015). Age and gender classification using convolutional neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) (pp. 34–42).
Zhang, B., & Bao, Y. (2022). Cross-dataset learning for age estimation. IEEE Access, 10, 24048–24055.
Shi, C., Zhao, S., Zhang, K., & Feng, X. (2023). Multi-task multi-scale attention learning-based facial age estimation. IET Signal Processing, 17(2), e12190.
Li, S., & Cheng, K.-T. (2019). Facial age estimation by deep residual decision making. arXiv preprint arXiv:1908.10737.
Wang, H., Sanchez, V., & Li, C.-T. (2022). Improving face-based age estimation with attention-based dynamic patch fusion. IEEE Transactions on Image Processing, 31, 1084–1096.
Shi, C., Zhao, S., Zhang, K., Wang, Y., & Liang, L. (2023). Face-based age estimation using improved Swin transformer with attention-based convolution. Frontiers in Neuroscience, 17, 1136934.
Chen, P., Zhang, X., Li, Y., Tao, J., Xiao, B., Wang, B., & Jiang, Z. (2023). DAA: A delta age AdaIN operation for age estimation via binary code transformer. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 15836–15845).
Sreedhar Kumar, S., Ahmed, S. T., Mercy Flora, P., Hemanth, L. S., Aishwarya, J., GopalNaik, R., & Fathima, A. (2021, January). An Improved Approach of Unstructured Text Document Classification Using Predetermined Text Model and Probability Technique. In ICASISET 2020: Proceedings of the First International Conference on Advanced Scientific Innovation in Science, Engineering and Technology, ICASISET 2020, 16-17 May 2020, Chennai, India (p. 378). European Alliance for Innovation.
Hiba, S., & Keller, Y. (2023). Hierarchical attention-based age estimation and bias analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence, 45(12), 14682–14692.
Niu, Z., Zhou, M., Wang, L., Gao, X., & Hua, G. (2016). Ordinal regression with multiple output CNN for age estimation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 4920–4928).
Kumar, A., Satheesha, T. Y., Salvador, B. B. L., Mithileysh, S., & Ahmed, S. T. (2023). Augmented Intelligence enabled Deep Neural Networking (AuDNN) framework for skin cancer classification and prediction using multi-dimensional datasets on industrial IoT standards. Microprocessors and Microsystems, 97, 104755.
Patil, K. K., & Ahmed, S. T. (2014, October). Digital telemammography services for rural India, software components and design protocol. In 2014 International Conference on Advances in Electronics Computers and Communications (pp. 1-5). IEEE.
Chen, S., Zhang, C., Dong, M., Le, J., & Rao, M. (2017). Using ranking-CNN for age estimation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 5183–5192).
Cao, W., Mirjalili, V., & Raschka, S. (2020). Rank consistent ordinal regression for neural networks with application to age estimation. Pattern Recognition Letters, 140, 325–331.
Sreedhar Kumar, S., Ahmed, S. T., & NishaBhai, V. B. (2019). Type of supervised text classification system for unstructured text comments using probability theory technique. International Journal of Recent Technology and Engineering (IJRTE), 8(10).
Madapuri, R. K., & Senthil Mahesh, P. C. (2017). HBS-CRA: Scaling impact of change request towards fault proneness: Defining a heuristic and biases scale (HBS) of change request artifacts (CRA). Cluster Computing, 22(S5), 11591–11599. https://doi.org/10.1007/s10586-017-1424-0
Dwaram, J. R., & Madapuri, R. K. (2022). Crop yield forecasting by long short‐term memory network with Adam optimizer and Huber loss function in Andhra Pradesh, India. Concurrency and Computation: Practice and Experience, 34(27). https://doi.org/10.1002/cpe.7310
Busireddy, S. H. R. (2025). Deep learning-based detection of hair and scalp diseases using CNN and image processing. Milestone Transactions on Medical Technometrics, 3(1), 145-5. https://doi.org/10.5281/zenodo.14965660
Busireddy, S. H. R., Venkatramana, R., & Jayasree, L. (2025). Enhancing apple fruit quality detection with augmented YOLOv3 deep learning algorithm. International Journal of Human Computations & Intelligence, 4(1), 386–396. https://doi.org/10.5281/zenodo.14998944

Age and Gender Prediction Using Swin Transformer and Multitasking Learning

Keywords

How to Cite

Download Citation

Abstract

References