Nasir Mahmood - Data Scientist and Mentor

Hi i am Nasir Mahmood

AWS Certified Machine Learning Engineer

I bring 15+ years of data science and machine learning experience, with 8 years in banking, insurance, and marketing.

I help businesses to leverage the power of DATA by reducing COST and increasing EFFICIENCY - ACCELERATED GROWTH.

I live in Canada with my wife, two sons, and daughter. They are the inspiration and motivation that drive me every day to push the limits of what I can achieve.

What i do

DS & ML Services





Sheridan College

Predictive Analytics, Machine Learning & Big Data

  • Supervised Learning: linear regression, logistic regression, decision trees, decision tree ensembles, K-nearest neighbors
  • Unsupervised Learning: k-means clustering, Bayesian classification and hierarchical clustering.
  • Model Validation: bias, overfitting, cross-validation, feature importance, ethical AI.
  • Natural Language Processing (NLP): text preprocessing, feature representation, similarity measures (TF, TF-IDF, cosine similarity, word2vec)

Sep 2019 - May 2021


Sun Life Canada

Senior Data Scientist

Leading strategy and development of state-of-the-art machine learning methods with focus on client acquisition and loyalty.

Feb 2018 - Sep 2019


Royal Bank of Canada (RBC)

Data Scientist - NLP, Machine Learning & Advanced Analytics

At RBC, I have successfully completed customer attrition prediction and OSFI market share projects. Also, I worked on a project which involved NLP driven hierarchical clustering for operational intelligence.



Bell Network Big Data

Data Science and Machine Learning

  • Anomaly Detection: using descriptive statistics and linear regression
  • Network Health Prediction: Multivariate Guassian models of network metrics.



American Express Canada

Data Science and Analytics

  • Recommender Systems: using regression methods and Bayesian classification techniques
  • Exploratory Data Analysis: analyzing and interpreting key data characteristics
  • Text Analytics: converting words into variables to build predictive models
  • Regression Analysis: modeling and analyzing relationship output and independent variables



Hamburg Center for Bioinformatics

Research Scientist

  • Protein struction prediction: extension of existing algorithm by incorporating building blocks feature
  • Selecting discriminatory features of 3D protein building blocks
  • Unsupervised classification of protein building blocks using Bayesian statistics
  • Extending statistical model and Monte Carlo simulations for blind protein structure predictions



Technical University Berlin

Postdoctoral Fellow

  • Extension and improvement of model based search prediction algorithm
  • Developed and managed blind prediction framework for CASP biannual competition
  • Developed model assessment framework for CASP biannual competition
  • Proposed concept of protein building blocks: dynamically adapting to an appropriate resolution of structural representation, hence making statistical modeling and conformational search space manageable
  • Developed algorithm for extraction of building blocks by applying advanced data mining techniques
  • Model validation framework to see, whether 1) the anticipated building blocks exist at all, and 2) if they exist, could the representative set of building blocks be used to build models of protein structures



Hamburg Center for Bioinformatics

Research Scientist

  • Protein structure prediction using Bayesian classifications with Monte Carlo simulated annealing (MCSA)
  • Implemented interplay between Cartesian coordinates and dihedral angles of protein structures
  • Extracted water-molecule interaction features from known protein structures and performed unsupervised Bayesian re-classification of constituent structures along with existing statistical models
  • Incorporated water interaction (with protien structures) into existing Bayesian classification models
  • Implemented alogrithm to calculate hydrogen bonding energies in protein structures




Data Science Specialization

A 9-course specialization by Johns Hopkins University on Coursera. Specialization Certificate earned on December 22, 2014

15.071x: The Analytics Edge

A course of study offered by MITx, an online learning initiative of The Massachusetts Institute of Technology through edX.

Statistics in Medicine

A course of study offered by Stanford Online, an online learning initiative of Stanford University, through OpenEdX, the leading open source online learning platform.

Matrix Algebra and Linear Models

A course of study offered by HarvardX, an online learning initiative of Harvard University through edX.

Statistics and R for the Life Sciences

A course of study offered by HarvardX, an online learning initiative of Harvard University through edX.



University of Hamburg, Germany

PhD, Computational Biology

During PhD research I worked on protein structure prediction, a classical probelm from computational biology. I developed low-resolution coarse grain force fields, which do not involve physical model or Boltzmann statistics. They rather use a mixture of Bayesian probabilities from normal and discrete distributions of features of known proteins. Consequently, a ratio of probabilities provides acceptance criterion for Monte Carlo method in prediction simulations. For more details, please check out my doctoral dissertation and presentation slides.



Cologne University Bioinformatics Center, Germany

Postgrad. Diploma, Applied Bioinformatics

During 1-year instensive program, I worked on a mandatory research project (in addition of course work) with focus on protein structural alignment, an NP-hard problem. I analyzed performed of existing algorithm and then modified by advancing optimization step. My contribution led to a significant increase in overall algorithm performance. Also, I benchmarked new algorithm against existing methods. For more details, please check out my project report and presentation slides.



Otto-Von-Guericke University, Magdeburg, Germany

M.Sc. Computational Visualistics

During research thesis, I worked on implementation and benchmarking of a document retrieval system for a handwriting device - PC Notes Taker (PCNT). PCNT device is believed to be competitive compared to other device which we had already tested and benchmarked. I extended features of the document retrieval system by implementing a subtype of triangular grid features and compared its results with those of existing features. It was found that new features were slightly poorer but closer to square grid features. For more details, please check out my master thesis and presentation slides.



University of Central Punjab, Lahore, Pakistan

MS Computer Science



Bahauddin Zakaria University, Multan, Pakistan

B.Sc. (Hons) Agri. Entomology

Download resume

Get in touch


Feel free to contact me, I love to interact with people who have interesting ideas and/or questions. I always reply back to non-spam emails.