Skills Overview

Machine Learning:

  • Building data-pipelines, feature engineering, modelling and deploying regression models to predict the sales price for a used industrial machinery in multiple territories.
  • Customer segmentation (k-means clustering) to make behaviour-based personas to personalise a web service.
  • Natural Language Processing (NLP) of raw text to extract sentiment, keyword pairs and relevant information.
  • Using Matrix Factorization to predict the responses users would have, based on existing information.
  • Using hierarchical clustering to identify cancer sub-populations with altered response to chemotherapy.
  • Predicting quality changes in a manufacturing process, utilising local regression (LOESS regression).
  • Determining drug response and binding affinity of complex molecules using non-linear regression.
  • Utilising non-linear regression to determine drug response and binding affinity of complex molecules.

Statistical Analytics:

  • Created a multi-phase financial model capable of predicting the monthly pricings, initial debt, profit and growth rate of commercial projects, which could be altered depending upon payment terms for a project.
  • Optimising experimental time and costs through utilising design of experiment (DoE) approaches.
  • Using various AB testing techniques to compare various cohorts to control groups.
  • Quantitating population differentials using a myriad of statistical tests, such as ROC curves and AUC.
  • Performed network and relationship analysis of users and the strength of their interactions (R and Neo4J)

Data Science and Analytics:

  • Utilising SQL for querying databases and integrating MySQL within R and Python for parameterised queries.
  • Creating and deploying SQL databases (both locally and on cloud) for storage of datasets, using Docker.
  • Making interactive analytics dashboards for stakeholders (Power BI, R shiny, Dash and Tableau)
  • Deploying machine learning within an interactive dashboard for users to make predictions (Dash & Power BI).
  • Proficient in cleaning, filtering and mining datasets, web scraping and interacting with APIs.
  • Explored, determined and normalised variations within a service to produce more efficient costing models.
  • Developed a Pricing Index to show pricing trends over time, using regression to normalise high variance items.

Data and Software Engineering:

  • Migrating Jupyter notebooks to automated CI/CD data pipelines in production, complete with data preparation, model training and performance reporting.
  • Developing ML APIs with multiple endpoints, designed to serve model predictions for different purposes.
  • Building simple data lakes on AWS and using AWS Glue crawlers to query multiple CSV files as a SQL database.
  • Re-architecting an on-prem legacy data warehouse for modern infrastructure (data warehouse/data mart and data lake solutions)
  • Scripting using Python (PyCharm and Jupyter Notebooks), R (including R Markdown and Shiny), SQL and Bash.
  • Experience using: Git, Docker, AWS Services (Inc. Sagemaker and Athena), DVC, Neo4J, SPSS, Flask

Organisation:

  • Possess excellent time management skills with a proven track record in project management.
  • Extensive remote working experience as part of multinational teams or with no geographical location
  • Adept at working well individually and as part of larger teams to deliver projects to required deadlines.
  • Assisting agile projects allowing teams to use data-driven approaches to quickly pivot projects to useful areas.
  • Capable managing the logistics, budget and operations of complex projects (e.g. clinical research trials).
  • Experience leading international collaborations with large research groups.
  • Proficient delivering training courses to colleagues and external visitors.
  • Extensive experience with Microsoft Office, with advanced knowledge of Excel.

Communication:

  • Possess outstanding written and communication skills; disseminating technical information to board members, ensuring conclusions are comprehensible to technical and non-technical personnel.
  • Presenting and report writing in both industrial and academic settings and publishing in scientific journals.
  • Running workshops and presenting talks at meetups.
  • Moderating and debating topics at meetups.