Machine Learning:
- Building data-pipelines, feature engineering, modelling and deploying regression models to predict the sales price for a used industrial machinery in multiple territories.
- Customer segmentation (k-means clustering) to make behaviour-based personas to personalise a web service.
- Natural Language Processing (NLP) of raw text to extract sentiment, keyword pairs and relevant information.
- Using Matrix Factorization to predict the responses users would have, based on existing information.
- Using hierarchical clustering to identify cancer sub-populations with altered response to chemotherapy.
- Predicting quality changes in a manufacturing process, utilising local regression (LOESS regression).
- Determining drug response and binding affinity of complex molecules using non-linear regression.
- Utilising non-linear regression to determine drug response and binding affinity of complex molecules.
Statistical Analytics:
- Created a multi-phase financial model capable of predicting the monthly pricings, initial debt, profit and growth rate of commercial projects, which could be altered depending upon payment terms for a project.
- Optimising experimental time and costs through utilising design of experiment (DoE) approaches.
- Using various AB testing techniques to compare various cohorts to control groups.
- Quantitating population differentials using a myriad of statistical tests, such as ROC curves and AUC.
- Performed network and relationship analysis of users and the strength of their interactions (R and Neo4J)
Data Science and Analytics:
- Utilising SQL for querying databases and integrating MySQL within R and Python for parameterised queries.
- Creating and deploying SQL databases (both locally and on cloud) for storage of datasets, using Docker.
- Making interactive analytics dashboards for stakeholders (Power BI, R shiny, Dash and Tableau)
- Deploying machine learning within an interactive dashboard for users to make predictions (Dash & Power BI).
- Proficient in cleaning, filtering and mining datasets, web scraping and interacting with APIs.
- Explored, determined and normalised variations within a service to produce more efficient costing models.
- Developed a Pricing Index to show pricing trends over time, using regression to normalise high variance items.
Data and Software Engineering:
- Migrating Jupyter notebooks to automated CI/CD data pipelines in production, complete with data preparation, model training and performance reporting.
- Developing ML APIs with multiple endpoints, designed to serve model predictions for different purposes.
- Building simple data lakes on AWS and using AWS Glue crawlers to query multiple CSV files as a SQL database.
- Re-architecting an on-prem legacy data warehouse for modern infrastructure (data warehouse/data mart and data lake solutions)
- Scripting using Python (PyCharm and Jupyter Notebooks), R (including R Markdown and Shiny), SQL and Bash.
- Experience using: Git, Docker, AWS Services (Inc. Sagemaker and Athena), DVC, Neo4J, SPSS, Flask
Organisation:
- Possess excellent time management skills with a proven track record in project management.
- Extensive remote working experience as part of multinational teams or with no geographical location
- Adept at working well individually and as part of larger teams to deliver projects to required deadlines.
- Assisting agile projects allowing teams to use data-driven approaches to quickly pivot projects to useful areas.
- Capable managing the logistics, budget and operations of complex projects (e.g. clinical research trials).
- Experience leading international collaborations with large research groups.
- Proficient delivering training courses to colleagues and external visitors.
- Extensive experience with Microsoft Office, with advanced knowledge of Excel.
Communication:
- Possess outstanding written and communication skills; disseminating technical information to board members, ensuring conclusions are comprehensible to technical and non-technical personnel.
- Presenting and report writing in both industrial and academic settings and publishing in scientific journals.
- Running workshops and presenting talks at meetups.
- Moderating and debating topics at meetups.