Our data scientists employ advanced mathematics, statistics, and computer science (ML and algorithmic), together with industry knowledge, to tackle business problems for clients ranging from governments to Silicon Valley tech giants. We also serve as an R&D function for the company, developing internal tools and proof-of-concepts to put us ahead of the market.
The team is technologically agnostic and has software engineering skills to deploy advance ML/AI methodologies to scale.
We work on complex, high profile investigations, and develop advanced models, leveraging a high-performance cloud-based tech stack, with engagements including:
- automated document classification using NLP
- geospatial modelling of electric vehicle demand
- communication analytics and social media modelling
- identifying sanctions breaches with unsupervised learning
- graph analysis of financial data
- anomaly detection in API call data
The Data Scientist should have pragmatic approach to problem-solving and be capable of developing solutions that clients can rely on in production. They should be capable of challenging their own work, and of finding balance between perfection and practicality.
Furthermore, as a Senior Data Science Consultant you will be expected to research different solutions and deliver high quality solutions with little guidance from Directors. You will be expected to mentor junior colleagues and lead sub-projects on larger pieces of work.
At the data collection stage of the project, the Data Scientist:
- performs data availability/quality assessment
- owns ETL processes: writes SQL queries or scripts to query APIs; researches and acquires data from public sources; liaises with clients to ingest unstructured data; chooses appropriate storage schemas, in conjunction with engineers / database architects
- identifies issues with the data—missing values, (near) duplicates, ambiguous name-entity resolution, non-standardised addresses, dates, etc.—and performs data cleaning
- prepares replicable processing scripts/notebooks in project repository, and documents steps taken
At the proof of concept stage of the project, the Data Scientist:
- performs exploratory data analysis—how are data distributed and what hypotheses might be supported?—and visualises key findings
- identifies possible model features and applies appropriate transformations or normalisation
- develops a “quick and dirty” approach, likely using off-the-shelf libraries and a reduced sample of available data, to produce initial insights or demonstrate feasibility of a model
- communicates outcomes to non-data scientists and actively feeds into discussion about what is achievable within project timelines, what might be other areas of opportunity
At production stage of the project, the Data Scientist:
- engineers variables into usable features
- develops robust models and statistical analyses, employing appropriate measures of fit to validate their own and colleagues work
- builds and owns complete pipelines, potentially working alongside engineers/developers, and testing that these work at scale
- finds and fixes bugs in pipeline components (ideally before they go into production!)
- optimises code for time/space efficiency, as needed
- takes responsibility for running new iterations of data (retraining model, or new inputs for prediction) through the existing pipeline as required, performs validation of these results and prepares client deliverables
- communicates findings to client, and incorporates their feedback / domain knowledge to tune model
At platform development stage, the Data Scientist:
- exposes model functionality to development team—e.g. as a package, or an API
- supports engineers / developers / UI team in building end-to-end solution, advocates for needs of client and data science team
- Educated to degree level within a technical discipline
- Experience in a Data Science role at both proof-of-concept and production stages
- Proficient in Python and SQL-like language
- Confident at data wrangling/cleaning in a dataframe package (e.g. Python’s pandas)
- Knowledge of general ML concepts (training/test, cross-validation, supervised/unsupervised learning, regression and classification, clustering, over/under-fitting, accuracy/precision/recall, ensemble methods, dimension reduction and feature extraction)
- For at least a few specific models—e.g. trees, forests, neural networks, gradient boosting, SVMs, k-means… though not necessarily all of these—to know how they actually work, algorithms involved, etc.
- Reasonable knowledge of fundamental computer science concepts (time/space complexity, hashing, simple algos for sorting/searching/etc.)
- Able to work with linux/*nix systems, bash scripting, SSH with minimal instruction
- Experience developing modularised code, using version control with git—other DevOps knowledge, such as containerisation using Docker, highly desirable
- Ideally, familiarity with an ML framework like Torch or TensorFlow
- Ideally, familiarity with a distributed framework like Spark
- This role requires travel to clients and FTI offices
FTI Consulting is an equal opportunity employer and does not discriminate on the basis of race, color, national origin, ancestry, citizenship status, protected veteran status, religion, physical or mental disability, marital status, sex, sexual orientation, gender identity or expression, age, or any other basis protected by law, ordinance, or regulation.