Skip to main content

Principal Data Scientist (Digital Twin)

Given the increased focus on data and analytics, we need additional Principal Data Scientist to work on, and leading, cutting-edge AI/ML projects with a strong focus on productionisation of Digital Twins.

In the emerging data and analytics world, AI/ML plays a key role, and the Principal Data Scientist must be a creative thinker and propose innovative ways to look at problems by using data. Expect our data to be messy and full of hidden treasures than only be found after using smart data wrangling and machine learning strategies. He/She will need to validate their findings using an experimental and iterative process. These approaches involve trial and error and may require multiple cycles, so are best addressed using iterative development methods such as agile or lean. We expect the Principal Data Scientist to bring a wealth of experience delivering production level projects that understands the difficulties moving from POC to Product. Our work covers working mainly with teams in Manufacturing, Quality, and Supply Chain.

Job Purpose:

Principal Data Scientist is accountable for delivering projects with a high risk / reward ratio. He / She needs to ensure that they are focused on delivering value at optimal cost while ensuring that GSK Tech is looking far enough into the future to guide the business. Leading virtual teams as leading expert is key to this role. You will help guide the team towards the optimal AI/ML strategies for the problems we are solving. As Principal Data Scientist you are also responsible for the productionisation of our Digital Twins AI/ML capabilities platform that is utilizing IoT, Feature Store, Edge-computing, and Cloud.

Accountable for all the business process identified in the scope below and contributing to the development of our AI/ML capabilities Strategy. In your role, you will collaborate with other functional teams and digital innovation team to ensure the optimal portfolio of applications exists to support the business.

  • Managing AI/ML projects and programs to deliver on time and avoid scope creep?
  • Ensure the right level of AI/ML complexity is picked to solve business demands. Avoid overcomplicated solutions and drive for robust AI/ML solutions.
  • Co-create with the business use-cases that will benefit from AI/ML
  • Given the broad diversity and significant size of the information used as part of the analysis, assessing the validity of the findings can be challenging and, as such, the chances of misleading results are greater. Principal Data Scientist will need to be able to present back their findings to the business by exposing their assumptions and validation work in a way that can be easily understood by their business counterparts. Strong understanding of using the right type of visualisations is key for success.

    Business process areas in scope:

  • Supply Chain
  • GIO/Manufacturing
  • Commercial
  • Your responsibilities:

  • Identify critical business problems and create analytical/modeling solutions while maintaining the right balance between speed to market and analytical soundness when designing solutions
  • Product Roadmap and strategy for Digital Twins
  • Influence machine learning strategy for a program/project; explores design options to assess efficiency and impact, develop approaches to improve robustness and rigour
  • Be a key contributor to the planning and direction of a project and effectively prioritize goals
  • Lead discussions at peer review and uses quantitative skills to positively influence decision making
  • Represent GSK externally to advance technical capability across the Industry
  • Identify opportunities to apply the latest advancements in Machine Learning and Artificial Intelligence to the fields of biology, chemistry, and medicine
  • Create algorithms to extract information from large, multiparametric data sets
  • Deploy your algorithms to production to identify actionable insights from large databases
  • Compare results from various methodologies and recommend best techniques to stake holders
  • Design, develop and implement analytical solutions using a variety of commercial and open source tools (common tools include Python, R, TensorFlow)
  • Develop and embed automated processes for predictive model validation, deployment, and implementation
  • Connect and collaborate with subject matter experts, GIO-Q, Commercial and Medical
  • Make impactful contributions to internal discussions on emerging machine learning methodologies
  • Educate the organization both from IT and the business perspectives on these new approaches, such as testing hypotheses and statistical validation of results
  • Provide thought leadership: facilitate cross-geography fertilization of ideas and implements key principles/best-practices and guidelines across the categories
  • Enable the future BI / Analytics infrastructure and self-service model
  • Demonstrate a combination of business focus, strong analytical and problem-solving skills and programming knowledge to be able to quickly cycle hypothesis through the discovery phase of the project and excellent written and communications skills to report back the findings in a clear, structured manner.
  • Why you?

    Basic Qualifications:

    We are looking for professionals with these required skills to achieve our goals:

  • A higher degree in Engineering, Statistics, Data Science, Applied Mathematics, Computer Science, Physics, Bioinformatics, Computational Biology, Computational Chemistry or related quantitative field
  • 5+ years of experience in any at least 2 of the AI domains: NLP, NLG, Digital Twins, Computer Vision, etc.
  • Knowledge of Cloud (Azure or AWS or Google Cloud) platform infrastructure and on-prem to run and optimize distributed data applications
  • Product mindset, move from POC to Products
  • Preferred Qualifications:

    If you have the following characteristics it would be a plus:

  • PhD preferred
  • Experience with Quantum AI/ML, federated AI/ML, Production Digital Twins, or Encrypted AI/ML big plus
  • Outcome driven experience, consulting background plus
  • Experience understanding of a programming language such as Python.
  • Experience with at least one deep learning framework such as TensorFlow, Keras, or PyTorch
  • Experience/Familiarity with standard deep learning algorithms (CNN, LSTM, etc.)
  • Experience with AI/ML Devops and CI/CD
  • Experience producing high-quality code, tests, documentation
  • Experience working in global virtual teams
  • Fluency in French
  • *Li-GSK


    Principal Data Scientist (Digital Twin)

    GSK, Wavre
    Data Engineer
    Degree level: