NLP Data Scientist Real World Data RWD - Any, المملكة العربية السعودية - Agilite Global Solutions Company

    Agilite Global Solutions Company
    Agilite Global Solutions Company Any, المملكة العربية السعودية

    منذ أسبوعين

    Default job background
    وصف
    Role: NLP Data ScientistLocation : India Remote (Work from Anywhere in India)Minimum Qualification : B. Tech. (or equivalent) from an accredited institutionIndicative Experience : 7 YearsSalary : Best in the MarketDomain: Preferably Life Sciences/Pharma.Customer Profile: Captive research and development pods for a $500 million group of pharma data research companies that help patients gain access to lifesaving therapies. We help our clients navigate the complexities at each step of the drug development life cycle from pipeline to patient.Other benefits : Health Insurance Provident Fund Life Insurance Reimbursement of Certification Expenses Gratuity 24x7 Health Desk
    About the CompanyWe are headquartered in Pittsburgh USA with locations across the globe. We are a team of thoughtful experts driven by the power of our client s unique ideas. We also have microoffices in Hyderabad Chennai Bengaluru and Delhi NCR in India. While technical expertise is ingrained in Agilite s DNA we are more than just engineers and developers we are trusted product strategists. We pride ourselves on being a ready resource for critical market insights with the knowledge and experience required to design build and scale big ideas to serve our growing list of customers in the USA and Europe. Our preferred working model is Work from Anywhere (WFA). In addition you can also decide on your work schedule. All we need is the outcome. Our peoplecentric culture is built on the belief that extraordinary employees create amazing things. Work with us and attain your Ikigai in a place where your aspirations and business objectives intersect
    Job Description:We are seeking a skilled NLP data scientist with a focus on language models to join our AI and Life Sciences Solutions team. Your expertise in processing and understanding natural language data along with your knowledge of Electronic Health Records (EHR) and laboratory report analysis will be instrumental in driving our data science initiatives and innovations particularly in the development of rich multimodal realworld datasets to expedite RWDdriven drug development in pharma. Responsibilities:
    1. Employ and leverage NLP and opensource Large Language Models (LLM) such as LLama2 Mixtral BERT etc. to extract process and interpret unstructured medical data from diverse sources like EHRs medical notes and laboratory reports.
    2. Collaborate with clinical scientists and data scientists to create efficient NLP models for healthcare exhibiting an understanding of both the technical and medical aspects of the data.
    3. Conduct data cleaning preprocessing and validation to maintain the accuracy and reliability of insights gathered from NLP processes.
    4. Validate and present data findings to stakeholders exhibiting clear and effective communication skills

    Required Skills/Qualifications:
    • Masters or Ph.D. degree in Computer Science Data Science Computational Linguistics or a related analytical field.
    • Deep understanding and direct experience (2 years) in handling and interpreting electronic health records (EHR) and laboratory test results are a must.
    • Proven experience (2 years) in NLP with a strong knowledge of NLP techniques such as Named Entity Recognition (NER) text summarization topic modelling etc. and their applied use in healthcare.
    • Expertlevel understanding and practical experience (1 years) with Large Language Models (LLM) e.g. inference and finetuning.
    • Proficient in Python and SQL with strong experience in NLP libraries such as NLTK SpaCy Hugging Face Transformers and deep learning libraries such as PyTorch and TensorFlow.
    • Familiarity with common data science and ML practices e.g. version control systems agile methodologies and documentation.
    • Experience working with the AWS cloud environment and large databases (e.g. AWS Redshift).
    • Experience in managing the ML lifecycle using opensource tools (e.g. MLflow).
    • Detailoriented with strong analytical and problemsolving abilities.
    • Excellent verbal and written communication skills with the ability to present complex data to nontechnical audience.

    Preferred Qualifications:
    • Experience dealing with protected health information (PHI) and familiarity with healthcarerelated data privacy laws such as HIPAA.
    • Familiarity with standard healthcare codes and terminologies such as ICD10 CPT LOINC and SNOMED CT.
    • Experience in RAG (RetrievalAugmented Generation) and vector storage in the context of storing a large volume of healthcare unstructured documents and querying those.

    nlp,natural language processing,data,data science,ehrpd