Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.

Job

Statistical Single Cell Genomics Fellowship - Computational Biologist / Machine Learning Scientist (Postdoctoral Fellow)

Join a dynamic and interdisciplinary team pushing the frontiers of single cell genomics and machine learning under the leadership of Dr. Yun Renee Zhang in the Division of Intramural Research (DIR) at the National Library of Medicine (NLM), located on the NIH Campus in Bethesda, Maryland.

About the position

NLM is the world’s largest biomedical library and a leader in research, development, and training in biomedical informatics and health information technology. NLM is legislatively mandated to support the essential work of acquiring, organizing, preserving, and disseminating biomedical information, a field that is changing at a more rapid pace than ever before. NLM plays a pivotal role in translating biomedical research into practice. NLM’s research and information services support scientific discovery, health care, and public health, enabling researchers, clinicians, and the public to use the vast wealth of biomedical data to improve health. NLM’s cutting-edge research and training programs – with a focus on artificial intelligence (AI), machine learning, computational biology, and biomedical informatics and health data standards – help catalyze basic biomedical science, data-driven discovery, and health care delivery. The NLM Division of Intramural Research (DIR) develops and applies computational approaches to a broad range of information problems in biology, biomedicine, and human health. Dr. Zhang’s laboratory at the NLM is looking for highly motivated Postdoctoral Fellow(s) working in the exciting single cell genomics field. Projects will focus on the integrative computational analyses of single cell multi-omics data, including single cell/nucleus RNA-seq (scRNA-seq) data, spatial transcriptomics data, etc. These projects will range across multiple organs and tissues to establish generalizable and robust data analysis strategies for enhancing data reusability and deriving trustworthy data insights, to ultimately deliver rigorous scientific findings that contribute to expediting health discoveries. One of the ongoing efforts is to build computational tools to improve the fundamental knowledge about cell phenotypes in health and disease. It has created great synergy with one of the broader Strategic Priority Areas at the NLM to establish the National Center for Biotechnology Information (NCBI) Cell Knowledgebase, as a definitive public reference resource of information about cell phenotypes, including cell types, cell states, and developmental trajectories. This Strategic Priority Area is led by the NLM Scientific Director. Given the rapid adoption of high-throughput single cell omics platforms, Explainable Artificial Intelligence and Machine Learning (XAI/ML) have emerged as powerful tools to capture, organize, and manage the data and information about cell phenotypes identified in the wide variety of experimental and pathological contexts being explored. Our team has developed a suite of computational algorithms – NS-Forest and FR-Match – using random forest machine learning model and non-parametric statistical testing to enable the accurate identification of cell type classification marker genes and robust matching of cell types across datasets. These tools have become the building blocks of the computational workflow at the backend of the NCBI Cell Knowledgebase. The Postdoctoral Fellow position offers unique training opportunities in the NIH environment, which includes being part of an inclusive, diverse, and data-skilled workforce, and access to computational resources at the NIH, including the Biowulf cluster and GPU chips. Being at NLM, one of the 27 institutes at the NIH, provides great potential for branching out new research areas and forming collaborations to advance your research and career growth. Additional information about NIH postdoctoral fellowships: https://www.training.nih.gov/research-training/pd/.

About the Zhang Lab Dr. Zhang's lab is interested in the development and application of novel computational and machine learning methods based on advanced biostatistics/statistics techniques to analyze large-scale multi-omics single cell data from human and other species in health and disease and to identify data-driven biomarkers for disease diagnostics and therapeutics. Her group carries out dry-lab (computational) research and closely collaborates with wet-lab (experimental) investigators to understand and characterize the diverse cell phenotypes at single cell resolution utilizing various single cell genomics approaches. Specific tasks include:

  • Lead the methodology development utilizing machine learning and statistical techniques in the context of single cell genomics analysis.
  • Process large-scale datasets from public data portals using scripting languages or APIs.
  • Perform end-to-end scRNA-seq data analysis, including data preprocessing, quality control, dimensionality reduction, finding nearest neighbors, community detection clustering, embedding and visualization.
  • Implement refinements and optimizations to existing algorithms to meet the evolving needs from a wide range of applications and use cases.
  • Identify and translate the real data problems encountered in different use cases of off-the-shelf machine learning algorithms or other computational methods into modeling frameworks that can be assessed using fundamental techniques of data science and statistical theory.
  • Construct innovative solutions to address pragmatic difficulties with novel metrics and models.
  • Conduct simulation studies to investigate the properties of models and metrics under designed scenarios with synthetic data.
  • Conduct systematic benchmarking studies to understand the method performances with rigorous validations across examined conditions.
  • Collaborate with internal and external collaborators to deliver data analysis insights to peer researchers and domain experts in the scope of larger collaborative projects.
  • Communicate the methodology design and implementation, real data analysis results, method evaluation results to technical and non-technical audiences.
  • Keep up to date with the latest literature of the methodology and application domains.
  • Provide data intelligence support to data service, product, or project teams.

Salary/Benefits: This is a full-time postdoctoral fellow position, renewable on a yearly basis. The initial appointment will be for 1 year, with extensions up to 5 years. The NIH offers a competitive salary and comprehensive health insurance. The NIH is dedicated to building a diverse community in its training and employment programs as well as the continued education and career development of all its research staff. This position is subject to background checks.

Apply for this vacancy

What you'll need to apply

Prospective candidates are encouraged to submit the following application materials to Dr. Betsy Clark at betsy.clark@nih.gov, with a copy to nlmdirtrainingoffice@mail.nlm.nih.gov. Please include “Statistical Single Cell Genomics Fellowship inquiry”, and your last name in the subject line of the email.

  • Curriculum vitae (It is encouraged to include your GitHub handle or code repository URLs to support your application.)
  • Cover letter or statement of research interest.
  • Contact information for 3 references (Please include the full name with titles, institute, email address and phone number of each reference.)

Contact name

Betsy Clark

Contact email

betsy.clark@nih.gov

Qualifications

  • Doctorate degree in statistics, biostatistics, bioinformatics, data science, or in a related biomedical science field.
  • Expertise in high dimensional data analysis, statistical inference, linear models, supervised and unsupervised machine learning.
  • Experience with non-parametric statistical testing and multiple hypothesis testing are preferred.
  • Experience with functional data analysis and pseudotime trajectory analysis are preferred.
  • Experience with minimum spanning tree and graph theory are preferred.
  • Experience working with gene expression data and/or other omics data modalities.
  • Knowledge of single cell technologies, e.g., scRNA-seq, scATAC-seq, spatial transcriptomics, etc. are preferred.
  • Experience with dimensionality reduction and multi-omics data integration are preferred.
  • Experience with differential analysis and gene set enrichment analysis are preferred.
  • Proficient in programming languages Python and/or R, as well as associated libraries for data science and bioinformatics, e.g., numpy, scipy, pandas, scikit-learn, and Bioconductor packages. o Familiarity with single cell computational tools, e.g., cellranger, Seurat, Scanpy, Monocle3, slingshot, etc. are preferred.
  • Familiarity with single cell data objects and single cell data portals are preferred.
  • Familiarity with software development and deployment tools, e.g., Git, GitHub, Docker. • Working knowledge of Linux system and high-performance computing (HPC).
  • Working knowledge of scientific, biomedical research, and health-related terminology.
  • Demonstrable skills in interpersonal communication, oral and written communication, and an ability to work collaboratively in cross-functional working groups.
  • Experience in collaborating with experimental and computational scientists in a multi-disciplinary project team.
  • Experience in establishing and maintaining collaborations and/or partnerships with other labs and groups working in complementary fields.
  • Capability to handle multiple projects concurrently, with meticulous attention to detail and adaptability to changing work requirements.
  • High level of initiative and ability to use sound judgment to effectively solve problems within the scope of the position.
  • A demonstrated ability to generate and pursue independent research ideas.

Disclaimer/Fine Print

COMMITMENT TO DIVERSITY AND EQUAL EMPLOYMENT OPPORTUNITY: NIH NLM is dedicated to building a diverse community in its training and employment programs and encourages the application and nomination of qualified women, minorities, and individuals with disabilities. The United States Government does not discriminate in employment based on race, color, religion, sex (including pregnancy and gender identity), national origin, political affiliation, sexual orientation, marital status, disability, genetic information, age, membership in an employee organization, retaliation, parental status, military service, or other non-merit factors. NIH NLM will provide reasonable accommodations to applicants with disabilities as appropriate. If you require reasonable accommodation during any part of the application and hiring process, please notify us. HHS, NIH, and NLM are equal opportunity employers.