Research Article

AI model that predicts antibiotic resistance from bacterial genomes (AMR) using open sequencing data

Authors

Abstract

Antimicrobial resistance (AMR) prediction from bacterial whole genome sequencing can shorten time to effective therapy, strengthen surveillance, and reduce reliance on slow culture based susceptibility testing. However, many genomic AMR workflows still depend on curated gene and mutation catalogs, which can miss emerging mechanisms, vary in coverage across species, and rarely provide calibrated uncertainty. We present a phenotype first framework that trains models to predict resistant, intermediate, or susceptible outcomes directly from genome sequences using open repositories. Genomes and linked phenotypes are assembled from NCBI SRA and GenBank records and harmonized with BV BRC (formerly PATRIC) antibiotic panels. We compare three modeling families: catalog based baselines (AMRFinderPlus and ResFinder), k mer linear models with stability selection, and sequence transformers fine tuned on long k mer tokens. To support clinical decision making, predicted probabilities are calibrated with temperature scaling and wrapped with conformal prediction to yield distribution free confidence sets. Interpretability is addressed by mapping influential k mers and attention based attributions to genes and known resistance determinants in CARD and MEGARes, producing mutation importance summaries that can be reviewed by microbiologists. We define evaluation protocols that avoid leakage through lineage structure, including temporal splits and site stratified cross validation, and report discrimination, calibration, and abstention metrics. The result is a reproducible template for AMR phenotype modeling that complements rule based tools while providing transparent evidence and quantified uncertainty. The manuscript targets publication in November 2025 and emphasizes open science, auditability, and transferability across pathogens and antibiotics in public health practice.

Article information

Journal

Frontiers in Computer Science and Artificial Intelligence

Volume (Issue)

4 (3)

Pages

33-43

Published

2025-04-25

How to Cite

Hasan, M. N., Bhuyain, M. M. H., & Chowdhury, F. (2025). AI model that predicts antibiotic resistance from bacterial genomes (AMR) using open sequencing data. Frontiers in Computer Science and Artificial Intelligence, 4(3), 33-43. https://doi.org/10.32996/fcsai.2025.4.3.4

Downloads

Views

37

Downloads

21

Keywords:

Antimicrobial resistance; AMR prediction; bacterial whole-genome sequencing; genotype–phenotype modeling; antibiotic susceptibility; k-mer embeddings; sequence transformers; genome foundation models; BV-BRC (PATRIC); NCBI SRA; GenBank; AMRFinderPlus; ResFinder; CARD; MEGARes; interpretable machine learning; mutation importance; calibrated uncertainty; temperature scaling; conformal prediction; deep ensembles; lineage-aware validation; temporal split; public health surveillance.