AMRrules Specification#

Rule Specification#

This section details how interpretive rules should be encoded in the AMRrules format. The current version of the AMRrules Specification is v1.0, for use with the AMRrules software package v1.0. The syntax for specifying different types of variants to which a rule should be applied is given in the next section.

On this page you will find the full list of fields (indicating which external databases or ontologies apply to each field, along with a description and guidance on defining/interpreting each field), as well as bespoke AMRrules-specific controlled vocabulary for some fields.

AMRrules template (Google sheet)#

The v1.0 rule specification is also available in a Google sheet that includes the AMRrules template, with allowed values encoded in drop-down menus, to facilitate rule curation.

Full list of fields#

The full list of fields is below, with guidelines on how each field should be specified and interpreted.

Download

Required fields

status

description

reference standard

reference link

guidance

rationale

ruleID

required

unique identifier for this rule {values listed in ‘organism subgroup codes’}

AMRrules

‘organism subgroup codes’ tab

Combination of 3-letter code (to indicate the organism subgroup who curated the rule, see tab ‘organism subgroup codes’) followed by 4-digit number (assigned by the subgroup).

Each rule needs a unique identifier, so that combinatorial rules can be defined as combinations of component parts. These need to be unique across the entire AMRrules set, but assigned and managed within the subgroups who are defining the individual and combinatorial rules.

txid

required

taxonomy ID of the species that this rule applies to

NCBI Taxonomy

https://www.ncbi.nlm.nih.gov/Taxonomy/taxonomyhome.html/

There should be one row per species/marker combination, for clarity of interpretation and parsing the rules files, and for clarity of recording evidence for each rule and its relevance to a given species. The primary taxonomy identifier for AMRrules is the NCBI Taxonomy, this field should contain a valid taxid for a species or genus. Note these identifiers are stable, even when the species or genus name changes.

organism

required

species that this rule applies to, normally a species {scientific name}

NCBI Taxonomy

https://www.ncbi.nlm.nih.gov/Taxonomy/taxonomyhome.html/

Indicate the name of the organism the rule applies to. Include the prefix ‘s_’, ‘g_’ etc to indicate the taxonomic level (species, genus). E.g. ‘s_Klebsiella pneumoniae’ indicates species Klebsiella pneumoniae. ‘g_Klebsiella’ indicates genus Klebsiella. This should usually be the value of the ‘current name’ field associated with the taxid in the NCBI Taxonomy, however if there are issues with the current name, e.g. if it does not match the organism nomenclature used by EUCAST to define a breakpoint, you may use a different organism name.

gene

required

name of the gene that this rule applies to {node ID, or gene symbol if node ID not available} OR a logical expression describing a combination of other ruleIDs {logical}

refgene, NCBI Gene Hierarchy

https://www.ncbi.nlm.nih.gov/pathogens/refgene/ https://www.ncbi.nlm.nih.gov/pathogens/genehierarchy, https://ftp.ncbi.nlm.nih.gov/pathogen/Antimicrobial_resistance/AMRFinderPlus/database/latest/ReferenceGeneHierarchy.txt

If the gene is in the NCBI hierarchy, specify the node ID. If it is not in the NCBI hierarchy, indicate the gene or allele name in NCBI refgene. If it is not in NCBI refgene, use the gene symbol (e.g. ‘mexB’) - if the gene is present in CARD, use the gene symbol present there, otherwise try to identify the most suitable gene symbol and be sure to include refseq and ARO accessions for clarity). For combinatorial rules, this should be a logical expression based on other single-marker rules, which when evaluated as TRUE means this rule should be applied. E.g. “ECO001 & ECO002” means this rule should be applied when both rule ECO001 and rule ECO002 apply (i.e. when the markers defined by these rules are both detected). “(ECO001 | ECO003) & ECO002” means this rule should be applied when either one or both of rules ECO001 or ECO003 apply and ECO002 also applies. Syntax should use ‘&’ for logical AND and ‘|’ for logical OR. If the rule is intended to convey an unexplained mechanism of expected resistance, gene should be set to ‘unknown’, with context ‘core’, phenotype ‘wildtype’, clinical category ‘R’, and breakpoint standard ‘EUCAST Expected Resistant Phenotypes vX (year)’ (all gene identifier fields should be ‘-’, and the curation note should explain the reasoning). If the rule is intended to convey an expected resistance due to lack of the drug target, the same applies but the gene should be set to ‘none’.

nodeID

uniquely identify the gene using AT LEAST ONE NCBI accession: nodeID (preferred) or refseq protein or GenBank protein or HMM (for protein-coding genes); or nucleotide accession with coordinates (for nucleotide variants e.g. 23S or promoter regions)

name of the gene that this rule applies to {node ID in NCBI gene hierarchy}

NCBI Gene Hierarchy

https://www.ncbi.nlm.nih.gov/pathogens/genehierarchy, https://ftp.ncbi.nlm.nih.gov/pathogen/Antimicrobial_resistance/AMRFinderPlus/database/latest/ReferenceGeneHierarchy.txt

Can be a leaf node or internal node in the NCBI Gene Hierarchy. Where a rule applies to multiple leaf nodes and/or all descendants of an internal node, it is recommended to specify one row per node, and provide evidence for each one (unless the number of leaf nodes is large and all have the same categorization and evidence).

protein accession

refseq protein accession for the gene this rule applies to

refseq or GenBank protein sequence accession

https://www.ncbi.nlm.nih.gov/refseq/ https://www.ncbi.nlm.nih.gov/genbank/

Indicate the refseq (preferred) or Genbank protein accession for the most appropriate protein sequence. Wherever possible this should match that used in the NCBI Pathogens refgene database.

HMM accession

HMM accession for the gene this rule applies to (suitable for internal nodes in the NCBI Gene Hierarchy)

HMM accession

https://www.ncbi.nlm.nih.gov/pathogens/refgene/ https://www.ncbi.nlm.nih.gov/pathogens/hmm/

Indicate the HMM accession for the most appropriate protein sequence; this is mainly relevant for internal nodes in the NCBI Gene Hierarchy.

nucleotide accession

nucleotide sequence accession and coordinates defining the gene this rule applies to (suitable for e.g. rRNA genes or promoter variants)

nucleotide sequence accession

https://www.ncbi.nlm.nih.gov/pathogens/refgene/ https://www.ncbi.nlm.nih.gov/refseq/

For variants defined by nucleotide sequences not proteins (e.g. 23S, or promoter mutations), indicate the nucleotide sequence accession and coordinates of the relevant gene within that sequence, in the format: accession:start-stop (for genes encoded on forward strand) and accession:stop-start (for genes encoded on reverse strand). The refgene AMR database gives the relevant accessions and coordinates for AMR variants included in AMRfinderplus.

ARO accession

optional

Antiboitic Resistance Ontology (ARO) identifier for the gene this rule applies to

ARO gene ID

https://card.mcmaster.ca

Optional. Note AROs are not associated with specific sequences, so are insufficient to define a rule.

Useful for harmonization with CARD (for drug dictionary and other things) and for annotation of genotypes generated using other DBs/tools based on CARD (which can be mapped to ARO using argNorm tool)

mutation

required (set to ‘-’ if non-specific)

specific mutation in this gene to which the rule applies

HGVS (with some AMRrules modifications)

https://hgvs-nomenclature.org/stable/ interpretAMR/AMRrulesCuration

Indicate the mutation relative to the gene in ‘gene’. Typically this will be a protein mutation (in the format ‘p.Ser83Tyr’) or a nucleotide mutation in a coding sequence (in the format ‘c.25T’). For more complex examples see interpretAMR/AMRrulesCuration

variation type

required

explanation of the type of variation this rule applies to {values listed in ‘variation type’ tab}

AMRrules

‘variation type’ tab

Indicate the type of variation this rule applies to. Allowed values are in the ‘variation type’ tab. Most common examples are ‘Gene presence detected’, ‘Protein variant detected’, ‘Nucleotide variant detected’ or ‘Combination’.

Based on the ‘variant type’ column in hAMRonization, helps to clarify the nature of the variation to which the rule applies.

gene context

required

indicates the genomic context for this gene in this species {core, acquired, unknown}

AMRrules

Indicate the genomic context of this gene within this species, i.e. whether the gene is ‘core’ or ‘acquired’. Working definition of ‘core’ is: present (>90% identity, >90% length) in the chromosome of >95% of genomes of this species and at least >95% those that have wildtype AST profiles. Note that a resistance-associated mutation in a core gene (e.g. Ser83Phe in chromosomal GyrA) should be coded as ‘core’. A mutation in an acquired gene should be coded as ‘acquired’.

drug

optional (need drug OR drug class)

name of drug for which the rule applies {ARO term}

ARO term

https://card.mcmaster.ca

Indicate the name of the drug that the rule applies to. Where rules apply to multiple drugs, they should be specified in separate rows (i.e. as separate rules), with individual references for each gene-drug combination. Alternatively, if the rule applies to all drugs in a defined drug class, leave this blank and indicate the ‘drug class’ field instead. Allowed values are all CARD ARO entries of type ‘antibiotic’ (which includes disinfectant agents) or ‘adjuvant’ (which includes inhibitors).

drug class

optional (need drug OR drug class)

name of drug class for which the rule applies (ONLY if the rule is consistent across the entire drug class) {ARO term}

ARO term

https://card.mcmaster.ca

Indicate the name of the drug class that the rule applies to. This field should be completed ONLY IF there is evidence that the gene has activity against all drugs in the class. Note that CARD defines five classes of cephalosporins: first-generation cephalosporin, second-generation cephalosporin, third-generation cephalosporin, fourth-generation cephalosporin, other cephalosporin and penam.

Useful as there are likely to be a lot of determinants that apply across a whole drug class.

phenotype

required

indicates whether members of this species with this gene are expected to fall in the wildtype or non-wildtype part of the reference MIC distribution; this is equivalent to identifying whether the MIC is expected to fall below or above the ECOFF, if one is defined {wildtype, nonwildtype}

EUCAST distribution

mic.eucast.org

Indicates whether isolates of this species, with this gene, are considered to have a wildtype or nonwildtype susceptibility phenotype, equivalent to being below vs above the MIC ECOFF if one is defined. If the gene is a core gene, the expected phenotype should generally be ‘wildtype’, unless the rule refers to a specific variant of the core gene for which there is evidence of a nonwildtype phenotype.

clinical category

required

expected clinical category for members of this species with this gene {S, I, R, NS}

EUCAST

https://www.eucast.org/clinical_breakpoints https://www.eucast.org/expert_rules_and_expected_phenotypes/expected_phenotypes

Indicates the categorization associated with this gene, for members of this species {S, I, R, NS} using the breakpoint standard indicated. If the drug this rule applies to appears on the EUCAST Expected Resistances list for this organism, and the gene is a core gene, the expected phenotype should be ‘wildtype’ and the category should be ‘R’. If the gene is identified as a core gene but the drug does not appear on the EUCAST Expected Resistances list for this organism, and there are no EUCAST Expert Rules recommending reporting as R, there should be strong evidence from literature and/or matched genome/phenotype data to support the assignment of ‘R’. Note that ‘NS’ is only an allowed value for CLSI, not EUCAST, and has a specific meaning that is only relevant when there is a breakpoint for S but not for I or R.

breakpoint

required

indicate the breakpoint that was used to define the expected phenotype category (note this is ‘not applicable’ if rule is specified for a drug class rather than a single drug)

EUCAST

https://www.eucast.org/clinical_breakpoints https://www.eucast.org/expert_rules_and_expected_phenotypes/expected_phenotypes

Give the breakpoint used to define the indicated category for the specified drug (please enter ‘not applicable’ if rule applies to a drug class). E.g. for categorization as ‘R’ based on MIC, breakpoint should be given in the form ‘MIC >X [units]’ or ‘disk zone <X mm’; for categorization as ‘S’, use ‘MIC <=X [units]’ or ‘disk zone > X mm’; for categorization as ‘I’ use ‘MIC range, >X and <= Y [units]’. For bug/drug combinations with wildtype ‘I’, the S breakpoint may be arbitrarily set to 0.001 (MIC) or 50 (disk); in this case it is inappropriate to define the breakpoint for ‘I’ as a range, e.g. ‘MIC <=X’ rather than ‘MIC range, >0.001 and <=X [units]’. If the rule is defined on the basis of an ECOFF, indicate the threshold used in the same manner as for a breakpoint. If it is an Expected (intrinsic) resistance, the breakpoint is irrelevant (and usually undefined) so enter ‘not applicable’. If the rule applies to a drug class, enter ‘not applicable’, but consider whether it would be more informative to set specific rules for individual drugs. If there is no breakpoint or ECOFF, enter ‘not available’.

As genotype interpretations are defined relative to clinical categorizations, and there are multiple sources for these and they are updated continuously, we need to record which standard was used to define each rule. This also facilitates accommodating multiple breakpoints for same bug-drug, using different standards or clinical indications (e.g. EUCAST sometimes has different breakpoints for IV vs oral, or for treatment of specific syndromes). This approach also facilitates using ECOFF in the absence of a breakpoint; facilitates specifying rules defined against other standards such as CLSI or veterinary standards

breakpoint standard

required

indicate the AST phenotyping standard used to interpret this rule

EUCAST

https://www.eucast.org/clinical_breakpoints https://www.eucast.org/expert_rules_and_expected_phenotypes/expected_phenotypes

In the format ‘[Name] [version] ([year])’, e.g. ‘EUCAST v15.0 (2025)’ or ‘ECOFF (May 2025)’ (as ECOFFs at mic.eucast.org are not versioned, indicate month and year). If it is an Expected (intrinsic) resistance, there will not typically be a breakpoint, in this case indicate the version of the expert rules e.g. ‘EUCAST Expected Resistant Phenotypes v1.2 (2023)’ or ‘EUCAST Salmonella Expert Rules v3.2 (2019)’. If the rule is defined based on an informal breakpoint defined in a paper, indicate the PubMed identifier for the relevant paper in this field as: ‘PMID xxx’

breakpoint condition

optional

indicate the specific conditions for this breakpoint, if relevant (e.g. meningitis, uncomplicated UTI, iv, oral)

EUCAST

https://www.eucast.org/clinical_breakpoints

If different breakpoints are defined for different conditions, indicate the conditions relevant to the breakpoint used to define this rule. For example different breakpoints may be given for different infection types (meningitis, uncomplicated UTI) or therapy types (iv, oral). If all breakpoints are the same, or all result in the same interpretation for this gene, it is preferable to specify a single rule without conditions. If multiple breakpoints are defined, and the interpretation is different using the different breakpoints, it is preferable to define separate interpretive rules for each condition. If the stated purpose of a condition-specific breakpoint is to screen for likely resistance mechanisms (e.g. ciprofloxacin for meningitis), or to enforce reporting of all isolates as ‘I’ for a specific condition, then a condition-specific rule is not needed as this is better managed in downstream reporting logic. Wherever possible, use the controlled vocabulary in sheet (see dropdown menu and ‘breakpoint condition values’ tab), which includes all such terms used in the EUCAST or CLSI 2025 breakpoints table.

PMID

required

PubMed identifier/s for literature supporting the rule (comma-separated list)

PubMed

https://pubmed.ncbi.nlm.nih.gov/

Provide PubMed identifier for the ‘best’ peer-reviewed research article/s providing specific evidence that this gene is associated with this phenotype category for this drug in this species (separate multiple entries with ‘, ‘). Literature demonstrating evidence in other species, or related drugs, should not be included.

evidence code

required

indicate the nature of the evidence that supports the rule {ECO code; select from controlled list, multiple selections allowed in comma-separated list}

ECO

https://www.evidenceontology.org/

Indicate the nature of the evidence supporting the rule. More than one can be listed, please include all forms of evidence available to support the rule (separate multiple entries with ‘, ‘). In principle any codes in the Evidence and Conclusion Ontology can be used, but in most cases it will be most appropriate to choose from the subset listed in the ‘evidence codes’ tab of this spreadsheet (also provided as a dropdown selection in the main data entry tab of this spreadsheet). The source for each type of evidence should be given in the ‘PMID’.

If you want to use an ECO code not yet included in the dropdown list, please let ESGEM-AMR chairs know so that we can add it to the specification as others may find this helpful also. If you feel something is missing from ECO, please also let us know so that we can discuss, and potentially work together to request the addition of new terms to the ontology.

evidence grade

required

expert curators’ overall assessment of the level of support provided by all evidence considered {high, moderate, low, very low}

AMRrules

Indicate the expert curators’ overall assessment of the level of support provided by all evidence considered.

There will often be a need to specify a rule for which the evidence is not yet conclusive. It is important to flag these and give some indication of what is lacking. Allowed terms and their definitions are given in the ‘evidence grades’ tab. Note that if no experimental evidence is available, the rule should NOT be graded as ‘high’, even if there is good evidence of statistical association between genotype and phenotype in natural populations. (Future updates will include additional fields to record quantitative details of genotype/phenotype associations.)

evidence limitations

optional

expert curators’ assessment of the key limitations of the available evidence {values listed in ‘evidence grades’ tab}

AMRrules

‘evidence grades’ tab

This should be completed for all rules with evidence grades other than ‘high’. Use the values listed in the ‘evidence grades’ tab (separate multiple entries with ‘, ‘).

rule curation note

optional

short explanatory note describing the mechanism and/or reasoning for the rule

free text

Highly recommended to complete for all core genes, or combinatorial rules, to explain why this results in susceptibility or resistance.

Download

Controlled vocabularies#

Variation type#

Specifies the nature of the type of variation to which the rule applies. Based on the ‘variant type’ column in the hAMRonization AMR detection specification scheme, with additional terms from the NCIT ontology.

Values allowed in variation type column

The specified AMRrule applies if…

Notes or source

Gene presence detected

…the gene specified in the ‘gene’ column is detected as being present.

hAMRonization

Protein variant detected

…the protein variant specified in the ‘mutation’ column is detected in the specified ‘gene’.

hAMRonization

Nucleotide variant detected

…the nucleotide variant specified in the ‘mutation’ column is detected in the specified ‘gene’.

hAMRonization

Promoter variant detected

…the promoter variant specified in the ‘mutation’ column is detected in the specified ‘gene’.

NCIT:C190205

Inactivating mutation detected

…the gene specified in the ‘gene’ column is inactivated by any type of mechanism (e.g. frameshift, internal stop, deletion, truncation), in the amino acid range specified in the ‘mutation’ column (or anywhere in the gene, if the ‘mutation’ column is blank i.e. ‘-‘).

NCIT:C178119

Gene truncation detected

…the gene specified in the ‘gene’ column is truncated, within the amino acid range specified in the ‘mutation’ column.

Gene copy number variant detected

…the gene specified in the ‘gene’ column is detected in at least the minimum number of copies specified in the ‘mutation’ column.

NCIT:C189957

Nucleotide variant detected in multi-copy gene

…the gene specified in the ‘gene’ column is a gene that is normally present in multiple copies (e.g. rRNA genes), and the nucleotide variant specified in the ‘mutation’ column is detected in at least the minimum number of alleles specified in the ‘mutation’ column.

Low frequency variant detected

…the reads data supports a mixed population, for which a minimum fraction specified in the ‘mutation’ column support the presence of the nucleotide variant specified in the ‘mutation’ column being present in the gene specified in the ‘gene’ column (currently intended for TB only).

Combination

…the logical expression in the ‘gene’ column, which expresses a combination of component rules identified by their ‘ruleID’, evaluates as true.

Download

Evidence codes#

Specified using the Evidence and Conclusion Ontology (ECO), this field indicates the nature of the evidence supporting the rule. More than one can be listed, and the field should include all forms of evidence available to support the rule (multiple entries separated with ‘, ‘).

Any ECO codes can be used, but curators are encouraged to choose from the subset listed here, which covers the types of evidence typically available to support resistance mechanisms in bacteria. Note the literature source for each type of evidence noted here should be indicated in the PMID field.

ECO:0001091

knockout phenotypic evidence

ECO:0001091 knockout phenotypic evidence

E.g. evidence that knocking out the proposed AMR gene in a phenotypically resistant strain results in loss of resistance

ECO:0000012

functional complementation evidence

ECO:0000012 functional complementation evidence

E.g. evidence that, when a gene knockout results in change from R to S, the phenotype is reversed (resistance is restored) when the gene is reintroduced

ECO:0001113

point mutation phenotypic evidence

ECO:0001113 point mutation phenotypic evidence

E.g. for a mutation, evidence that this specific mutation is associated with a change in susceptibility phenotype

ECO:0000024

protein-binding evidence

ECO:0000024 protein-binding evidence

E.g. evidence that the gene product binds to this drug

ECO:0001034

crystallography evidence

ECO:0001034 crystallography evidence

E.g. structural evidence from crystallography that the mutated position in this gene product interacts with the drug

ECO:0000005

enzymatic activity assay evidence

ECO:0000005 enzymatic activity assay evidence

E.g. evidence that the gene product has enzymatic activity against the drug

ECO:0000042

gain-of-function mutant phenotypic evidence

ECO:0000042 gain-of-function mutant phenotypic evidence

E.g. for a mutation, evidence that introducing this specific mutation into a wildtype background is associated with a change in susceptibility phenotype

ECO:0007000

high throughput mutant phenotypic evidence

ECO:0007000 high throughput mutant phenotypic evidence

E.g. evidence from a transposon mutant library that mutation or loss of a gene in a phenotypically resistant strain results in loss of resistance

ECO:0001103

natural variation mutant evidence

ECO:0001103 natural variation mutant evidence

E.g. for an acquired gene or mutation, evidence that natural variation in presence vs absence is associated with susceptibility to the drug (genotype-phenotype association in a natural population)

ECO:0005027

genetic transformation evidence

ECO:0005027 genetic transformation evidence

E.g. evidence that transfer of the gene into a susceptible recipient strain results in resistance

ECO:0000020

protein inhibition evidence

ECO:0000020 protein inhibition evidence

E.g. evidence that a mutation inhibits protein function to reduce interaction the effect of the drug and confer resistance

ECO:0006404

experimentally evolved mutant phenotypic evidence

ECO:0006404 experimentally evolved mutant phenotypic evidence

E.g. evidence that the mutation arises in response to drug exposure during experimental evolution, resulting in resistant mutants

ECO:0000054

double mutant phenotype evidence

ECO:0000054 double mutant phenotype evidence

E.g. evidence resulting from an experiment typically constructed to determine if two different genes have an observable genetic interaction (functional connection) as the result of a mutation occurring in the alleles of the two genes of interest

ECO:0000154

heterologous protein expression evidence

ECO:0000154 heterologous protein expression evidence

E.g. a type of protein expression evidence where a gene from one cell is inserted into a cell that does not typically contain that gene and heterologous protein expression is assessed

ECO:0000006

experimental evidence

ECO:0000006 experimental evidence

Experimental evidence not otherwise classified

ECO:0001583

small interfering RNA knockdown evidence

ECO:0001583 small interfering RNA knockdown evidence

    1. a type of anti-sense experiment evidence where gene expression is disrupted through the introduction of double-stranded RNA molecules, 20-25 base pairs in length, which operate within the RNA interference pathway

Download

Evidence grade#

This fields indicates the expert curators’ overall assessment of the level of support provided by all evidence considered. It is modelled on the GRADE (Grading of Recommendations, Assessment, Development, and Evaluation) approach to assessing the certainty of evidence to guide decision making in healthcare.

AMRrules aims to provide rules to interpret all markers that have been detected in a given species, but in many cases the evidence can be quite limited. The evidence grade field gives users an overall guide to the strength of evidence, and the evidence limitations field highlights what kind of evidence is lacking.

Note that if no experimental evidence is available, the rule should NOT be graded as ‘high’, even if there is strong evidence of statistical association between genotype and phenotype in natural populations. (Future updates to the rule specification will include additional fields to record quantitative details of genotype/phenotype associations.)

There are four possible ‘grades’ in AMRrules, these are listed below with guidance on what they mean in the context of AMRrules (modelled on the GRADE framework).

Evidence grade

What it means

Use this when

high

The curators are confident in the categorisation, and believe that the likelihood that the effect will be substantially different from this is low.

Experimental evidence provides strong support for the interpretation of this gene/variant in this species for this drug. If there is statistical geno/pheno evidence available, it supports this interpretation.

moderate

The curators believe that the categorisation most likely reflects the true effect, and the likelihood that the effect will be substantially different is moderate.

There is good evidence to support the interpretation of this gene/variant in this species for this drug, but there is some uncertainty (e.g. lack of direct evidence in this organism although evidence from related organisms is convincing; or there is good statistical geno/pheno evidence but no experimental evidence of mechanism).

low

The curators believe that the categorisation might not reflect the true effect, and the likelihood that the effect will be substantially different is high.

There is evidence supporting a link between this gene/variant and this drug, but the interpretation in this species is unclear (e.g. lack of evidence in this organism or related organisms; statistical geno/pheno evidence is lacking, or does not support a clear effect; or there are trustworthy but conflicting reports).

very low

The curators have no confidence that the categorisation reflects the true effect, and the likelihood that the effect will be substantially different is high.

There is no trustworthy evidence as to the effect in this organism, or there is conflicting evidence. The categorical interpretation is based on assumptions made from unrelated organisms and may be wrong.

Download

Evidence limitations#

This fields indicates highlights what kind of evidence is lacking to support interpretation of this marker in this organism. All rules with an evidence grade other than ‘high’ should have at least one limitation recorded.

Evidence limitations

lacks evidence for this species

lacks evidence for this genus

lacks evidence for this allele

lacks evidence of the degree to which MIC is affected

low clinical relevance

unknown clinical relevance

statistical geno/pheno evidence but no experimental evidence

conflicting evidence

lacks formal breakpoints

lacks evidence for this drug

Download

Breakpoint condition#

EUCAST, CLSI and others sometimes assign different breakpoints for different clinical conditions, infection sites, or drug delivery routes (e.g. intravenous vs oral). In such cases, this field is used to indicate which specific breakpoint the rule was defined against. This will often be blank, indicating that the rule is not specific to any particular type of infection or delivery route.

The list of allowed terms is taken from the EUCAST and CLSI 2025 Breakpoints, sourced from the digitized versions in the AMR R package using this command: ` clinical_breakpoints %>% filter(guideline=="CLSI 2025" | guideline=="EUCAST 2025") %>% group_by(site) %>% count() `

Endocarditis

Endocarditis with combination treatment

Extraintestinal

Intravenous

Intravenous, Oral

Investigational agent

Liposomal, Inhaled

Mammary gland

Mastitis

Meningitis

Meningitis, Endocarditis

Metritis

Non-endocarditis

Non-meningitis

Non-meningitis, Non-endocarditis

Non-pneumonia

Oral

Oral, Infections originating from the urinary tract

Oral, Other indications

Oral, Uncomplicated urinary tract infection

Parenteral

Pneumonia

Prophylaxis

Respiratory

Respiratory, genital

Respiratory, soft tissue

Screen

Skin

Skin, respiratory

Skin, soft tissue

Skin, soft tissue, respiratory

Skin, soft tissue, respiratory, uncomplicated urinary tract infection

Skin, soft tissue, respiratory, uncomplicated urinary tract infection, genital

Skin, soft tissue, uncomplicated urinary tract infection

Skin, uncomplicated urinary tract infection

Uncomplicated urinary tract infection

Uncomplicated urinary tract infection, Investigational agent

Wounds, abscesses

Wounds, abscesses, uncomplicated urinary tract infection

Download

Organism code#

Each rule is assigned a ruleID, which starts with a 3-letter code to indicate the organism subgroup who curated the rule. The list of available organism subgroup codes is below.

Organism

Prefix for ‘ruleID’

Achromobacter xylosoxidans

AXY

Acinetobacter

ACI

Aeromonas

AER

Anaerobes

ANA

Bordetella

BOR

Brucella

BRU

Burkholderia cepacia complex

BCC

Burkholderia pseudomallei

BPM

Campylobacter jejuni

CAJ

Campylobacter fetus

CAF

Campylobacter coli

CAC

Chryseobacterium indologenes

CIN

Corynebacterium diphtheriae

CDP

  1. coli/Shigella

ECO

Edwardsiella

EDW

Enterobacter cloacae complex

ECC

Enterococcus

ENT

Haemophilus influenzae

HIN

Helicobacter

HEL

Klebsiella pneumoniae

KPN

Legionella

LEG

Listeria

LIS

Mycobacterium non-Tb

MYC

Mycobacterium tuberculosis

MTB

Mycoplasma pneumoniae

MPN

Neisseria commensals

NEI

Neisseria gonorrhoeae

NGO

Neisseria meningitidis

NMN

Pasteurella

PAS

Proteus mirabilis

PRM

Pseudomonas aeruginosa

PSA

Salmonella

SAL

Serratia

SER

Shewanella

SHW

Staphylococcus aureus

STA

Staphylococcus epidermidis

STE

Staphylococcus saprophyticus

STS

Stenotrophomonas maltophilia

STM

Streptococcus

STR

Treponema

TRE

Vibrio

VIB

Yersinia

YER

Download

Variant Specification#

The AMRrules specification needs to be able to encode interpretive rules for all types of genetic variants relevant to AMR in bacteria.

In 2024, the ESGEM-AMR working group collated and reviewed examples of known variants across diverse bacteria, and identified the following types of AMR variants:

  • Gene presence detected

  • Amino acid substitution or insertion

  • Nucleotide substitution or insertion

  • Gene truncated (loss of function)

  • Mutation in promoter region (substitution, deletion or insertion, including IS)

  • Gene copy number changes

  • Mutations in multi-copy genes (e.g. 23S rRNA)

  • Low frequency variants (i.e. heterozygosity)

It was concluded that all such variants could be adequately addressed using a combination of three fields:

Specific examples of each AMR variant are shown below, with proposed mutation syntax and variation types for each (note that other fields required for rule definition, like organism, refseq accession, context, PMID are not included here for simplicity, as they are not essential to illustrate how to define a specific kind of variation):

ID

gene

mutation

variation type

drug

category

KPN0001

blaSHV

-

Gene presence detected

ampicillin

wt R

KPN0002

gyrA

p.Ser83Tyr

Protein variant detected

ciprofloxacin

nwt I

KPN0003

parC

p.Ser80Ile

Protein variant detected

ciprofloxacin

nwt I

KPN0004

ompK36

c.25C>T

Nucleotide variant detected

meropenem

nwt S

KPN0005

ompK36

p.114_115insGlyAsp

Protein variant detected

meropenem

nwt I

KPN0006

mgrB

p.(1_100)

Gene truncation detected

colistin

nwt R

ECO0001

ampC

c.-11C>T

Promoter variant detected

ceftriaxone

nwt R

ECO0002

ampC

c.-14_-13insGT

Promoter variant detected

ceftriaxone

nwt R

ACI0001

blaOXA-58

c.(-35_1)ins[ISAba125:inv]

Promoter variant detected

ceftriaxone

nwt R

NGO0002

23S rDNA

c.[2045A>G][3]

Nucleotide variant detected in multi-copy gene

azithromycin

nwt R

ECO0003

blaTEM

c.[3]

Gene copy number variant detected

piperacillin+tazobactam

nwt R

MTC0001

gyrA

p.[Ala94Gly][0.13]

Low frequency variant detected

ciprofloxacin

nwt R

Syntax for mutations#

Syntax for ‘mutation’ column follows HGVS, including:

  • Gene and protein start sites are position 1 (there is no position 0)

  • Ranges are specified using x_y; for insertions the coordinates are specified as inclusive_exclusive, otherwise ranges are inclusive_inclusive

  • Unknown ranges are specified with parentheses, (x_y). E.g. p.(1_100)insGlyAsp means an insertion of 2 amino acids (Gly and Asp) anywhere between codons 1 and 100 inclusive (as opposed to a replacement of amino acids 1 through 100 with GlyAsp, which would be expressed as p.1_100delinsGlyAsp).

    1. Coordinates are specified relative to the reference sequence of a protein (p) or coding sequence (c)

  • Coordinates upstream of coding sequence are specified relative to the start site, with a hyphen, e.g. c.-35 indicates 35 bp upstream

  • Mutations in protein and DNA are specified differently, e.g.

    1. p.Ser83Tyr: change to protein sequence from Ser to Tyr at codon 83

    2. c.25C>T: change to nucleotide coding region from C to T at nucleotide position 25

  • Stop codons are specified (in both DNA and protein variants) as Ter

  • Following IUPAC, X signifies any amino acid, N signifies any DNA base

  • ^ (caret) is used as “or”, e.g. p.(Gly719Ala^Ser)

  • The letters inv indicate the inverse (i.e. reverse complement) of a sequence

  • Repeat sequences are specified as sequence[N] where N is the number of copies of the repeat

AMRrules-specific syntax#

  • AMRrules requires amino acids be specified as three-letter codes (whereas HGVS allows single-letter or three-letter codes)

    • Accordingly, the STOP codon should be specified as ‘Ter’ rather than ‘*’

  • In HGVS you must specify the reference sequence explicitly using a sequence accession, followed by : and then the mutation, e.g. NF000285.3:p.Gly238Ser. In AMRrules the gene is specified in separate column/s (‘gene’, ‘refseq accession’, ‘ARO accession’) and should not be repeated in the mutation column. So the above rule should be coded as:

    • gene = blaSHV

    • node = blaSHV

    • refseq accession = NF000285.3

    • ARO accession = ARO:3000015

    • mutation = p.Gly238Ser

  • In AMRrules, insertion sequences (IS) should be labelled with their IS name as per ISfinder, as many do not have their own sequence accessions in refseq. E.g. insertion of ISAba125 should be specified as ins[ISAba125], and insertion in reverse orientation to the gene to which the rule applies should be specified as ins[ISAba125:inv].

  • In AMRrules, rules intended to apply when a gene is present in a minimum of N copies can be specified using the [N] syntax to indicate the minimum repeat/copy number of the whole coding sequence, as c.[N].

    1. Note this syntax does not convey any information about the location of the copies, i.e. c.[2] simply indicates that there are at least 2 copies of the gene detected in the genome, whether they are tandem repeats or in different replicons such as one in the chromosome and one in a plasmid.

  • In HGVS, the presence of multiple alleles (i.e. heterozygous) is specified as a colon-separated list of allelic variants e.g. [allele1];[allele2].

  • In AMRrules, rules that apply to variation in a multi-copy gene can be specified in this way, with each allele explicitly stated.

    1. Alternatively if the rule applies when a minimum of N copies of the gene carry the mutation (e.g. mutation in ≥3 copies of 23S rRNA resulting in resistance to azithromycin), this can be abbreviated using the [N] syntax to indicate the minimum repeat/copy number, as c.[allele][N] or p.[allele][N], e.g. c.[2045A>G][3].

  • In AMRrules, rules that apply to ‘low frequency variants’, i.e. when a minimum fraction of reads, P, support presence of the allelic variant in a sequenced population, the minimum fraction can be specified by extension of the syntax for copy number, as [X]. E.g. p.[Ala94Gly][0.13] (example from the Mycobacterium tuberculosis gyrA gene).

    1. To put another way, in AMRrules the repeat syntax [X] is interpreted as a minimum copy number if X is an integer, and as a minimum read fraction if X is a double/float between 0 and 1.

Explanation of ‘mutation’ syntax relevant to known AMR variants#

  • p.Ser83Tyr: change to protein sequence from Ser to Tyr at codon 83

  • c.25C>T: change to nucleotide coding region from C to T at nucleotide position 25

  • p.114_115insGlyAsp: change to protein sequence, with an insertion of amino acids Gly and Asp between codons 114 and 115

  • p.(1_100): truncation (of any kind) anywhere in the first 100 amino acids of the protein sequence

  • c.-11C>T: change to nucleotide sequence from C to T, 11 bases upstream of the start site for the gene.

  • c.-14_-13insGT: insertion of nucleotides GT between positions -14 and -13, upstream of the start site of the gene

  • c.(-35_1)ins[ISAba125:inv]: insertion of ISAba125, in reverse orientation (:inv), anywhere between 35 bases upstream of the start site, and the start of the gene coding sequence

  • c.[2045A>G][3]: substitution of A to G at position 2045 of the gene. This mutation must occur in minimum 3 copies

  • c.[3]: gene needs to be present with a minimum of 2 copies

  • p.[Ala94Gly][0.13]: protein variant is present in >13% of reads

Combinatorial rules#

Combinatorial rules are defined using logical expressions in the ‘gene’ column, where the objects of the expression are rule identifiers (ruleID) that can be used as shorthand labels for the variants defined by gene:mutation (variant type) specified in the corresponding rules. The variation type should be specified as ‘Combination’.

  • Each rule must have a unique ruleID, assigned by the curating subgroup and prefixed with a 3-letter code that identifies the subgroup.

  • E.g. in the table below, KPN0008 can be used in a logical expression in the ‘gene’ column to demarcate gyrA:p.Ser83Tyr, and KPN0013 can be used to demarcate qnr (Gene presence detected).

  • So, the combination of these two variants can be specified as KPN0008 & KPN0013, which expands to gyrA:p.Ser83Tyr & qnr (Gene presence detected).

Rules must be specified explicitly if the effect of the combination is NOT the same as the ‘most resistant’ (in terms of exceeding breakpoints, R > I > S; or deviation from wildtype, nonwildtype > wildtype) predicted category of the component rules. E.g. in the table below:

  • The individual rules KPN0008 and KPN0009 solo each have expected category ‘nonwildtype I’, but in combination we expect ‘nonwildtype R’, so we need to specify the rule for the combination KPN0008 & KPN0009.

  • The expected category for genomes meeting rule KPN0002 (i.e. carrying core gene oqxA, => wildtype S) in addition to rule KPN0008 (i.e. with an acquired gyrA mutation, => nonwildtype I) is nonwildtype I. This is the same, not greater, than one of the component rules (KPN0008) so we do not need to specify the combination explicitly.

Note this means the combination must be specified explicitly if the combined effect is LESS resistant than the ‘most resistant’ component, e.g. in this example from TB, deletion in one gene renders the resistance mutation in another gene irrelevant so the combination must be specified.

ID

gene

mutation

variation type

drug

category

KPN0002

oqxA

-

Gene presence detected

ciprofloxacin

wt S

KPN0008

gyrA

p.Ser83Tyr

Protein variant detected

ciprofloxacin

nwt I

KPN0009

parC

p.Ser80Ile

Protein variant detected

ciprofloxacin

nwt I

KPN0013

qnr

-

Gene presence detected

ciprofloxacin

nwt I

KPN0051

KPN0008 & KPN0009

-

Combination

ciprofloxacin

nwt R

KPN0052

(KPN0008 | KPN0009) & KPN0013

-

Combination

ciprofloxacin

nwt R