AMRrules Specification#
Rule Specification#
This section details how interpretive rules should be encoded in the AMRrules format. The current version of the AMRrules Specification is v1.0, for use with the AMRrules software package v1.0. The syntax for specifying different types of variants to which a rule should be applied is given in the next section.
On this page you will find the full list of fields (indicating which external databases or ontologies apply to each field, along with a description and guidance on defining/interpreting each field), as well as bespoke AMRrules-specific controlled vocabulary for some fields.
AMRrules template (Google sheet)#
The v1.0 rule specification is also available in a Google sheet that includes the AMRrules template, with allowed values encoded in drop-down menus, to facilitate rule curation.
Full list of fields#
The full list of fields is below, with guidelines on how each field should be specified and interpreted.
Required fields |
status |
description |
reference standard |
reference link |
guidance |
rationale |
|---|---|---|---|---|---|---|
ruleID |
required |
unique identifier for this rule {values listed in ‘organism subgroup codes’} |
AMRrules |
‘organism subgroup codes’ tab |
Combination of 3-letter code (to indicate the organism subgroup who curated the rule, see tab ‘organism subgroup codes’) followed by 4-digit number (assigned by the subgroup). |
Each rule needs a unique identifier, so that combinatorial rules can be defined as combinations of component parts. These need to be unique across the entire AMRrules set, but assigned and managed within the subgroups who are defining the individual and combinatorial rules. |
txid |
required |
taxonomy ID of the species that this rule applies to |
NCBI Taxonomy |
There should be one row per species/marker combination, for clarity of interpretation and parsing the rules files, and for clarity of recording evidence for each rule and its relevance to a given species. The primary taxonomy identifier for AMRrules is the NCBI Taxonomy, this field should contain a valid taxid for a species or genus. Note these identifiers are stable, even when the species or genus name changes. |
||
organism |
required |
species that this rule applies to, normally a species {scientific name} |
NCBI Taxonomy |
Indicate the name of the organism the rule applies to. Include the prefix ‘s_’, ‘g_’ etc to indicate the taxonomic level (species, genus). E.g. ‘s_Klebsiella pneumoniae’ indicates species Klebsiella pneumoniae. ‘g_Klebsiella’ indicates genus Klebsiella. This should usually be the value of the ‘current name’ field associated with the taxid in the NCBI Taxonomy, however if there are issues with the current name, e.g. if it does not match the organism nomenclature used by EUCAST to define a breakpoint, you may use a different organism name. |
||
gene |
required |
name of the gene that this rule applies to {node ID, or gene symbol if node ID not available} OR a logical expression describing a combination of other ruleIDs {logical} |
refgene, NCBI Gene Hierarchy |
https://www.ncbi.nlm.nih.gov/pathogens/refgene/ https://www.ncbi.nlm.nih.gov/pathogens/genehierarchy, https://ftp.ncbi.nlm.nih.gov/pathogen/Antimicrobial_resistance/AMRFinderPlus/database/latest/ReferenceGeneHierarchy.txt |
If the gene is in the NCBI hierarchy, specify the node ID. If it is not in the NCBI hierarchy, indicate the gene or allele name in NCBI refgene. If it is not in NCBI refgene, use the gene symbol (e.g. ‘mexB’) - if the gene is present in CARD, use the gene symbol present there, otherwise try to identify the most suitable gene symbol and be sure to include refseq and ARO accessions for clarity). For combinatorial rules, this should be a logical expression based on other single-marker rules, which when evaluated as TRUE means this rule should be applied. E.g. “ECO001 & ECO002” means this rule should be applied when both rule ECO001 and rule ECO002 apply (i.e. when the markers defined by these rules are both detected). “(ECO001 | ECO003) & ECO002” means this rule should be applied when either one or both of rules ECO001 or ECO003 apply and ECO002 also applies. Syntax should use ‘&’ for logical AND and ‘|’ for logical OR. If the rule is intended to convey an unexplained mechanism of expected resistance, gene should be set to ‘unknown’, with context ‘core’, phenotype ‘wildtype’, clinical category ‘R’, and breakpoint standard ‘EUCAST Expected Resistant Phenotypes vX (year)’ (all gene identifier fields should be ‘-’, and the curation note should explain the reasoning). If the rule is intended to convey an expected resistance due to lack of the drug target, the same applies but the gene should be set to ‘none’. |
|
nodeID |
uniquely identify the gene using AT LEAST ONE NCBI accession: nodeID (preferred) or refseq protein or GenBank protein or HMM (for protein-coding genes); or nucleotide accession with coordinates (for nucleotide variants e.g. 23S or promoter regions) |
name of the gene that this rule applies to {node ID in NCBI gene hierarchy} |
NCBI Gene Hierarchy |
https://www.ncbi.nlm.nih.gov/pathogens/genehierarchy, https://ftp.ncbi.nlm.nih.gov/pathogen/Antimicrobial_resistance/AMRFinderPlus/database/latest/ReferenceGeneHierarchy.txt |
Can be a leaf node or internal node in the NCBI Gene Hierarchy. Where a rule applies to multiple leaf nodes and/or all descendants of an internal node, it is recommended to specify one row per node, and provide evidence for each one (unless the number of leaf nodes is large and all have the same categorization and evidence). |
|
protein accession |
refseq protein accession for the gene this rule applies to |
refseq or GenBank protein sequence accession |
https://www.ncbi.nlm.nih.gov/refseq/ https://www.ncbi.nlm.nih.gov/genbank/ |
Indicate the refseq (preferred) or Genbank protein accession for the most appropriate protein sequence. Wherever possible this should match that used in the NCBI Pathogens refgene database. |
||
HMM accession |
HMM accession for the gene this rule applies to (suitable for internal nodes in the NCBI Gene Hierarchy) |
HMM accession |
https://www.ncbi.nlm.nih.gov/pathogens/refgene/ https://www.ncbi.nlm.nih.gov/pathogens/hmm/ |
Indicate the HMM accession for the most appropriate protein sequence; this is mainly relevant for internal nodes in the NCBI Gene Hierarchy. |
||
nucleotide accession |
nucleotide sequence accession and coordinates defining the gene this rule applies to (suitable for e.g. rRNA genes or promoter variants) |
nucleotide sequence accession |
https://www.ncbi.nlm.nih.gov/pathogens/refgene/ https://www.ncbi.nlm.nih.gov/refseq/ |
For variants defined by nucleotide sequences not proteins (e.g. 23S, or promoter mutations), indicate the nucleotide sequence accession and coordinates of the relevant gene within that sequence, in the format: accession:start-stop (for genes encoded on forward strand) and accession:stop-start (for genes encoded on reverse strand). The refgene AMR database gives the relevant accessions and coordinates for AMR variants included in AMRfinderplus. |
||
ARO accession |
optional |
Antiboitic Resistance Ontology (ARO) identifier for the gene this rule applies to |
ARO gene ID |
Optional. Note AROs are not associated with specific sequences, so are insufficient to define a rule. |
Useful for harmonization with CARD (for drug dictionary and other things) and for annotation of genotypes generated using other DBs/tools based on CARD (which can be mapped to ARO using argNorm tool) |
|
mutation |
required (set to ‘-’ if non-specific) |
specific mutation in this gene to which the rule applies |
HGVS (with some AMRrules modifications) |
https://hgvs-nomenclature.org/stable/ interpretAMR/AMRrulesCuration |
Indicate the mutation relative to the gene in ‘gene’. Typically this will be a protein mutation (in the format ‘p.Ser83Tyr’) or a nucleotide mutation in a coding sequence (in the format ‘c.25T’). For more complex examples see interpretAMR/AMRrulesCuration |
|
variation type |
required |
explanation of the type of variation this rule applies to {values listed in ‘variation type’ tab} |
AMRrules |
‘variation type’ tab |
Indicate the type of variation this rule applies to. Allowed values are in the ‘variation type’ tab. Most common examples are ‘Gene presence detected’, ‘Protein variant detected’, ‘Nucleotide variant detected’ or ‘Combination’. |
Based on the ‘variant type’ column in hAMRonization, helps to clarify the nature of the variation to which the rule applies. |
gene context |
required |
indicates the genomic context for this gene in this species {core, acquired, unknown} |
AMRrules |
Indicate the genomic context of this gene within this species, i.e. whether the gene is ‘core’ or ‘acquired’. Working definition of ‘core’ is: present (>90% identity, >90% length) in the chromosome of >95% of genomes of this species and at least >95% those that have wildtype AST profiles. Note that a resistance-associated mutation in a core gene (e.g. Ser83Phe in chromosomal GyrA) should be coded as ‘core’. A mutation in an acquired gene should be coded as ‘acquired’. |
||
drug |
optional (need drug OR drug class) |
name of drug for which the rule applies {ARO term} |
ARO term |
Indicate the name of the drug that the rule applies to. Where rules apply to multiple drugs, they should be specified in separate rows (i.e. as separate rules), with individual references for each gene-drug combination. Alternatively, if the rule applies to all drugs in a defined drug class, leave this blank and indicate the ‘drug class’ field instead. Allowed values are all CARD ARO entries of type ‘antibiotic’ (which includes disinfectant agents) or ‘adjuvant’ (which includes inhibitors). |
||
drug class |
optional (need drug OR drug class) |
name of drug class for which the rule applies (ONLY if the rule is consistent across the entire drug class) {ARO term} |
ARO term |
Indicate the name of the drug class that the rule applies to. This field should be completed ONLY IF there is evidence that the gene has activity against all drugs in the class. Note that CARD defines five classes of cephalosporins: first-generation cephalosporin, second-generation cephalosporin, third-generation cephalosporin, fourth-generation cephalosporin, other cephalosporin and penam. |
Useful as there are likely to be a lot of determinants that apply across a whole drug class. |
|
phenotype |
required |
indicates whether members of this species with this gene are expected to fall in the wildtype or non-wildtype part of the reference MIC distribution; this is equivalent to identifying whether the MIC is expected to fall below or above the ECOFF, if one is defined {wildtype, nonwildtype} |
EUCAST distribution |
mic.eucast.org |
Indicates whether isolates of this species, with this gene, are considered to have a wildtype or nonwildtype susceptibility phenotype, equivalent to being below vs above the MIC ECOFF if one is defined. If the gene is a core gene, the expected phenotype should generally be ‘wildtype’, unless the rule refers to a specific variant of the core gene for which there is evidence of a nonwildtype phenotype. |
|
clinical category |
required |
expected clinical category for members of this species with this gene {S, I, R, NS} |
EUCAST |
https://www.eucast.org/clinical_breakpoints https://www.eucast.org/expert_rules_and_expected_phenotypes/expected_phenotypes |
Indicates the categorization associated with this gene, for members of this species {S, I, R, NS} using the breakpoint standard indicated. If the drug this rule applies to appears on the EUCAST Expected Resistances list for this organism, and the gene is a core gene, the expected phenotype should be ‘wildtype’ and the category should be ‘R’. If the gene is identified as a core gene but the drug does not appear on the EUCAST Expected Resistances list for this organism, and there are no EUCAST Expert Rules recommending reporting as R, there should be strong evidence from literature and/or matched genome/phenotype data to support the assignment of ‘R’. Note that ‘NS’ is only an allowed value for CLSI, not EUCAST, and has a specific meaning that is only relevant when there is a breakpoint for S but not for I or R. |
|
breakpoint |
required |
indicate the breakpoint that was used to define the expected phenotype category (note this is ‘not applicable’ if rule is specified for a drug class rather than a single drug) |
EUCAST |
https://www.eucast.org/clinical_breakpoints https://www.eucast.org/expert_rules_and_expected_phenotypes/expected_phenotypes |
Give the breakpoint used to define the indicated category for the specified drug (please enter ‘not applicable’ if rule applies to a drug class). E.g. for categorization as ‘R’ based on MIC, breakpoint should be given in the form ‘MIC >X [units]’ or ‘disk zone <X mm’; for categorization as ‘S’, use ‘MIC <=X [units]’ or ‘disk zone > X mm’; for categorization as ‘I’ use ‘MIC range, >X and <= Y [units]’. For bug/drug combinations with wildtype ‘I’, the S breakpoint may be arbitrarily set to 0.001 (MIC) or 50 (disk); in this case it is inappropriate to define the breakpoint for ‘I’ as a range, e.g. ‘MIC <=X’ rather than ‘MIC range, >0.001 and <=X [units]’. If the rule is defined on the basis of an ECOFF, indicate the threshold used in the same manner as for a breakpoint. If it is an Expected (intrinsic) resistance, the breakpoint is irrelevant (and usually undefined) so enter ‘not applicable’. If the rule applies to a drug class, enter ‘not applicable’, but consider whether it would be more informative to set specific rules for individual drugs. If there is no breakpoint or ECOFF, enter ‘not available’. |
As genotype interpretations are defined relative to clinical categorizations, and there are multiple sources for these and they are updated continuously, we need to record which standard was used to define each rule. This also facilitates accommodating multiple breakpoints for same bug-drug, using different standards or clinical indications (e.g. EUCAST sometimes has different breakpoints for IV vs oral, or for treatment of specific syndromes). This approach also facilitates using ECOFF in the absence of a breakpoint; facilitates specifying rules defined against other standards such as CLSI or veterinary standards |
breakpoint standard |
required |
indicate the AST phenotyping standard used to interpret this rule |
EUCAST |
https://www.eucast.org/clinical_breakpoints https://www.eucast.org/expert_rules_and_expected_phenotypes/expected_phenotypes |
In the format ‘[Name] [version] ([year])’, e.g. ‘EUCAST v15.0 (2025)’ or ‘ECOFF (May 2025)’ (as ECOFFs at mic.eucast.org are not versioned, indicate month and year). If it is an Expected (intrinsic) resistance, there will not typically be a breakpoint, in this case indicate the version of the expert rules e.g. ‘EUCAST Expected Resistant Phenotypes v1.2 (2023)’ or ‘EUCAST Salmonella Expert Rules v3.2 (2019)’. If the rule is defined based on an informal breakpoint defined in a paper, indicate the PubMed identifier for the relevant paper in this field as: ‘PMID xxx’ |
|
breakpoint condition |
optional |
indicate the specific conditions for this breakpoint, if relevant (e.g. meningitis, uncomplicated UTI, iv, oral) |
EUCAST |
If different breakpoints are defined for different conditions, indicate the conditions relevant to the breakpoint used to define this rule. For example different breakpoints may be given for different infection types (meningitis, uncomplicated UTI) or therapy types (iv, oral). If all breakpoints are the same, or all result in the same interpretation for this gene, it is preferable to specify a single rule without conditions. If multiple breakpoints are defined, and the interpretation is different using the different breakpoints, it is preferable to define separate interpretive rules for each condition. If the stated purpose of a condition-specific breakpoint is to screen for likely resistance mechanisms (e.g. ciprofloxacin for meningitis), or to enforce reporting of all isolates as ‘I’ for a specific condition, then a condition-specific rule is not needed as this is better managed in downstream reporting logic. Wherever possible, use the controlled vocabulary in sheet (see dropdown menu and ‘breakpoint condition values’ tab), which includes all such terms used in the EUCAST or CLSI 2025 breakpoints table. |
||
PMID |
required |
PubMed identifier/s for literature supporting the rule (comma-separated list) |
PubMed |
Provide PubMed identifier for the ‘best’ peer-reviewed research article/s providing specific evidence that this gene is associated with this phenotype category for this drug in this species (separate multiple entries with ‘, ‘). Literature demonstrating evidence in other species, or related drugs, should not be included. |
||
evidence code |
required |
indicate the nature of the evidence that supports the rule {ECO code; select from controlled list, multiple selections allowed in comma-separated list} |
ECO |
Indicate the nature of the evidence supporting the rule. More than one can be listed, please include all forms of evidence available to support the rule (separate multiple entries with ‘, ‘). In principle any codes in the Evidence and Conclusion Ontology can be used, but in most cases it will be most appropriate to choose from the subset listed in the ‘evidence codes’ tab of this spreadsheet (also provided as a dropdown selection in the main data entry tab of this spreadsheet). The source for each type of evidence should be given in the ‘PMID’. |
If you want to use an ECO code not yet included in the dropdown list, please let ESGEM-AMR chairs know so that we can add it to the specification as others may find this helpful also. If you feel something is missing from ECO, please also let us know so that we can discuss, and potentially work together to request the addition of new terms to the ontology. |
|
evidence grade |
required |
expert curators’ overall assessment of the level of support provided by all evidence considered {high, moderate, low, very low} |
AMRrules |
Indicate the expert curators’ overall assessment of the level of support provided by all evidence considered. |
There will often be a need to specify a rule for which the evidence is not yet conclusive. It is important to flag these and give some indication of what is lacking. Allowed terms and their definitions are given in the ‘evidence grades’ tab. Note that if no experimental evidence is available, the rule should NOT be graded as ‘high’, even if there is good evidence of statistical association between genotype and phenotype in natural populations. (Future updates will include additional fields to record quantitative details of genotype/phenotype associations.) |
|
evidence limitations |
optional |
expert curators’ assessment of the key limitations of the available evidence {values listed in ‘evidence grades’ tab} |
AMRrules |
‘evidence grades’ tab |
This should be completed for all rules with evidence grades other than ‘high’. Use the values listed in the ‘evidence grades’ tab (separate multiple entries with ‘, ‘). |
|
rule curation note |
optional |
short explanatory note describing the mechanism and/or reasoning for the rule |
free text |
Highly recommended to complete for all core genes, or combinatorial rules, to explain why this results in susceptibility or resistance. |
Controlled vocabularies#
Variation type#
Specifies the nature of the type of variation to which the rule applies. Based on the ‘variant type’ column in the hAMRonization AMR detection specification scheme, with additional terms from the NCIT ontology.
Values allowed in variation type column |
The specified AMRrule applies if… |
Notes or source |
|---|---|---|
Gene presence detected |
…the gene specified in the ‘gene’ column is detected as being present. |
hAMRonization |
Protein variant detected |
…the protein variant specified in the ‘mutation’ column is detected in the specified ‘gene’. |
hAMRonization |
Nucleotide variant detected |
…the nucleotide variant specified in the ‘mutation’ column is detected in the specified ‘gene’. |
hAMRonization |
Promoter variant detected |
…the promoter variant specified in the ‘mutation’ column is detected in the specified ‘gene’. |
NCIT:C190205 |
Inactivating mutation detected |
…the gene specified in the ‘gene’ column is inactivated by any type of mechanism (e.g. frameshift, internal stop, deletion, truncation), in the amino acid range specified in the ‘mutation’ column (or anywhere in the gene, if the ‘mutation’ column is blank i.e. ‘-‘). |
NCIT:C178119 |
Gene truncation detected |
…the gene specified in the ‘gene’ column is truncated, within the amino acid range specified in the ‘mutation’ column. |
|
Gene copy number variant detected |
…the gene specified in the ‘gene’ column is detected in at least the minimum number of copies specified in the ‘mutation’ column. |
NCIT:C189957 |
Nucleotide variant detected in multi-copy gene |
…the gene specified in the ‘gene’ column is a gene that is normally present in multiple copies (e.g. rRNA genes), and the nucleotide variant specified in the ‘mutation’ column is detected in at least the minimum number of alleles specified in the ‘mutation’ column. |
|
Low frequency variant detected |
…the reads data supports a mixed population, for which a minimum fraction specified in the ‘mutation’ column support the presence of the nucleotide variant specified in the ‘mutation’ column being present in the gene specified in the ‘gene’ column (currently intended for TB only). |
|
Combination |
…the logical expression in the ‘gene’ column, which expresses a combination of component rules identified by their ‘ruleID’, evaluates as true. |
Evidence codes#
Specified using the Evidence and Conclusion Ontology (ECO), this field indicates the nature of the evidence supporting the rule. More than one can be listed, and the field should include all forms of evidence available to support the rule (multiple entries separated with ‘, ‘).
Any ECO codes can be used, but curators are encouraged to choose from the subset listed here, which covers the types of evidence typically available to support resistance mechanisms in bacteria. Note the literature source for each type of evidence noted here should be indicated in the PMID field.
ECO:0001091 |
knockout phenotypic evidence |
ECO:0001091 knockout phenotypic evidence |
E.g. evidence that knocking out the proposed AMR gene in a phenotypically resistant strain results in loss of resistance |
ECO:0000012 |
functional complementation evidence |
ECO:0000012 functional complementation evidence |
E.g. evidence that, when a gene knockout results in change from R to S, the phenotype is reversed (resistance is restored) when the gene is reintroduced |
ECO:0001113 |
point mutation phenotypic evidence |
ECO:0001113 point mutation phenotypic evidence |
E.g. for a mutation, evidence that this specific mutation is associated with a change in susceptibility phenotype |
ECO:0000024 |
protein-binding evidence |
ECO:0000024 protein-binding evidence |
E.g. evidence that the gene product binds to this drug |
ECO:0001034 |
crystallography evidence |
ECO:0001034 crystallography evidence |
E.g. structural evidence from crystallography that the mutated position in this gene product interacts with the drug |
ECO:0000005 |
enzymatic activity assay evidence |
ECO:0000005 enzymatic activity assay evidence |
E.g. evidence that the gene product has enzymatic activity against the drug |
ECO:0000042 |
gain-of-function mutant phenotypic evidence |
ECO:0000042 gain-of-function mutant phenotypic evidence |
E.g. for a mutation, evidence that introducing this specific mutation into a wildtype background is associated with a change in susceptibility phenotype |
ECO:0007000 |
high throughput mutant phenotypic evidence |
ECO:0007000 high throughput mutant phenotypic evidence |
E.g. evidence from a transposon mutant library that mutation or loss of a gene in a phenotypically resistant strain results in loss of resistance |
ECO:0001103 |
natural variation mutant evidence |
ECO:0001103 natural variation mutant evidence |
E.g. for an acquired gene or mutation, evidence that natural variation in presence vs absence is associated with susceptibility to the drug (genotype-phenotype association in a natural population) |
ECO:0005027 |
genetic transformation evidence |
ECO:0005027 genetic transformation evidence |
E.g. evidence that transfer of the gene into a susceptible recipient strain results in resistance |
ECO:0000020 |
protein inhibition evidence |
ECO:0000020 protein inhibition evidence |
E.g. evidence that a mutation inhibits protein function to reduce interaction the effect of the drug and confer resistance |
ECO:0006404 |
experimentally evolved mutant phenotypic evidence |
ECO:0006404 experimentally evolved mutant phenotypic evidence |
E.g. evidence that the mutation arises in response to drug exposure during experimental evolution, resulting in resistant mutants |
ECO:0000054 |
double mutant phenotype evidence |
ECO:0000054 double mutant phenotype evidence |
E.g. evidence resulting from an experiment typically constructed to determine if two different genes have an observable genetic interaction (functional connection) as the result of a mutation occurring in the alleles of the two genes of interest |
ECO:0000154 |
heterologous protein expression evidence |
ECO:0000154 heterologous protein expression evidence |
E.g. a type of protein expression evidence where a gene from one cell is inserted into a cell that does not typically contain that gene and heterologous protein expression is assessed |
ECO:0000006 |
experimental evidence |
ECO:0000006 experimental evidence |
Experimental evidence not otherwise classified |
ECO:0001583 |
small interfering RNA knockdown evidence |
ECO:0001583 small interfering RNA knockdown evidence |
|
Evidence grade#
This fields indicates the expert curators’ overall assessment of the level of support provided by all evidence considered. It is modelled on the GRADE (Grading of Recommendations, Assessment, Development, and Evaluation) approach to assessing the certainty of evidence to guide decision making in healthcare.
AMRrules aims to provide rules to interpret all markers that have been detected in a given species, but in many cases the evidence can be quite limited. The evidence grade field gives users an overall guide to the strength of evidence, and the evidence limitations field highlights what kind of evidence is lacking.
Note that if no experimental evidence is available, the rule should NOT be graded as ‘high’, even if there is strong evidence of statistical association between genotype and phenotype in natural populations. (Future updates to the rule specification will include additional fields to record quantitative details of genotype/phenotype associations.)
There are four possible ‘grades’ in AMRrules, these are listed below with guidance on what they mean in the context of AMRrules (modelled on the GRADE framework).
Evidence grade |
What it means |
Use this when |
|---|---|---|
high |
The curators are confident in the categorisation, and believe that the likelihood that the effect will be substantially different from this is low. |
Experimental evidence provides strong support for the interpretation of this gene/variant in this species for this drug. If there is statistical geno/pheno evidence available, it supports this interpretation. |
moderate |
The curators believe that the categorisation most likely reflects the true effect, and the likelihood that the effect will be substantially different is moderate. |
There is good evidence to support the interpretation of this gene/variant in this species for this drug, but there is some uncertainty (e.g. lack of direct evidence in this organism although evidence from related organisms is convincing; or there is good statistical geno/pheno evidence but no experimental evidence of mechanism). |
low |
The curators believe that the categorisation might not reflect the true effect, and the likelihood that the effect will be substantially different is high. |
There is evidence supporting a link between this gene/variant and this drug, but the interpretation in this species is unclear (e.g. lack of evidence in this organism or related organisms; statistical geno/pheno evidence is lacking, or does not support a clear effect; or there are trustworthy but conflicting reports). |
very low |
The curators have no confidence that the categorisation reflects the true effect, and the likelihood that the effect will be substantially different is high. |
There is no trustworthy evidence as to the effect in this organism, or there is conflicting evidence. The categorical interpretation is based on assumptions made from unrelated organisms and may be wrong. |
Evidence limitations#
This fields indicates highlights what kind of evidence is lacking to support interpretation of this marker in this organism. All rules with an evidence grade other than ‘high’ should have at least one limitation recorded.
Evidence limitations |
||
|---|---|---|
lacks evidence for this species |
||
lacks evidence for this genus |
||
lacks evidence for this allele |
||
lacks evidence of the degree to which MIC is affected |
||
low clinical relevance |
||
unknown clinical relevance |
||
statistical geno/pheno evidence but no experimental evidence |
||
conflicting evidence |
||
lacks formal breakpoints |
||
lacks evidence for this drug |
Breakpoint condition#
EUCAST, CLSI and others sometimes assign different breakpoints for different clinical conditions, infection sites, or drug delivery routes (e.g. intravenous vs oral). In such cases, this field is used to indicate which specific breakpoint the rule was defined against. This will often be blank, indicating that the rule is not specific to any particular type of infection or delivery route.
The list of allowed terms is taken from the EUCAST and CLSI 2025 Breakpoints, sourced from the digitized versions in the AMR R package using this command:
`
clinical_breakpoints %>% filter(guideline=="CLSI 2025" | guideline=="EUCAST 2025") %>% group_by(site) %>% count()
`
Endocarditis |
Endocarditis with combination treatment |
Extraintestinal |
Intravenous |
Intravenous, Oral |
Investigational agent |
Liposomal, Inhaled |
Mammary gland |
Mastitis |
Meningitis |
Meningitis, Endocarditis |
Metritis |
Non-endocarditis |
Non-meningitis |
Non-meningitis, Non-endocarditis |
Non-pneumonia |
Oral |
Oral, Infections originating from the urinary tract |
Oral, Other indications |
Oral, Uncomplicated urinary tract infection |
Parenteral |
Pneumonia |
Prophylaxis |
Respiratory |
Respiratory, genital |
Respiratory, soft tissue |
Screen |
Skin |
Skin, respiratory |
Skin, soft tissue |
Skin, soft tissue, respiratory |
Skin, soft tissue, respiratory, uncomplicated urinary tract infection |
Skin, soft tissue, respiratory, uncomplicated urinary tract infection, genital |
Skin, soft tissue, uncomplicated urinary tract infection |
Skin, uncomplicated urinary tract infection |
Uncomplicated urinary tract infection |
Uncomplicated urinary tract infection, Investigational agent |
Wounds, abscesses |
Wounds, abscesses, uncomplicated urinary tract infection |
Organism code#
Each rule is assigned a ruleID, which starts with a 3-letter code to indicate the organism subgroup who curated the rule. The list of available organism subgroup codes is below.
Organism |
Prefix for ‘ruleID’ |
|---|---|
Achromobacter xylosoxidans |
AXY |
Acinetobacter |
ACI |
Aeromonas |
AER |
Anaerobes |
ANA |
Bordetella |
BOR |
Brucella |
BRU |
Burkholderia cepacia complex |
BCC |
Burkholderia pseudomallei |
BPM |
Campylobacter jejuni |
CAJ |
Campylobacter fetus |
CAF |
Campylobacter coli |
CAC |
Chryseobacterium indologenes |
CIN |
Corynebacterium diphtheriae |
CDP |
|
ECO |
Edwardsiella |
EDW |
Enterobacter cloacae complex |
ECC |
Enterococcus |
ENT |
Haemophilus influenzae |
HIN |
Helicobacter |
HEL |
Klebsiella pneumoniae |
KPN |
Legionella |
LEG |
Listeria |
LIS |
Mycobacterium non-Tb |
MYC |
Mycobacterium tuberculosis |
MTB |
Mycoplasma pneumoniae |
MPN |
Neisseria commensals |
NEI |
Neisseria gonorrhoeae |
NGO |
Neisseria meningitidis |
NMN |
Pasteurella |
PAS |
Proteus mirabilis |
PRM |
Pseudomonas aeruginosa |
PSA |
Salmonella |
SAL |
Serratia |
SER |
Shewanella |
SHW |
Staphylococcus aureus |
STA |
Staphylococcus epidermidis |
STE |
Staphylococcus saprophyticus |
STS |
Stenotrophomonas maltophilia |
STM |
Streptococcus |
STR |
Treponema |
TRE |
Vibrio |
VIB |
Yersinia |
YER |
Variant Specification#
The AMRrules specification needs to be able to encode interpretive rules for all types of genetic variants relevant to AMR in bacteria.
In 2024, the ESGEM-AMR working group collated and reviewed examples of known variants across diverse bacteria, and identified the following types of AMR variants:
Gene presence detected
Amino acid substitution or insertion
Nucleotide substitution or insertion
Gene truncated (loss of function)
Mutation in promoter region (substitution, deletion or insertion, including IS)
Gene copy number changes
Mutations in multi-copy genes (e.g. 23S rRNA)
Low frequency variants (i.e. heterozygosity)
It was concluded that all such variants could be adequately addressed using a combination of three fields:
genemutation(based on HGVS syntax, with some modifications)variation type(based on hAMRonization field Genetic Variation Type, with some additions).
Specific examples of each AMR variant are shown below, with proposed mutation syntax and variation types for each (note that other fields required for rule definition, like organism, refseq accession, context, PMID are not included here for simplicity, as they are not essential to illustrate how to define a specific kind of variation):
ID |
gene |
mutation |
variation type |
drug |
category |
|---|---|---|---|---|---|
KPN0001 |
blaSHV |
|
Gene presence detected |
ampicillin |
wt R |
KPN0002 |
gyrA |
p.Ser83Tyr |
Protein variant detected |
ciprofloxacin |
nwt I |
KPN0003 |
parC |
p.Ser80Ile |
Protein variant detected |
ciprofloxacin |
nwt I |
KPN0004 |
ompK36 |
c.25C>T |
Nucleotide variant detected |
meropenem |
nwt S |
KPN0005 |
ompK36 |
p.114_115insGlyAsp |
Protein variant detected |
meropenem |
nwt I |
KPN0006 |
mgrB |
p.(1_100) |
Gene truncation detected |
colistin |
nwt R |
ECO0001 |
ampC |
c.-11C>T |
Promoter variant detected |
ceftriaxone |
nwt R |
ECO0002 |
ampC |
c.-14_-13insGT |
Promoter variant detected |
ceftriaxone |
nwt R |
ACI0001 |
blaOXA-58 |
c.(-35_1)ins[ISAba125:inv] |
Promoter variant detected |
ceftriaxone |
nwt R |
NGO0002 |
23S rDNA |
c.[2045A>G][3] |
Nucleotide variant detected in multi-copy gene |
azithromycin |
nwt R |
ECO0003 |
blaTEM |
c.[3] |
Gene copy number variant detected |
piperacillin+tazobactam |
nwt R |
MTC0001 |
gyrA |
p.[Ala94Gly][0.13] |
Low frequency variant detected |
ciprofloxacin |
nwt R |
Syntax for mutations#
Syntax for ‘mutation’ column follows HGVS, including:
Gene and protein start sites are position 1 (there is no position 0)
Ranges are specified using
x_y; for insertions the coordinates are specified as inclusive_exclusive, otherwise ranges are inclusive_inclusiveUnknown ranges are specified with parentheses,
(x_y). E.g.p.(1_100)insGlyAspmeans an insertion of 2 amino acids (Gly and Asp) anywhere between codons 1 and 100 inclusive (as opposed to a replacement of amino acids 1 through 100 with GlyAsp, which would be expressed asp.1_100delinsGlyAsp).Coordinates are specified relative to the reference sequence of a protein (p) or coding sequence (c)
Coordinates upstream of coding sequence are specified relative to the start site, with a hyphen, e.g.
c.-35indicates 35 bp upstreamMutations in protein and DNA are specified differently, e.g.
p.Ser83Tyr: change to protein sequence from Ser to Tyr at codon 83c.25C>T: change to nucleotide coding region from C to T at nucleotide position 25
Stop codons are specified (in both DNA and protein variants) as
TerFollowing IUPAC,
Xsignifies any amino acid,Nsignifies any DNA base^(caret) is used as “or”, e.g.p.(Gly719Ala^Ser)The letters
invindicate the inverse (i.e. reverse complement) of a sequenceRepeat sequences are specified as
sequence[N]whereNis the number of copies of the repeat
AMRrules-specific syntax#
AMRrules requires amino acids be specified as three-letter codes (whereas HGVS allows single-letter or three-letter codes)
Accordingly, the STOP codon should be specified as ‘Ter’ rather than ‘*’
In HGVS you must specify the reference sequence explicitly using a sequence accession, followed by : and then the mutation, e.g.
NF000285.3:p.Gly238Ser. In AMRrules the gene is specified in separate column/s (‘gene’, ‘refseq accession’, ‘ARO accession’) and should not be repeated in the mutation column. So the above rule should be coded as:gene =
blaSHVnode =
blaSHVrefseq accession =
NF000285.3ARO accession =
ARO:3000015mutation =
p.Gly238Ser
In AMRrules, insertion sequences (IS) should be labelled with their IS name as per ISfinder, as many do not have their own sequence accessions in refseq. E.g. insertion of ISAba125 should be specified as
ins[ISAba125], and insertion in reverse orientation to the gene to which the rule applies should be specified asins[ISAba125:inv].In AMRrules, rules intended to apply when a gene is present in a minimum of N copies can be specified using the
[N]syntax to indicate the minimum repeat/copy number of the whole coding sequence, asc.[N].Note this syntax does not convey any information about the location of the copies, i.e.
c.[2]simply indicates that there are at least 2 copies of the gene detected in the genome, whether they are tandem repeats or in different replicons such as one in the chromosome and one in a plasmid.
In HGVS, the presence of multiple alleles (i.e. heterozygous) is specified as a colon-separated list of allelic variants e.g.
[allele1];[allele2].In AMRrules, rules that apply to variation in a multi-copy gene can be specified in this way, with each allele explicitly stated.
Alternatively if the rule applies when a minimum of N copies of the gene carry the mutation (e.g. mutation in ≥3 copies of 23S rRNA resulting in resistance to azithromycin), this can be abbreviated using the
[N]syntax to indicate the minimum repeat/copy number, asc.[allele][N]orp.[allele][N], e.g.c.[2045A>G][3].
In AMRrules, rules that apply to ‘low frequency variants’, i.e. when a minimum fraction of reads, P, support presence of the allelic variant in a sequenced population, the minimum fraction can be specified by extension of the syntax for copy number, as
[X]. E.g.p.[Ala94Gly][0.13](example from the Mycobacterium tuberculosis gyrA gene).To put another way, in AMRrules the repeat syntax
[X]is interpreted as a minimum copy number ifXis an integer, and as a minimum read fraction ifXis a double/float between 0 and 1.
Explanation of ‘mutation’ syntax relevant to known AMR variants#
p.Ser83Tyr: change to protein sequence from Ser to Tyr at codon 83c.25C>T: change to nucleotide coding region from C to T at nucleotide position 25p.114_115insGlyAsp: change to protein sequence, with an insertion of amino acids Gly and Asp between codons 114 and 115p.(1_100): truncation (of any kind) anywhere in the first 100 amino acids of the protein sequencec.-11C>T: change to nucleotide sequence from C to T, 11 bases upstream of the start site for the gene.c.-14_-13insGT: insertion of nucleotides GT between positions -14 and -13, upstream of the start site of the genec.(-35_1)ins[ISAba125:inv]: insertion of ISAba125, in reverse orientation (:inv), anywhere between 35 bases upstream of the start site, and the start of the gene coding sequencec.[2045A>G][3]: substitution of A to G at position 2045 of the gene. This mutation must occur in minimum 3 copiesc.[3]: gene needs to be present with a minimum of 2 copiesp.[Ala94Gly][0.13]: protein variant is present in >13% of reads
Combinatorial rules#
Combinatorial rules are defined using logical expressions in the ‘gene’ column, where the objects of the expression are rule identifiers (ruleID) that can be used as shorthand labels for the variants defined by gene:mutation (variant type) specified in the corresponding rules. The variation type should be specified as ‘Combination’.
Each rule must have a unique
ruleID, assigned by the curating subgroup and prefixed with a 3-letter code that identifies the subgroup.E.g. in the table below,
KPN0008can be used in a logical expression in the ‘gene’ column to demarcategyrA:p.Ser83Tyr, andKPN0013can be used to demarcateqnr (Gene presence detected).So, the combination of these two variants can be specified as
KPN0008 & KPN0013, which expands togyrA:p.Ser83Tyr & qnr (Gene presence detected).
Rules must be specified explicitly if the effect of the combination is NOT the same as the ‘most resistant’ (in terms of exceeding breakpoints, R > I > S; or deviation from wildtype, nonwildtype > wildtype) predicted category of the component rules. E.g. in the table below:
The individual rules
KPN0008andKPN0009solo each have expected category ‘nonwildtype I’, but in combination we expect ‘nonwildtype R’, so we need to specify the rule for the combinationKPN0008 & KPN0009.The expected category for genomes meeting rule
KPN0002(i.e. carrying core gene oqxA, => wildtype S) in addition to ruleKPN0008(i.e. with an acquired gyrA mutation, => nonwildtype I) is nonwildtype I. This is the same, not greater, than one of the component rules (KPN0008) so we do not need to specify the combination explicitly.
Note this means the combination must be specified explicitly if the combined effect is LESS resistant than the ‘most resistant’ component, e.g. in this example from TB, deletion in one gene renders the resistance mutation in another gene irrelevant so the combination must be specified.
ID |
gene |
mutation |
variation type |
drug |
category |
|---|---|---|---|---|---|
KPN0002 |
oqxA |
|
Gene presence detected |
ciprofloxacin |
wt S |
KPN0008 |
gyrA |
p.Ser83Tyr |
Protein variant detected |
ciprofloxacin |
nwt I |
KPN0009 |
parC |
p.Ser80Ile |
Protein variant detected |
ciprofloxacin |
nwt I |
KPN0013 |
qnr |
|
Gene presence detected |
ciprofloxacin |
nwt I |
KPN0051 |
KPN0008 & KPN0009 |
|
Combination |
ciprofloxacin |
nwt R |
KPN0052 |
(KPN0008 | KPN0009) & KPN0013 |
|
Combination |
ciprofloxacin |
nwt R |