Required fields	status	description	reference standard	reference link	guidance	rationale
ruleID	required	unique identifier for this rule {values listed in 'organism subgroup codes'}	AMRrules	'organism subgroup codes' tab	Combination of 3-letter code (to indicate the organism subgroup who curated the rule, see tab 'organism subgroup codes') followed by 4-digit number (assigned by the subgroup).	Each rule needs a unique identifier, so that combinatorial rules can be defined as combinations of component parts. These need to be unique across the entire AMRrules set, but assigned and managed within the subgroups who are defining the individual and combinatorial rules.
txid	required	taxonomy ID of the species that this rule applies to	NCBI Taxonomy	https://www.ncbi.nlm.nih.gov/Taxonomy/taxonomyhome.html/	There should be one row per species/marker combination, for clarity of interpretation and parsing the rules files, and for clarity of recording evidence for each rule and its relevance to a given species. The primary taxonomy identifier for AMRrules is the NCBI Taxonomy, this field should contain a valid taxid for a species or genus. Note these identifiers are stable, even when the species or genus name changes.	
organism	required	species that this rule applies to, normally a species {scientific name}	NCBI Taxonomy	https://www.ncbi.nlm.nih.gov/Taxonomy/taxonomyhome.html/	Indicate the name of the organism the rule applies to. Include the prefix 's_', 'g_' etc to indicate the taxonomic level (species, genus). E.g. 's_Klebsiella pneumoniae' indicates species Klebsiella pneumoniae. 'g_Klebsiella' indicates genus Klebsiella. This should usually be the value of the 'current name' field associated with the taxid in the NCBI Taxonomy, however if there are issues with the current name, e.g. if it does not match the organism nomenclature used by EUCAST to define a breakpoint, you may use a different organism name.	
gene	required	name of the gene that this rule applies to {node ID, or gene symbol if node ID not available} OR a logical expression describing a combination of other ruleIDs {logical}	refgene, NCBI Gene Hierarchy	https://www.ncbi.nlm.nih.gov/pathogens/refgene/ https://www.ncbi.nlm.nih.gov/pathogens/genehierarchy, https://ftp.ncbi.nlm.nih.gov/pathogen/Antimicrobial_resistance/AMRFinderPlus/database/latest/ReferenceGeneHierarchy.txt 	If the gene is in the NCBI hierarchy, specify the node ID. If it is not in the NCBI hierarchy, indicate the gene or allele name in NCBI refgene. If it is not in NCBI refgene, use the gene symbol (e.g. 'mexB') - if the gene is present in CARD, use the gene symbol present there, otherwise try to identify the most suitable gene symbol and be sure to include refseq and ARO accessions for clarity). For combinatorial rules, this should be a logical expression based on other single-marker rules, which when evaluated as TRUE means this rule should be applied. E.g. "ECO001 & ECO002" means this rule should be applied when both rule ECO001 and rule ECO002 apply (i.e. when the markers defined by these rules are both detected). "(ECO001 | ECO003) & ECO002" means this rule should be applied when either one or both of rules ECO001 or ECO003 apply and ECO002 also applies. Syntax should use '&' for logical AND and '|' for logical OR. If the rule is intended to convey an unexplained mechanism of expected resistance, gene should be set to 'unknown', with context 'core', phenotype 'wildtype', clinical category 'R', and breakpoint standard 'EUCAST Expected Resistant Phenotypes vX (year)' (all gene identifier fields should be '-', and the curation note should explain the reasoning). If the rule is intended to convey an expected resistance due to lack of the drug target, the same applies but the gene should be set to 'none'.	
nodeID	uniquely identify the gene using AT LEAST ONE NCBI accession: nodeID (preferred) or refseq protein or GenBank protein or HMM (for protein-coding genes); or nucleotide accession with coordinates (for nucleotide variants e.g. 23S or promoter regions)	name of the gene that this rule applies to {node ID in NCBI gene hierarchy}	NCBI Gene Hierarchy	https://www.ncbi.nlm.nih.gov/pathogens/genehierarchy, https://ftp.ncbi.nlm.nih.gov/pathogen/Antimicrobial_resistance/AMRFinderPlus/database/latest/ReferenceGeneHierarchy.txt	Can be a leaf node or internal node in the NCBI Gene Hierarchy. Where a rule applies to multiple leaf nodes and/or all descendants of an internal node, it is recommended to specify one row per node, and provide evidence for each one (unless the number of leaf nodes is large and all have the same categorization and evidence). 	
protein accession		refseq protein accession for the gene this rule applies to	refseq or GenBank protein sequence accession	https://www.ncbi.nlm.nih.gov/refseq/ https://www.ncbi.nlm.nih.gov/genbank/	Indicate the refseq (preferred) or Genbank protein accession for the most appropriate protein sequence. Wherever possible this should match that used in the NCBI Pathogens refgene database.	
HMM accession		HMM accession for the gene this rule applies to (suitable for internal nodes in the NCBI Gene Hierarchy)	HMM accession	https://www.ncbi.nlm.nih.gov/pathogens/refgene/ https://www.ncbi.nlm.nih.gov/pathogens/hmm/	Indicate the HMM accession for the most appropriate protein sequence; this is mainly relevant for internal nodes in the NCBI Gene Hierarchy.	
nucleotide accession		nucleotide sequence accession and coordinates defining the gene this rule applies to (suitable for e.g. rRNA genes or promoter variants)	nucleotide sequence accession	https://www.ncbi.nlm.nih.gov/pathogens/refgene/ https://www.ncbi.nlm.nih.gov/refseq/	For variants defined by nucleotide sequences not proteins (e.g. 23S, or promoter mutations), indicate the nucleotide sequence accession and coordinates of the relevant gene within that sequence, in the format: accession:start-stop (for genes encoded on forward strand) and accession:stop-start (for genes encoded on reverse strand). The refgene AMR database gives the relevant accessions and coordinates for AMR variants included in AMRfinderplus.	
ARO accession	optional	Antiboitic Resistance Ontology (ARO) identifier for the gene this rule applies to	ARO gene ID	https://card.mcmaster.ca	Optional. Note AROs are not associated with specific sequences, so are insufficient to define a rule.	Useful for harmonization with CARD (for drug dictionary and other things) and for annotation of genotypes generated using other DBs/tools based on CARD (which can be mapped to ARO using argNorm tool)
mutation	required (set to '-' if non-specific)	specific mutation in this gene to which the rule applies	HGVS (with some AMRrules modifications)	https://hgvs-nomenclature.org/stable/ https://github.com/interpretAMR/AMRrulesCuration/blob/main/syntax.md	Indicate the mutation relative to the gene in 'gene'. Typically this will be a protein mutation (in the format 'p.Ser83Tyr') or a nucleotide mutation in a coding sequence (in the format 'c.25T'). For more complex examples see https://github.com/interpretAMR/AMRrulesCuration/blob/main/syntax.md 	
variation type	required	explanation of the type of variation this rule applies to {values listed in 'variation type' tab}	AMRrules	'variation type' tab	Indicate the type of variation this rule applies to. Allowed values are in the 'variation type' tab. Most common examples are 'Gene presence detected', 'Protein variant detected', 'Nucleotide variant detected' or 'Combination'.	Based on the 'variant type' column in hAMRonization, helps to clarify the nature of the variation to which the rule applies.
gene context	required	indicates the genomic context for this gene in this species {core, acquired, unknown}	AMRrules		Indicate the genomic context of this gene within this species, i.e. whether the gene is ‘core’ or ‘acquired’. Working definition of ‘core’ is: present (>90% identity, >90% length) in the chromosome of >95% of genomes of this species and at least >95% those that have wildtype AST profiles. Note that a resistance-associated mutation in a core gene (e.g. Ser83Phe in chromosomal GyrA) should be coded as 'core'. A mutation in an acquired gene should be coded as 'acquired'.	
drug	optional (need drug OR drug class)	name of drug for which the rule applies {ARO term}	ARO term	https://card.mcmaster.ca	Indicate the name of the drug that the rule applies to. Where rules apply to multiple drugs, they should be specified in separate rows (i.e. as separate rules), with individual references for each gene-drug combination. Alternatively, if the rule applies to all drugs in a defined drug class, leave this blank and indicate the 'drug class' field instead. Allowed values are all CARD ARO entries of type 'antibiotic' (which includes disinfectant agents) or 'adjuvant' (which includes inhibitors).	
drug class	optional (need drug OR drug class)	name of drug class for which the rule applies (ONLY if the rule is consistent across the entire drug class) {ARO term}	ARO term	https://card.mcmaster.ca	Indicate the name of the drug class that the rule applies to. This field should be completed ONLY IF there is evidence that the gene has activity against all drugs in the class. Note that CARD defines five classes of cephalosporins: first-generation cephalosporin, second-generation cephalosporin, third-generation cephalosporin, fourth-generation cephalosporin, other cephalosporin and penam.	Useful as there are likely to be a lot of determinants that apply across a whole drug class.
phenotype	required	indicates whether members of this species with this gene are expected to fall in the wildtype or non-wildtype part of the reference MIC distribution; this is equivalent to identifying whether the MIC is expected to fall below or above the ECOFF, if one is defined {wildtype, nonwildtype}	EUCAST distribution	mic.eucast.org	Indicates whether isolates of this species, with this gene, are considered to have a wildtype or nonwildtype susceptibility phenotype, equivalent to being below vs above the MIC ECOFF if one is defined. If the gene is a core gene, the expected phenotype should generally be 'wildtype', unless the rule refers to a specific variant of the core gene for which there is evidence of a nonwildtype phenotype.	
clinical category	required	expected clinical category for members of this species with this gene {S, I, R, NS}	EUCAST	https://www.eucast.org/clinical_breakpoints https://www.eucast.org/expert_rules_and_expected_phenotypes/expected_phenotypes	Indicates the categorization associated with this gene, for members of this species {S, I, R, NS} using the breakpoint standard indicated. If the drug this rule applies to appears on the EUCAST Expected Resistances list for this organism, and the gene is a core gene, the expected phenotype should be 'wildtype' and the category should be ‘R’. If the gene is identified as a core gene but the drug does not appear on the EUCAST Expected Resistances list for this organism, and there are no EUCAST Expert Rules recommending reporting as R, there should be strong evidence from literature and/or matched genome/phenotype data to support the assignment of ‘R’. Note that 'NS' is only an allowed value for CLSI, not EUCAST, and has a specific meaning that is only relevant when there is a breakpoint for S but not for I or R.	
breakpoint	required	indicate the breakpoint that was used to define the expected phenotype category (note this is 'not applicable' if rule is specified for a drug class rather than a single drug)	EUCAST	https://www.eucast.org/clinical_breakpoints https://www.eucast.org/expert_rules_and_expected_phenotypes/expected_phenotypes	Give the breakpoint used to define the indicated category for the specified drug (please enter 'not applicable' if rule applies to a drug class). E.g. for categorization as 'R' based on MIC, breakpoint should be given in the form 'MIC >X [units]' or 'disk zone <X mm'; for categorization as 'S', use 'MIC <=X [units]' or 'disk zone > X mm'; for categorization as 'I' use 'MIC range, >X and <= Y [units]'. For bug/drug combinations with wildtype 'I', the S breakpoint may be arbitrarily set to 0.001 (MIC) or 50 (disk); in this case it is inappropriate to define the breakpoint for 'I' as a range, e.g. 'MIC <=X' rather than 'MIC range, >0.001 and <=X [units]'. If the rule is defined on the basis of an ECOFF, indicate the threshold used in the same manner as for a breakpoint. If it is an Expected (intrinsic) resistance, the breakpoint is irrelevant (and usually undefined) so enter 'not applicable'. If the rule applies to a drug class, enter 'not applicable', but consider whether it would be more informative to set specific rules for individual drugs. If there is no breakpoint or ECOFF, enter 'not available'.	As genotype interpretations are defined relative to clinical categorizations, and there are multiple sources for these and they are updated continuously, we need to record which standard was used to define each rule. This also facilitates accommodating multiple breakpoints for same bug-drug, using different standards or clinical indications (e.g. EUCAST sometimes has different breakpoints for IV vs oral, or for treatment of specific syndromes). This approach also facilitates using ECOFF in the absence of a breakpoint; facilitates specifying rules defined against other standards such as CLSI or veterinary standards
breakpoint standard	required	indicate the AST phenotyping standard used to interpret this rule	EUCAST	https://www.eucast.org/clinical_breakpoints https://www.eucast.org/expert_rules_and_expected_phenotypes/expected_phenotypes	In the format '[Name] [version] ([year])', e.g. 'EUCAST v15.0 (2025)' or 'ECOFF (May 2025)' (as ECOFFs at mic.eucast.org are not versioned, indicate month and year). If it is an Expected (intrinsic) resistance, there will not typically be a breakpoint, in this case indicate the version of the expert rules e.g. 'EUCAST Expected Resistant Phenotypes v1.2 (2023)' or 'EUCAST Salmonella Expert Rules v3.2 (2019)'. If the rule is defined based on an informal breakpoint defined in a paper, indicate the PubMed identifier for the relevant paper in this field as: 'PMID xxx'	
breakpoint condition	optional	indicate the specific conditions for this breakpoint, if relevant (e.g. meningitis, uncomplicated UTI, iv, oral)	EUCAST	https://www.eucast.org/clinical_breakpoints	If different breakpoints are defined for different conditions, indicate the conditions relevant to the breakpoint used to define this rule. For example different breakpoints may be given for different infection types (meningitis, uncomplicated UTI) or therapy types (iv, oral). If all breakpoints are the same, or all result in the same interpretation for this gene, it is preferable to specify a single rule without conditions. If multiple breakpoints are defined, and the interpretation is different using the different breakpoints, it is preferable to define separate interpretive rules for each condition. If the stated purpose of a condition-specific breakpoint is to screen for likely resistance mechanisms (e.g. ciprofloxacin for meningitis), or to enforce reporting of all isolates as 'I' for a specific condition, then a condition-specific rule is not needed as this is better managed in downstream reporting logic. Wherever possible, use the controlled vocabulary in sheet (see dropdown menu and 'breakpoint condition values' tab), which includes all such terms used in the EUCAST or CLSI 2025 breakpoints table.	
PMID	required	PubMed identifier/s for literature supporting the rule (comma-separated list)	PubMed	https://pubmed.ncbi.nlm.nih.gov/	Provide PubMed identifier for the ‘best’ peer-reviewed research article/s providing specific evidence that this gene is associated with this phenotype category for this drug in this species (separate multiple entries with ', '). Literature demonstrating evidence in other species, or related drugs, should not be included.	
evidence code	required	indicate the nature of the evidence that supports the rule {ECO code; select from controlled list, multiple selections allowed in comma-separated list}	ECO	https://www.evidenceontology.org/	Indicate the nature of the evidence supporting the rule. More than one can be listed, please include all forms of evidence available to support the rule (separate multiple entries with ', '). In principle any codes in the Evidence and Conclusion Ontology can be used, but in most cases it will be most appropriate to choose from the subset listed in the 'evidence codes' tab of this spreadsheet (also provided as a dropdown selection in the main data entry tab of this spreadsheet). The source for each type of evidence should be given in the 'PMID'.	If you want to use an ECO code not yet included in the dropdown list, please let ESGEM-AMR chairs know so that we can add it to the specification as others may find this helpful also. If you feel something is missing from ECO, please also let us know so that we can discuss, and potentially work together to request the addition of new terms to the ontology.
evidence grade	required	expert curators’ overall assessment of the level of support provided by all evidence considered {high, moderate, low, very low}	AMRrules		Indicate the expert curators’ overall assessment of the level of support provided by all evidence considered.	There will often be a need to specify a rule for which the evidence is not yet conclusive. It is important to flag these and give some indication of what is lacking. Allowed terms and their definitions are given in the 'evidence grades' tab. Note that if no experimental evidence is available, the rule should NOT be graded as 'high', even if there is good evidence of statistical association between genotype and phenotype in natural populations. (Future updates will include additional fields to record quantitative details of genotype/phenotype associations.)
evidence limitations	optional	expert curators' assessment of the key limitations of the available evidence {values listed in 'evidence grades' tab}	AMRrules	'evidence grades' tab	This should be completed for all rules with evidence grades other than 'high'. Use the values listed in the 'evidence grades' tab (separate multiple entries with ', ').	
rule curation note	optional	short explanatory note describing the mechanism and/or reasoning for the rule	free text		Highly recommended to complete for all core genes, or combinatorial rules, to explain why this results in susceptibility or resistance.	
