Learning path
Generated2026-05-10
Total rows6899
Rows per node1894-3001

Simulated hospital nodes

Summary calculated by reading the local CSV files generated by the first script.

Node Rows Pathogens Resistance rate Genomics
nodo_barcelona_mar 3001 CRAB, CRE, CRPA 64.2% 56.6%
nodo_madrid_norte 1894 CRAB, CRE, CRPA 69.2% 45.2%
nodo_valencia_turia 2004 CRAB, CRE, CRPA 74.6% 36.5%

Module README

01 Synthetic AMR Data

Generates and validates fictitious microbiology data for three simulated hospital nodes.

Each node has a different volume and epidemiological profile to avoid an artificially symmetric demo. Generation uses a fixed seed for reproducibility, but does not produce exactly the same number of rows per node.

The data includes:

  • Priority AMR pathogen.
  • Antibiotic.
  • Synthetic MIC.
  • Resistant/susceptible interpretation.
  • Fictitious resistance mechanism.
  • Minimum aggregated clinical context.
  • Indicator of genomics availability.

There are no real patients, real identifiers, or EHDS data.

Run

From Desarrollo:

python .\01_datos_sinteticos_amr\generar_dataset_sintetico.py
python .\01_datos_sinteticos_amr\validar_calidad.py

Outputs

  • salida/nodos/*.csv: one CSV per simulated node.
  • salida/manifest.json: generation summary.

Simulated profiles

  • Madrid Norte: higher proportion of CRE and greater carbapenem pressure.
  • Barcelona Mar: higher volume and more genomics availability.
  • Valencia Turia: more CRPA, higher simulated ICU rate, and greater resistance shift.

These profiles are pedagogical; they do not represent real hospitals.