Skip to content

Changelog#

Latest#

  • Removing repeat_end option from default config !751
  • Add dark mode to mkdocs documentation !748
  • Improved documentation + umami paper draft !747
  • Integrate UPP in umami/preprocessing.py !740
  • Update packages to match UPP + UPP update (v0.0.5) !745
  • Update UPP to v0.0.4 !744
  • Change truth label names !739

v0.21 (21.09.2023)#

  • Update UPP to v0.0.2 !737
  • Update atlas-ftag-tools and UPP !736
  • Update atlas-ftag-tools to v0.1.5 !735
  • Update Puma version to v0.2.7 !734
  • Added an option to train umami model directly from TDD files with structured arrays, fixed issue that writer was saving weights as one of the jet variables [!730] (https://gitlab.cern.ch/atlas-flavor-tagging-tools/algorithms/umami/-/merge_requests/730)
  • Update documentation FAQ !733
  • Added validation definitions to boosted preprocessing config file in order to create a validation sample !728
  • Update PUMA and atlas-ftag-tools !732
  • Support PUMA variables in general preprocessing config for preprocessing plots !731
  • Removed pixel and SCT holes from GNN variables [!729]
  • Changed boosted tagging flavours to lowercase: (Hbb, Hcc, QCD) changed to (hbb, hcc, qcd) !727
  • Setting coverage precision to two decimals !726

v0.20 (18.04.2023)#

  • Added example configs for Xbb/Xcc boosted preprocessing, fixed a misnomer (dijets -> QCD) with the in the global config, added flag "legend_sample_category" !717
  • Fixing a bug (ignoring the last non-full batch of jets) with resampling_generator by merging it with sampling_generator method !721
  • Removing unused preprocessing UnderSamplingProp + Fixing coverage issue !724
  • Adding plot_type to Preprocessing configs !722
  • Adding atlas-ftag-tools package !723
  • Fix for flavour_label variable !719
  • Allow linter to fail in forks !718
  • Merge Preprocessing Rewrite into Master !716
  • Added and refactored some unit tests for resampling classes, refactored Undersampling and UndersamllingNoReplace classes!705
  • Adding git hash to output h5 files !710
  • Enable multidimensional count resampling !687
  • Enable correct usage of dataclasses in preprocessing config !696
  • Adding duplicated jets plot to after resampling plots !714
  • Improve resampled file flavour label handling !713
  • Make track weight variable customizable !712
  • Add support for different labels for each track-like group !688

v0.19 (20.03.2023)#

  • Minor change to configuration.py !706
  • Adding more docs for fraction contour plots !707
  • Save resampling plots in dedicated directory for hybrid validation file !703
  • Ensure correct probability masking if evaluating with non-present class !702
  • Adding Check against zeros/negative values in log variables !700
  • Fixing precision problem in discriminant calculation !701
  • Improve scale dict copy logic !698
  • update LWTNN-conversion documentation !699
  • Fixing common variable bug in PDF sampling !685
  • Add VR track jet configs and fix bug in pdf sampling for a third sample category !657
  • Enabling additional jet labels integration test !695

v0.18 (27.01.2023)#

  • Fixing LWTNN vardict conversion !693
  • Adding figsize argument to train configuration !692
  • Adding check if track are used for training with tf records but no track labels !691
  • Adding check if custom initial jets is set for PDF Sampling !690
  • Allow for cuts when plotting input variables !666
  • Added support for storing additional jet labels, for use in regression studies !671
  • Added unit tests for ConditionalDeepSet layer!684
  • Fix bug with Hbb/Hcc/top/dijets categories in global config !683
  • Update Tensorflow Version to 2.11.0 !681
  • Rewriting train config Configuration !675

v0.17 (05.12.2022)#

  • rewrite the selection in global config and replace functions reading cuts, correct reading of cuts from test file!649
  • Update Puma Version to 0.1.9 !680
  • Change default tracks name and update a logger printout !679
  • Adding boosted flavour categories to umami/configs/global_config.yaml for boosted Xbb/Xcc tagging. !678
  • Adding checkup against n_jets <= 0 for all methods beside pdf !677
  • Fixing PDF file naming issue !676
  • Simplifying mapping function one-hot labels -> labels in the writing step of the preprocessing, remove one-hot labels in resampling step, add them in writing step!664
  • Fixing check if argument --file_range is passed when using sample_merger, adding a warning if not used !674
  • Fixing invalid-name pylint errors !669
  • Update scale dict and track label saving !665

v0.16 (11.11.2022)#

  • Update README.md and docs with tutorial link !667
  • Add preprocessing step to merge mc21 single and dileptonic ttbar samples !651
  • Adding full precision calculation of the scale/shift dicts !663
  • Changing default split in train/val/test !662

v0.15 (31.10.2022)#

  • Added writing validation files !659
  • Adding integration test for DL1* with tfrecords !660
  • Adding Appache 2.0 license !656
  • Adding support for combining track/jet inputs in input var plotting !658
  • Fixing issue in the try except blocks of the preprocessing plots !655
  • Setting default value for concat_jet_tracks !654
  • Adding support for non-top level and special named jet- and track collections !653
  • Various improvements to train file writing (group-based structure) !648
  • Removing file dependency for generator unit tests !652

v0.14 (14.10.2022)#

  • Moving Feature Importance in the evaluation section of the training !647
  • Fixing issue with SHAPley plot naming !645
  • Adding proper hybrid validation sample creation !646
  • Fixing SHAPley calculation and add it to DL1r integration test !643
  • Adding yaml to requirements.txt !644
  • Adding classes_to_evaluate to rejection per fraction calculation !642
  • Adding function to write model predictions to h5 files !637
  • Fixing ufunc issue in scale/shift application !641
  • Adding FAQ and small docs update !636

v0.13 (15.09.2022)#

  • Fix error calculation in ROC plots !639
  • Remove global dropout parameter from DIPS config. Dropout in DIPS is now defined for each layer with dropout_rate and dropout_rate_phi !638
  • Re-adding #!/usr/bin/env python to executable scripts !635
  • Removing global dropout parameter from DL1* models. Dropout has to be specified per layer now via the list dropout_rate !633
  • Bot-comment about changed placeholders will now be posted as unresolved thread !634
  • Adding function to flatten arbitrary nested lists !632
  • Adding randomise option to input_h5 block in preprocessing config !631
  • Input variable plots: Adding support for custom x-labels !626
  • Input variable plots: Adding support for dataset-specific class labels !623
  • Adding possibility to evaluate classes the freshly trained tagger is not trained on !625
  • Remove preprocessing config from loading functions !622
  • Training metrics plots: now using puma.Line2DPlot objects here, which modifies the default colours !629
  • Fixing plotting issue in fraction contour + plot_scores !624
  • Switch to puma v0.1.8 !630
  • Adding support for dataset-specific class labels in input var plots !623
  • Apply naming scheme for WP and nEpochs !621
  • Adding correct naming scheme for train config sections !617

v0.12 (23.08.2022)#

  • Small resampling fix !620
  • Fixing multiple issues with the fraction contour plots !619
  • Adding automatic creation of samples dict for the preprocessing config !610
  • Rewriting of preprocessing config reader !606
  • Adding truth label to results file + Fix flavour retrieval in plotting !618
  • Merging load validation data functions !615
  • Update training documentation !613
  • Cleanup of preprocessing config !609
  • Update puma version to v0.1.7 !614

v0.11 (10.08.2022)#

  • Removing var_dict from train config !611
  • Switching track variable precision to float32 !608
  • Merge apply_scaling and write step in preprocessing !605
  • Adding string join support for yaml !607
  • Adding configuration base class and doc improvements for pdf sampling !604
  • Merging evaluate_model script funtions + adapt pt_vs plots to be var_vs plots !599
  • Unify scaling/shifting application for preprocessing/validation !597
  • Adding script to process test samples in an easy way !595
  • Adding x_axis_granularity argument + Fixing evaluation_file plotting issue !596
  • Restructure and update preprocessing documentation !598
  • Bot posts message in MR in case files used as placeholders were changed !594
  • Pointing truth label docs directly to FTAG docs !593
  • Compare class id, class operators and variables of each class definition instead of only comparing the class id to avoid the same class definition. !575
  • Removing #!/usr/bin/env python from scripts !591
  • Adding metadata information to training file !592
  • Adding some missing unit tests !587
  • Plots per default with non-transparent background !590
  • Fixing pylint for unit tests !588
  • Adding support for hits !583
  • Fixing track masking for the input variable plots !585
  • Reducing artifact size for the preprocessing integration tests !586
  • Removing casefold in tagger name retrieval !584
  • Fixing all pylint logging-fstring-interpolation issues !582
  • Adding consistent n_jets naming !570

v0.10 (06.07.2022)#

  • Adding track truth label to the Preprocessing. !559
  • Fixing CI syntax of cobertura !577
  • Fixing image issue in pylint !574
  • Fixing memory leak in Callback functions + New TF version 2.9.1 !573
  • Add option sampling_fraction in preprocessing config to use a different number of jets for each class. Defined as fraction of events compared to target class, add option to define operator in global config !561
  • Switch to latest puma version (v0.1.3) !572
  • Splitting CADS and DIPS Attention !569
  • Fixing docker image builds !571
  • Fixing uncertainty calculation for the ROC curves !566

v0.9 (21.06.2022)#

  • Fixing Callback error when LRR is not used !567
  • Fixing stacking issue for the jet variables in the PDFSampling !565
  • Fixing problem with 4 classes integration test !564
  • Rework saliency plots to use puma !556
  • Fixing generation of class ids for only one class !563
  • Removing hardcoded tmp directories in the integration tests !562
  • Fixing x range in metrics plots + correct tagger name in results files !560
  • Fixing issue with the PDFSampling shuffling + Fixing small issue with the loaders !558
  • Fixing ylabel issue in ROC plots !555
  • Adding verbose option to executable scripts !557
  • Moving Plotting Files in one folder !554
  • Adding classes to global config (light-flavour jets split by quark flavour/gluons, leptonic b-hadron decays) to define extended tagger output !553
  • Fixing issues with trained_taggers and taggers_from_file in plotting_epoch_performance.py !549
  • Adding plotting API to Contour plots + Updating plotting_umami docs !537
  • Adding unit test for prepare_model and minor bug fixes !546
  • Adding unit tests for tf generators!542
  • Fix epoch bug in continue_training!543
  • Updating tensorflow to version 2.9.0 and pytorch to 1.11.0-cuda11.3-cudnn8-runtime !547
  • Removing plotting API code and switch to puma !540 !548
  • Fix epoch bug in continue_training!543
  • Remove IPxD from default configs !544

v0.8 (16.05.2022)#

  • Fix integration test artifacts !538
  • Moving the line-block replacement script to a separate repo !539
  • Apply Plotting API to preprocessing plots!534
  • Adding fix for batch size in validation/evaluation !535
  • Adding Plotting API to PlottingFunctions in the eval tools !532
  • Fix for the "exclude" funtionality !528
  • Adding metrics to Callback functions + Fixing model summary issue !526
  • Improved compression settings during scaling and writing !527
  • Add documentation and integration tests for importance sampling without replacement method !502
  • (Plotting API) Update training plots to plotting API !515
  • Fix validation values json in continue_training !516
  • Fixing bunch of invalid-name pylint errors !522
  • Adding error message if file in placeholder does not exist !519
  • Update the LWTNN scripts !512
  • Adding pydash to requirements !517
  • (Plotting API) Change default value of atlas_second_tag !514
  • Small refinements in input var plots !505
  • Adding ylabel_ratio_1 and ylabel_ratio_2 to plot_base !504
  • Adding prepare_docs stage to CI !503
  • Extend flexibility in input var plotting functions !501
  • Adding continue_training option !500
  • change default fc for evaluation of Dips and Cads in training configs !499
  • Use plotting python API in input var plots (track variables) !498
  • Remove redundant loading loop !496
  • Use plotting python API in input var plots (track variables) !488
  • Fixing nFiles for tfrecords training !495
  • (Plotting API) Adding support for removing "ATLAS" branding on plots !494
  • (Plotting API) Adding option to specify number of bins (instead of bin edges) in histogram plots !491
  • (Plotting API) Adding support for ATLAS tag offset + Small fix for ratio uncertainty in histogram plots !490
  • Adding support for multiple signal classes !414

v0.7 (18.03.2022)#

  • Adding Script for input variables correlation plots to examples folder !474
  • Adding integration tests for plotting examples scripts + added plots to documentation !480
  • Adding slim umami image (mainly for plotting) !473 !482
  • Update python packaging, fixing CI gitlab labels and moving classification_tools into helper_tools !481
  • Added histogram plots to the new plotting python API !449
  • Implemented placeholder for code snippets in markdown files !476
  • Fixing branch unit test (problem with changing style of matplotlib globally) !478
  • Streamline h5 ntuples and samples overview with that of ftag-docs !479
  • Adding dummy data generation of multi-class classification output !475
  • Move to matplotlib.figure API and atlasify for plotting python API !464
  • Adding --prepare option to train.py and fix an issue with the model_file not copied into the metadata folder !472
  • Move to matplotlib.figure API and atlasify for plotting python API !464
  • Fixing issue #157 with the ylabel of the input variable plots !466.
  • Adding custom labels for the taggers_from_files option in the validation metrics plots.
  • Adding custom labels for the taggers_from_files option in the validation metrics plots !469.
  • Fixing doubled integration test and removing old namings !455
  • Adding new instructions for VS Code usage !467
  • Fixing fixed_eff_bin for pT dependence plots and adding new feature to set the y limit of the ratio plots for the ROC plots !465
  • Adding a check for replaceLineInFile if leading spaces stay same, if not a warning is raised !451
  • Allowing that no cuts are provided for samples in the preprocessing step !451
  • Updating jet training variable from SV1_significance3d to SV1_correctSignificance3d for r22 !451
  • Restructuring gitlab CI file structure and adding MR/issue templates !463
  • Removing spectator variables from variable configs and fixing exclude option in training !461
  • Adding atlasify to requirements !458
  • Supprting binariser for 2 class labels to have still one hot encoding !409
  • Variable plots for preprocessing stages added !440
  • Update TFRecord reader/writer + Adding support for CADS and Umami Cond Att !444
  • Restructuring documentation !448
  • New Python API for plotting of variable vs efficenciy/rejection !434
  • New combine flavour method for PDF sampling (with shuffling) !442
  • Add TFRecords support for CADS !436
  • Added Umami attention !298
  • renamed nominator to numerator !447
  • Fix of calculation of scaling factor !441

v0.6 (16.02.2022)#

  • CI improvements
  • latest samples added to documentation
  • packages were upgraded
  • new Python API added for plotting of ROC curves
  • Added normalisation option to input plotting
  • logging level for all tests are set by default to debug
  • Added optional results filename extension
  • Added docs for pdf method and parallelise pdf method
  • Possibility to modify names of track variables in config files
  • Added new sphinx documentation
  • Black was added in CI
  • fraction contour plots were added
  • bb-jets category colour was changed
  • Copying now config files during pre-processing
  • several doc string updates
  • docs update for taggers (merged them)
  • save divide added
  • flexible validation sample definition in config added
  • fixed all doc strings and enforce now darglint in CI

v0.5 (26.01.2022)#

  • Adding Multiple Tracks datasets in preprocessing stage in !285

v0.4 (25.01.2022)#

  • Updating Tensorflow version from 2.6.0 to 2.7.0
  • Upgrading Python from version 3.6.9 to 3.8.10
  • Adding new base and baseplus images
  • Introducing linting to the CI pipelines
  • Changing to Pylint as main linting package
  • Adding doc-string checks (not enforced)
  • Adding support for GNN preprocessing
  • Restructuring of the training config files
  • Explanation how to set up Visual Studio Code to develop Umami
  • Automatic documentation via sphinx-docs is added
  • Reordering of the preprocessing config file structure (NO BACKWARD COMPATABILITY)
  • Adding CI pipeline updates
  • Restructuring of functions (where they are saved)
  • Adding multiple updates for the taggers (mostly minor adds, no big change in performance is expected)

v0.3 (01.12.2021)#

  • new preprocessing chain included
  • adding PDF sampling, weighting
Back to top