diff --git a/_episodes/02-creating-a-factory.md b/_episodes/02-creating-a-factory.md index 58da735..8bbd664 100644 --- a/_episodes/02-creating-a-factory.md +++ b/_episodes/02-creating-a-factory.md @@ -49,7 +49,7 @@ There are a number of different kinds of factories available in JANA which we ha 4. It requires a deeper understanding of JANA internals to use correctly. The user is allowed to perform actions inside the factory callbacks that don't necessarily make sense. We remedied this issue by developing `JOmniFactory`, which *declares* what it needs upfront, and JANA *provides* it only when it makes sense. `JOmniFactory` supports all of the functionality developed for points (1), (2), and (3), and presents a simpler interface. -In summary, always use `JOmniFactory` if you are writing something new. All existing factories in EICrecon are in the process of being migrated right now: https://github.com/eic/EICrecon/issues/1176. +In summary, always use `JOmniFactory` if you are writing something new. The migration of all EICrecon factories to `JOmniFactory` (tracked in https://github.com/eic/EICrecon/issues/1176) is essentially complete, so you should not encounter the older base classes in current code. ## The JOmniFactory interface @@ -102,16 +102,16 @@ public: // The logger, parameters, and services have all been fetched before this is called } - void ChangeRun(int64_t run_number) { + void ChangeRun(int32_t run_number) { // This is called whenever the run number is changed. // Use this callback to retrieve state that is keyed off of run number. } - void Process(int64_t run_number, uint64_t event_number) { + void Process(int32_t run_number, uint64_t event_number) { // This is called on every event. // Use this callback to call your Algorithm using all inputs and outputs // The inputs will have already been fetched for you at this point. - // m_algo->execute(...); + // m_algo->process({...}, {...}); logger()->debug( "Event {}: Calling Process()", event_number ); } @@ -120,21 +120,21 @@ public: ## The JOmniFactory inputs and outputs -The user specifies the JOmniFactory's inputs by declaring `PodioInput` or `VariationalPodioInput` objects as data members. These are templated on the basic PODIO type (Not the collection type or mutable type or object type or data type), and require the user to pass `this` as a constructor argument. These objects immediately register themselves with the factory, so that the factory always knows exactly what data it needs to fetch. To access the data once it has been fetched, the user can call the object's `operator()`, which returns a constant pointer to a PODIO collection of the correct type. For instance, suppose the user declares the data member: +The user specifies the JOmniFactory's inputs by declaring `PodioInput` or `VariadicPodioInput` objects as data members. These are templated on the basic PODIO type (Not the collection type or mutable type or object type or data type), and require the user to pass `this` as a constructor argument. These objects immediately register themselves with the factory, so that the factory always knows exactly what data it needs to fetch. To access the data once it has been fetched, the user can call the object's `operator()`, which returns a constant pointer to a PODIO collection of the correct type. For instance, suppose the user declares the data member: ```c++ -PodioInput m_particles_in {this}; +PodioInput m_particles_in {this}; ``` In this case, the user would access the input data like this: ```c++ -const MCParticlesCollection* particles_in = m_particles_in(); +const edm4hep::MCParticleCollection* particles_in = m_particles_in(); ``` -Of course, for brevity, the user could simply write this instead: +Of course, for brevity, the user could simply pass `m_particles_in()` straight into the algorithm, and write the algorithm output through `m_particles_out().get()` — which is the pattern most factories use today: ```c++ -m_particles_out() = smearing_algo->execute( m_particles_in() ); +m_algo->process({m_particles_in()}, {m_particles_out().get()}); ``` As you have just seen, PodioOutputs are very analogous to PodioInputs. diff --git a/_episodes/03-calling-a-factory.md b/_episodes/03-calling-a-factory.md index 7263bfb..315deba 100644 --- a/_episodes/03-calling-a-factory.md +++ b/_episodes/03-calling-a-factory.md @@ -19,7 +19,7 @@ Instead of handing over the OmniFactory to JANA directly, we create a `JOmniFact Here is how you set up a factory generator: ```c++ - app->Add(new JOmniFactoryGeneratorT( + app->Add(new JOmniFactoryGeneratorT( "GeneratedParticles", {"MCParticles"}, {"GeneratedParticles"}, @@ -31,7 +31,7 @@ In this example, "GeneratedParticles" is the factory instance's unique tag, `{"M - If you are only creating one instance of this factory, feel free to use the "primary" output collection name as the factory prefix. (This has to be unique because PODIO collection names have to be unique.) -- Collection names are positional, so they need to be in the same order as the `PodioInput` and `VariationalPodioInput` declarations in the factory. +- Collection names are positional, so they need to be in the same order as the `PodioInput` and `VariadicPodioInput` declarations in the factory. - Variadic inputs are a little bit interesting: You can have any number of variadic inputs mixed in among the non-variadic inputs, as long as there are the same number of collection names for each variadic input. If this confuses you, just restrict yourself to one variadic input and put it as the very last input, like most programming languages do. @@ -54,15 +54,17 @@ eicrecon -Ppodio:output_collections=MyNewCollectionName1,MyNewCollectionName2 in #### To permanently include your factory's outputs in the output file: -Add your collection name to the `output_collections` list in src/services/io/podio/JEventProcessorPODIO.cc:44 +Add your collection name to the `output_collections` list in `src/services/io/podio/JEventProcessorPODIO.cc`: ```c++ - std::vector output_collections={ - "EventHeader", - "MCParticles", - "CentralTrackingRecHits", - "CentralTrackSeedingResults", - "CentralTrackerMeasurements", - //... + std::vector output_collections = { + // Header and other metadata + "EventHeader", + + // Truth record + "MCParticles", + "MCBeamElectrons", + "MCBeamProtons", + // ... ``` ### To temporarily use your factory's outputs as inputs to another factory @@ -73,9 +75,9 @@ eicrecon -Ptargetfactory:InputTags=MyNewCollectionName1,MyNewCollection2 in.root ### To permanently use your factory's outputs as inputs to another factory -Change the collection name in the `OmniFactoryGeneratorT` or `JChainMultifactoryGeneratorT`: +Change the collection name in the `JOmniFactoryGeneratorT`: ```c++ - app->Add(new JOmniFactoryGeneratorT( + app->Add(new JOmniFactoryGeneratorT( "GeneratedParticles", {"MCParticlesSmeared"}, // <== Used to be "MCParticles" {"GeneratedParticles"}, @@ -87,7 +89,7 @@ Change the collection name in the `OmniFactoryGeneratorT` or `JChainMultifactory > - Create a JOmniFactoryGenerator for your ElectronReconstruction factory > - Give your factory's output collection a fun name > - Call your factory from the command line and verify that you see its logger output. -> - Add it to the `JEventSourcePODIO::output_collections`, so that it gets called automatically. +> - Add it to the `JEventProcessorPODIO::output_collections`, so that it gets called automatically. > - Experiment with multiple factory generators so you can have multiple instances of the same factory {: .challenge} diff --git a/_episodes/04-parameterizing-a-factory.md b/_episodes/04-parameterizing-a-factory.md index f6edb57..762501e 100644 --- a/_episodes/04-parameterizing-a-factory.md +++ b/_episodes/04-parameterizing-a-factory.md @@ -18,17 +18,17 @@ Parameters are also handled using registered members. JOmniFactory provides a `P ParameterRef m_energyWeight {this, "energyWeight", config().energyWeight}; ``` -Parameters are fetched immediately before `Init()` is called, so you may access them from any of the callbacks like so: +Parameters are fetched immediately before `Configure()` is called, so you may access them from any of the callbacks like so: ```c++ - void Process(int64_t run_number, uint64_t event_number) { + void Process(int32_t run_number, uint64_t event_number) { logger()->debug( "Event {}: samplingFraction = {}", event_number, m_samplingFraction() ); } ``` Because we are using ParameterRefs, we can also access the field the ref points to directly: ```c++ - void Process(int64_t run_number, uint64_t event_number) { + void Process(int32_t run_number, uint64_t event_number) { logger()->debug( "Event {}: samplingFraction = {}", event_number, config().sampFrac ); } ``` @@ -91,4 +91,4 @@ Oftentimes we want to retrieve a resource from a Service and refresh it whenever > - Give your factory a Config struct > - Give your Config struct some parameters > - Experiment with overriding parameter values in the generator and on the command line. -{: .challenge} \ No newline at end of file +{: .challenge} diff --git a/_episodes/05-adding-an-algorithm.md b/_episodes/05-adding-an-algorithm.md index d754076..7eb4e22 100644 --- a/_episodes/05-adding-an-algorithm.md +++ b/_episodes/05-adding-an-algorithm.md @@ -11,10 +11,10 @@ objectives: ## The difference between a factory and an algorithm -*Algorithms* are classes that perform one kind of calculation we need and they do so in a generic, framework-independent way. The core of an Algorithm is a method called `execute` which inputs some PODIO collections and outputs some other PODIO collections. Algorithms don't know or care where the inputs come from and where they go. Algorithms also don't know much about where their parameters come from; rather, they are passed a `Config` structure which contains the parameters' values. The nice thing about algorithms is that they are simple to design and test, and easy to reuse for different detectors, frameworks, or even entire experiments. +*Algorithms* are classes that perform one kind of calculation we need and they do so in a generic, framework-independent way. The core of an Algorithm is a method called `process` which takes a tuple of input PODIO collections and a tuple of output PODIO collections. Algorithms don't know or care where the inputs come from and where they go. Algorithms also don't know much about where their parameters come from; rather, they are passed a `Config` structure which contains the parameters' values. The nice thing about algorithms is that they are simple to design and test, and easy to reuse for different detectors, frameworks, or even entire experiments. -Most of what makes an Algorithm an Algorithm is convention. (These are largely inspired by the KISS principle in software engineering!) There is an ongoing effort to create a "framework-less framework" for formally expressing Algorithms using templates, which lives at https://github.com/eic/algorithms. Eventually, we may encourage users to have all Algorithms inherit from the `Algorithm, Output<...>>` templated interface. For now, however, just follow the Algorithm conventions that we will go over next. +In EICrecon, all new algorithms inherit from the templated `algorithms::Algorithm, Output<...>>` interface (provided by the [eic/algorithms](https://github.com/eic/algorithms) "framework-less framework"), and use the `WithPodConfig` mixin to attach a configuration struct. The `algorithms::Algorithm` base provides logging facilities (`info()`, `debug()`, `trace()`, ...) and a structured way of declaring inputs and outputs by their PODIO collection types. ## Where to put the algorithm code @@ -29,32 +29,68 @@ Here is a template for an algorithm header file: #pragma once -// #include relevant header files here +#include +// #include relevant edm4eic / edm4hep collection headers here + +#include "MyAlgorithmNameConfig.h" +#include "algorithms/interfaces/WithPodConfig.h" namespace eicrecon { - class MyAlgorithmName { + using MyAlgorithmNameAlgorithm = + algorithms::Algorithm, + algorithms::Output>; + + class MyAlgorithmName : public MyAlgorithmNameAlgorithm, + public WithPodConfig { public: - - // init function contains any required initialization - void init(); - // execute function contains main algorithm processes - // (e.g. manipulate existing objects to create new objects) - std::unique_ptr execute(); - - // Any additional public members go here + MyAlgorithmName(std::string_view name) + : MyAlgorithmNameAlgorithm{name, + {"inputCollectionName"}, + {"outputCollectionName"}, + "Short description of what this algorithm does."} {} + + // init() is called once before processing starts. Most algorithms do not need it. + void init() final {}; - private: - std::shared_ptr m_log; - // any additional private members (e.g. services and calibrations) go here + // process() does the actual work for each event. The Input/Output tuples + // contain pointers to the PODIO collections. + void process(const Input&, const Output&) const final; }; } // namespace eicrecon ~~~ +A few things worth noting: + +- The class is *templated* on the list of input and output collection types. The `Input` and `Output` aliases inside the class expand into `std::tuple` of pointers (`gsl::not_null` for inputs, `T*` for outputs). +- `process()` is `const` — algorithms must not mutate their own state during event processing. Run-by-run state should be set up in `init()` instead. +- Logging is inherited from `algorithms::AlgorithmBase`, so inside `process()` you simply call `info(...)`, `debug(...)`, or `trace(...)` directly — no logger pointer needs to be passed in. +- The configuration struct is held by `WithPodConfig` and accessible as the protected member `m_cfg`. + +The corresponding implementation file unpacks the input and output tuples with structured bindings: + +~~~ c++ + +#include "MyAlgorithmName.h" + +namespace eicrecon { + + void MyAlgorithmName::process(const Input& input, const Output& output) const { + + const auto [in_particles] = input; + auto [out_particles] = output; + + // ... fill out_particles using in_particles and m_cfg ... + } + +} // namespace eicrecon + +~~~ + ## How to call an algorithm from a factory The code to call an algorithm from a factory generally follows a specific pattern: @@ -64,37 +100,34 @@ The code to call an algorithm from a factory generally follows a specific patter // This is called when the factory is instantiated. // Use this callback to make sure the algorithm is configured. // The logger, parameters, and services have all been fetched before this is called - m_algo = std::make_unique(); - // Pass config object to algorithm - m_algo->applyConfig(config()); + // Construct the algorithm with the factory's prefix as its name — + // this is what hooks the algorithm's logger up to the same prefix as the factory. + m_algo = std::make_unique(GetPrefix()); - // If we needed geometry, we'd obtain it like so - // m_algo->init(m_geoSvc().detector(), m_geoSvc().converter(), logger()); + // Forward the JANA log level down to the algorithm. + m_algo->level(static_cast(logger()->level())); + + // Pass the config object to the algorithm. + m_algo->applyConfig(config()); - m_algo->init(logger()); + // Call init() once. Note that init() takes no arguments — services + // (e.g. geometry) are accessed by the algorithms framework via the + // algorithms::ServiceSvc / algorithms::GeoSvc, not by passing pointers in. + m_algo->init(); } - void Process(int64_t run_number, uint64_t event_number) { - // This is called on every event. - // Use this callback to call your Algorithm using all inputs and outputs - // The inputs will have already been fetched for you at this point. - auto output = m_algo->execute( - m_in_mc_particles(), - m_in_rc_particles(), - m_in_rc_particles_assoc(), - m_in_clu_assoc() - ); - - m_out_reco_particles() = std::move(output); - // JANA will take care of publishing the outputs for you. + void Process(int32_t /* run_number */, uint64_t /* event_number */) { + // This is called on every event. The inputs will have already been fetched. + // Call process() with brace-enclosed lists of input pointers and output pointers. + m_algo->process({m_in_particles()}, {m_out_particles().get()}); } ``` > Exercise: > - Create your own ElectronReconstruction algorithm using the code skeleton above. -> - Print some log messages from your algorithm's `execute()` method. +> - Print some log messages from your algorithm's `process()` method using `info(...)` / `debug(...)`. > - Have your ElectronReconstruction factory call the algorithm. > - Run this end-to-end. -{: .challenge} \ No newline at end of file +{: .challenge} diff --git a/_episodes/06-working-with-podio.md b/_episodes/06-working-with-podio.md index dd45b6c..6d02fa6 100644 --- a/_episodes/06-working-with-podio.md +++ b/_episodes/06-working-with-podio.md @@ -87,8 +87,8 @@ subset_clusters->push_back(cluster); ``` -Note that when you write a factory, its inputs will be `const ExampleHitCollection*`, which are *immmutable*. -Its output will be `std::unique_ptr`, which is still mutable but will transfer its ownership to JANA2. JANA2 will add the collection to a podio `Frame`. From that point on, the collection is immutable and owned by the `Frame`. +Note that when you write a factory, its inputs will be `const ExampleHitCollection*`, which are *immutable*. +Its output is held by a `PodioOutput` member; calling `m_output()` returns a `std::unique_ptr&` that the factory can mutate, and `m_output().get()` gives a raw mutable pointer that you can hand to your algorithm's `process()` method. After `Process()` returns, JOmniFactory transfers ownership of that collection to JANA2, which adds it to a podio `Frame`. From that point on, the collection is immutable and owned by the `Frame`. JANA2 will create and destroy `Frame`s internally. diff --git a/_episodes/07-putting-everything-together.md b/_episodes/07-putting-everything-together.md index 8791602..c305017 100644 --- a/_episodes/07-putting-everything-together.md +++ b/_episodes/07-putting-everything-together.md @@ -9,10 +9,9 @@ objectives: ## The final ReconstructedElectron factory -Here is the final ReconstructedElectron factory. Our algorithm requires reconstructed tracks, calorimeter clusters, and associations between the two. As previously stated, the current implementation of the simple electron ID uses the truth association between tracks and clusters (association using matching between clusters and track projections will be implemented later). Thus, we need two sets of associations: association between the truth particle and the reconstructed charged particle, and association between the truth particle and the calorimeter cluster. Obviously, we will not create these objects from scratch. Rather, we will get them from the factories (and underlying algorithms) implemented to create these objects. +Here is the final `ReconstructedElectrons_factory`. The current implementation in `main` is much simpler than the variadic-input version that was originally drafted: by the time we reach the reco stage, charged tracks have already been combined with calorimeter clusters into `edm4eic::ReconstructedParticle` objects (each carrying a `getClusters()` relation), so we only need to read that one collection and apply the E/p cut on it. - -`src/global/reco/ReconstructedElectrons_factory.h` in branch `nbrei_variadic_omnifactories`: +`src/factories/reco/ReconstructedElectrons_factory.h`: ~~~ c++ @@ -22,94 +21,76 @@ Here is the final ReconstructedElectron factory. Our algorithm requires reconstr #include "algorithms/reco/ElectronReconstruction.h" - namespace eicrecon { -class ReconstructedElectrons_factory : public JOmniFactory { -private: - - // Underlying algorithm - std::unique_ptr m_algo; - - // Declare inputs - PodioInput m_in_mc_particles {this, "MCParticles"}; - PodioInput m_in_rc_particles {this, "ReconstructedChargedParticles"}; - PodioInput m_in_rc_particles_assoc {this, "ReconstructedChargedParticleAssociations"}; +class ReconstructedElectrons_factory + : public JOmniFactory { +public: + using AlgoT = eicrecon::ElectronReconstruction; - VariadicPodioInput m_in_clu_assoc {this}; +private: + // Underlying algorithm + std::unique_ptr m_algo; - // Declare outputs - PodioOutput m_out_reco_particles {this}; + // Declare inputs + PodioInput m_in_rc_particles{this, "ReconstructedParticles"}; - // Declare parameters - ParameterRef m_min_energy_over_momentum {this, "minEnergyOverMomentum", config().min_energy_over_momentum}; - ParameterRef m_max_energy_over_momentum {this, "maxEnergyOverMomentum", config().max_energy_over_momentum}; + // Declare outputs + PodioOutput m_out_reco_particles{this}; - // Declare services here, e.g. - // Service m_geoSvc {this}; + // Declare parameters + ParameterRef m_min_energy_over_momentum{this, "minEnergyOverMomentum", + config().min_energy_over_momentum}; + ParameterRef m_max_energy_over_momentum{this, "maxEnergyOverMomentum", + config().max_energy_over_momentum}; public: - void Configure() { - // This is called when the factory is instantiated. - // Use this callback to make sure the algorithm is configured. - // The logger, parameters, and services have all been fetched before this is called - m_algo = std::make_unique(); - - // Pass config object to algorithm - m_algo->applyConfig(config()); + void Configure() { + // This is called when the factory is instantiated. + // Use this callback to make sure the algorithm is configured. + // The logger, parameters, and services have all been fetched before this is called + m_algo = std::make_unique(GetPrefix()); + m_algo->level(static_cast(logger()->level())); - // If we needed geometry, we'd obtain it like so - // m_algo->init(m_geoSvc().detector(), m_geoSvc().converter(), logger()); + // Pass config object to algorithm + m_algo->applyConfig(config()); - m_algo->init(logger()); - } + m_algo->init(); + } - void ChangeRun(int32_t run_number) { - // This is called whenever the run number is changed. - // Use this callback to retrieve state that is keyed off of run number. - // This state should usually be managed by a Service. - // Note: You usually don't need this, because you can declare a Resource instead. - } + void Process(int32_t /* run_number */, uint64_t /* event_number */) { + // This is called on every event. + // Use this callback to call your Algorithm using all inputs and outputs. + // The inputs will have already been fetched for you at this point. + m_algo->process({m_in_rc_particles()}, {m_out_reco_particles().get()}); - void Process(int32_t run_number, uint64_t event_number) { - // This is called on every event. - // Use this callback to call your Algorithm using all inputs and outputs - // The inputs will have already been fetched for you at this point. - auto output = m_algo->execute( - m_in_mc_particles(), - m_in_rc_particles(), - m_in_rc_particles_assoc(), - m_in_clu_assoc() - ); - - logger()->debug( "Event {}: Found {} reconstructed electron candidates", event_number, output->size() ); - - m_out_reco_particles() = std::move(output); - // JANA will take care of publishing the outputs for you. - } + logger()->debug("Found {} reconstructed electron candidates", m_out_reco_particles()->size()); + } }; } // namespace eicrecon ~~~ -Next, we register this with the `reco` plugin in src/global/reco/reco.cc: + +Note that `ChangeRun()` is omitted entirely — the JOmniFactory base class provides a default no-op implementation, so a factory that doesn't need to react to run-number changes does not have to override it. + +Next, we register this with the `reco` plugin in `src/global/reco/reco.cc`: ```c++ - app->Add(new JOmniFactoryGeneratorT( - "ReconstructedElectrons", - {"MCParticles", "ReconstructedChargedParticles", "ReconstructedChargedParticleAssociations", - "EcalBarrelScFiClusterAssociations", - "EcalEndcapNClusterAssociations", - "EcalEndcapPClusterAssociations", - "EcalEndcapPInsertClusterAssociations", - "EcalLumiSpecClusterAssociations", - }, - {"ReconstructedElectrons"}, - {}, // Override config values here - app - )); + app->Add(new JOmniFactoryGeneratorT( + "ReconstructedElectrons", {"ReconstructedParticles"}, {"ReconstructedElectrons"}, {}, app)); ``` +You can also create additional instances of the same factory with different parameter values. For example, the same `reco.cc` registers a second instance, `ReconstructedElectronsForDIS`, with a wider E/p window: +```c++ + app->Add(new JOmniFactoryGeneratorT( + "ReconstructedElectronsForDIS", {"ReconstructedParticles"}, {"ReconstructedElectronsForDIS"}, + { + .min_energy_over_momentum = 0.7, // GeV + .max_energy_over_momentum = 1.3 // GeV + }, + app)); +``` -And finally, we add its output collection name to the output include list in src/services/io/podio/JEventProcessorPODIO: +And finally, we add its output collection name to the output include list in `src/services/io/podio/JEventProcessorPODIO.cc`: ```c++ "ReconstructedElectrons", @@ -117,6 +98,7 @@ And finally, we add its output collection name to the output include list in src ``` +## The ElectronReconstruction algorithm `src/algorithms/reco/ElectronReconstruction.h`: @@ -124,58 +106,58 @@ And finally, we add its output collection name to the output include list in src #pragma once -#include -#include +#include #include -#include -#include -#include -#include +#include +#include #include "ElectronReconstructionConfig.h" #include "algorithms/interfaces/WithPodConfig.h" namespace eicrecon { - class ElectronReconstruction : public WithPodConfig{ - public: - - // Initialization will set the pointer of the logger - void init(std::shared_ptr logger); - - // Algorithm will apply E/p cut to reconstructed tracks based on truth track-cluster associations - std::unique_ptr execute( - const edm4hep::MCParticleCollection *mcparts, - const edm4eic::ReconstructedParticleCollection *rcparts, - const edm4eic::MCRecoParticleAssociationCollection *rcassoc, - const std::vector &in_clu_assoc - ); - - // Could overload execute here to, for instance, use track projections - // for track-cluster association (instead of truth information) - - private: - std::shared_ptr m_log; // pointer to logger - double m_electron{0.000510998928}; // electron mass - }; +using ElectronReconstructionAlgorithm = + algorithms::Algorithm, + algorithms::Output>; + +class ElectronReconstruction : public ElectronReconstructionAlgorithm, + public WithPodConfig { + +public: + ElectronReconstruction(std::string_view name) + : ElectronReconstructionAlgorithm{name, + {"inputParticles"}, + {"outputParticles"}, + "selected electrons from reconstructed particles"} {} + + void init() final {}; + void process(const Input&, const Output&) const final; +}; + } // namespace eicrecon ~~~ +A few things to note about the modern algorithm interface: +- The algorithm inherits from `algorithms::Algorithm, Output<...>>`, so its inputs and outputs are declared *as types* in the template parameter list. The compiler then checks at the call site that the right collection types are passed in. +- `WithPodConfig` provides the protected member `m_cfg`, which is what the algorithm uses to access its configuration values. +- Logging methods (`info`, `debug`, `trace`, `warning`, `error`) are inherited — there is no `m_log` member to manage by hand. +- `process()` is `const`. Any state that has to be set up once per job goes in `init()` (which here is empty); state that depends on the run number can be retrieved on the fly from `algorithms::GeoSvc` and friends. -Note that the algorithm's parameters are enclosed in a Config object which lives at `src/algorithms/reco/ElectronReconstructionConfig.h` -and can be accessed via the protected member variable `m_cfg`. +The algorithm's parameters live in a Config struct at `src/algorithms/reco/ElectronReconstructionConfig.h`: ```c++ -struct ElectronReconstructionConfig { +namespace eicrecon { - double min_energy_over_momentum = 0.9; - double max_energy_over_momentum = 1.2; +struct ElectronReconstructionConfig { + double min_energy_over_momentum = 0.9; + double max_energy_over_momentum = 1.2; }; -``` +} // namespace eicrecon +``` The algorithm itself lives at `src/algorithms/reco/ElectronReconstruction.cc`: @@ -184,74 +166,61 @@ The algorithm itself lives at `src/algorithms/reco/ElectronReconstruction.cc`: #include "ElectronReconstruction.h" #include +#include #include #include +#include +#include + +#include "algorithms/reco/ElectronReconstructionConfig.h" namespace eicrecon { - void ElectronReconstruction::init(std::shared_ptr logger) { - m_log = logger; - } +void ElectronReconstruction::process(const Input& input, const Output& output) const { + + const auto [rcparts] = input; + auto [out_electrons] = output; + + // out_electrons is a *subset* collection of the input ReconstructedParticles — + // PODIO objects are owned by exactly one collection, and a subset collection + // lets us refer to existing objects without copying or transferring ownership. + out_electrons->setSubsetCollection(); - std::unique_ptr ElectronReconstruction::execute( - const edm4hep::MCParticleCollection *mcparts, - const edm4eic::ReconstructedParticleCollection *rcparts, - const edm4eic::MCRecoParticleAssociationCollection *rcassoc, - const std::vector &in_clu_assoc - ) { - - // Step 1. Loop through MCParticle - cluster associations - // Step 2. Get Reco particle for the Mc Particle matched to cluster - // Step 3. Apply E/p cut using Reco cluster Energy and Reco Particle momentum - - // Some obvious improvements: - // - E/p cut from real study optimized for electron finding and hadron rejection - // - use of any HCAL info? - // - check for duplicates? - - // output container - auto out_electrons = std::make_unique(); - - for ( const auto *col : in_clu_assoc ){ // loop on cluster association collections - for ( auto clu_assoc : (*col) ){ // loop on MCRecoClusterParticleAssociation in this particular collection - auto sim = clu_assoc.getSim(); // McParticle - auto clu = clu_assoc.getRec(); // RecoCluster - - m_log->trace( "SimId={}, CluId={}", clu_assoc.getSimID(), clu_assoc.getRecID() ); - m_log->trace( "MCParticle: Energy={} GeV, p={} GeV, E/p = {} for PDG: {}", clu.getEnergy(), edm4hep::utils::magnitude(sim.getMomentum()), clu.getEnergy() / edm4hep::utils::magnitude(sim.getMomentum()), sim.getPDG() ); - - - // Find the Reconstructed particle associated to the MC Particle that is matched with this reco cluster - // i.e. take (MC Particle <-> RC Cluster) + ( MC Particle <-> RC Particle ) = ( RC Particle <-> RC Cluster ) - auto reco_part_assoc = rcassoc->begin(); - for (; reco_part_assoc != rcassoc->end(); ++reco_part_assoc) { - if (reco_part_assoc->getSimID() == (unsigned) clu_assoc.getSimID()) { - break; - } - } - - // if we found a reco particle then test for electron compatibility - if ( reco_part_assoc != rcassoc->end() ){ - auto reco_part = reco_part_assoc->getRec(); - double EoverP = clu.getEnergy() / edm4hep::utils::magnitude(reco_part.getMomentum()); - m_log->trace( "ReconstructedParticle: Energy={} GeV, p={} GeV, E/p = {} for PDG (from truth): {}", clu.getEnergy(), edm4hep::utils::magnitude(reco_part.getMomentum()), EoverP, sim.getPDG() ); - - // Apply the E/p cut here to select electons - if ( EoverP >= m_cfg.min_energy_over_momentum && EoverP <= m_cfg.max_energy_over_momentum ) { - out_electrons->push_back(reco_part.clone()); - } - - } else { - m_log->debug( "Could not find reconstructed particle for SimId={}", clu_assoc.getSimID() ); - } - - } // loop on MC particle to cluster associations in collection - } // loop on collections - - m_log->debug( "Found {} electron candidates", out_electrons->size() ); - return out_electrons; + for (const auto particle : *rcparts) { + + // Skip particles without an associated cluster (no calorimeter info) + if (particle.getClusters().empty()) { + continue; + } + // Skip neutral particles (no track momentum to compare against) + if (particle.getCharge() == 0) { + continue; } + double E = particle.getClusters()[0].getEnergy(); + double p = edm4hep::utils::magnitude(particle.getMomentum()); + double EOverP = E / p; + + trace("ReconstructedElectron: Energy={} GeV, p={} GeV, E/p = {} for PDG (from truth): {}", + E, p, EOverP, particle.getPDG()); + + // Apply the E/p cut here to select electrons + if (EOverP >= m_cfg.min_energy_over_momentum && + EOverP <= m_cfg.max_energy_over_momentum) { + out_electrons->push_back(particle); + } + } + debug("Found {} electron candidates", out_electrons->size()); } +} // namespace eicrecon + ~~~ + +Compared to the truth-association-based draft from earlier in the tutorial, this version is much shorter because: + +- The track ↔ cluster matching has already been done upstream and is exposed via `ReconstructedParticle::getClusters()`. +- We don't need any `MCRecoParticleAssociation` or `MCRecoClusterParticleAssociation` collections — the only input is the merged `ReconstructedParticles`. +- We don't allocate new particles; we just keep references to the originals via a *subset* collection (`setSubsetCollection()`). + +The exercise from earlier episodes — wiring a custom variant that takes truth associations as additional inputs — is still a useful one. Compare your version with the version on `main` to see the trade-offs between richness of inputs and code simplicity.