Update local-execution to newest task-spec #1625

elliottslaughter · 2026-01-17T01:18:08Z

Runs end-to-end tests, including loss function. Do not include cost estimator, that will be a separate PR.

Changes to local-execution:

ComputationGraphInstance now captures all the necessary state to run a computation
- InitializedComputationGraphInstance has been removed
- Creating the ComputationGraphInstance runs all passes, allocates and initializes the graph
local_task_registry.h is now static: no dynamic map is retained
- This means task IDs are no longer used or required
Irrelevant sections of the old runtime have been removed

Changes to task-spec:

DynamicValueAttrs now stores a DynamicTensorAccessor (variant of GenericTensorAccessorR and GenericTensorAccessorW)
Exposed the mapping computed from labelled_open_kwarg_dataflow_graph_from_dynamic_open_dataflow_graph
Fixes for make_dynamic_open_dataflow_graph_from_cg
Fixes for perform_update_insertion
Fixes for linear op and optimizer wrappers

Follow-up PRs:

Update local cost estimator for latest task-spec #1626

This change is

… equivalent dataflow, expand.

Refactor computation graph instance as DTG. Cut out more stuff that doesn't compile yet.

…taflow_graph_from_cg.

lockshaw

Looks good! Left a few comments, but most are minor style issues--only a few comments of real importance. I also pushed a couple changes directly before I remembered that this is still one of your earlier PRs so I should really leave comments so you're aware of things. Let me know when it's ready for another round of review--the next round should be quick for me to review and will very likely get merged without additional changes needed.

@lockshaw reviewed 80 files and all commit messages, and made 33 comments.
Reviewable status: all files reviewed, 30 unresolved discussions (waiting on @elliottslaughter).

lib/local-execution/include/local-execution/computation_graph_instance/computation_graph_instance.h line 25 at r1 (raw file):

struct ComputationGraphInstance {
public:

Minor: Not strictly necessary, but explicitly deleting the default constructor can be nice for readability

Suggestion:

public:
  ComputationGraphInstance() = delete;

lib/local-execution/include/local-execution/computation_graph_instance/computation_graph_instance.h line 30 at r1 (raw file):

                           OptimizerAttrs const &,
                           std::optional<LossAttrs> const &,
                           std::optional<GenericTensorAccessorW>);

Minor: Generally we make all constructors explicit unless there's a clear reason not to

Suggestion:

  explicit ComputationGraphInstance(std::vector<DynamicNodeInvocation> const &,
                           Allocator &,
                           OptimizerAttrs const &,
                           std::optional<LossAttrs> const &,
                           std::optional<GenericTensorAccessorW>);

lib/local-execution/src/local-execution/local_task_registry.cc line 158 at r1 (raw file):

  std::optional<TaskImplFunction> task_impl_fn =
      get_init_task_impl_for_op_attrs(op_attrs);
  if (!task_impl_fn) {

For clarity

Suggestion:

  if (!task_impl_fn.has_value()) {

lib/local-execution/src/local-execution/local_task_registry.cc line 175 at r1 (raw file):

  std::optional<TaskImplFunction> task_impl_fn =
      get_fwd_task_impl_for_op_attrs(op_attrs);
  if (!task_impl_fn) {

Suggestion:

  if (!task_impl_fn.has_value()) {

lib/local-execution/src/local-execution/local_task_registry.cc line 189 at r1 (raw file):

  std::optional<TaskImplFunction> task_impl_fn =
      get_bwd_task_impl_for_op_attrs(op_attrs);
  if (!task_impl_fn) {

Suggestion:

  if (!task_impl_fn.has_value()) {

lib/local-execution/include/local-execution/task_execution.h line 17 at r1 (raw file):

    DeviceType kernel_device_type,
    PCGOperatorAttrs op_attrs,
    std::optional<LossAttrs> const &loss_attrs,

Minor: Any reason for deleting all of the argument names?

lib/local-execution/src/local-execution/computation_graph_instance/computation_graph_instance.cc line 62 at r1 (raw file):

                                         dg, value.tensor_guid, value.role))
                           .second.accessor)
      .get<GenericTensorAccessorW>();

Minor: In general I'd recommend splitting this up into a couple statements to improve readability, but for a small static function like this it's not a big deal

Code quote:

  return assert_unwrap(assert_unwrap(find_output_tensor(
                                         dg, value.tensor_guid, value.role))
                           .second.accessor)
      .get<GenericTensorAccessorW>();

lib/local-execution/src/local-execution/computation_graph_instance/computation_graph_instance.cc line 84 at r1 (raw file):

      input_tensors;
  std::optional<DynamicValueAttrs> logit_grad_value;
  if (loss_attrs) {

For readability

Suggestion:

  if (loss_attrs.has_value()) {

lib/local-execution/src/local-execution/computation_graph_instance/computation_graph_instance.cc line 85 at r1 (raw file):

  std::optional<DynamicValueAttrs> logit_grad_value;
  if (loss_attrs) {
    auto [dg2, label_v, logit_grad_v] = perform_loss_insertion(

For readablity

Suggestion:

    auto [loss_inserted_dg, label_v, logit_grad_v] = perform_loss_insertion(

lib/local-execution/src/local-execution/computation_graph_instance/computation_graph_instance.cc line 163 at r1 (raw file):

        device_id_t device_idx) {
  std::vector<DynamicNodeInvocation> const &execution_order =
      instance.get_execution_order();

Don't assign reference variables unless you know the cost savings are significant--it just creates too many opportunities for someone to mix up the lifetimes and create really annoying memory bugs

Suggestion:

  std::vector<DynamicNodeInvocation> execution_order =
      instance.get_execution_order();

lib/local-execution/src/local-execution/computation_graph_instance/computation_graph_instance.cc line 185 at r1 (raw file):

        FFIterationConfig iteration_config,
        device_id_t device_idx) {
  std::vector<DynamicNodeInvocation> const &execution_order =

Really don't assign the result of nontrivial expressions to refs as it's then really easy to mess up the lifetimes--I'm actually not sure if this code is actually correct or not

lib/local-execution/src/local-execution/computation_graph_instance/computation_graph_instance.cc line 211 at r1 (raw file):

        FFIterationConfig iteration_config,
        device_id_t device_idx) {
  std::vector<DynamicNodeInvocation> const &execution_order =

Update here as well (mostly leaving the additional comment to help me remember to check this next round--normally I don't leave comments on every occurence)

lib/local-execution/src/local-execution/computation_graph_instance/computation_graph_instance.cc line 236 at r1 (raw file):

    FFIterationConfig iteration_config,
    device_id_t device_idx) {
  std::vector<DynamicNodeInvocation> const &execution_order =

Here as well

lib/local-execution/src/local-execution/task_execution.cc line 17 at r1 (raw file):

namespace FlexFlow {

TaskTensorParameter make_task_tensor_parameter_dynamic(

Minor Makes the name a bit clearer

Suggestion:

make_task_tensor_parameter_from_dynamic_slot(

lib/local-execution/src/local-execution/task_execution.cc line 18 at r1 (raw file):

TaskTensorParameter make_task_tensor_parameter_dynamic(
    DynamicTensorSlot slot,

Minor Generally we pass composite arguments (or at least arguments that are logically composite--some types may interally have multiple values, but are intended to be treated as opaque handles) as const &.

In general, for consistency default to const & unless you have a specific reason for not doing so

Suggestion:

    DynamicTensorSlot const &slot,

lib/local-execution/src/local-execution/task_execution.cc line 53 at r1 (raw file):

    device_id_t device_idx) {
  std::unordered_map<TaskTensorParameter, DynamicTensorAccessor>
      tensor_slots_backing;

I think this can be converted to a binary_merge_disjoint_maps of the result of a composition of a map_keys and map_values (it might actually be a good idea to make a map_keys_and_values that is simply a composition of the two, as iirc composing map_keys and map_values is a reasonable common operation in the codebase.

lib/local-execution/src/local-execution/task_execution.cc line 54 at r1 (raw file):

  std::unordered_map<TaskTensorParameter, DynamicTensorAccessor>
      tensor_slots_backing;
  for (auto const &[slot, input] : invocation.inputs) {

Minor If the type isn't clear from the immediate context, I'd recommend leaning toward explicit types for slot and input over the auto expansion

Code quote:

  for (auto const &[slot, input] : invocation.inputs) {

lib/local-execution/src/local-execution/task_execution.cc line 89 at r1 (raw file):

    std::optional<LossAttrs> const &loss_attrs,
    std::optional<PerDeviceOpState> const &per_device_op_state,
    FFIterationConfig iteration_config,

Minor pass by const ref unless you have a reason not toa

Code quote:

    FFIterationConfig iteration_config,

lib/local-execution/src/local-execution/task_execution.cc line 126 at r1 (raw file):

      break;
    default:
      PANIC("Unhandled DynamicTaskType", fmt::to_string(task_type));

I don't think the fmt::to_string is necessary

Suggestion:

      PANIC("Unhandled DynamicTaskType", task_type);

lib/local-execution/test/src/local-execution/local_task_argument_accessor.cc line 73 at r1 (raw file):

    SUBCASE("get_tensor") {
      SUBCASE("get_tensor(TensorSlotName, Permissions::RO, "
              "TrainingTensorType::FORWARD)") {

Update subcase names to match (or at least not contradict) the updated function signature of get_tensor

Code quote:

      SUBCASE("get_tensor(TensorSlotName, Permissions::RO, "
              "TrainingTensorType::FORWARD)") {

lib/task-spec/src/task-spec/dynamic_graph/dynamic_open_dataflow_graph.cc line 88 at r1 (raw file):

std::optional<std::pair<DynamicNodeInvocation, DynamicValueAttrs>>
    find_output_tensor(DynamicOpenDataflowGraph const &dg,

I don't really understand this function--if you already have a tensor_guid, what is the output tensor you're trying to find? And why does it return a pair of values?

lib/task-spec/include/task-spec/dynamic_graph/dynamic_tensor_accessor.dtg.toml line 16 at r1 (raw file):

[[values]]
name = "accessor_r"
type = "::FlexFlow::GenericTensorAccessorR"

FYI You can put key = "some_string" and then dtgen will autogenerate has_some_string and require_some_string (and a couple other) functions that you can use as a sometimes more readable version of has<GenericTensorAccessorR> and get<GenericTensorAccessorR>

lib/task-spec/src/task-spec/dynamic_graph/loss_insertion.cc line 31 at r1 (raw file):

      /*accessor=*/std::nullopt,
      /*role=*/mk_dynamic_tensor_role_loss(),
  };

FYI generally we use this syntax, as it's a bit more approachable for new people. That said, not a big deal

Suggestion:

  DynamicValueAttrs label_value = DynamicValueAttrs{
      /*tensor_guid=*/mk_dynamic_tensor_guid_loss(),
      /*parallel_tensor_shape=*/logit_value.parallel_tensor_shape,
      /*shard_coord=*/logit_value.shard_coord,
      /*accessor=*/std::nullopt,
      /*role=*/mk_dynamic_tensor_role_loss(),
  };

lib/task-spec/src/task-spec/dynamic_graph/loss_insertion.cc line 49 at r1 (raw file):

      },
      /*node_attrs=*/
      DynamicNodeAttrs{

Shouldn't loss_attrs go somewhere in here?

lib/task-spec/src/task-spec/dynamic_graph/loss_insertion.cc line 66 at r1 (raw file):

  DynamicOpenDataflowGraph result = dg;
  result.invocations.insert(loss_invocation);
  return std::tuple{result, label_value, logit_grad_value};

Especially when you're passing back tuples with multiple values of the same type, it's usually preferred to make a dtgen struct so that the fields have names, which makes code that uses this function safer to write and easier to read

lib/task-spec/src/task-spec/dynamic_graph/dynamic_layer_guid_t.cc line 6 at r1 (raw file):

namespace FlexFlow {

dynamic_layer_guid_t mk_dynamic_layer_guid(layer_guid_t l) {

Minor: Would make the name a bit clearer

Suggestion:

 mk_dynamic_layer_guid_for_cg_layer

lib/task-spec/src/task-spec/dynamic_graph/dynamic_layer_guid_t.cc line 9 at r1 (raw file):

  return dynamic_layer_guid_t{l};
}
dynamic_layer_guid_t mk_dynamic_layer_guid_parallel(parallel_layer_guid_t l) {

Minor: Would make the name a bit clearer

Suggestion:

mk_dynamic_layer_guid_for_parallel_layer

lib/task-spec/src/task-spec/dynamic_graph/dynamic_layer_guid_t.cc line 12 at r1 (raw file):

  return dynamic_layer_guid_t{l};
}
dynamic_layer_guid_t mk_dynamic_layer_guid_loss() {

Minor: Would make the name a bit clearer

Suggestion:

dynamic_layer_guid_t mk_dynamic_layer_guid_for_loss() {

lib/task-spec/src/task-spec/dynamic_graph/make_dynamic_open_dataflow_graph_from_cg.cc line 42 at r1 (raw file):

                  DynamicValueAttrs{
                      /*tensor_guid=*/dynamic_tensor_guid_t{tensor},
                      /*parallel_tensor_shape=*/lift_to_parallel(attrs.shape),

Do we ever need to be able to distinguish between a TensorShape and a ParallelTensorShape for this field? If not this solution is fine, just wanted to confirm since by lifting everything to a ParallelTensorShape we do lose this information

lib/task-spec/src/task-spec/dynamic_graph/dynamic_tensor_guid_t.cc line 6 at r1 (raw file):

namespace FlexFlow {

dynamic_tensor_guid_t mk_dynamic_tensor_guid(tensor_guid_t t) {

Makes the name a bit less ambiguous

Suggestion:

mk_dynamic_tensor_guid_for_tensor_guid

lib/task-spec/src/task-spec/dynamic_graph/dynamic_tensor_guid_t.cc line 10 at r1 (raw file):

}
dynamic_tensor_guid_t
    mk_dynamic_tensor_guid_parallel(parallel_tensor_guid_t t) {

Makes the name a bit less ambiguous

Suggestion:

mk_dynamic_tensor_guid_for_parallel_tensor_guid

lib/task-spec/src/task-spec/dynamic_graph/dynamic_tensor_guid_t.cc line 13 at r1 (raw file):

  return dynamic_tensor_guid_t{t};
}
dynamic_tensor_guid_t mk_dynamic_tensor_guid_loss() {

Makes the name a bit less ambiguous

Suggestion:

mk_dynamic_tensor_guid_for_loss

elliottslaughter

Still in progress, left some notes on things I have questions about.

@elliottslaughter made 18 comments and resolved 10 discussions.
Reviewable status: 64 of 81 files reviewed, 21 unresolved discussions (waiting on @lockshaw).

lib/local-execution/src/local-execution/local_task_registry.cc line 158 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

For clarity

Done.

lib/local-execution/src/local-execution/task_execution.cc line 18 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Minor Generally we pass composite arguments (or at least arguments that are logically composite--some types may interally have multiple values, but are intended to be treated as opaque handles) as const &.

In general, for consistency default to const & unless you have a specific reason for not doing so

I guess I was treating this like Copy in Rust (it's a composite type but all the fields are flat POD), but happy to do whatever you want.

lib/local-execution/src/local-execution/task_execution.cc line 126 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

I don't think the fmt::to_string is necessary

I definitely copied this from elsewhere in the codebase. What's the rule for when this is necessary vs not?

lib/local-execution/src/local-execution/computation_graph_instance/computation_graph_instance.cc line 84 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

For readability

Done.

lib/local-execution/src/local-execution/computation_graph_instance/computation_graph_instance.cc line 85 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

For readablity

Done.

lib/local-execution/src/local-execution/computation_graph_instance/computation_graph_instance.cc line 163 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Don't assign reference variables unless you know the cost savings are significant--it just creates too many opportunities for someone to mix up the lifetimes and create really annoying memory bugs

Done.

lib/local-execution/src/local-execution/computation_graph_instance/computation_graph_instance.cc line 185 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Really don't assign the result of nontrivial expressions to refs as it's then really easy to mess up the lifetimes--I'm actually not sure if this code is actually correct or not

Done.

lib/local-execution/src/local-execution/computation_graph_instance/computation_graph_instance.cc line 211 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Update here as well (mostly leaving the additional comment to help me remember to check this next round--normally I don't leave comments on every occurence)

Done.

lib/local-execution/src/local-execution/computation_graph_instance/computation_graph_instance.cc line 236 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Here as well

Done.

lib/local-execution/test/src/local-execution/local_task_argument_accessor.cc line 73 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Update subcase names to match (or at least not contradict) the updated function signature of get_tensor

I'm actually not sure how you want me to do this, unless you're literally asking me to write:

SUBCASE("get_tensor(make_task_tensor_parameter_fwd(TensorSlotName::LHS_INPUT), Permissions::RO)")

Is that it or do you have a shorter version you want to see here?

lib/task-spec/src/task-spec/dynamic_graph/dynamic_layer_guid_t.cc line 9 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Minor: Would make the name a bit clearer

Do you want this to be consistent with your other name? Because it seems you're asking for:

dynamic_layer_guid_t mk_dynamic_layer_guid_for_cg_layer(layer_guid_t);
dynamic_layer_guid_t mk_dynamic_layer_guid_for_parallel_layer(parallel_layer_guid_t);

Personally I would either make it layer/parallel_layer or else cg_layer/pcg_layer, but that's just me.

lib/task-spec/src/task-spec/dynamic_graph/dynamic_open_dataflow_graph.cc line 88 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

I don't really understand this function--if you already have a tensor_guid, what is the output tensor you're trying to find? And why does it return a pair of values?

See if the new version in the updated diff makes more sense.

lib/task-spec/src/task-spec/dynamic_graph/loss_insertion.cc line 49 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Shouldn't loss_attrs go somewhere in here?

I was on the fence about this. We could put loss attrs into the dynamic graph. On the other hand, optimizer attrs aren't stored in the graph, and arguably shouldn't be because they need to change on every iteration.

If you want them in the graph I'll add them.

lib/task-spec/src/task-spec/dynamic_graph/make_dynamic_open_dataflow_graph_from_cg.cc line 42 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Do we ever need to be able to distinguish between a TensorShape and a ParallelTensorShape for this field? If not this solution is fine, just wanted to confirm since by lifting everything to a ParallelTensorShape we do lose this information

Honestly, I don't know. As far as I'm aware everything downstream uses the ParallelTensorShape, but if you want this represented I can put it in.

lib/local-execution/src/local-execution/local_task_registry.cc line 175 at r1 (raw file):

  std::optional<TaskImplFunction> task_impl_fn =
      get_fwd_task_impl_for_op_attrs(op_attrs);
  if (!task_impl_fn) {

Done.

lib/local-execution/src/local-execution/local_task_registry.cc line 189 at r1 (raw file):

  std::optional<TaskImplFunction> task_impl_fn =
      get_bwd_task_impl_for_op_attrs(op_attrs);
  if (!task_impl_fn) {

Done.

lib/local-execution/src/local-execution/device_state_initialization.cc line 84 at r2 (raw file):

                               device_idx);
      });
  ASSERT(all_nodes_are_initialized(dg));

We can't assert here because not all nodes end up initialized at the end. I made the same mistake when I was initially writing, but I had to remove it. Perhaps you could write a more principled version that only checks certain kinds of nodes for this property, but it felt like more trouble than it's worth.

elliottslaughter

Still in progress, left some notes on things I have questions about.

@elliottslaughter made 18 comments and resolved 10 discussions.
Reviewable status: 64 of 81 files reviewed, 21 unresolved discussions (waiting on @lockshaw).

lib/local-execution/src/local-execution/local_task_registry.cc line 158 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

For clarity

Done.

lib/local-execution/src/local-execution/task_execution.cc line 18 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Minor Generally we pass composite arguments (or at least arguments that are logically composite--some types may interally have multiple values, but are intended to be treated as opaque handles) as const &.

In general, for consistency default to const & unless you have a specific reason for not doing so

I guess I was treating this like Copy in Rust (it's a composite type but all the fields are flat POD), but happy to do whatever you want.

lib/local-execution/src/local-execution/task_execution.cc line 126 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

I don't think the fmt::to_string is necessary

I definitely copied this from elsewhere in the codebase. What's the rule for when this is necessary vs not?

lib/local-execution/src/local-execution/computation_graph_instance/computation_graph_instance.cc line 84 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

For readability

Done.

lib/local-execution/src/local-execution/computation_graph_instance/computation_graph_instance.cc line 85 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

For readablity

Done.

lib/local-execution/src/local-execution/computation_graph_instance/computation_graph_instance.cc line 163 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Don't assign reference variables unless you know the cost savings are significant--it just creates too many opportunities for someone to mix up the lifetimes and create really annoying memory bugs

Done.

lib/local-execution/src/local-execution/computation_graph_instance/computation_graph_instance.cc line 185 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Really don't assign the result of nontrivial expressions to refs as it's then really easy to mess up the lifetimes--I'm actually not sure if this code is actually correct or not

Done.

lib/local-execution/src/local-execution/computation_graph_instance/computation_graph_instance.cc line 211 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Update here as well (mostly leaving the additional comment to help me remember to check this next round--normally I don't leave comments on every occurence)

Done.

lib/local-execution/src/local-execution/computation_graph_instance/computation_graph_instance.cc line 236 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Here as well

Done.

lib/local-execution/test/src/local-execution/local_task_argument_accessor.cc line 73 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Update subcase names to match (or at least not contradict) the updated function signature of get_tensor

I'm actually not sure how you want me to do this, unless you're literally asking me to write:

SUBCASE("get_tensor(make_task_tensor_parameter_fwd(TensorSlotName::LHS_INPUT), Permissions::RO)")

Is that it or do you have a shorter version you want to see here?

lib/task-spec/src/task-spec/dynamic_graph/dynamic_layer_guid_t.cc line 9 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Minor: Would make the name a bit clearer

Do you want this to be consistent with your other name? Because it seems you're asking for:

dynamic_layer_guid_t mk_dynamic_layer_guid_for_cg_layer(layer_guid_t);
dynamic_layer_guid_t mk_dynamic_layer_guid_for_parallel_layer(parallel_layer_guid_t);

Personally I would either make it layer/parallel_layer or else cg_layer/pcg_layer, but that's just me.

lib/task-spec/src/task-spec/dynamic_graph/dynamic_open_dataflow_graph.cc line 88 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

I don't really understand this function--if you already have a tensor_guid, what is the output tensor you're trying to find? And why does it return a pair of values?

See if the new version in the updated diff makes more sense.

lib/task-spec/src/task-spec/dynamic_graph/loss_insertion.cc line 49 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Shouldn't loss_attrs go somewhere in here?

I was on the fence about this. We could put loss attrs into the dynamic graph. On the other hand, optimizer attrs aren't stored in the graph, and arguably shouldn't be because they need to change on every iteration.

If you want them in the graph I'll add them.

lib/task-spec/src/task-spec/dynamic_graph/make_dynamic_open_dataflow_graph_from_cg.cc line 42 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Do we ever need to be able to distinguish between a TensorShape and a ParallelTensorShape for this field? If not this solution is fine, just wanted to confirm since by lifting everything to a ParallelTensorShape we do lose this information

Honestly, I don't know. As far as I'm aware everything downstream uses the ParallelTensorShape, but if you want this represented I can put it in.

lib/local-execution/src/local-execution/device_state_initialization.cc line 84 at r2 (raw file):

                               device_idx);
      });
  ASSERT(all_nodes_are_initialized(dg));

We can't assert here because not all nodes end up initialized at the end. I made the same mistake when I was initially writing, but I had to remove it. Perhaps you could write a more principled version that only checks certain kinds of nodes for this property, but it felt like more trouble than it's worth.

lib/local-execution/src/local-execution/local_task_registry.cc line 175 at r1 (raw file):

  std::optional<TaskImplFunction> task_impl_fn =
      get_fwd_task_impl_for_op_attrs(op_attrs);
  if (!task_impl_fn) {

Done.

lib/local-execution/src/local-execution/local_task_registry.cc line 189 at r1 (raw file):

  std::optional<TaskImplFunction> task_impl_fn =
      get_bwd_task_impl_for_op_attrs(op_attrs);
  if (!task_impl_fn) {

Done.

elliottslaughter

Still in progress, left some notes on things I have questions about.

@elliottslaughter made 18 comments and resolved 10 discussions.
Reviewable status: 64 of 81 files reviewed, 21 unresolved discussions (waiting on @lockshaw).

lib/local-execution/src/local-execution/local_task_registry.cc line 158 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

For clarity

Done.

lib/local-execution/src/local-execution/task_execution.cc line 18 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Minor Generally we pass composite arguments (or at least arguments that are logically composite--some types may interally have multiple values, but are intended to be treated as opaque handles) as const &.

In general, for consistency default to const & unless you have a specific reason for not doing so

I guess I was treating this like Copy in Rust (it's a composite type but all the fields are flat POD), but happy to do whatever you want.

lib/local-execution/src/local-execution/task_execution.cc line 126 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

I don't think the fmt::to_string is necessary

I definitely copied this from elsewhere in the codebase. What's the rule for when this is necessary vs not?

lib/local-execution/src/local-execution/computation_graph_instance/computation_graph_instance.cc line 84 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

For readability

Done.

lib/local-execution/src/local-execution/computation_graph_instance/computation_graph_instance.cc line 85 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

For readablity

Done.

lib/local-execution/src/local-execution/computation_graph_instance/computation_graph_instance.cc line 163 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Don't assign reference variables unless you know the cost savings are significant--it just creates too many opportunities for someone to mix up the lifetimes and create really annoying memory bugs

Done.

lib/local-execution/src/local-execution/computation_graph_instance/computation_graph_instance.cc line 185 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Really don't assign the result of nontrivial expressions to refs as it's then really easy to mess up the lifetimes--I'm actually not sure if this code is actually correct or not

Done.

lib/local-execution/src/local-execution/computation_graph_instance/computation_graph_instance.cc line 211 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Update here as well (mostly leaving the additional comment to help me remember to check this next round--normally I don't leave comments on every occurence)

Done.

lib/local-execution/src/local-execution/computation_graph_instance/computation_graph_instance.cc line 236 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Here as well

Done.

lib/local-execution/test/src/local-execution/local_task_argument_accessor.cc line 73 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Update subcase names to match (or at least not contradict) the updated function signature of get_tensor

I'm actually not sure how you want me to do this, unless you're literally asking me to write:

SUBCASE("get_tensor(make_task_tensor_parameter_fwd(TensorSlotName::LHS_INPUT), Permissions::RO)")

Is that it or do you have a shorter version you want to see here?

lib/task-spec/src/task-spec/dynamic_graph/dynamic_layer_guid_t.cc line 9 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Minor: Would make the name a bit clearer

Do you want this to be consistent with your other name? Because it seems you're asking for:

dynamic_layer_guid_t mk_dynamic_layer_guid_for_cg_layer(layer_guid_t);
dynamic_layer_guid_t mk_dynamic_layer_guid_for_parallel_layer(parallel_layer_guid_t);

Personally I would either make it layer/parallel_layer or else cg_layer/pcg_layer, but that's just me.

lib/task-spec/src/task-spec/dynamic_graph/dynamic_open_dataflow_graph.cc line 88 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

I don't really understand this function--if you already have a tensor_guid, what is the output tensor you're trying to find? And why does it return a pair of values?

See if the new version in the updated diff makes more sense.

lib/task-spec/src/task-spec/dynamic_graph/loss_insertion.cc line 49 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Shouldn't loss_attrs go somewhere in here?

I was on the fence about this. We could put loss attrs into the dynamic graph. On the other hand, optimizer attrs aren't stored in the graph, and arguably shouldn't be because they need to change on every iteration.

If you want them in the graph I'll add them.

lib/task-spec/src/task-spec/dynamic_graph/make_dynamic_open_dataflow_graph_from_cg.cc line 42 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Do we ever need to be able to distinguish between a TensorShape and a ParallelTensorShape for this field? If not this solution is fine, just wanted to confirm since by lifting everything to a ParallelTensorShape we do lose this information

Honestly, I don't know. As far as I'm aware everything downstream uses the ParallelTensorShape, but if you want this represented I can put it in.

lib/local-execution/src/local-execution/device_state_initialization.cc line 84 at r2 (raw file):

                               device_idx);
      });
  ASSERT(all_nodes_are_initialized(dg));

We can't assert here because not all nodes end up initialized at the end. I made the same mistake when I was initially writing, but I had to remove it. Perhaps you could write a more principled version that only checks certain kinds of nodes for this property, but it felt like more trouble than it's worth.

lib/local-execution/src/local-execution/local_task_registry.cc line 175 at r1 (raw file):

  std::optional<TaskImplFunction> task_impl_fn =
      get_fwd_task_impl_for_op_attrs(op_attrs);
  if (!task_impl_fn) {

Done.

lib/local-execution/src/local-execution/local_task_registry.cc line 189 at r1 (raw file):

  std::optional<TaskImplFunction> task_impl_fn =
      get_bwd_task_impl_for_op_attrs(op_attrs);
  if (!task_impl_fn) {

Done.

elliottslaughter

Still in progress, left some notes on things I have questions about.

@elliottslaughter made 18 comments and resolved 10 discussions.
Reviewable status: 64 of 81 files reviewed, 21 unresolved discussions (waiting on @lockshaw).

lib/local-execution/src/local-execution/local_task_registry.cc line 158 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

For clarity

Done.

lib/local-execution/src/local-execution/task_execution.cc line 18 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Minor Generally we pass composite arguments (or at least arguments that are logically composite--some types may interally have multiple values, but are intended to be treated as opaque handles) as const &.

In general, for consistency default to const & unless you have a specific reason for not doing so

I guess I was treating this like Copy in Rust (it's a composite type but all the fields are flat POD), but happy to do whatever you want.

lib/local-execution/src/local-execution/task_execution.cc line 126 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

I don't think the fmt::to_string is necessary

I definitely copied this from elsewhere in the codebase. What's the rule for when this is necessary vs not?

lib/local-execution/src/local-execution/computation_graph_instance/computation_graph_instance.cc line 84 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

For readability

Done.

lib/local-execution/src/local-execution/computation_graph_instance/computation_graph_instance.cc line 85 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

For readablity

Done.

lib/local-execution/src/local-execution/computation_graph_instance/computation_graph_instance.cc line 163 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Don't assign reference variables unless you know the cost savings are significant--it just creates too many opportunities for someone to mix up the lifetimes and create really annoying memory bugs

Done.

lib/local-execution/src/local-execution/computation_graph_instance/computation_graph_instance.cc line 185 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Really don't assign the result of nontrivial expressions to refs as it's then really easy to mess up the lifetimes--I'm actually not sure if this code is actually correct or not

Done.

lib/local-execution/src/local-execution/computation_graph_instance/computation_graph_instance.cc line 211 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Update here as well (mostly leaving the additional comment to help me remember to check this next round--normally I don't leave comments on every occurence)

Done.

lib/local-execution/src/local-execution/computation_graph_instance/computation_graph_instance.cc line 236 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Here as well

Done.

lib/local-execution/test/src/local-execution/local_task_argument_accessor.cc line 73 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Update subcase names to match (or at least not contradict) the updated function signature of get_tensor

I'm actually not sure how you want me to do this, unless you're literally asking me to write:

SUBCASE("get_tensor(make_task_tensor_parameter_fwd(TensorSlotName::LHS_INPUT), Permissions::RO)")

Is that it or do you have a shorter version you want to see here?

lib/task-spec/src/task-spec/dynamic_graph/dynamic_layer_guid_t.cc line 9 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Minor: Would make the name a bit clearer

Do you want this to be consistent with your other name? Because it seems you're asking for:

dynamic_layer_guid_t mk_dynamic_layer_guid_for_cg_layer(layer_guid_t);
dynamic_layer_guid_t mk_dynamic_layer_guid_for_parallel_layer(parallel_layer_guid_t);

Personally I would either make it layer/parallel_layer or else cg_layer/pcg_layer, but that's just me.

lib/task-spec/src/task-spec/dynamic_graph/dynamic_open_dataflow_graph.cc line 88 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

I don't really understand this function--if you already have a tensor_guid, what is the output tensor you're trying to find? And why does it return a pair of values?

See if the new version in the updated diff makes more sense.

lib/task-spec/src/task-spec/dynamic_graph/loss_insertion.cc line 49 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Shouldn't loss_attrs go somewhere in here?

I was on the fence about this. We could put loss attrs into the dynamic graph. On the other hand, optimizer attrs aren't stored in the graph, and arguably shouldn't be because they need to change on every iteration.

If you want them in the graph I'll add them.

lib/task-spec/src/task-spec/dynamic_graph/make_dynamic_open_dataflow_graph_from_cg.cc line 42 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Do we ever need to be able to distinguish between a TensorShape and a ParallelTensorShape for this field? If not this solution is fine, just wanted to confirm since by lifting everything to a ParallelTensorShape we do lose this information

Honestly, I don't know. As far as I'm aware everything downstream uses the ParallelTensorShape, but if you want this represented I can put it in.

lib/local-execution/src/local-execution/local_task_registry.cc line 175 at r1 (raw file):

  std::optional<TaskImplFunction> task_impl_fn =
      get_fwd_task_impl_for_op_attrs(op_attrs);
  if (!task_impl_fn) {

Done.

lib/local-execution/src/local-execution/local_task_registry.cc line 189 at r1 (raw file):

  std::optional<TaskImplFunction> task_impl_fn =
      get_bwd_task_impl_for_op_attrs(op_attrs);
  if (!task_impl_fn) {

Done.

lib/local-execution/src/local-execution/device_state_initialization.cc line 84 at r2 (raw file):

                               device_idx);
      });
  ASSERT(all_nodes_are_initialized(dg));

We can't assert here because not all nodes end up initialized at the end. I made the same mistake when I was initially writing, but I had to remove it. Perhaps you could write a more principled version that only checks certain kinds of nodes for this property, but it felt like more trouble than it's worth.

elliottslaughter

Still in progress, left some notes on things I have questions about.

@elliottslaughter made 18 comments and resolved 10 discussions.
Reviewable status: 64 of 81 files reviewed, 21 unresolved discussions (waiting on @lockshaw).

lib/local-execution/src/local-execution/local_task_registry.cc line 158 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

For clarity

Done.

lib/local-execution/src/local-execution/task_execution.cc line 18 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Minor Generally we pass composite arguments (or at least arguments that are logically composite--some types may interally have multiple values, but are intended to be treated as opaque handles) as const &.

In general, for consistency default to const & unless you have a specific reason for not doing so

I guess I was treating this like Copy in Rust (it's a composite type but all the fields are flat POD), but happy to do whatever you want.

lib/local-execution/src/local-execution/task_execution.cc line 126 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

I don't think the fmt::to_string is necessary

I definitely copied this from elsewhere in the codebase. What's the rule for when this is necessary vs not?

lib/local-execution/src/local-execution/computation_graph_instance/computation_graph_instance.cc line 84 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

For readability

Done.

lib/local-execution/src/local-execution/computation_graph_instance/computation_graph_instance.cc line 85 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

For readablity

Done.

lib/local-execution/src/local-execution/computation_graph_instance/computation_graph_instance.cc line 163 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Don't assign reference variables unless you know the cost savings are significant--it just creates too many opportunities for someone to mix up the lifetimes and create really annoying memory bugs

Done.

lib/local-execution/src/local-execution/computation_graph_instance/computation_graph_instance.cc line 185 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Really don't assign the result of nontrivial expressions to refs as it's then really easy to mess up the lifetimes--I'm actually not sure if this code is actually correct or not

Done.

lib/local-execution/src/local-execution/computation_graph_instance/computation_graph_instance.cc line 211 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Update here as well (mostly leaving the additional comment to help me remember to check this next round--normally I don't leave comments on every occurence)

Done.

lib/local-execution/src/local-execution/computation_graph_instance/computation_graph_instance.cc line 236 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Here as well

Done.

lib/local-execution/test/src/local-execution/local_task_argument_accessor.cc line 73 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Update subcase names to match (or at least not contradict) the updated function signature of get_tensor

I'm actually not sure how you want me to do this, unless you're literally asking me to write:

SUBCASE("get_tensor(make_task_tensor_parameter_fwd(TensorSlotName::LHS_INPUT), Permissions::RO)")

Is that it or do you have a shorter version you want to see here?

lib/task-spec/src/task-spec/dynamic_graph/dynamic_layer_guid_t.cc line 9 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Minor: Would make the name a bit clearer

Do you want this to be consistent with your other name? Because it seems you're asking for:

dynamic_layer_guid_t mk_dynamic_layer_guid_for_cg_layer(layer_guid_t);
dynamic_layer_guid_t mk_dynamic_layer_guid_for_parallel_layer(parallel_layer_guid_t);

Personally I would either make it layer/parallel_layer or else cg_layer/pcg_layer, but that's just me.

lib/task-spec/src/task-spec/dynamic_graph/dynamic_open_dataflow_graph.cc line 88 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

I don't really understand this function--if you already have a tensor_guid, what is the output tensor you're trying to find? And why does it return a pair of values?

See if the new version in the updated diff makes more sense.

lib/task-spec/src/task-spec/dynamic_graph/loss_insertion.cc line 49 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Shouldn't loss_attrs go somewhere in here?

I was on the fence about this. We could put loss attrs into the dynamic graph. On the other hand, optimizer attrs aren't stored in the graph, and arguably shouldn't be because they need to change on every iteration.

If you want them in the graph I'll add them.

lib/task-spec/src/task-spec/dynamic_graph/make_dynamic_open_dataflow_graph_from_cg.cc line 42 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Do we ever need to be able to distinguish between a TensorShape and a ParallelTensorShape for this field? If not this solution is fine, just wanted to confirm since by lifting everything to a ParallelTensorShape we do lose this information

Honestly, I don't know. As far as I'm aware everything downstream uses the ParallelTensorShape, but if you want this represented I can put it in.

lib/local-execution/src/local-execution/device_state_initialization.cc line 84 at r2 (raw file):

                               device_idx);
      });
  ASSERT(all_nodes_are_initialized(dg));

We can't assert here because not all nodes end up initialized at the end. I made the same mistake when I was initially writing, but I had to remove it. Perhaps you could write a more principled version that only checks certain kinds of nodes for this property, but it felt like more trouble than it's worth.

lib/local-execution/src/local-execution/local_task_registry.cc line 175 at r1 (raw file):

  std::optional<TaskImplFunction> task_impl_fn =
      get_fwd_task_impl_for_op_attrs(op_attrs);
  if (!task_impl_fn) {

Done.

lib/local-execution/src/local-execution/local_task_registry.cc line 189 at r1 (raw file):

  std::optional<TaskImplFunction> task_impl_fn =
      get_bwd_task_impl_for_op_attrs(op_attrs);
  if (!task_impl_fn) {

Done.

elliottslaughter

Still in progress, left some notes on things I have questions about.

@elliottslaughter made 18 comments and resolved 10 discussions.
Reviewable status: 64 of 81 files reviewed, 21 unresolved discussions (waiting on @lockshaw).

lib/local-execution/src/local-execution/local_task_registry.cc line 158 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

For clarity

Done.

lib/local-execution/src/local-execution/task_execution.cc line 18 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Minor Generally we pass composite arguments (or at least arguments that are logically composite--some types may interally have multiple values, but are intended to be treated as opaque handles) as const &.

In general, for consistency default to const & unless you have a specific reason for not doing so

I guess I was treating this like Copy in Rust (it's a composite type but all the fields are flat POD), but happy to do whatever you want.

lib/local-execution/src/local-execution/task_execution.cc line 126 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

I don't think the fmt::to_string is necessary

I definitely copied this from elsewhere in the codebase. What's the rule for when this is necessary vs not?

lib/local-execution/src/local-execution/computation_graph_instance/computation_graph_instance.cc line 84 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

For readability

Done.

lib/local-execution/src/local-execution/computation_graph_instance/computation_graph_instance.cc line 85 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

For readablity

Done.

lib/local-execution/src/local-execution/computation_graph_instance/computation_graph_instance.cc line 163 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Don't assign reference variables unless you know the cost savings are significant--it just creates too many opportunities for someone to mix up the lifetimes and create really annoying memory bugs

Done.

lib/local-execution/src/local-execution/computation_graph_instance/computation_graph_instance.cc line 185 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Really don't assign the result of nontrivial expressions to refs as it's then really easy to mess up the lifetimes--I'm actually not sure if this code is actually correct or not

Done.

lib/local-execution/src/local-execution/computation_graph_instance/computation_graph_instance.cc line 211 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Update here as well (mostly leaving the additional comment to help me remember to check this next round--normally I don't leave comments on every occurence)

Done.

lib/local-execution/src/local-execution/computation_graph_instance/computation_graph_instance.cc line 236 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Here as well

Done.

lib/local-execution/test/src/local-execution/local_task_argument_accessor.cc line 73 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Update subcase names to match (or at least not contradict) the updated function signature of get_tensor

I'm actually not sure how you want me to do this, unless you're literally asking me to write:

SUBCASE("get_tensor(make_task_tensor_parameter_fwd(TensorSlotName::LHS_INPUT), Permissions::RO)")

Is that it or do you have a shorter version you want to see here?

lib/task-spec/src/task-spec/dynamic_graph/dynamic_layer_guid_t.cc line 9 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Minor: Would make the name a bit clearer

Do you want this to be consistent with your other name? Because it seems you're asking for:

dynamic_layer_guid_t mk_dynamic_layer_guid_for_cg_layer(layer_guid_t);
dynamic_layer_guid_t mk_dynamic_layer_guid_for_parallel_layer(parallel_layer_guid_t);

Personally I would either make it layer/parallel_layer or else cg_layer/pcg_layer, but that's just me.

lib/task-spec/src/task-spec/dynamic_graph/dynamic_open_dataflow_graph.cc line 88 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

I don't really understand this function--if you already have a tensor_guid, what is the output tensor you're trying to find? And why does it return a pair of values?

See if the new version in the updated diff makes more sense.

lib/task-spec/src/task-spec/dynamic_graph/loss_insertion.cc line 49 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Shouldn't loss_attrs go somewhere in here?

I was on the fence about this. We could put loss attrs into the dynamic graph. On the other hand, optimizer attrs aren't stored in the graph, and arguably shouldn't be because they need to change on every iteration.

If you want them in the graph I'll add them.

lib/task-spec/src/task-spec/dynamic_graph/make_dynamic_open_dataflow_graph_from_cg.cc line 42 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Do we ever need to be able to distinguish between a TensorShape and a ParallelTensorShape for this field? If not this solution is fine, just wanted to confirm since by lifting everything to a ParallelTensorShape we do lose this information

Honestly, I don't know. As far as I'm aware everything downstream uses the ParallelTensorShape, but if you want this represented I can put it in.

lib/local-execution/src/local-execution/local_task_registry.cc line 175 at r1 (raw file):

  std::optional<TaskImplFunction> task_impl_fn =
      get_fwd_task_impl_for_op_attrs(op_attrs);
  if (!task_impl_fn) {

Done.

lib/local-execution/src/local-execution/local_task_registry.cc line 189 at r1 (raw file):

  std::optional<TaskImplFunction> task_impl_fn =
      get_bwd_task_impl_for_op_attrs(op_attrs);
  if (!task_impl_fn) {

Done.

lib/local-execution/src/local-execution/device_state_initialization.cc line 84 at r2 (raw file):

                               device_idx);
      });
  ASSERT(all_nodes_are_initialized(dg));

We can't assert here because not all nodes end up initialized at the end. I made the same mistake when I was initially writing, but I had to remove it. Perhaps you could write a more principled version that only checks certain kinds of nodes for this property, but it felt like more trouble than it's worth.

elliottslaughter

Still in progress, left some notes on things I have questions about.

@elliottslaughter made 18 comments and resolved 10 discussions.
Reviewable status: 64 of 81 files reviewed, 21 unresolved discussions (waiting on @lockshaw).

lib/local-execution/src/local-execution/local_task_registry.cc line 158 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

For clarity

Done.

lib/local-execution/src/local-execution/task_execution.cc line 18 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Minor Generally we pass composite arguments (or at least arguments that are logically composite--some types may interally have multiple values, but are intended to be treated as opaque handles) as const &.

In general, for consistency default to const & unless you have a specific reason for not doing so

I guess I was treating this like Copy in Rust (it's a composite type but all the fields are flat POD), but happy to do whatever you want.

lib/local-execution/src/local-execution/task_execution.cc line 126 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

I don't think the fmt::to_string is necessary

I definitely copied this from elsewhere in the codebase. What's the rule for when this is necessary vs not?

lib/local-execution/src/local-execution/computation_graph_instance/computation_graph_instance.cc line 84 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

For readability

Done.

lib/local-execution/src/local-execution/computation_graph_instance/computation_graph_instance.cc line 85 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

For readablity

Done.

lib/local-execution/src/local-execution/computation_graph_instance/computation_graph_instance.cc line 163 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Don't assign reference variables unless you know the cost savings are significant--it just creates too many opportunities for someone to mix up the lifetimes and create really annoying memory bugs

Done.

lib/local-execution/src/local-execution/computation_graph_instance/computation_graph_instance.cc line 185 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Really don't assign the result of nontrivial expressions to refs as it's then really easy to mess up the lifetimes--I'm actually not sure if this code is actually correct or not

Done.

lib/local-execution/src/local-execution/computation_graph_instance/computation_graph_instance.cc line 211 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Update here as well (mostly leaving the additional comment to help me remember to check this next round--normally I don't leave comments on every occurence)

Done.

lib/local-execution/src/local-execution/computation_graph_instance/computation_graph_instance.cc line 236 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Here as well

Done.

lib/local-execution/test/src/local-execution/local_task_argument_accessor.cc line 73 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Update subcase names to match (or at least not contradict) the updated function signature of get_tensor

I'm actually not sure how you want me to do this, unless you're literally asking me to write:

SUBCASE("get_tensor(make_task_tensor_parameter_fwd(TensorSlotName::LHS_INPUT), Permissions::RO)")

Is that it or do you have a shorter version you want to see here?

lib/task-spec/src/task-spec/dynamic_graph/dynamic_layer_guid_t.cc line 9 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Minor: Would make the name a bit clearer

Do you want this to be consistent with your other name? Because it seems you're asking for:

dynamic_layer_guid_t mk_dynamic_layer_guid_for_cg_layer(layer_guid_t);
dynamic_layer_guid_t mk_dynamic_layer_guid_for_parallel_layer(parallel_layer_guid_t);

Personally I would either make it layer/parallel_layer or else cg_layer/pcg_layer, but that's just me.

lib/task-spec/src/task-spec/dynamic_graph/dynamic_open_dataflow_graph.cc line 88 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

I don't really understand this function--if you already have a tensor_guid, what is the output tensor you're trying to find? And why does it return a pair of values?

See if the new version in the updated diff makes more sense.

lib/task-spec/src/task-spec/dynamic_graph/loss_insertion.cc line 49 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Shouldn't loss_attrs go somewhere in here?

I was on the fence about this. We could put loss attrs into the dynamic graph. On the other hand, optimizer attrs aren't stored in the graph, and arguably shouldn't be because they need to change on every iteration.

If you want them in the graph I'll add them.

lib/task-spec/src/task-spec/dynamic_graph/make_dynamic_open_dataflow_graph_from_cg.cc line 42 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Do we ever need to be able to distinguish between a TensorShape and a ParallelTensorShape for this field? If not this solution is fine, just wanted to confirm since by lifting everything to a ParallelTensorShape we do lose this information

Honestly, I don't know. As far as I'm aware everything downstream uses the ParallelTensorShape, but if you want this represented I can put it in.

lib/local-execution/src/local-execution/device_state_initialization.cc line 84 at r2 (raw file):

                               device_idx);
      });
  ASSERT(all_nodes_are_initialized(dg));

We can't assert here because not all nodes end up initialized at the end. I made the same mistake when I was initially writing, but I had to remove it. Perhaps you could write a more principled version that only checks certain kinds of nodes for this property, but it felt like more trouble than it's worth.

lib/local-execution/src/local-execution/local_task_registry.cc line 175 at r1 (raw file):

  std::optional<TaskImplFunction> task_impl_fn =
      get_fwd_task_impl_for_op_attrs(op_attrs);
  if (!task_impl_fn) {

Done.

lib/local-execution/src/local-execution/local_task_registry.cc line 189 at r1 (raw file):

  std::optional<TaskImplFunction> task_impl_fn =
      get_bwd_task_impl_for_op_attrs(op_attrs);
  if (!task_impl_fn) {

Done.

elliottslaughter

Still in progress, left some notes on things I have questions about.

@elliottslaughter made 18 comments and resolved 10 discussions.
Reviewable status: 64 of 81 files reviewed, 21 unresolved discussions (waiting on @lockshaw).

lib/local-execution/src/local-execution/local_task_registry.cc line 158 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

For clarity

Done.

lib/local-execution/src/local-execution/task_execution.cc line 18 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Minor Generally we pass composite arguments (or at least arguments that are logically composite--some types may interally have multiple values, but are intended to be treated as opaque handles) as const &.

In general, for consistency default to const & unless you have a specific reason for not doing so

I guess I was treating this like Copy in Rust (it's a composite type but all the fields are flat POD), but happy to do whatever you want.

lib/local-execution/src/local-execution/task_execution.cc line 126 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

I don't think the fmt::to_string is necessary

I definitely copied this from elsewhere in the codebase. What's the rule for when this is necessary vs not?

lib/local-execution/src/local-execution/computation_graph_instance/computation_graph_instance.cc line 84 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

For readability

Done.

lib/local-execution/src/local-execution/computation_graph_instance/computation_graph_instance.cc line 85 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

For readablity

Done.

lib/local-execution/src/local-execution/computation_graph_instance/computation_graph_instance.cc line 163 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Don't assign reference variables unless you know the cost savings are significant--it just creates too many opportunities for someone to mix up the lifetimes and create really annoying memory bugs

Done.

lib/local-execution/src/local-execution/computation_graph_instance/computation_graph_instance.cc line 185 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Really don't assign the result of nontrivial expressions to refs as it's then really easy to mess up the lifetimes--I'm actually not sure if this code is actually correct or not

Done.

lib/local-execution/src/local-execution/computation_graph_instance/computation_graph_instance.cc line 211 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Update here as well (mostly leaving the additional comment to help me remember to check this next round--normally I don't leave comments on every occurence)

Done.

lib/local-execution/src/local-execution/computation_graph_instance/computation_graph_instance.cc line 236 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Here as well

Done.

lib/local-execution/test/src/local-execution/local_task_argument_accessor.cc line 73 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Update subcase names to match (or at least not contradict) the updated function signature of get_tensor

I'm actually not sure how you want me to do this, unless you're literally asking me to write:

SUBCASE("get_tensor(make_task_tensor_parameter_fwd(TensorSlotName::LHS_INPUT), Permissions::RO)")

Is that it or do you have a shorter version you want to see here?

lib/task-spec/src/task-spec/dynamic_graph/dynamic_layer_guid_t.cc line 9 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Minor: Would make the name a bit clearer

Do you want this to be consistent with your other name? Because it seems you're asking for:

dynamic_layer_guid_t mk_dynamic_layer_guid_for_cg_layer(layer_guid_t);
dynamic_layer_guid_t mk_dynamic_layer_guid_for_parallel_layer(parallel_layer_guid_t);

Personally I would either make it layer/parallel_layer or else cg_layer/pcg_layer, but that's just me.

lib/task-spec/src/task-spec/dynamic_graph/dynamic_open_dataflow_graph.cc line 88 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

I don't really understand this function--if you already have a tensor_guid, what is the output tensor you're trying to find? And why does it return a pair of values?

See if the new version in the updated diff makes more sense.

lib/task-spec/src/task-spec/dynamic_graph/loss_insertion.cc line 49 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Shouldn't loss_attrs go somewhere in here?

I was on the fence about this. We could put loss attrs into the dynamic graph. On the other hand, optimizer attrs aren't stored in the graph, and arguably shouldn't be because they need to change on every iteration.

If you want them in the graph I'll add them.

lib/task-spec/src/task-spec/dynamic_graph/make_dynamic_open_dataflow_graph_from_cg.cc line 42 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Do we ever need to be able to distinguish between a TensorShape and a ParallelTensorShape for this field? If not this solution is fine, just wanted to confirm since by lifting everything to a ParallelTensorShape we do lose this information

Honestly, I don't know. As far as I'm aware everything downstream uses the ParallelTensorShape, but if you want this represented I can put it in.

lib/local-execution/src/local-execution/device_state_initialization.cc line 84 at r2 (raw file):

                               device_idx);
      });
  ASSERT(all_nodes_are_initialized(dg));

We can't assert here because not all nodes end up initialized at the end. I made the same mistake when I was initially writing, but I had to remove it. Perhaps you could write a more principled version that only checks certain kinds of nodes for this property, but it felt like more trouble than it's worth.

lib/local-execution/src/local-execution/local_task_registry.cc line 175 at r1 (raw file):

  std::optional<TaskImplFunction> task_impl_fn =
      get_fwd_task_impl_for_op_attrs(op_attrs);
  if (!task_impl_fn) {

Done.

lib/local-execution/src/local-execution/local_task_registry.cc line 189 at r1 (raw file):

  std::optional<TaskImplFunction> task_impl_fn =
      get_bwd_task_impl_for_op_attrs(op_attrs);
  if (!task_impl_fn) {

Done.

elliottslaughter

@elliottslaughter resolved 10 discussions.
Reviewable status: 64 of 81 files reviewed, 20 unresolved discussions (waiting on @lockshaw).

lockshaw

@lockshaw reviewed 4 files and all commit messages, and made 1 comment.
Reviewable status: 68 of 81 files reviewed, 20 unresolved discussions (waiting on @elliottslaughter).

lib/local-execution/include/local-execution/computation_graph_instance/computation_graph_instance.h line 30 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Minor: Generally we make all constructors explicit unless there's a clear reason not to

Testing publishing a response

elliottslaughter

@elliottslaughter made 10 comments.
Reviewable status: 68 of 81 files reviewed, 20 unresolved discussions (waiting on @lockshaw).

lib/local-execution/src/local-execution/local_task_registry.cc line 158 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

For clarity

Done.

lib/local-execution/src/local-execution/task_execution.cc line 126 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

I don't think the fmt::to_string is necessary

Done.

lib/local-execution/src/local-execution/computation_graph_instance/computation_graph_instance.cc line 84 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

For readability

Done.

lib/local-execution/src/local-execution/computation_graph_instance/computation_graph_instance.cc line 85 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

For readablity

Done.

lib/local-execution/src/local-execution/computation_graph_instance/computation_graph_instance.cc line 163 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Don't assign reference variables unless you know the cost savings are significant--it just creates too many opportunities for someone to mix up the lifetimes and create really annoying memory bugs

Done.

lib/local-execution/src/local-execution/computation_graph_instance/computation_graph_instance.cc line 185 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Really don't assign the result of nontrivial expressions to refs as it's then really easy to mess up the lifetimes--I'm actually not sure if this code is actually correct or not

Done.

lib/local-execution/src/local-execution/computation_graph_instance/computation_graph_instance.cc line 211 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Update here as well (mostly leaving the additional comment to help me remember to check this next round--normally I don't leave comments on every occurence)

Done.

lib/local-execution/src/local-execution/computation_graph_instance/computation_graph_instance.cc line 236 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Here as well

Done.

lib/local-execution/src/local-execution/local_task_registry.cc line 175 at r1 (raw file):

  std::optional<TaskImplFunction> task_impl_fn =
      get_fwd_task_impl_for_op_attrs(op_attrs);
  if (!task_impl_fn) {

Done.

lib/local-execution/src/local-execution/local_task_registry.cc line 189 at r1 (raw file):

  std::optional<TaskImplFunction> task_impl_fn =
      get_bwd_task_impl_for_op_attrs(op_attrs);
  if (!task_impl_fn) {

Done.

elliottslaughter

@elliottslaughter made 1 comment.
Reviewable status: 68 of 81 files reviewed, 20 unresolved discussions (waiting on @lockshaw).

lib/local-execution/include/local-execution/computation_graph_instance/computation_graph_instance.h line 30 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Testing publishing a response

Test reply

elliottslaughter

@elliottslaughter made 1 comment.
Reviewable status: 68 of 81 files reviewed, 20 unresolved discussions (waiting on @lockshaw).

elliottslaughter

@elliottslaughter made 6 comments and resolved 1 discussion.
Reviewable status: 68 of 81 files reviewed, 19 unresolved discussions (waiting on @lockshaw).

lib/local-execution/include/local-execution/task_execution.h line 17 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Minor: Any reason for deleting all of the argument names?

I went back and forth on this one because the rest of the code seems to be inconsistent. What do you prefer?

lib/task-spec/src/task-spec/dynamic_graph/dynamic_layer_guid_t.cc line 6 at r1 (raw file):

Previously, lockshaw (Colin Unger) wrote…

Minor: Would make the name a bit clearer

I'm not sure this is the most consistent, particularly considering the parallel case of tensors.

I feel like the two consistent options are:

layer / parallel_layer / loss, or
cg_layer / pcg_layer / loss

We used option 1 for tensors so that feels the most consistent to me, but let me know if you disagree.

lockshaw

@lockshaw reviewed 13 files, made 7 comments, and resolved 11 discussions.
Reviewable status: all files reviewed, 9 unresolved discussions (waiting on @elliottslaughter).

lib/local-execution/include/local-execution/task_execution.h line 17 at r1 (raw file):

Previously, elliottslaughter (Elliott Slaughter) wrote…

I went back and forth on this one because the rest of the code seems to be inconsistent. What do you prefer?

I have moved over to including the argument names because it enables us to use clang-tidy to check the argument name comments (i.e., the /*my_argument=*/... syntax), and because it enables adding doxygen docstrings (I don't think you can document arguments without names in doxygen, at least not easily)

lib/local-execution/test/src/local-execution/local_task_argument_accessor.cc line 73 at r1 (raw file):

Previously, elliottslaughter (Elliott Slaughter) wrote…

I'm actually not sure how you want me to do this, unless you're literally asking me to write:
SUBCASE("get_tensor(make_task_tensor_parameter_fwd(TensorSlotName::LHS_INPUT), Permissions::RO)")
Is that it or do you have a shorter version you want to see here?

You can also just use a prose name, like get_tensor for read-only forward tensor. I just don't want the subcase names to explicitly contradict the signature

lib/task-spec/src/task-spec/dynamic_graph/dynamic_layer_guid_t.cc line 6 at r1 (raw file):

Previously, elliottslaughter (Elliott Slaughter) wrote…

I'm not sure this is the most consistent, particularly considering the parallel case of tensors.

I feel like the two consistent options are:

layer / parallel_layer / loss, or

cg_layer / pcg_layer / loss

We used option 1 for tensors so that feels the most consistent to me, but let me know if you disagree.

Yeah, either is fine, I just don't like layer by itself because it's ambiguous (unfortunately we don't have a good name for a non-parallel layer that's shorter than "non_parallel_layer", hence "cg_layer"). Either parallel_layer or pcg_layer is fine to me

lib/task-spec/src/task-spec/dynamic_graph/dynamic_open_dataflow_graph.cc line 88 at r1 (raw file):

Previously, elliottslaughter (Elliott Slaughter) wrote…

See if the new version in the updated diff makes more sense.

Yup, makes much more sense. Thanks!

lib/task-spec/src/task-spec/dynamic_graph/loss_insertion.cc line 49 at r1 (raw file):

Previously, elliottslaughter (Elliott Slaughter) wrote…

I was on the fence about this. We could put loss attrs into the dynamic graph. On the other hand, optimizer attrs aren't stored in the graph, and arguably shouldn't be because they need to change on every iteration.

If you want them in the graph I'll add them.

Makes sense. I'd say either (1) include loss attrs in the graph, or (2) remove the loss_attrs argument. Either is fine with me, but having a loss_attrs argument that just isn't used feels weird.

lib/task-spec/src/task-spec/dynamic_graph/make_dynamic_open_dataflow_graph_from_cg.cc line 42 at r1 (raw file):

Previously, elliottslaughter (Elliott Slaughter) wrote…

Honestly, I don't know. As far as I'm aware everything downstream uses the ParallelTensorShape, but if you want this represented I can put it in.

Nope, if it's working for now that's fine, just wanted to raise the issue to make sure it got considered/kept in mind as we go forward. We can always change it if we discover a case where it is necessary.

lib/local-execution/src/local-execution/device_state_initialization.cc line 85 at r5 (raw file):

      });
  // FIXME: this assert fails because not all kinds of nodes require
  // initialization ASSERT(all_nodes_are_initialized(dg));

Got it, in that case it seems like having an all_nodes_are_initialized function doesn't make a whole lot of sense as it doesn't tell us anything, so maybe the answer is just to remove that function. Another option would be to keep it, but to have it incorporate what types of nodes don't need to be initialized. Either is fine, just trying to understand (and if convenient, add a check in the code) the postcondition of this function

elliottslaughter marked this pull request as ready for review January 29, 2026 01:00

elliottslaughter added 29 commits January 30, 2026 13:37

Rip out a bunch of stuff, stub stuff out.

c1a8ab4

Basic flow to create computation graph instance: computation graph to…

9cbf779

… equivalent dataflow, expand.

Initialized computation graph instance owns input tensors.

b3b49ba

Refactor computation graph instance as DTG. Cut out more stuff that doesn't compile yet.

Factor out dynamic_layer_guid_t and some work on make_dynamic_open_da…

4f9bb24

…taflow_graph_from_cg.

Create initialized computation graph instance.

cfe3d55

Assert on state of computation graph.

402140a

Switch to variant to track pre-allocated tensors.

f35ed75

Code to initialize ops, need task argument accessor.

91a7c7c

Fix up LocalTaskArgumentAccessor.

3e8eca6

Fix some headers.

7e979e2

Local executor builds, does not link.

c2b6e40

Rewrite to use existing tensor allocator pass.

ef6f5ec

Fix multiple definition.

3cc4669

Fix up some tests or comment them out.

82991f6

Rip out task registry class, stub the static implementation.

00dd0a1

Implement task IDs.

ad28993

Rename computation graph instance.

b0b6002

Clean up includes.

3924b42

Fix accessor test.

892cfdb

Fixes for various test cases.

bd8c3dc

Remove obsolete sections of task argument accessor test.

d2aede0

First pass attempt to generate topological order for dataflow graph.

049c06d

Finish implementation of graph execution.

fcbd7fa

Refactor and lift topo order to initialization.

78647b0

Clean up and more implementation.

4a7fd2e

Call task implementation.

91e4921

Get everything to build and link.

50a3288

Implement rest of passes.

421f813

Update step.

75f1b61

elliottslaughter added 7 commits February 5, 2026 20:48

Clean up headers.

ec07cf2

Split out device state initialization to make it publically accessible.

67cd13c

And actually use the device initialization in the public interface.

f6fdd68

There's no real reason for local execution to take a dynamic tensor.

fac9328

Update header.

ca771b2

Put back the assert_unwrap.

025fbdc

Reorder parameters.

9ce21c5

lockshaw self-requested a review February 10, 2026 20:29

Merge branch 'master' into local-execution

216f1f9

lockshaw requested changes Feb 11, 2026

View reviewed changes

lockshaw and others added 2 commits February 11, 2026 01:26

Minor fixes from PR review

e09d15d

Updates for PR feedback.

4d322a3

elliottslaughter commented Feb 11, 2026

View reviewed changes

Format.

de645e9

elliottslaughter commented Feb 11, 2026

View reviewed changes

Make tests pass.

60e62d3

elliottslaughter commented Feb 11, 2026

View reviewed changes

lockshaw reviewed Feb 11, 2026

View reviewed changes

elliottslaughter commented Feb 11, 2026

View reviewed changes

lockshaw requested changes Feb 11, 2026

View reviewed changes

More responses to feedback.

678b453

Update local-execution to newest task-spec #1625

Are you sure you want to change the base?

Update local-execution to newest task-spec #1625

Conversation

elliottslaughter commented Jan 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lockshaw left a comment

Choose a reason for hiding this comment

Uh oh!

elliottslaughter left a comment

Choose a reason for hiding this comment

Uh oh!

elliottslaughter left a comment

Choose a reason for hiding this comment

Uh oh!

elliottslaughter left a comment

Choose a reason for hiding this comment

Uh oh!

elliottslaughter left a comment

Choose a reason for hiding this comment

Uh oh!

elliottslaughter left a comment

Choose a reason for hiding this comment

Uh oh!

elliottslaughter left a comment

Choose a reason for hiding this comment

Uh oh!

elliottslaughter left a comment

Choose a reason for hiding this comment

Uh oh!

elliottslaughter left a comment

Choose a reason for hiding this comment

Uh oh!

elliottslaughter left a comment

Choose a reason for hiding this comment

Uh oh!

lockshaw left a comment

Choose a reason for hiding this comment

Uh oh!

elliottslaughter left a comment

Choose a reason for hiding this comment

Uh oh!

elliottslaughter left a comment

Choose a reason for hiding this comment

Uh oh!

elliottslaughter left a comment

Choose a reason for hiding this comment

Uh oh!

elliottslaughter left a comment

Choose a reason for hiding this comment

Uh oh!

lockshaw left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

elliottslaughter commented Jan 17, 2026 •

edited

Loading