Skip to content

Kdae ctrl#19

Open
rerpha wants to merge 61 commits into
mainfrom
kdae_ctrl
Open

Kdae ctrl#19
rerpha wants to merge 61 commits into
mainfrom
kdae_ctrl

Conversation

@rerpha
Copy link
Copy Markdown
Collaborator

@rerpha rerpha commented Apr 21, 2026

Closes ISISComputingGroup/DataStreaming#24
This adds an architectural skeleton which provides basic functionality ie. the ability to begin and end a run (and to see whether the hardware is running) to kafka_dae_control

Beyond this I have added several tickets to flesh the IOC out.

Notably I am going to leave the OPI for ISISComputingGroup/DataStreaming#25 to complete as currently the only thing that you can actually do is begin/end a run.

@rerpha rerpha force-pushed the kdae_ctrl branch 2 times, most recently from c95baf4 to b59f300 Compare April 22, 2026 15:26
@rerpha rerpha marked this pull request as ready for review May 6, 2026 10:07
Copy link
Copy Markdown
Member

@Tom-Willemsen Tom-Willemsen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this looks like a solid foundation.

Sorry I've put a lot of comments in the code; most of them are small nit-picks.

I managed to do a very small functional review but something has now bricked the board...?

Architectural

As discussed in-person, I think it would also be worth prototyping an action-queue based design for "updates" (from e.g. UDP, kafka, camonitor) which trigger the side-effects (changes to the Data object or adding more "updates" onto the queue). We should try it together, but it might give us an easier to reason about threading model - which might sidestep some of the more detailed threading/concurrency comments I've put elsewhere.

Troubleshooting / docs

At one point I think the FPGA went bad on me, got a crash like:

INFO:kafka_dae_control.static_pvs:begin
ERROR:kafka_dae_control.run_state:Failed to start run:
Traceback (most recent call last):
  File "/home/ds/kafka_dae_control/src/kafka_dae_control/run_state.py", line 83, in on_run_state_change
    write_verify(
  File "/home/ds/kafka_dae_control/src/kafka_dae_control/comms.py", line 54, in write_verify
    raise OSError(
OSError: (192.168.1.250) Could not write 1 32 bit words to address 0 (set data=19, readback=0)
INFO:kafka_dae_control.static_pvs:end
ERROR:kafka_dae_control.run_state:Failed to end run:
Traceback (most recent call last):
  File "/home/ds/kafka_dae_control/src/kafka_dae_control/run_state.py", line 114, in on_run_state_change
    write_and_inv_then_verify(
  File "/home/ds/kafka_dae_control/src/kafka_dae_control/comms.py", line 89, in write_and_inv_then_verify
    write_verify(sock, host, address, new_value, count, verify)
  File "/home/ds/kafka_dae_control/src/kafka_dae_control/comms.py", line 54, in write_verify
    raise OSError(
OSError: (192.168.1.250) Could not write 1 32 bit words to address 0 (set data=50, readback=51)

Can we start some troubleshooting docs about this?

In general - I think we need some "high-level" docs about what the assumptions/preconditions etc of this program are.

Functional

If I end the same run multiple times, I get multiple RunStops in Kafka - is that what we want? Should it check the runstate and ignore if already in SETUP?

Comment thread doc/_static/css/custom.css Outdated
Comment thread doc/local_development.md
Comment thread src/kafka_dae_control/blocks.py Outdated
Comment thread src/kafka_dae_control/comms.py Outdated
Comment thread src/kafka_dae_control/comms.py Outdated
@@ -0,0 +1,166 @@
"""Utilities for communicating to a UDP device such as a streaming control board."""
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please can you add some documentation about how you guarantee reads/writes in this module don't interleave with each other when called from multiple threads, or whether that's the responsibility of the caller (I think it should probably be the responsibility of this module).

Conceptually I think you likely need a (shared) lock at the top level around each function for this to be threadsafe.

But whichever approach we choose - whether it's a lock or a queue or something else, we need to document exactly what the behaviour is if these functions are called from multiple threads concurrently.

Returns: None

"""
with state_file.open("w", encoding="utf-8") as file:
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like it probably needs a lock around it to prevent multiple threads from potentially calling it concurrently

Comment thread src/kafka_dae_control/config.py Outdated
Comment thread src/kafka_dae_control/config.py Outdated
Comment thread src/kafka_dae_control/cli.py Outdated
Comment thread src/kafka_dae_control/data.py Outdated
@rerpha rerpha requested a review from Tom-Willemsen May 12, 2026 10:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

prototype a process which talks to the FPGAs and VXI streaming control board (kafka_dae_control) TIMEBOX [1 Week]

2 participants