Conversation
76ccf41 to
0371b3c
Compare
6d30b34 to
654d07b
Compare
|
Updated to rework OCI precomposition. The way I did it before resulted in two code paths which were essentially the same but couldn't be exactly the same. Now there is only one code path and it is exactly the same as itself. I am getting increasingly fearful that I am getting lost in a labyrinth of my own changes and although it all seems to work for me I would really value somebody else looking at it and kicking the tyres because what if I am not testing it right. I was sure that this last rework had broken something, and yet seemingly not, which makes me worry whether I am testing what I think I'm testing. |
867c0be to
c08e5d1
Compare
|
One thing that is troubling me with all this is isolation behaviour. Like if you want a middleware to call an auth provider, then it needs network permissions. So:
We can probably work around 1 in the increasingly terrifying normalisation step but 2 is a gazillion times more complicated. The only way around that today is to use a service chaining approach where it passes through the host rather than being directly composed. (Which is do-able, but a non-trivial rewrite, and we'd need to check that this would work with downstream hosts who don't support service chaining.) |
| if self.precompose_only { | ||
| let Some(precompose_component_id) = self.precompose_component_id.as_ref() else { | ||
| anyhow::bail!("got --precompose-only but no --precompose-component-id"); | ||
| }; |
There was a problem hiding this comment.
Was this added to aid in the research? I'm not sure what's happening here.
There was a problem hiding this comment.
This is used by the OCI subsystem during precomposition. spin registry push needs to precompose Wasm binaries for cases like SpinKube where composition at load time isn't available. But the OCI subsystem doesn't know how to deal with HTTP middleware because that's an HTTP concern, not an OCI concern. So when the OCI subsystem sees complications in a trigger, such as HTTP middleware, it issues issues a command to the relevant trigger saying "hey, you're so smart, you precompose this for me."
|
|
||
| /// An object which composes extras onto the primary component. | ||
| /// TODO: the combination of functions and objects and traits is a bit funny and we may/should be able to streamline it. | ||
| fn complicator() -> impl spin_factors_executor::Complicator { |
There was a problem hiding this comment.
I know complicator is a placeholder so I'll suggest we use something generic here like fn compose_extras() -> impl spin_factors_executor::ComposeExtras.
There was a problem hiding this comment.
Yeah I think I originally had "extras" but if anything I hated that even more than "complication." I am not sure what a good name would be!
| middleware_blobs: impl Iterator<Item = &'a ComplicationData>, | ||
| ) -> anyhow::Result<Vec<u8>> { | ||
| const MW_NEXT_INBOUND: &str = "wasi:http/handler@0.3.0-rc-2026-01-06"; | ||
| const MW_NEXT_OUTBOUND: &str = "wasi:http/handler@0.3.0-rc-2026-01-06"; |
There was a problem hiding this comment.
Can we unify these into a single const MW_HANDLER_INTERFACE: &str = "wasi:http/handler@0.3.0-rc-2026-01-06";?
There was a problem hiding this comment.
We probably can. Originally they were different, and I guess I wanted to retain clarity about which context we were in at any given point in case we ever needed to prise them apart again. But yeah that does seem unlikely.
crates/loader/src/local.rs
Outdated
| /// but with different middleware. In this case, we will synthesise a component | ||
| /// for each such trigger, with the same main configuration but with its own | ||
| /// "extra" components. | ||
| fn reassign_extra_components_from_triggers(mut locked: LockedApp) -> LockedApp { |
There was a problem hiding this comment.
I think we can avoid cloning the locked app if we break this up into 3 parts:
- Collect a lightweight
TriggerInfostruct containing thetrigger_id,component_idand whether "extras" are present. - Use those to determine splitting, then clone only the individual components that need it.
- Move extras from triggers to components.
Something like (untested):
/// We want all component/composition graph information to be in the component,
/// because the component ID is how Spin looks this stuff up. So if a trigger
/// contains a `components` table, e.g. specifying middleware, we want to move
/// that to the component.
///
/// But it's possible to have two triggers pointing to the same primary component,
/// but with different middleware. In this case, we will synthesise a component
/// for each such trigger, with the same main configuration but with its own
/// "extra" components.
fn reassign_extra_components_from_triggers(mut locked: LockedApp) -> LockedApp {
use std::collections::{HashMap, HashSet};
fn trigger_component_id(trigger: &LockedTrigger) -> Option<String> {
trigger
.trigger_config
.get("component")
.and_then(|v| v.as_str())
.map(|s| s.to_string())
}
fn extra_components(trigger: &LockedTrigger) -> Option<&ValuesMap> {
trigger
.trigger_config
.get("components")
.and_then(|v| v.as_object())
}
fn has_extra_components(trigger: &LockedTrigger) -> bool {
extra_components(trigger).is_some_and(|xcs| !xcs.is_empty())
}
// Collect lightweight trigger metadata to determine which components
// need splitting. This avoids cloning the entire LockedApp.
struct TriggerInfo {
trigger_id: String,
component_id: String,
has_extras: bool,
}
let trigger_infos: Vec<TriggerInfo> = locked
.triggers
.iter()
.filter_map(|t| {
trigger_component_id(t).map(|cid| TriggerInfo {
trigger_id: t.id.clone(),
component_id: cid,
has_extras: has_extra_components(t),
})
})
.collect();
// Group triggers by component ID, find components that need splitting
// (multiple triggers reference the same component AND at least one has extras).
let mut cid_triggers: HashMap<&str, Vec<&TriggerInfo>> = HashMap::new();
for info in &trigger_infos {
cid_triggers
.entry(&info.component_id)
.or_default()
.push(info);
}
let mut seen = HashSet::new();
let mut disambiguator = 0;
// For components referenced by multiple triggers where at least
// one has extras, create synthetic component clones so each trigger gets
// its own composition graph.
let needs_splitting: Vec<_> = cid_triggers
.into_iter()
.filter(|(_, triggers)| {
triggers.len() > 1 && triggers.iter().any(|t| t.has_extras)
})
.collect();
for (cid, triggers) in &needs_splitting {
for info in triggers {
if !info.has_extras {
// Unenriched triggers can continue pointing to the original component.
continue;
}
let mut synthetic_id = format!("{cid}-for-{}", info.trigger_id);
if seen.contains(&synthetic_id) {
disambiguator += 1;
synthetic_id = format!("{synthetic_id}-d{disambiguator}");
}
seen.insert(synthetic_id.clone());
let component = locked
.components
.iter()
.find(|c| c.id == **cid)
.expect("trigger references non-existent component")
.clone();
let mut synthetic = component;
synthetic.id = synthetic_id.clone();
locked.components.push(synthetic);
// Update the trigger to point to the new synthetic component
let trigger = locked
.triggers
.iter_mut()
.find(|t| t.id == info.trigger_id)
.expect("trigger disappeared during splitting");
trigger
.trigger_config
.as_object_mut()
.expect("trigger config should be an object")
.insert("component".into(), synthetic_id.into());
}
}
// Move extras from triggers onto their respective components.
// Now each enriched trigger has its own component, so composition graphs
// are uniquely identified by component ID.
for trigger in &mut locked.triggers {
if let Some(extras) = extra_components(trigger) {
if let Some(component_id) = trigger_component_id(trigger) {
if let Some(component) = locked.components.iter_mut().find(|c| c.id == component_id)
{
component
.metadata
.insert("trigger-extras".into(), extras.clone().into());
component.metadata.insert(
"resolve-extras-using".into(),
trigger.trigger_type.clone().into(),
);
trigger
.trigger_config
.as_object_mut()
.expect("trigger config should be an object")
.remove("components");
}
}
}
}
locked
}There was a problem hiding this comment.
I had another look at this, and ended up tackling it a different way - I've moved this huge pile of code to its own module, broke out the helper functions, defaffed the synthetic ID stuff, and broke out needs_splitting (among other bits) as an additional helper. This limits the clone antics to a few IDs and things in the needs_splitting function. (It does a bit clone more stuff than your version but honestly with all the nonsense we get up to during load I am not going to shed any tears over this.)
I reckon this makes the main flow way clearer, although if the borrow checker were not intent on THWARTING ME AT EVERY TURN I could have made it even less ghastly...
Capability isolation remains a thorn for composition. We either need to accept that middlewares inherit capability from the primary component (or alternatively deny-all by default via some adapter like we do for dependencies with an opt-in) or what you suggest with service-chaining (which I prefer less for the reason you postulate). |
9fe7582 to
d5516d8
Compare
Signed-off-by: itowlson <ivan.towlson@fermyon.com>
d5516d8 to
c719d64
Compare
Fixes #3291.
This currently includes #3383; once that lands I will rebase. For now just ignore theDone!witdirectory.Example usage: https://github.com/itowlson/spin-middleware-terrifying-nonsense
Setting it up in spin.toml:
Implementing a middleware component in Rust (today: we'd presumably provide a wrapper for the onbound
handleimport):Needs lots of renaming, tests, etc. but anyway getting it into the system. (and yes, yes, I'll squash the commits)