Conversation
Why not use something like Cargo publish protocol?
What about license? Should require it to be an SPDX expression. (EDIT: Agree! RESOLVED) |
Alaternatively, why not be even simpler: pass the PURL in some header (e.g. |
6d94e3f to
d1f6b99
Compare
Signed-off-by: Erik Sundell <erik.sundell@sensmetry.com> Signed-off-by: Erik Sundell <erik.sundell+2025@sensmetry.com>
|
@andrius-puksta-sensmetry I've now updated this PR after rework and more self-review, and I consider it ready to be reviewed again.
We could go for cargo's publish protocol, but there wasn't a clear why / why not for me here so I picked use of multipart form as it felt a bit more straightforward then using crates binary protocol. I think it is good if we can ensure that we can include misc metadata though, so just having a single application/zip POST form with headers seems too rigid. I for example added Do you want me to switch to Cargos binary protocol to deliver the metadata and file (but with different metadata content)? It also does the trick. |
Signed-off-by: Erik Sundell <erik.sundell+2025@sensmetry.com>
Signed-off-by: Erik Sundell <erik.sundell+2025@sensmetry.com>
Signed-off-by: Erik Sundell <erik.sundell+2025@sensmetry.com>
Signed-off-by: Erik Sundell <erik.sundell+2025@sensmetry.com>
Signed-off-by: Erik Sundell <erik.sundell+2025@sensmetry.com>
Signed-off-by: Erik Sundell <erik.sundell+2025@sensmetry.com>
| #[error( | ||
| "publisher field `{0}` is invalid for modern project IDs: must be 3-50 characters, use only ASCII letters and numbers, may include single spaces or hyphens between words, and must start and end with a letter or number" | ||
| )] | ||
| InvalidPublisher(Box<str>), |
There was a problem hiding this comment.
| InvalidPublisher(Box<str>), | |
| InvalidPublisher(String), |
Is there a reason not to just use String here?
There was a problem hiding this comment.
To reduce the error size. Box<str> is 16 bytes, String is 24. Since errors are returned from many places and tend to be somewhat large, they dictate the size of the return types.
There was a problem hiding this comment.
(to be fair, here specifically this has no effect on error size)
There was a problem hiding this comment.
Does it matter to us if the enum type is a few bytes bigger? It seems like a micro optimization to me.
There was a problem hiding this comment.
It is a micro optimization. Doesn't hurt though.
| }; | ||
|
|
||
| /// Defensive upper bound on kpar file size (100 MiB) to catch unexpected uploads by mistake. | ||
| const MAX_KPAR_PUBLISH_SIZE: u64 = 100 * 1024 * 1024; |
There was a problem hiding this comment.
Currently Sysand doesn't support jsons in KPARs (I think), but they will be supported. And the sysml model jsons are really big, at least with the current zip compression, and I wouldn't be surprised if even a simple project will be more than 100 MiB.
This should be fine for now though.
|
|
||
| let info = info.ok_or(PublishError::MissingInfo)?; | ||
| // Validate that metadata exists; contents are not used during upload. | ||
| let _meta = meta.ok_or(PublishError::MissingMeta)?; |
There was a problem hiding this comment.
| let _meta = meta.ok_or(PublishError::MissingMeta)?; | |
| _ = meta.ok_or(PublishError::MissingMeta)?; |
| ); | ||
| Err(PublishError::ServerError { | ||
| status, | ||
| body: summarize_error_body(body_bytes), |
There was a problem hiding this comment.
Do we want to return these error details to the user in all cases?
The answer might be obvious, but I'm not familiar enough with with this to know for myself. Often the "500 internal server error" message is deliberately lacking in details to not accidentally expose sensitive information.
| /// Must start and end with an alphanumeric character. | ||
| /// | ||
| /// Publish-only; if additional surfaces need this, extract to a shared module. | ||
| fn is_valid_field(s: &str, allow_dot: bool) -> bool { |
There was a problem hiding this comment.
It would be nice to return a more precise error in case it is invalid. It would be better UX for people to immediately know what exactly sysand doesn't like about the name/publisher
| (None, Some(detail)) => detail.to_string(), | ||
| (None, None) => String::new(), | ||
| }; | ||
| if !message.is_empty() { |
There was a problem hiding this comment.
What happens if message is empty? The body was a valid json, so I feel like something should be returned from here either way, and not fall through to the non-json body handling
| fn summarize_error_text(text: &str) -> String { | ||
| let trimmed = text.trim(); | ||
| if trimmed.is_empty() { | ||
| return "empty response body".to_string(); |
There was a problem hiding this comment.
Is "empty response body" an error we want to show to the user? Wouldn't it be better to just return a generic description based on the error code (e.g. "Not Found" for 404)?
| /// Defensive upper bound on kpar file size (100 MiB) to catch unexpected uploads by mistake. | ||
| const MAX_KPAR_PUBLISH_SIZE: u64 = 100 * 1024 * 1024; | ||
| /// Maximum number of characters to include when summarizing an error response body. | ||
| const MAX_ERROR_BODY_CHARS: usize = 1024; |
There was a problem hiding this comment.
Are there any known cases when the errors would be really long?
Your choice, I don't have a strong opinion either way. Either way I think it would be better to have separate fields for |
| log::info!( | ||
| "{header}{:>12}{header:#} `{name}` {version} to {index}", | ||
| "Publishing", | ||
| ); |
There was a problem hiding this comment.
Maybe mention publisher as well?
|
|
||
| let build_request = move |c: &reqwest_middleware::ClientWithMiddleware| { | ||
| let file_part = reqwest::multipart::Part::stream(file_bytes.clone()) | ||
| .file_name(file_name.clone()) |
There was a problem hiding this comment.
Why not use Part::file(<path>)? Also, file_name serves no purpose; name will already identify the part.
There was a problem hiding this comment.
I see, stream is to clone cheaply.
|
|
||
| let form = reqwest::multipart::Form::new() | ||
| .part("metadata", metadata_part) | ||
| .part("file", file_part); |
There was a problem hiding this comment.
| .part("file", file_part); | |
| .part("kpar", file_part); |
More explicit.
| }); | ||
| } | ||
|
|
||
| let mut upload_url = index.clone(); |
There was a problem hiding this comment.
| let mut upload_url = index.clone(); | |
| let mut upload_url = index.to_owned(); |
| { | ||
| let mut segments = upload_url | ||
| .path_segments_mut() | ||
| .expect("http(s) URLs are hierarchical and must support mutable path segments"); |
There was a problem hiding this comment.
No need for error messages, simple unwrap + comment is fine, this is guaranteed to not fail for http(s)
| .path_segments() | ||
| .expect("http(s) URLs are hierarchical and must support path segments") | ||
| .collect(); | ||
| if path_segments.ends_with(&UPLOAD_ENDPOINT_SEGMENTS) { |
There was a problem hiding this comment.
This is too verbose and inefficient. Use .path() and unify UPLOAD_ENDPOINT_SEGMENTS into a single &str.
| { | ||
| let mut segments = upload_url | ||
| .path_segments_mut() | ||
| .expect("http(s) URLs are hierarchical and must support mutable path segments"); |
There was a problem hiding this comment.
| .expect("http(s) URLs are hierarchical and must support mutable path segments"); | |
| .unwrap(); |
Since this is the same op as above, no need to explain.
| #[error("HTTP request failed: {0}")] | ||
| Http(#[from] reqwest_middleware::Error), |
There was a problem hiding this comment.
| #[error("HTTP request failed: {0}")] | |
| Http(#[from] reqwest_middleware::Error), | |
| #[error("HTTP request failed: {0:#?}")] | |
| Http(#[from] reqwest_middleware::Error), |
Due to the way reqwest implements Error, crucial details get lost for reqwest_middleware::Error unless debug print is used.
| #[error("failed to read server response body: {0}")] | ||
| ResponseBody(#[source] reqwest::Error), |
There was a problem hiding this comment.
Same here
| #[error("failed to read server response body: {0}")] | |
| ResponseBody(#[source] reqwest::Error), | |
| #[error("failed to read server response body: {0:#?}")] | |
| ResponseBody(#[source] reqwest::Error), |
| NotFound(String), | ||
|
|
||
| #[error( | ||
| "kpar file is unexpectedly large ({size} bytes, limit is {limit} bytes); verify you are publishing the correct file" |
There was a problem hiding this comment.
| "kpar file is unexpectedly large ({size} bytes, limit is {limit} bytes); verify you are publishing the correct file" | |
| "KPAR file is unexpectedly large ({size} bytes, limit is {limit} bytes); verify you are publishing the correct file" |
KPAR is an abbreviation
| // url is declared a String, but would ideally be declared an Url. However, | ||
| // that would come with challenges as Url provides no Default impl, which | ||
| // makes it impossible to use #[derive(Default)] on this struct. | ||
| pub url: String, |
There was a problem hiding this comment.
Semantically doesn't make sense to implement Default for this. It only seems to be useful in a few tests. For convenience can expose function from_url that populates only url field.
Summary
This PR introduces a new
sysand publishcommand to upload.kparartifacts to a sysand package index.What’s Included
sysand publish --index <URL> [PATH][PATH]explicitly, or defaults tooutput/<name>-<version>.kparfor the current project<index>/api/v1/upload:metadataJSON with:purl(pkg:sysand/<publisher>/<name>@<version>)sha256_digestof the.kparfilewith .kpar data (application/zip)versionmust be valid SemVer 2.0licensemust be valid SPDX expressionpublisher/namemust follow strict modern ID rulesnameare preserved)http/https/api/v1/upload401/403/404/409+ generic server errors with summarized body)Docs
docs/src/commands/publish.mdsysand publishentry to docs summary navigationTests
Dependency/Config Updates
multipartfeatures forreqwest/reqwest-middlewarewhere needed for upload support