From 51840c735ea68a9a2987538129feb5cc88abf013 Mon Sep 17 00:00:00 2001 From: Kris Zyp Date: Fri, 27 Mar 2026 16:44:03 -0600 Subject: [PATCH 1/3] Resource API updates for v5 --- .../harper-applications-in-depth.mdx | 9 +- reference/resources/overview.md | 114 ++++ reference/resources/query-optimization.md | 219 ++++++++ reference/resources/resource-api.md | 526 ++++++++++++++++++ .../version-v4/resources/overview.md | 4 +- .../version-v4/rest/querying.md | 7 + 6 files changed, 871 insertions(+), 8 deletions(-) create mode 100644 reference/resources/overview.md create mode 100644 reference/resources/query-optimization.md create mode 100644 reference/resources/resource-api.md diff --git a/learn/developers/harper-applications-in-depth.mdx b/learn/developers/harper-applications-in-depth.mdx index bdd6037a..24346f14 100644 --- a/learn/developers/harper-applications-in-depth.mdx +++ b/learn/developers/harper-applications-in-depth.mdx @@ -373,8 +373,7 @@ function calculateHumanAge(dogAge) { } export class DogWithHumanAge extends tables.Dog { - static loadAsInstance = false; - async async get(target) { + static async get(target) { const dogRecord = await super.get(target); return { @@ -433,8 +432,7 @@ Notably, did you see how we were able to use the `001` id with the new resource ```javascript export class DogWithHumanAge extends tables.Dog { - static loadAsInstance = false; - async get(target) { + static async get(target) { // ... } } @@ -444,8 +442,7 @@ The `DogWithHumanAge` class extends from `tables.Dog`. The `tables` reference is ```javascript export class DogWithHumanAge extends tables.Dog { - static loadAsInstance = false; - async get(target) { + static async get(target) { const dogRecord = await super.get(target); return { diff --git a/reference/resources/overview.md b/reference/resources/overview.md new file mode 100644 index 00000000..d6c5178b --- /dev/null +++ b/reference/resources/overview.md @@ -0,0 +1,114 @@ +--- +title: Resources Overview +--- + + + + +# Resources + +Harper's Resource API is the foundation for building custom data access logic and connecting data sources. Resources are JavaScript classes that define how data is accessed, modified, subscribed to, and served over HTTP, MQTT, and WebSocket protocols. + +## What Is a Resource? + +A **Resource** is a class that provides a unified interface for a set of records or entities. Harper's built-in tables extend the base `Resource` class, and you can extend either `Resource` or a table class to implement custom behavior for any data source — internal or external. + +Added in: v4.2.0 + +The Resource API is designed to mirror REST/HTTP semantics: methods map directly to HTTP verbs (`get`, `put`, `patch`, `post`, `delete`), making it straightforward to build API endpoints alongside custom data logic. + +## Relationship to Other Features + +- **Database tables** extend `Resource` automatically. You can use tables through the Resource API without writing any custom code. +- The **REST plugin** maps incoming HTTP requests to Resource methods. See [REST Overview](TODO:reference_versioned_docs/version-v4/rest/overview.md 'REST plugin reference'). +- The **MQTT plugin** routes publish/subscribe messages to `publish` and `subscribe` Resource methods. See [MQTT Overview](TODO:reference_versioned_docs/version-v4/mqtt/overview.md 'MQTT plugin reference'). +- **Global APIs** (`tables`, `databases`, `transaction`) provide access to resources from JavaScript code. +- The **`jsResource` plugin** (configured in `config.yaml`) registers a JavaScript file's exported Resource classes as endpoints. + +## Extending a Table + +The most common use case is extending an existing table to add custom logic. + +Starting with a table definition in a `schema.graphql`: + +```graphql +# Omit the `@export` directive +type MyTable @table { + id: Long @primaryKey + # ... +} +``` + +> For more info on the schema API see [`Database / Schema`]() + +Then, in a `resources.js` extend from the `tables.MyTable` global: + +```javascript +export class MyTable extends tables.MyTable { + static async get(target) { + // get the record from the database + const record = await super.get(target); + // add a computed property before returning + return { ...record, computedField: 'value' }; + } + + static async post(target, data) { + // custom action on POST + this.create({ ...(await data), status: 'pending' }); + } +} +``` + +Finally, ensure everything is configured appropriately: + +```yaml +rest: true +graphqlSchema: + files: schema.graphql +jsResource: + files: resources.js +``` + +## Custom External Data Source + +You can also extend the base `Resource` class directly to implement custom endpoints, or even wrap an external API or service as a custom caching layer: + +```javascript +export class CustomEndpoint extends Resource { + static get(target) { + return { + data: doSomething(), + }; + } +} + +export class MyExternalData extends Resource { + static async get(target) { + const response = await fetch(`https://api.example.com/${target.id}`); + return response.json(); + } + + static async put(target, data) { + return fetch(`https://api.example.com/${target.id}`, { + method: 'PUT', + body: JSON.stringify(await data), + }); + } +} + +// Use as a cache source for a local table +tables.MyCache.sourcedFrom(MyExternalData); +``` + +Resources are the true customization point for Harper. This is where the business logic of a Harper application really lives. There is a lot more to this API than these examples show. Ensure you fully review the [Resource API](./resource-api.md) documentation, and consider exploring the Learn guides for more information. + +## Exporting Resources as Endpoints + +Resources become HTTP/MQTT endpoints when they are exported. As the examples demonstrated if a Resource extends an existing table, make sure to not have conflicting exports between the schema and the JavaScript implementation. Alternatively, you can register resources programmatically using `server.resources.set()`. See [HTTP API](../http/api.md) for server extension documentation. + +## Pages in This Section + +| Page | Description | +| --------------------------------------------- | --------------------------------------------------------------------------------------------------------------- | +| [Resource API](./resource-api.md) | Complete reference for instance methods, static methods, the Query object, RequestTarget, and response handling | +| [Query Optimization](./query-optimization.md) | How Harper executes queries and how to write performant conditions | diff --git a/reference/resources/query-optimization.md b/reference/resources/query-optimization.md new file mode 100644 index 00000000..c5f3648f --- /dev/null +++ b/reference/resources/query-optimization.md @@ -0,0 +1,219 @@ +--- +title: Query Optimization +--- + + + + +# Query Optimization + +Added in: v4.3.0 (query planning and execution improvements) + +Harper has powerful query functionality with excellent performance characteristics. Like any database, different queries can vary significantly in performance. Understanding how querying works helps you write queries that perform well as your dataset grows. + +## Query Execution + +At a fundamental level, querying involves defining conditions to find matching data and then executing those conditions against the database. Harper supports indexed fields, and these indexes are used to speed up query execution. + +When conditions are specified in a query, Harper attempts to utilize indexes to optimize the speed of query execution. When a field is not indexed, Harper checks each potential record to determine if it matches the condition — this is a full table scan and degrades as data grows (`O(n)`). + +When a query has multiple conditions, Harper attempts to optimize their execution order. For intersecting conditions (the default `and` operator), Harper applies the most selective and performant condition first. If one condition can use an index and is more selective than another, it is used first to narrow the candidate set before filtering on the remaining conditions. + +The `search` method supports an `explain` flag that returns the query execution order Harper determined, useful for debugging and optimization: + +```javascript +const result = await MyTable.search({ + conditions: [...], + explain: true, +}); +``` + +For union queries (`or` operator), each condition is executed separately and the results are merged. + +## Conditions, Operators, and Indexing + +When a query is executed, conditions are evaluated against the database. Indexed fields significantly improve query performance. + +### Index Performance Characteristics + +| Operator | Uses index | Notes | +| -------------------------------------------------------------------- | ------------------ | ------------------------------------------------------------------------ | +| `equals` | Yes | Fast lookup in sorted index | +| `greater_than`, `greater_than_equal`, `less_than`, `less_than_equal` | Yes | Range scan in sorted index; narrower range = faster | +| `starts_with` | Yes | Prefix search in sorted index | +| `not_equal` | No | Full scan required (unless combined with selective indexed condition) | +| `contains` | No | Full scan required | +| `ends_with` | No | Full scan required | +| `!= null` | Yes (special case) | Can use indexes to find non-null records; only helpful for sparse fields | + +**Rule of thumb**: Use `equals`, range operators, and `starts_with` on indexed fields. Avoid `contains`, `ends_with`, and `not_equal` as the sole or first condition in large datasets. + +### Indexed vs. Non-Indexed Fields + +Indexed fields provide `O(log n)` lookup — fast even as the dataset grows. Non-indexed fields require `O(n)` full table scans. + +Trade-off: indexes speed up reads but add overhead to writes (insert/update/delete must update the index). This is usually worth it for frequently queried fields. + +### Primary Key vs. Secondary Index + +Querying on a **primary key** is faster than querying on a secondary (non-primary) index, because the primary key directly addresses the record without cross-referencing. + +Secondary indexes are still valuable for query conditions on other fields, but expect slightly more overhead than primary key lookups. + +### Cardinality + +More unique values (higher cardinality) = more efficient indexed lookups. For example, an index on a boolean field has very low cardinality (only two possible values) and is less efficient than an index on a `UUID` field. High-cardinality fields benefit most from indexing. + +## Relationships and Joins + +Harper supports relationship-based queries that join data across tables. See [Schema documentation](TODO:reference_versioned_docs/version-v4/database/schema.md 'Database schema section with relationship directives') for how to define relationships. + +Join queries involve more lookups and naturally carry more overhead. The same indexing principles apply: + +- Conditions on joined table fields should use indexed columns for best performance. +- If a relationship uses a foreign key, that foreign key should be indexed in both tables. +- Higher cardinality foreign keys make joins more efficient. + +Example of an indexed foreign key that enables efficient join queries: + +```graphql +type Product @table { + id: Long @primaryKey + brandId: Long @indexed # foreign key — index this + brand: Related @relation(from: "brandId") +} +type Brand @table { + id: Long @primaryKey + name: String @indexed # indexed — enables efficient brand.name queries + products: Product @relation(to: "brandId") +} +``` + +Added in: v4.3.0 + +## Sorting + +Sorting can significantly impact query performance. + +- **Aligned sort and index**: If the sort attribute is the same indexed field used in the primary condition, Harper can use the index to retrieve results already in order — very fast. +- **Unaligned sort**: If the sort is on a different field than the condition, or the sort field is not indexed, Harper must retrieve and sort all matching records. For large result sets this can be slow, and it also **defeats streaming** (see below). + +Best practice: sort on the same indexed field you are filtering on, or sort on a secondary indexed field with a narrow enough condition to produce a manageable result set. + +## Streaming + +Harper can stream query results — returning records as they are found rather than waiting for the entire query to complete. This improves time-to-first-byte for large queries and reduces peak memory usage. + +**Streaming is defeated** when: + +- A sort order is specified that is not aligned with the condition's index +- The full result set must be materialized to perform sorting + +When streaming is possible, results are returned as an `AsyncIterable`: + +```javascript +for await (const record of MyTable.search({ conditions: [...] })) { + // process each record as it arrives +} +``` + +Failing to iterate the `AsyncIterable` to completion keeps a read transaction open, degrading performance. Always ensure you either fully iterate or explicitly release the query. + +### Draining or Releasing a Query + +An open query holds an active read transaction. While that transaction is open, the underlying data pages and internal state for the query cannot be freed — they remain pinned in memory until the transaction closes. In long-running processes or under high concurrency, accumulating unreleased transactions degrades throughput and increases memory pressure. + +The transaction closes automatically once the `AsyncIterable` is fully iterated. If you need to stop early, you must explicitly signal that iteration is complete so Harper can release the transaction. + +**Breaking out of a `for await...of` loop** is the most natural way. The JavaScript runtime automatically calls `.return()` on the iterator when a `break`, `return`, or `throw` exits the loop: + +```javascript +for await (const record of MyTable.search({ conditions: [...] })) { + if (meetsStopCriteria(record)) { + break; // iterator.return() is called automatically — transaction is released + } + process(record); +} +``` + +**Calling `.return()` manually** is useful when you hold an iterator reference directly: + +```javascript +const iterator = MyTable.search({ conditions: [...] })[Symbol.asyncIterator](); +try { + const { value } = await iterator.next(); + process(value); +} finally { + await iterator.return(); // explicitly closes the iterator and releases the transaction +} +``` + +Avoid storing an iterator and abandoning it (e.g. never using it in a for-loop or calling `.return()`), as the transaction will remain open until the iterator is garbage collected — which is non-deterministic. + +## Practical Guidance + +### Index fields you query on frequently + +```graphql +type Product @table { + id: Long @primaryKey + name: String @indexed # queried frequently + category: String @indexed # queried frequently + description: String # not indexed (rarely in conditions) +} +``` + +### Use `explain` to diagnose slow queries + +```javascript +const result = await Product.search({ + conditions: [ + { attribute: 'category', value: 'electronics' }, + { attribute: 'price', comparator: 'less_than', value: 100 }, + ], + explain: true, +}); +// result shows the actual execution order Harper selected +``` + +### Prefer selective conditions first + +When Harper cannot auto-reorder (e.g. with `enforceExecutionOrder`), put the most selective condition first: + +```javascript +// Better: indexed, selective condition first +Product.search({ + conditions: [ + { attribute: 'sku', value: 'ABC-001' }, // exact match on indexed unique field + { attribute: 'active', value: true }, // low cardinality filter + ], +}); +``` + +### Use `limit` and `offset` for pagination + +```javascript +Product.search({ + conditions: [...], + sort: { attribute: 'createdAt', descending: true }, + limit: 20, + offset: page * 20, +}); +``` + +### Avoid wide range queries on non-indexed fields + +```javascript +// Slow: non-indexed field with range condition +Product.search({ + conditions: [{ attribute: 'description', comparator: 'contains', value: 'sale' }], +}); + +// Better: use an indexed field condition to narrow first +Product.search({ + conditions: [ + { attribute: 'category', value: 'clothing' }, // indexed — narrows to subset + { attribute: 'description', comparator: 'contains', value: 'sale' }, // non-indexed, applied to smaller set + ], +}); +``` diff --git a/reference/resources/resource-api.md b/reference/resources/resource-api.md new file mode 100644 index 00000000..e8dcbee2 --- /dev/null +++ b/reference/resources/resource-api.md @@ -0,0 +1,526 @@ +--- +title: Resource API +--- + + + + + + + + + + +# Resource API + +Added in: v4.2.0 + +The Resource API provides a unified JavaScript interface for accessing, querying, modifying, and subscribing to data resources in Harper. Tables extend the base `Resource` class, and all resource interactions — whether from HTTP requests, MQTT messages, or application code — flow through this interface. + +A Resource class represents a collection of entities/records with methods for querying and accessing records and inserting/updating records. Instances of a Resource class represent a single record that can be modified through various methods or queries. A Resource instance holds the primary key/identifier and any pending updates to the record, so any instance methods can act on the record and have full access to this information during execution. + +Resource classes have static methods that directly map to RESTful methods or HTTP verbs (`get`, `put`, `patch`, `post`, `delete`), which can be called to interact with records, with general create, read, update, and delete capabilities. And these static methods can be overridden for defining custom API endpoint handling. + +## Resource Static Methods + +Static methods are defined on a Resource class and called when requests are routed to the resource. This is the preferred way to interact with tables and resources from application code. You can override these methods to define custom behavior for these methods and for HTTP requests. + +### `get(target: RequestTarget | Id, context?: Resource | Context): Promise | AsyncIterable` + +This can be called to retrieve a record by primary key. + +```javascript +// By primary key +const product = await Product.get(34); +``` + +The default `get` method returns a `RecordObject` — a frozen plain object with the record's properties plus `getUpdatedTime()` and `getExpiresAt()`. The record object is immutable because it represents the current state of the record in the database. + +`get` is also called for HTTP GET requests and is always called with a `RequestTarget` as the `target` parameter. When the request targets a single record (e.g. `/Table/some-id`), the default `get` returns a single record object. When the request targets a collection (e.g. `/Table/?name=value`), the `target.isCollection` property is `true` and the default behavior calls `search()`, returning an `AsyncIterable`. + +```javascript +class MyResource extends Resource { + static get(target) { + const id = target.id; // primary key from URL path + const param = target.get('param1'); // query string param + const path = target.pathname; // path relative to resource + return super.get(target); // default: return the record + } +} +``` + +Again, the default `get` method (available through `super.get()` inside the `get` method) returns a frozen object. If you want a modified object from this record, you can copy the record and change properties. This is most conveniently done with the spread operator: + +```javascript + static async get(target) { + const record = await super.get(target); + const alteredRecord = { ...record, changedProperty: 'value' }; + return alteredRecord; + } +``` + +The return value of `get` method on a `Table` is a `RecordObject`, which has the full state of the record as properties. + +#### RecordObject + +The `get()` method returns a `RecordObject` — a frozen plain object with all record properties, plus: + +- `getUpdatedTime(): number` — Last updated time (milliseconds since epoch) +- `getExpiresAt(): number` — Expiration time, if set + +--- + +### `search(query: RequestTarget): AsyncIterable` + +`search` performs a query on the resource or table. This is called by `get()` on collection requests and can be overridden to define custom query behavior. The default implementation on tables queries by the `conditions`, `limit`, `offset`, `select`, and `sort` properties parsed from the URL. See [Query Object](#query-object) below for available query options. + +### `put(target: RequestTarget | Id, data: Promise, context?: Resource | Context): Promise | Response` + +This writes the full record to the table, creating or replacing the existing record. This does _not_ merge the `data` into the existing record, but replaces it with the new data. + +```javascript +await Product.put(34, { name: 'New Product Name' }); +``` + +This is called for HTTP PUT requests, and can be overridden to implement a custom `PUT` handler. For example: + +```javascript +class MyResource extends Resource { + static async put(target, data) { + // validate or transform before saving + return super.put(target, { ...(await data), status: data.status ?? 'active' }); + } +} +``` + +### `put(data: object): Promise | Response` + +The `put` method can also be called directly with a plain object/record, and it will write the record to the table if it has a primary key defined in the object. + +```javascript +await Product.put({ id: 34, name: 'New Product Name' }); +``` + +### `patch(target: RequestTarget | Id, data: Promise, context?: Resource | Context): Promise | Response` + +This writes a partial record to the table, creating or updating the existing record. This merges the `data` into the existing record. + +```javascript +await Product.patch(34, { description: 'Updated description' }); +``` + +This is called for HTTP PATCH requests, and can be overridden to implement a custom `PATCH` handler. For example: + +```javascript +class MyResource extends Resource { + static async patch(target, data) { + // validate or transform before saving + return super.patch(target, { ...(await data), status: data.status ?? 'active' }); + } +} +``` + +Added in: v4.3.0 (CRDT support for individual property updates via PATCH) + +### `post(target: RequestTarget, data: Promise, context?: Resource | Context): Promise | Response` + +This is called for HTTP POST requests. The default behavior creates a new record, but it can overridden to implement custom actions: + +```javascript +class MyResource extends Resource { + static async post(target, promisedData) { + let data = await promisedData; + if (data.action === 'create') { + // create a new record + return this.create(target, data.content); + } else if (data.action === 'update') { + // update the referenced record + let resource = await this.update(target); + resource.set('status', data.status); + resource.save(); + } + } +} +``` + +It is not recommended to call `post` directly, and prefer more explicit methods like `create()` or `update()`. + +### `delete(target: RequestTarget | Id): void | Response` + +This deletes a record from the table. + +```javascript +await Product.delete(34); +``` + +This is called for HTTP DELETE requests, and can be overridden to implement a custom `DELETE` handler. For example: + +```javascript +class MyResource extends Resource { + static async delete(target) { + // validate or transform before deleting + return super.delete(target); + } +} +``` + +### `publish(target: RequestTarget, message: object, context?: Resource | Context): void | Response` + +This is called to publish a message. Messages can be published through tables, using the same primary key structure as records. + +```javascript +await Product.publish(34, { event: 'product-purchased', purchasePrice: 100 }); +``` + +This is called for MQTT publish commands. The default behavior records the message and notifies subscribers without changing the record's stored data. This can be overridden to implement custom message handling. + +### `subscribe(subscriptionRequest?: SubscriptionRequest): Promise` + +Called for MQTT subscribe commands. Returns a `Subscription` — an `AsyncIterable` of messages/changes. + +#### `SubscriptionRequest` options + +All properties are optional: + +| Property | Description | +| -------------------- | ---------------------------------------------------------------------------------------------- | +| `includeDescendants` | Include all updates with an id prefixed by the subscribed id (e.g. `sub/*`) | +| `startTime` | Start from a past time (catch-up of historical messages). Cannot be used with `previousCount`. | +| `previousCount` | Return the last N updates/messages. Cannot be used with `startTime`. | +| `omitCurrent` | Do not send the current/retained record as the first update. | + +### `connect(target: RequestTarget, incomingMessages?: AsyncIterable): AsyncIterable` + +Called for WebSocket and Server-Sent Events connections. `incomingMessages` is provided for WebSocket connections (not SSE). Returns an `AsyncIterable` of messages to send to the client. + +### `invalidate(target: RequestTarget)` + +Marks the specified record as invalid in a caching table, so it will be reloaded from the source on next access. + +### `create(record: object, context?): Promise` + +Create a new record with an auto-generated primary key. Returns the created record. Do not include a primary key in the `record` argument. + +Added in: v4.2.0 + +### `setComputedAttribute(name: string, computeFunction: (record) => any)` + +Define the compute function for a `@computed` schema attribute. + +Added in: v4.4.0 + +```javascript +MyTable.setComputedAttribute('fullName', (record) => `${record.firstName} ${record.lastName}`); +``` + +### `getRecordCount({ exactCount?: boolean }): Promise<{ recordCount: number, estimatedRange?: [number, number] }>` + +Returns the number of records in the table. By default returns an approximate (fast) count. Pass `{ exactCount: true }` for a precise count. + +Added in: v4.2.0 + +### `sourcedFrom(Resource, options?)` + +Configure a table to use another resource as its data source (caching behavior). When a record is not found locally, it is fetched from the source and cached. Writes are delegated to the source. + +Options: + +- `expiration` — Default TTL in seconds +- `eviction` — Eviction time in seconds +- `scanInterval` — Period for scanning expired records + +### `primaryKey` + +The name of the primary key attribute for the table. + +### `operation(operationObject: object, authorize?: boolean): Promise` + +Executes a Harper operations API call using this table as the target. Set `authorize` to `true` to enforce current-user authorization. + +--- + +### `update(target: RequestTarget, updates?: object): Promise` + +This returns a promise to an instance of the Resource class that can be updated and saved. This has mutable property access to a record. Any property changes on the instance are written to the table when the transaction commits. This is primary method for getting an instance of a Resource and accessing all of the Resource instance methods. + +## Resource Instance Methods + +A Resource instance is used to update and interact with a single record/resource. It provides functionality for updating properties, accessing property values, and managing record lifecycle. The Resource instance is normally retrieved from the static `update()` method. An instance from a table has updatable properties that can used to access and update individual properties, as well methods for more advanced updates and saving data. For example: + +```javascript +const product = await Product.update(32); +product.status = 'active'; // we can directly change properties on the updatable record +product.subtractFrom('quantity', 1); // We can use CRDT incrementation/decrementation to safely update the quantity +product.save(); +``` + +### `save()` + +This saves the current state of the resource to the database in the current transaction. This method can be called after making changes to the resource to ensure that those changes have been saved to the current transaction and can be queried within the same transaction. Any pending changes are automatically saved when the transaction commits (if `save()` has not already saved them). + +### `addTo(property: string, value: number)` + +Adds `value` to `property` using CRDT incrementation — safe for concurrent updates across threads and nodes. + +Added in: v4.3.0 + +```javascript +static async post(target, data) { + const record = await this.update(target.id); + record.addTo('quantity', -1); // decrement safely across nodes +} +``` + +### `subtractFrom(property: string, value: number)` + +Subtracts `value` from `property` using CRDT incrementation. + +### `set(property: string, value: any): void` + +Sets a property to `value`. Equivalent to direct property assignment (`record.property = value`), but useful when the property name is dynamic and not declared in the schema. + +```javascript +const record = await Table.update(target.id); +record.set('status', 'active'); +record.save(); +``` + +### `put(record: object): void` + +This replaces the current record data in the instance with the provided `record` object. + +### `patch(record: object): void` + +This merges the provided `record` object into the current record data for the instance. + +### `publish(message: object): void` + +This publishes a message to the current instance's primary key. + +### `invalidate(): void` + +This invalidates the current instance's record in a caching table, forcing it to be reloaded from the source on next access. + +### `getId(): Id` + +Returns the primary key of the current instance. + +### `getProperty(property: string): any` + +Returns the current value of `property` from the record. Useful when the property name is dynamic or when you want an explicit read rather than direct property access. + +```javascript +const record = await Table.update(target.id); +const current = record.getProperty('status'); +``` + +### `getUpdatedTime(): number` + +Returns the last updated time as milliseconds since epoch. + +### `getExpiresAt(): number` + +Returns the expiration time, if one is set. + +### `allowStaleWhileRevalidate(entry, id): boolean` + +For caching tables: return `true` to serve the stale entry while revalidation happens concurrently; `false` to wait for the fresh value. + +Entry properties: + +- `version` — Timestamp/version from the source +- `localTime` — When the resource was last refreshed locally +- `expiresAt` — When the entry became stale +- `value` — The stale record value + +The following instances are also implemented on Resource instances for backwards compatibility, but generally not necessary to directly use: + +- `get` +- `search` +- `post` +- `create` +- `subscribe` + +## Query Object + +The `Query` object is accepted by `search()` and the static `get()` method. + +### `conditions` + +Array of condition objects to filter records. Each condition: + +| Property | Description | +| ------------ | -------------------------------------------------------------------------------------------------------------------------------------------------------- | +| `attribute` | Property name, or an array for chained/joined properties (e.g. `['brand', 'name']`) | +| `value` | The value to match | +| `comparator` | `equals` (default), `greater_than`, `greater_than_equal`, `less_than`, `less_than_equal`, `starts_with`, `contains`, `ends_with`, `between`, `not_equal` | +| `conditions` | Nested conditions array | +| `operator` | `and` (default) or `or` for the nested `conditions` | + +Example with nested conditions: + +```javascript +Product.search({ + conditions: [ + { attribute: 'price', comparator: 'less_than', value: 100 }, + { + operator: 'or', + conditions: [ + { attribute: 'rating', comparator: 'greater_than', value: 4 }, + { attribute: 'featured', value: true }, + ], + }, + ], +}); +``` + +**Chained attribute references** (for relationships/joins): Use an array to traverse relationship properties: + +```javascript +Product.search({ conditions: [{ attribute: ['brand', 'name'], value: 'Harper' }] }); +``` + +Added in: v4.3.0 + +### `operator` + +Top-level `and` (default) or `or` for the `conditions` array. + +### `limit` + +Maximum number of records to return. + +### `offset` + +Number of records to skip (for pagination). + +### `select` + +Properties to include in each returned record. Can be: + +- Array of property names: `['name', 'price']` +- Nested select for related records: `[{ name: 'brand', select: ['id', 'name'] }]` +- String to return a single property per record: `'id'` + +Special properties: + +- `$id` — Returns the primary key regardless of its name +- `$updatedtime` — Returns the last-updated timestamp + +### `sort` + +Sort order object: + +| Property | Description | +| ------------ | ---------------------------------------------------------- | +| `attribute` | Property name (or array for chained relationship property) | +| `descending` | Sort descending if `true` (default: `false`) | +| `next` | Secondary sort to resolve ties (same structure) | + +### `explain` + +If `true`, returns conditions reordered as Harper will execute them (for debugging and optimization). + +### `enforceExecutionOrder` + +If `true`, forces conditions to execute in the order supplied, disabling Harper's automatic re-ordering optimization. + +--- + +## RequestTarget + +`RequestTarget` represents a URL path mapped to a resource. It is a subclass of `URLSearchParams`. + +Properties: + +- `pathname` — Path relative to the resource, without query string +- `search` — The query/search string portion of the URL +- `id` — Primary key derived from the path +- `isCollection` — `true` when the request targets a collection +- `checkPermission` — Set to indicate authorization should be performed; has `action`, `resource`, and `user` sub-properties + +Standard `URLSearchParams` methods are available: + +- `get(name)`, `getAll(name)`, `set(name, value)`, `append(name, value)`, `delete(name)`, `has(name)` +- Iterable: `for (const [name, value] of target) { ... }` + +When a URL uses Harper's extended query syntax, these are parsed onto the target: + +- `conditions`, `limit`, `offset`, `sort`, `select` + +--- + +## Response Object + +Resource methods can return: + +1. **Plain data** — serialized using content negotiation +2. **`Response`-like object** with `status`, `headers`, and `data` or `body`: + +```javascript +// Redirect +return { status: 302, headers: { Location: '/new-location' } }; + +// Custom header with data +return { status: 200, headers: { 'X-Custom-Header': 'value' }, data: { message: 'ok' } }; +``` + +`body` must be a string, `Buffer`, Node.js stream, or `ReadableStream`. `data` is an object that will be serialized. + +Added in: v4.4.0 + +### Throwing Errors + +Uncaught errors are caught by the protocol handler. For REST, they produce error responses. Set `error.statusCode` to control the HTTP status: + +```javascript +if (!authorized) { + const error = new Error('Forbidden'); + error.statusCode = 403; + throw error; +} +``` + +--- + +## Context and Transactions + +Whenever you call other resources from within a resource method, pass `this` as the context argument to share the transaction and ensure atomicity: + +```javascript +export class BlogPost extends tables.BlogPost { + static loadAsInstance = false; + + post(target, data) { + // both writes share the same transaction + tables.Comment.put(data, this); + const post = this.update(target.id); + post.commentCount = (post.commentCount ?? 0) + 1; + } +} +``` + +See [Global APIs — transaction](./global-apis.md#transaction) for explicitly starting transactions outside of request handlers. + +--- + +### `getContext(): Context` + +getContext is availabe as export from the `harper` module, or as a global variable, and returns the current context, which includes: + +- `user` — User object with username, role, and authorization information +- `transaction` — The current transaction + +When triggered by HTTP, the context is the `Request` object with these additional properties: + +- `url` — Full local path including query string +- `method` — HTTP method +- `headers` — Request headers (access with `context.headers.get(name)`) +- `responseHeaders` — Response headers (set with `context.responseHeaders.set(name, value)`) +- `pathname` — Path without query string +- `host` — Host from the `Host` header +- `ip` — Client IP address +- `body` — Raw Node.js `Readable` stream (if a request body exists) +- `data` — Promise resolving to the deserialized request body +- `lastModified` — Controls the `ETag`/`Last-Modified` response header +- `requestContext` — (For source resources only) Context of the upstream resource making the data request diff --git a/reference_versioned_docs/version-v4/resources/overview.md b/reference_versioned_docs/version-v4/resources/overview.md index c8fa7e41..f1493f79 100644 --- a/reference_versioned_docs/version-v4/resources/overview.md +++ b/reference_versioned_docs/version-v4/resources/overview.md @@ -56,10 +56,10 @@ Then, in a `resources.js` extend from the `tables.MyTable` global: export class MyTable extends tables.MyTable { static loadAsInstance = false; // use V2 API - get(target) { + async get(target) { // add a computed property before returning - const record = await super.get(target) + const record = await super.get(target); return { ...record, computedField: 'value' }; } diff --git a/reference_versioned_docs/version-v4/rest/querying.md b/reference_versioned_docs/version-v4/rest/querying.md index 37ad0b4d..0cc649cd 100644 --- a/reference_versioned_docs/version-v4/rest/querying.md +++ b/reference_versioned_docs/version-v4/rest/querying.md @@ -253,6 +253,13 @@ Added in: v4.5.0 Resources can be configured with `directURLMapping: true` for more direct URL path handling. When enabled, the URL path is mapped more directly to the resource without the default query parameter parsing semantics. See [Database / Schema](../database/schema.md) for configuration details. +:::caution Common gotchas + +- **`/Table` vs `/Table/`** — `GET /Table` returns metadata about the table resource itself. `GET /Table/` (trailing slash) targets the collection and invokes `get()` as a collection request. These are distinct endpoints. +- **Case sensitivity** — The URL path must match the exact casing of the exported resource or table name. `/Table/` works; `/table/` returns a 404. + +::: + ## See Also - [REST Overview](./overview.md) — HTTP methods, URL structure, and caching From d18dfe7b3c4c6a98a7c11bc5b83ea02d103ba821 Mon Sep 17 00:00:00 2001 From: Kris Zyp Date: Fri, 27 Mar 2026 23:23:36 -0600 Subject: [PATCH 2/3] Resource API updates for v5 --- reference/resources/resource-api.md | 27 +++++++++++++++------------ 1 file changed, 15 insertions(+), 12 deletions(-) diff --git a/reference/resources/resource-api.md b/reference/resources/resource-api.md index e8dcbee2..a7c60e2f 100644 --- a/reference/resources/resource-api.md +++ b/reference/resources/resource-api.md @@ -333,7 +333,7 @@ Entry properties: - `expiresAt` — When the entry became stale - `value` — The stale record value -The following instances are also implemented on Resource instances for backwards compatibility, but generally not necessary to directly use: +The following instances are also implemented on Resource instances for [backwards compatibility with 4.x](../../reference_versioned_docs/version-v4/resources/resource-api.md), but generally not necessary to directly use: - `get` - `search` @@ -485,22 +485,25 @@ if (!authorized) { ## Context and Transactions -Whenever you call other resources from within a resource method, pass `this` as the context argument to share the transaction and ensure atomicity: +Harper's HTTP/REST request handler automatically starts a transaction for each request, and assigns the `Request` object as the current context. The current context is available via `getContext()` as export from the `harper` module, or as a global variable. All database interactions that are called from the request will automatically use that transaction, for reading and writing data. Transactions and context are tracking using [asynchronous context tracking](https://nodejs.org/dist/latest/docs/api/async_context.html). + +However, you can explicitly create transactions to control the scope of atomicity and isolation. Transactions are created with the `transaction()` method, which establishes a transaction and context that are used for all subsequent database operations within the asynchronous context of the transaction. For example: ```javascript -export class BlogPost extends tables.BlogPost { - static loadAsInstance = false; - - post(target, data) { - // both writes share the same transaction - tables.Comment.put(data, this); - const post = this.update(target.id); - post.commentCount = (post.commentCount ?? 0) + 1; - } +import { transaction } from 'harper'; + +function receivedShipment(products) { + let myContext = {}; + trasaction(myContext, async () => { + for (let received of products) { + let product = await Product.update(received.productId); + product.addTo('quantity', received.quantity); + } + }); // all the product updates will be atomically commmited in this transaction } ``` -See [Global APIs — transaction](./global-apis.md#transaction) for explicitly starting transactions outside of request handlers. +See [Global APIs — transaction](./global-apis.md#transaction) for more information on starting transactions outside of request handlers. --- From 8bb3eb9d7b3ad1142f1cf097265d3dff6f016dae Mon Sep 17 00:00:00 2001 From: Kris Zyp Date: Sat, 28 Mar 2026 08:26:52 -0600 Subject: [PATCH 3/3] More resource API updates for v5 --- reference/database/overview.md | 123 ++++++++++++++++++++++++++++ reference/resources/resource-api.md | 12 ++- 2 files changed, 132 insertions(+), 3 deletions(-) create mode 100644 reference/database/overview.md diff --git a/reference/database/overview.md b/reference/database/overview.md new file mode 100644 index 00000000..c9d2eb43 --- /dev/null +++ b/reference/database/overview.md @@ -0,0 +1,123 @@ +--- +title: Overview +--- + + + + + +# Database + +Harper's database system is the foundation of its data storage and retrieval capabilities. Harper supports two storage enginers, [RocksDB](https://github.com/facebook/rocksdb/wiki/RocksDB-Overview) and [LMDB](https://www.symas.com/lmdb) (Lightning Memory-Mapped Database) and is designed to provide high performance, ACID-compliant storage with indexing and flexible schema support. + +## How Harper Stores Data + +Harper organizes data in a three-tier hierarchy: + +- **Databases** — containers that group related tables together in a single transactional file +- **Tables** — collections of records with a common data pattern +- **Records** — individual data objects with a primary key and any number of attributes + +All tables within a database share the same transaction context, meaning reads and writes across tables in the same database can be performed atomically. + +### The Schema System and Auto-REST + +The most common way to use Harper's database is through the **schema system**. By defining a [GraphQL schema](./schema.md), you can: + +- Declare tables and their attribute types +- Control which attributes are indexed +- Define relationships between tables +- Automatically expose data via REST, MQTT, and other interfaces + +You do not need to build custom application code to use the database. A schema definition alone is enough to create fully functional, queryable REST endpoints for your data. + +For more advanced use cases, you can extend table behavior using the [Resource API](TODO:reference_versioned_docs/version-v4/resources/resource-api.md 'Custom resource logic layered on top of tables'). + +### Architecture Overview + +``` + ┌──────────┐ ┌──────────┐ + │ Clients │ │ Clients │ + └────┬─────┘ └────┬─────┘ + │ │ + ▼ ▼ + ┌────────────────────────────────────────┐ + │ │ + │ Socket routing/management │ + ├───────────────────────┬────────────────┤ + │ │ │ + │ Server Interfaces ─►│ Authentication │ + │ RESTful HTTP, MQTT │ Authorization │ + │ ◄─┤ │ + │ ▲ └────────────────┤ + │ │ │ │ + ├───┼──────────┼─────────────────────────┤ + │ │ │ ▲ │ + │ ▼ Resources ▲ │ ┌───────────┐ │ + │ │ └─┤ │ │ + ├─────────────────┴────┐ │ App │ │ + │ ├─►│ resources │ │ + │ Database tables │ └───────────┘ │ + │ │ ▲ │ + ├──────────────────────┘ │ │ + │ ▲ ▼ │ │ + │ ┌────────────────┐ │ │ + │ │ External │ │ │ + │ │ data sources ├────┘ │ + │ │ │ │ + │ └────────────────┘ │ + │ │ + └────────────────────────────────────────┘ +``` + +## Databases + +Harper databases hold a collection of tables in a single transactionally-consistent file. This means reads and writes can be performed atomically across all tables in the same database, and multi-table transactions are replicated as a single atomic unit. + +The default database is named `data`. Most applications will use this default. Additional databases can be created for namespace separation — this is particularly useful for components designed for reuse across multiple applications, where a unique database name avoids naming collisions. + +> **Note:** Transactions do not preserve atomicity across different databases, only across tables within the same database. + +## Tables + +Tables group records with a common data pattern. A table must have: + +- **Table name** — used to identify the table +- **Primary key** — the unique identifier for each record (also referred to as `hash_attribute` in the Operations API) + +Primary keys must be unique. If a primary key is not provided on insert, Harper auto-generates one: + +- A **UUID string** for primary keys typed as `String` or `ID` +- An **auto-incrementing integer** for primary keys typed as `Int`, `Long`, or `Any` + +Numeric primary keys are more efficient than UUIDs for large tables. + +## Dynamic vs. Defined Schemas + +Harper tables can operate in two modes: + +**Defined schemas** (recommended): Tables with schemas explicitly declared using [GraphQL schema syntax](./schema.md). This provides predictable structure, precise control over indexing, and data integrity. Schemas are declared in a component's `schema.graphql` file. + +**Dynamic schemas**: Tables created through the Operations API or Studio without a schema definition. Attributes are reflexively added as data is ingested. All top-level attributes are automatically indexed. Dynamic schema tables automatically maintain `__createdtime__` and `__updatedtime__` audit attributes on every record. + +It is best practice to define schemas for production tables. Dynamic schemas are convenient for experimentation and prototyping. + +## Key Concepts + +For deeper coverage of each database feature, see the dedicated pages in this section: + +- **[Schema](./schema.md)** — Defining table structure, types, indexes, relationships, and computed properties using GraphQL schema syntax +- **[API](./api.md)** — The `tables`, `databases`, `transaction()`, and `createBlob()` globals for interacting with the database from code +- **[Data Loader](./data-loader.md)** — Loading seed or initial data into tables as part of component deployment +- **[Storage Algorithm](./storage-algorithm.md)** — How Harper stores data using LMDB with universal indexing and ACID compliance +- **[Jobs](./jobs.md)** — Asynchronous bulk data operations (CSV import/export, S3 import/export) +- **[System Tables](./system-tables.md)** — Harper internal tables for analytics, data loader state, and other system features +- **[Compaction](./compaction.md)** — Reducing database file size by eliminating fragmentation and free space +- **[Transaction Logging](./transaction.md)** — Recording and querying a history of data changes via audit log and transaction log + +## Related Documentation + +- [REST](../rest/overview.md) — HTTP interface built on top of the database resource system +- [Resources](TODO:reference_versioned_docs/version-v4/resources/overview.md) — Custom application logic extending database tables +- [Operations API](TODO:reference_versioned_docs/version-v4/operations-api/overview.md) — Direct database management operations (create/drop databases and tables, insert/update/delete records) +- [Configuration](TODO:reference_versioned_docs/version-v4/configuration/overview.md) — Storage configuration options (compression, blob paths, compaction) diff --git a/reference/resources/resource-api.md b/reference/resources/resource-api.md index a7c60e2f..e74de5db 100644 --- a/reference/resources/resource-api.md +++ b/reference/resources/resource-api.md @@ -245,11 +245,11 @@ This returns a promise to an instance of the Resource class that can be updated ## Resource Instance Methods -A Resource instance is used to update and interact with a single record/resource. It provides functionality for updating properties, accessing property values, and managing record lifecycle. The Resource instance is normally retrieved from the static `update()` method. An instance from a table has updatable properties that can used to access and update individual properties, as well methods for more advanced updates and saving data. For example: +A Resource instance is used to update and interact with a single record/resource. It provides functionality for updating properties, accessing property values, and managing record lifecycle. The Resource instance is normally retrieved from the static `update()` method. An instance from a table has updatable properties that can used to access and update individual properties (for properties declared in the table's schema), as well methods for more advanced updates and saving data. For example: ```javascript const product = await Product.update(32); -product.status = 'active'; // we can directly change properties on the updatable record +product.status = 'active'; // we can directly change properties on the updatable record, if they are declared in the schema product.subtractFrom('quantity', 1); // We can use CRDT incrementation/decrementation to safely update the quantity product.save(); ``` @@ -258,6 +258,8 @@ product.save(); This saves the current state of the resource to the database in the current transaction. This method can be called after making changes to the resource to ensure that those changes have been saved to the current transaction and can be queried within the same transaction. Any pending changes are automatically saved when the transaction commits (if `save()` has not already saved them). +This method only saves data when using RocksDB storage engine, and is a no-op when using LMDB. + ### `addTo(property: string, value: number)` Adds `value` to `property` using CRDT incrementation — safe for concurrent updates across threads and nodes. @@ -277,7 +279,7 @@ Subtracts `value` from `property` using CRDT incrementation. ### `set(property: string, value: any): void` -Sets a property to `value`. Equivalent to direct property assignment (`record.property = value`), but useful when the property name is dynamic and not declared in the schema. +Sets a property to `value`. Equivalent to direct property assignment (`record.property = value`), but can be used when the property name is dynamic and not declared in the schema. ```javascript const record = await Table.update(target.id); @@ -293,6 +295,10 @@ This replaces the current record data in the instance with the provided `record` This merges the provided `record` object into the current record data for the instance. +### `validate(record: object, partial?: boolean): void` + +This validates the provided `record` object against the schema, throwing an error if validation fails. If `partial` is true, only validates the provided properties, otherwise validates all required properties. This can be overridden to implement custom validation logic. This is called at the beginning of a transaction commit, prior to writing data to the transaction and fully committing it. + ### `publish(message: object): void` This publishes a message to the current instance's primary key.