Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 6 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ You can deploy your packaged model to your own infrastructure, or to [Replicate]

- ✅ **Define the inputs and outputs for your model with standard Python.** Then, Cog generates an OpenAPI schema and validates the inputs and outputs.

- 🎁 **Automatic HTTP prediction server**: Your model's types are used to dynamically generate a RESTful HTTP API using a high-performance Rust/Axum server.
- 🎁 **Automatic HTTP inference server**: Your model's types are used to dynamically generate a RESTful HTTP API using a high-performance Rust/Axum server.

- 🚀 **Ready for production.** Deploy your model anywhere that Docker images run. Your own infrastructure, or [Replicate](https://replicate.com).

Expand All @@ -31,35 +31,35 @@ build:
run: "run.py:Runner"
```

Define how predictions are run on your model with `run.py`:
Define how your model runs with `run.py`:

```python
from cog import BaseRunner, Input, Path
import torch

class Runner(BaseRunner):
def setup(self):
"""Load the model into memory to make running multiple predictions efficient"""
"""Load the model into memory to make running multiple inferences efficient"""
self.model = torch.load("./weights.pth")

# The arguments and types the model takes as input
def run(self,
image: Path = Input(description="Grayscale input image")
) -> Path:
"""Run a single prediction on the model"""
"""Run the model"""
processed_image = preprocess(image)
output = self.model(processed_image)
return postprocess(output)
```

In the above we accept a path to the image as an input, and return a path to our transformed image after running it through our model.

Now, you can run predictions on this model:
Now, you can run the model:

```console
$ cog run -i image=@input.jpg
--> Building Docker image...
--> Running Prediction...
--> Running...
--> Output written to output.jpg
```

Expand Down
10 changes: 5 additions & 5 deletions docs/cli.md
Original file line number Diff line number Diff line change
Expand Up @@ -215,13 +215,13 @@ cog push [IMAGE] [flags]

## `cog run`

Run a prediction.
Run the model.

If 'image' is passed, it will run the prediction on that Docker image.
If 'image' is passed, it will run the model on that Docker image.
It must be an image that has been built by Cog.

Otherwise, it will build the model in the current directory and run
the prediction on that.
it.

```
cog run [image] [flags]
Expand All @@ -230,7 +230,7 @@ cog run [image] [flags]
**Examples**

```
# Run a prediction with named inputs
# Run the model with named inputs
cog run -i prompt="a photo of a cat"

# Pass a file as input
Expand Down Expand Up @@ -268,7 +268,7 @@ cog run [image] [flags]

## `cog serve`

Run a prediction HTTP server.
Run an HTTP server.

Builds the model and starts an HTTP server that exposes the model's inputs
and outputs as a REST API. Compatible with the Cog HTTP protocol.
Expand Down
12 changes: 6 additions & 6 deletions docs/deploy.md
Original file line number Diff line number Diff line change
@@ -1,11 +1,11 @@
# Deploy models with Cog

Cog containers are Docker containers that serve an HTTP server
for running predictions on your model.
for running your model.
You can deploy them anywhere that Docker containers run.

The server inside Cog containers is **coglet**, a Rust-based prediction server
that handles HTTP requests, worker process management, and prediction execution.
The server inside Cog containers is **coglet**, a Rust-based inference server
that handles HTTP requests, worker process management, and run execution.

This guide assumes you have a model packaged with Cog.
If you don't, [follow our getting started guide](getting-started-own-model.md),
Expand All @@ -19,7 +19,7 @@ First, build your model:
cog build -t my-model
```

You can serve predictions locally with `cog serve`:
You can serve your model locally with `cog serve`:

```console
cog serve
Expand Down Expand Up @@ -54,7 +54,7 @@ To stop the server, run:
docker kill my-model
```

To run a prediction on the model,
To run the model,
call the `/predictions` endpoint,
passing input in the format expected by your model:

Expand All @@ -79,7 +79,7 @@ The response includes a `status` field with values like `STARTING`, `READY`, `BU

## Concurrency

By default, the server processes one prediction at a time. To enable concurrent predictions, set the `concurrency.max` option in `cog.yaml`:
By default, the server processes one run at a time. To enable concurrent runs, set the `concurrency.max` option in `cog.yaml`:

```yaml
concurrency:
Expand Down
2 changes: 1 addition & 1 deletion docs/environment.md
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,7 @@ The `dist` option searches for wheels in:

### `COGLET_WHEEL`

Controls which coglet wheel is installed in the Docker image. Coglet is the Rust-based prediction server.
Controls which coglet wheel is installed in the Docker image. Coglet is the Rust-based inference server.

**Supported values:** Same as `COG_SDK_WHEEL`

Expand Down
16 changes: 8 additions & 8 deletions docs/getting-started-own-model.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ sudo chmod +x /usr/local/bin/cog
To configure your project for use with Cog, you'll need to add two files:

- [`cog.yaml`](yaml.md) defines system requirements, Python package dependencies, etc
- [`run.py`](python.md) describes the prediction interface for your model
- [`run.py`](python.md) describes the run interface for your model

Use the `cog init` command to generate these files in your project:

Expand Down Expand Up @@ -74,31 +74,31 @@ This is handy for ensuring a consistent environment for development or training.

With `cog.yaml`, you can also install system packages and other things. [Take a look at the full reference to see what else you can do.](yaml.md)

## Define how to run predictions
## Define how to run your model

The next step is to update `run.py` to define the interface for running predictions on your model. The `run.py` generated by `cog init` looks something like this:
The next step is to update `run.py` to define the interface for running your model. The `run.py` generated by `cog init` looks something like this:

```python
from cog import BaseRunner, Path, Input
import torch

class Runner(BaseRunner):
def setup(self):
"""Load the model into memory to make running multiple predictions efficient"""
"""Load the model into memory to make running multiple inferences efficient"""
self.net = torch.load("weights.pth")

def run(self,
image: Path = Input(description="Image to enlarge"),
scale: float = Input(description="Factor to scale image by", default=1.5)
) -> Path:
"""Run a single prediction on the model"""
"""Run the model"""
# ... pre-processing ...
output = self.net(input)
# ... post-processing ...
return output
```

Edit your `run.py` file and fill in the functions with your own model's setup and prediction code. You might need to import parts of your model from another file.
Edit your `run.py` file and fill in the functions with your own model's setup and run code. You might need to import parts of your model from another file.

You also need to define the inputs to your model as arguments to the `run()` function, as demonstrated above. For each argument, you need to annotate with a type. The supported types are:

Expand All @@ -121,7 +121,7 @@ You can provide more information about the input with the `Input()` function, as
- `choices`: For `str` or `int` types, a list of possible values for this input.
- `deprecated`: Mark this input as deprecated with a message explaining what to use instead.

There are some more advanced options you can pass, too. For more details, [take a look at the prediction interface documentation](python.md).
There are some more advanced options you can pass, too. For more details, [take a look at the run interface documentation](python.md).

Next, add the line `run: "run.py:Runner"` to your `cog.yaml`, so it looks something like this:

Expand All @@ -132,7 +132,7 @@ build:
run: "run.py:Runner"
```

That's it! To test this works, try running a prediction on the model:
That's it! To test this works, try running the model:

```
$ cog run -i image=@input.jpg
Expand Down
12 changes: 6 additions & 6 deletions docs/getting-started.md
Original file line number Diff line number Diff line change
Expand Up @@ -85,11 +85,11 @@ Type "help", "copyright", "credits" or "license" for more information.

Inside this Docker environment you can do anything – run a Jupyter notebook, your training script, your evaluation script, and so on.

## Run predictions on a model
## Run a model

Let's pretend we've trained a model. With Cog, we can define how to run predictions on it in a standard way, so other people can easily run predictions on it without having to hunt around for a prediction script.
Let's pretend we've trained a model. With Cog, we can define how to run it in a standard way, so other people can easily run it without having to hunt around for a run script.

We need to write some code to describe how predictions are run on the model.
We need to write some code to describe how the model runs.

Save this to `run.py`:

Expand All @@ -107,13 +107,13 @@ WEIGHTS = models.ResNet50_Weights.IMAGENET1K_V1

class Runner(BaseRunner):
def setup(self):
"""Load the model into memory to make running multiple predictions efficient"""
"""Load the model into memory to make running multiple inferences efficient"""
self.device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
self.model = models.resnet50(weights=WEIGHTS).to(self.device)
self.model.eval()

def run(self, image: Path = Input(description="Image to classify")) -> dict:
"""Run a single prediction on the model"""
"""Run the model"""
img = Image.open(image).convert("RGB")
preds = self.model(WEIGHTS.transforms()(img).unsqueeze(0).to(self.device))
top3 = preds[0].softmax(0).topk(3)
Expand Down Expand Up @@ -174,7 +174,7 @@ Note: The first time you run `cog run`, the build process will be triggered to g

## Build an image

We can bake your model's code, the trained weights, and the Docker environment into a Docker image. This image serves predictions with an HTTP server, and can be deployed to anywhere that Docker runs to serve real-time predictions.
We can bake your model's code, the trained weights, and the Docker environment into a Docker image. This image serves an HTTP server, and can be deployed to anywhere that Docker runs to serve real-time inference.

```bash
cog build -t resnet
Expand Down
Loading