> For the complete documentation index, see [llms.txt](https://docs.openmind.com/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://docs.openmind.com/developing/1_get-started.md).

# Quick Start

### System Requirements

#### Operating System

* Linux (Ubuntu 20, 22, 24)
* MacOS 12.0+

#### Hardware

* Sufficient memory to run vision and other models
* Reliable WiFi or other networking
* Sensors such as cameras, microphones, LIDAR units, IMUs
* Actuators and outputs such as speakers, visual displays, and movement platforms (legs, arms, hands)
* Hardware connected to the "central" computer via `Zenoh`, `CycloneDDS`, serial, usb, or custom APIs/libraries

#### Software

Ensure you have the following installed on your machine:

* `Go` >= 1.23.0 ([installation guide](https://go.dev/doc/install))
* `make` build tool
* `portaudio` for audio input and output
* `ffmpeg` for video processing
* Get your OpenMind API key [here](https://portal.openmind.com/)

**Go Installation**

```bash
# macOS
brew install go

# Linux - download from https://go.dev/dl/ or use your package manager
sudo apt-get update
sudo apt-get install golang-go
```

For other platforms, download from <https://go.dev/dl/>

**PortAudio Library**

For audio functionality, install `portaudio`:

```bash
# macOS
brew install portaudio

# Linux
sudo apt-get update
sudo apt-get install portaudio19-dev
```

**ffmpeg**

For video functionality, install FFmpeg:

```bash
# macOS
brew install ffmpeg

# Linux
sudo apt-get update
sudo apt-get install ffmpeg
```

To install Rust and Cargo (required for building SDKs like cdp-sdk), follow the steps below

```bash
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
source $HOME/.cargo/env
```

### CLI

OM1 provides a command-line interface (CLI). The main entry point is the `om1` binary built from `cmd/main.go` which provides the following options:

```bash
CONFIG=[config_name] make run
```

* `config_name`: Name of the config file (without `.json5` extension) in the `/config` directory.

For development with debug logging:

```bash
CONFIG=[config_name] make dev
```

### Installation and Setup

1. Clone the repository

Run the following commands to clone the repository and set up the environment:

```bash
git clone https://github.com/OpenMind/OM1.git
cd OM1
make deps
make build
```

**What these commands do:**

* `make deps` - Downloads and installs all Go module dependencies, fetches the zenoh-c library, and ensures your environment is ready for building.
* `make build` - Compiles the OM1 binary from source code.

Dependencies are managed via Go modules (`go.mod` and `go.sum`).

**Adding New Dependencies**

To add a new Go package:

```bash
go get <package>    # Add the dependency
make deps           # Tidy and verify modules
```

**Best Practices:**

* Keep dependencies minimal and prefer well-maintained packages
* Run `make check` before committing (runs fmt, vet, lint, and test)
* Use `make fmt` to format code and `make lint` to check for issues

2. Set the configuration variables

Locate the `config` folder and add your OpenMind API key to `/config/conversation.json5` (for example). If you do not already have one, you can obtain a free access key at <https://portal.openmind.com/>.

```bash
# /config/conversation.json5
...
"api_key": "om1_live_..."
...
```

Or, create a `.env` file in the project directory and add the following:

> **Note:** Using the placeholder key **openmind\_free** will generate errors.

```bash
OM_API_KEY=om1_live_...
```

3. Run the Conversation Agent

Run the following command to start the Conversation Agent:

```bash
CONFIG=conversation make run
```

> **Note:** Agent configuration names are only required when switching between different agents.

The conversation agent is just an example agent configuration.

If you want to interact with the agent and see how it works, make sure ASR and TTS are configured in `conversation.json5`.

ASR configuration (check in agent\_inputs)

```json5
{
      "type": "GoogleASRInput"
}
```

TTS configuration (check in agent\_actions)

```json5
{
      name: "speak",
      llm_label: "speak",
      connector: "elevenlabs_tts",
      config:
      {
        voice_id: "i4CzbCVWoqvD0P1QJCUL",
        "silence_rate": 20,
      },
}
```

During the first build, the system will automatically download the zenoh-c library and resolve all Go dependencies. This process may take several minutes to complete.

#### Prometheus and Grafana Monitoring

If you have Docker installed, you can use the included Docker Compose configuration to spin up Grafana and Prometheus to monitor real-time AI pipeline metrics (such as LLM response times and ASR latencies).

Run the following command:

```bash
docker-compose up -d grafana prometheus
```

Then navigate to <http://localhost:3000> (default login: `admin`/`admin`). The **OM1 Latency Monitoring** dashboard is automatically provisioned and will display your latency metrics as you interact with the agent.

#### Understanding the Log Data

The log data provide insight into how the `conversation` agent makes sense of its environment and decides on its next actions.

* First, it detects a person using vision.
* Communicates with an external AI API for response generation.
* The LLM(s) decide on a set of actions (dancing and speaking).
* The simulated robot expresses emotions via a front-facing display.
* Logs latency and processing times to monitor system performance.

```bash
Object Detector INPUT
// START
You see a person in front of you. You also see a laptop.
// END

AVAILABLE ACTIONS:
command: move
    A movement to be performed by the agent.
    Effect: Allows the agent to move.
    Arguments: Allowed values: 'stand still', 'sit', 'dance', 'shake paw', 'walk', 'walk back', 'run', 'jump', 'wag tail'

command: speak
    Words to be spoken by the agent.
    Effect: Allows the agent to speak.
    Arguments: <class 'str'>

command: emotion
    A facial expression to be performed by the agent.
    Effect: Performs a given facial expression.
    Arguments: Allowed values: 'cry', 'smile', 'frown', 'think', 'joy'

What will you do? Command:

INFO:httpx:HTTP Request: POST https://api.openmind.com/api/core/openai/chat/completions "HTTP/1.1 200 OK"
INFO:root:OpenAI LLM output: commands=[Command(type='move', value='wag tail'), Command(type='speak', value="Hi there! I see you and I'm excited!"), Command(type='emotion', value='joy')]
```

### More Examples

There are more pre-configured agents in the `/config` folder. They can be run with the following command:

For example, to run the `greeting_conversation` agent:

```bash
CONFIG=greeting_conversation make run
```

If you configure a custom agent, replace `<agent_name>` with your agent and run the below command:

```bash
CONFIG=<agent_name> make run
```

To get started with development, refer [here](/developer-cookbook/introduction.md)


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://docs.openmind.com/developing/1_get-started.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
