Refactor voice control core and robot behavior
This commit is contained in:
@@ -1,178 +1,106 @@
|
||||
# Dora IOBridge Node
|
||||
|
||||
A WebSocket server that bridges web clients with the Dora dataflow for real-time voice commands and scene updates.
|
||||
Generic WebSocket bridge between web clients and Dora dataflow.
|
||||
|
||||
## Inputs/Outputs
|
||||
## Dora Outputs (WebSocket → Dora)
|
||||
|
||||
| Input | Type | Description |
|
||||
|----------------|--------|---------------------------------------|
|
||||
| `voice_out` | JSON | Response from voice control node |
|
||||
| `scene_update` | JSON | Scene objects from voice control |
|
||||
| Output | Type | Description |
|
||||
|-------------|--------|--------------------------|
|
||||
| `text_out` | string | Text from clients |
|
||||
| `audio_out` | bytes | WAV audio from clients |
|
||||
| `data_out` | JSON | Generic data from clients|
|
||||
|
||||
| Output | Type | Description |
|
||||
|----------------|--------|---------------------------------------|
|
||||
| `voice_in` | string | Voice commands forwarded to Dora |
|
||||
## Dora Inputs (Dora → WebSocket)
|
||||
|
||||
| Input | Type | Description |
|
||||
|-------------|--------|--------------------------|
|
||||
| `text_in` | string | Text to broadcast |
|
||||
| `audio_in` | bytes | WAV audio to broadcast |
|
||||
| `data_in` | JSON | Generic data to broadcast|
|
||||
|
||||
## Environment Variables
|
||||
|
||||
```bash
|
||||
VOICE_HOST=0.0.0.0 # Bind address
|
||||
VOICE_PORT=8765 # Listen port
|
||||
| Variable | Default | Description |
|
||||
|-----------|-----------|--------------|
|
||||
| `WS_HOST` | `0.0.0.0` | Bind address |
|
||||
| `WS_PORT` | `8765` | Listen port |
|
||||
|
||||
## WebSocket Endpoint
|
||||
|
||||
```text
|
||||
ws://{WS_HOST}:{WS_PORT}
|
||||
```
|
||||
|
||||
## Installation
|
||||
Default: `ws://0.0.0.0:8765`
|
||||
|
||||
```bash
|
||||
cd dora_iobridge
|
||||
pip install -e .
|
||||
## WebSocket Protocol
|
||||
|
||||
### Client → Server
|
||||
|
||||
| Type | Field | Description |
|
||||
|---------|-----------|-----------------------|
|
||||
| `text` | `content` | Text string |
|
||||
| `audio` | `content` | Base64-encoded WAV |
|
||||
| `data` | `payload` | Any JSON object |
|
||||
| `ping` | - | Health check |
|
||||
|
||||
Examples:
|
||||
|
||||
```json
|
||||
{"type": "text", "content": "hello world"}
|
||||
{"type": "audio", "content": "UklGRi4AAABXQVZFZm10..."}
|
||||
{"type": "data", "payload": {"key": "value"}}
|
||||
{"type": "ping"}
|
||||
```
|
||||
|
||||
### Server → Client
|
||||
|
||||
| Type | Field | Description |
|
||||
|---------|-----------|-----------------------|
|
||||
| `text` | `content` | Text string |
|
||||
| `audio` | `content` | Base64-encoded WAV |
|
||||
| `data` | `payload` | Any JSON object |
|
||||
| `pong` | - | Response to ping |
|
||||
| `error` | `message` | Error description |
|
||||
|
||||
Examples:
|
||||
|
||||
```json
|
||||
{"type": "text", "content": "response text"}
|
||||
{"type": "audio", "content": "UklGRi4A...", "format": "wav"}
|
||||
{"type": "data", "payload": {"objects": [...]}}
|
||||
{"type": "pong"}
|
||||
{"type": "error", "message": "Invalid JSON"}
|
||||
```
|
||||
|
||||
## Dataflow Example
|
||||
|
||||
```yaml
|
||||
- id: iobridge
|
||||
build: uv pip install -e dora_iobridge
|
||||
path: dora_iobridge/dora_iobridge/main.py
|
||||
env:
|
||||
WS_HOST: "0.0.0.0"
|
||||
WS_PORT: "8765"
|
||||
inputs:
|
||||
text_in: voice/voice_out
|
||||
data_in: voice/scene_update
|
||||
outputs:
|
||||
- text_out
|
||||
```
|
||||
|
||||
## Testing
|
||||
|
||||
### Test with WebSocket (wscat)
|
||||
|
||||
```bash
|
||||
# Install wscat
|
||||
npm install -g wscat
|
||||
|
||||
# Connect to the server
|
||||
wscat -c ws://localhost:8765
|
||||
```
|
||||
|
||||
### Test with curl (websocat)
|
||||
|
||||
```bash
|
||||
# Install websocat
|
||||
# Ubuntu: sudo apt install websocat
|
||||
# macOS: brew install websocat
|
||||
sudo apt install websocat
|
||||
|
||||
# Send a ping
|
||||
echo '{"type": "ping"}' | websocat ws://localhost:8765
|
||||
# Response: {"type": "pong"}
|
||||
# Connect
|
||||
websocat ws://localhost:8765
|
||||
|
||||
# Send a voice command
|
||||
echo '{"type": "command", "text": "sube"}' | websocat ws://localhost:8765
|
||||
# Send text
|
||||
{"type": "text", "content": "hello"}
|
||||
|
||||
# Request scene refresh
|
||||
echo '{"type": "scene_refresh"}' | websocat ws://localhost:8765
|
||||
```
|
||||
|
||||
### Test with Python
|
||||
|
||||
```python
|
||||
import asyncio
|
||||
import websockets
|
||||
import json
|
||||
|
||||
async def test_iobridge():
|
||||
uri = "ws://localhost:8765"
|
||||
async with websockets.connect(uri) as ws:
|
||||
# Test ping
|
||||
await ws.send(json.dumps({"type": "ping"}))
|
||||
response = await ws.recv()
|
||||
print(f"Ping response: {response}")
|
||||
|
||||
# Send command
|
||||
await ws.send(json.dumps({
|
||||
"type": "command",
|
||||
"text": "agarra el cubo rojo"
|
||||
}))
|
||||
|
||||
# Listen for responses
|
||||
async for message in ws:
|
||||
data = json.loads(message)
|
||||
print(f"Received: {data}")
|
||||
|
||||
asyncio.run(test_iobridge())
|
||||
```
|
||||
|
||||
### Test with curl (HTTP upgrade not supported directly)
|
||||
|
||||
Since WebSocket requires an upgrade handshake, use this shell script:
|
||||
|
||||
```bash
|
||||
#!/bin/bash
|
||||
# test_iobridge.sh
|
||||
|
||||
# Using websocat for interactive testing
|
||||
websocat ws://localhost:8765 <<EOF
|
||||
{"type": "ping"}
|
||||
{"type": "command", "text": "sube"}
|
||||
{"type": "scene_refresh"}
|
||||
EOF
|
||||
```
|
||||
|
||||
## WebSocket Message Types
|
||||
|
||||
### Client -> Server
|
||||
|
||||
**Command (voice input)**
|
||||
```json
|
||||
{"type": "command", "text": "agarra el cubo rojo"}
|
||||
```
|
||||
|
||||
**Ping (health check)**
|
||||
```json
|
||||
# Ping
|
||||
{"type": "ping"}
|
||||
```
|
||||
Response: `{"type": "pong"}`
|
||||
|
||||
**Scene Refresh**
|
||||
```json
|
||||
{"type": "scene_refresh"}
|
||||
```
|
||||
|
||||
### Server -> Client (Broadcasts)
|
||||
|
||||
**Command Response**
|
||||
```json
|
||||
{
|
||||
"type": "response",
|
||||
"text": "Ok, voy a tomar",
|
||||
"status": "ok"
|
||||
}
|
||||
```
|
||||
|
||||
**Scene Update**
|
||||
```json
|
||||
{
|
||||
"type": "scene_updated",
|
||||
"objects": [
|
||||
{
|
||||
"object_type": "cubo",
|
||||
"color": "rojo",
|
||||
"size": "big",
|
||||
"position_mm": [150.0, 200.0, 280.0],
|
||||
"source": "detection"
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
## Dora Dataflow Configuration
|
||||
|
||||
```yaml
|
||||
nodes:
|
||||
- id: iobridge
|
||||
build: pip install -e ./dora_iobridge
|
||||
path: dora_iobridge
|
||||
inputs:
|
||||
voice_out: voice_control/voice_out
|
||||
scene_update: voice_control/scene_update
|
||||
outputs:
|
||||
- voice_in
|
||||
env:
|
||||
VOICE_HOST: "0.0.0.0"
|
||||
VOICE_PORT: "8765"
|
||||
```
|
||||
|
||||
```bash
|
||||
dora up
|
||||
dora start dataflow.yml
|
||||
```
|
||||
|
||||
## Dependencies
|
||||
|
||||
- dora-rs >= 0.3.9
|
||||
- pyarrow >= 12.0.0
|
||||
- websockets >= 12.0
|
||||
|
||||
Reference in New Issue
Block a user