Refactor voice control core and robot behavior
This commit is contained in:
@@ -1,131 +1,129 @@
|
||||
# Dora Voice Control Node
|
||||
|
||||
A Dora node that processes Spanish voice commands from children and translates them into robot actions (movement, grasping, releasing objects). Includes a web debug interface.
|
||||
Dora node that processes Spanish voice commands and translates them into robot actions. Supports multiple robot types via robot subfolders.
|
||||
|
||||
## Features
|
||||
|
||||
- Spanish voice command parsing (rule-based or Gemini LLM)
|
||||
- Spanish voice command parsing (rule-based or LLM)
|
||||
- Robot adapter pattern for different gripper types
|
||||
- Real-time web debug interface
|
||||
- Command queue management
|
||||
- Workspace bounds validation
|
||||
- Object detection integration
|
||||
|
||||
## File Structure
|
||||
|
||||
```
|
||||
```text
|
||||
dora_voice_control/
|
||||
├── __init__.py
|
||||
├── main.py # Main Dora node entry point
|
||||
├── api.py # FastAPI web server
|
||||
├── config.py # Configuration management
|
||||
├── models.py # Pydantic request/response models
|
||||
├── parser.py # Voice command parsing logic
|
||||
├── state.py # Shared state management
|
||||
└── templates.py # HTML template for web interface
|
||||
├── main.py # Thin orchestrator
|
||||
│
|
||||
├── core/ # Shared logic
|
||||
│ ├── behavior.py # RobotBehavior with actions
|
||||
│ ├── config.py # Configuration classes
|
||||
│ ├── node.py # Dora adapter + dispatcher + context
|
||||
│ ├── robot.py # RobotAdapter base
|
||||
│ ├── robot_io.py # Pose/status/image handlers + command queue
|
||||
│ ├── scene.py # Scene state + notifier + objects handler
|
||||
│ ├── state.py # Thread-safe shared state
|
||||
│ └── voice.py # Voice input + parsing + intents
|
||||
│
|
||||
├── robots/ # Robot-specific implementations
|
||||
│ └── littlehand/ # Vacuum gripper robot
|
||||
│ ├── adapter.py # Vacuum adapter
|
||||
│ ├── actions.py # Action vocabulary
|
||||
│ └── behavior.py # Behavior binding
|
||||
│
|
||||
└── web/ # Web interface
|
||||
├── api.py # FastAPI server
|
||||
├── models.py # Pydantic models
|
||||
└── templates.py # HTML template
|
||||
```
|
||||
|
||||
## Web Debug Interface
|
||||
## Robot Adapters
|
||||
|
||||
Access the debug interface at `http://localhost:8080` (default).
|
||||
Set `ROBOT_TYPE` to select the robot package:
|
||||
|
||||
Features:
|
||||
- Real-time status monitoring (pose, objects, queue)
|
||||
- Send manual voice commands
|
||||
- Quick command buttons
|
||||
- View parse results
|
||||
- Command history
|
||||
- Clear queue
|
||||
| Type | Grab Command | Release Command |
|
||||
|------|--------------|-----------------|
|
||||
| `littlehand` (alias: `vacuum`) | `vacuum_on` | `vacuum_off` |
|
||||
|
||||
To add a new robot, create a new subfolder under `robots/` with its adapter and behavior, then register it in `robots/__init__.py`.
|
||||
|
||||
## Inputs/Outputs
|
||||
|
||||
| Input | Type | Description |
|
||||
|---------------|--------|------------------------------------------|
|
||||
| `voice_in` | string | Text transcription of voice command |
|
||||
| `tcp_pose` | array | Current robot pose [x, y, z, roll, pitch, yaw] |
|
||||
| `objects` | JSON | Detected objects from vision system |
|
||||
| `status` | JSON | Command execution status from robot |
|
||||
| Input | Type | Description |
|
||||
|-------|------|-------------|
|
||||
| `voice_in` | string | Voice command text |
|
||||
| `tcp_pose` | array | Robot pose [x, y, z, roll, pitch, yaw] |
|
||||
| `objects` | JSON | Detected objects |
|
||||
| `status` | JSON | Command execution status |
|
||||
| `image_annotated` | array | Camera image |
|
||||
|
||||
| Output | Type | Description |
|
||||
|---------------|--------|------------------------------------------|
|
||||
| `robot_cmd` | JSON | Robot command with action and payload |
|
||||
| `voice_out` | JSON | Response confirmation to user |
|
||||
| `scene_update`| JSON | Updated scene with all visible objects |
|
||||
| Output | Type | Description |
|
||||
|--------|------|-------------|
|
||||
| `robot_cmd` | JSON | Robot command |
|
||||
| `voice_out` | JSON | Response to user |
|
||||
| `scene_update` | JSON | Scene state |
|
||||
|
||||
## Supported Commands (Spanish)
|
||||
|
||||
| Command | Action | Example |
|
||||
|---------------|----------------|--------------------------------|
|
||||
| `subir` | Move up | "sube" |
|
||||
| `bajar` | Move down | "baja" |
|
||||
| `tomar` | Grab object | "agarra el cubo rojo" |
|
||||
| `soltar` | Release object | "suelta en la caja azul" |
|
||||
| `ir` | Go to object | "ve al cilindro" |
|
||||
| `reiniciar` | Reset | "reinicia" |
|
||||
| Command | Action | Example |
|
||||
|---------|--------|---------|
|
||||
| `subir` | Move up | "sube" |
|
||||
| `bajar` | Move down | "baja" |
|
||||
| `tomar` | Grab object | "agarra el cubo rojo" |
|
||||
| `soltar` | Release object | "suelta en la caja azul" |
|
||||
| `ir` | Go to object | "ve al cilindro" |
|
||||
| `reiniciar` | Reset | "reinicia" |
|
||||
|
||||
## Environment Variables
|
||||
|
||||
```bash
|
||||
# Web API Server
|
||||
API_ENABLED=true # Enable/disable web interface
|
||||
API_HOST=0.0.0.0 # Bind address
|
||||
API_PORT=8080 # Listen port
|
||||
# Robot Configuration
|
||||
ROBOT_TYPE=littlehand # "littlehand" (alias: "vacuum")
|
||||
|
||||
# Web API
|
||||
API_ENABLED=true
|
||||
API_PORT=9001
|
||||
|
||||
# TCP Parameters
|
||||
TCP_OFFSET_MM=63.0 # Z-offset to object surface
|
||||
APPROACH_OFFSET_MM=50.0 # Safe approach distance above object
|
||||
STEP_MM=20.0 # Distance for up/down increments
|
||||
TCP_OFFSET_MM=63.0
|
||||
APPROACH_OFFSET_MM=50.0
|
||||
STEP_MM=20.0
|
||||
|
||||
# LLM Configuration (optional)
|
||||
LLM_PROVIDER=rules # "rules" or "gemini"
|
||||
GOOGLE_API_KEY=your_key # Required if using gemini
|
||||
GEMINI_MODEL=gemini-2.0-flash
|
||||
# LLM (optional)
|
||||
LLM_PROVIDER=rules # "rules", "gemini", "ollama"
|
||||
GOOGLE_API_KEY=your_key
|
||||
|
||||
# Workspace Safety (optional)
|
||||
WORKSPACE_MIN_X=-300
|
||||
WORKSPACE_MAX_X=300
|
||||
WORKSPACE_MIN_Y=-300
|
||||
WORKSPACE_MAX_Y=300
|
||||
# Initial Position
|
||||
INIT_ON_START=true
|
||||
INIT_X=300.0
|
||||
INIT_Y=0.0
|
||||
INIT_Z=350.0
|
||||
|
||||
# Safety
|
||||
DRY_RUN=false
|
||||
WORKSPACE_MIN_Z=0
|
||||
WORKSPACE_MAX_Z=500
|
||||
|
||||
# Misc
|
||||
DRY_RUN=false # Skip sending robot commands
|
||||
```
|
||||
|
||||
## Installation
|
||||
## Web Debug Interface
|
||||
|
||||
Access at `http://localhost:8080`:
|
||||
|
||||
- Camera view with detections
|
||||
- Real-time status (pose, objects, queue)
|
||||
- Send manual commands
|
||||
- View parse results
|
||||
|
||||
## API Endpoints
|
||||
|
||||
```bash
|
||||
cd dora_voice_control
|
||||
pip install -e .
|
||||
|
||||
# With LLM support
|
||||
pip install -e ".[llm]"
|
||||
```
|
||||
|
||||
## Testing
|
||||
|
||||
### Web Interface
|
||||
|
||||
```bash
|
||||
# Start the node (standalone for testing)
|
||||
python -m dora_voice_control.main
|
||||
|
||||
# Open in browser
|
||||
open http://localhost:8080
|
||||
```
|
||||
|
||||
### API Endpoints
|
||||
|
||||
```bash
|
||||
# Get status
|
||||
# Status
|
||||
curl http://localhost:8080/api/status
|
||||
|
||||
# Get objects
|
||||
# Objects
|
||||
curl http://localhost:8080/api/objects
|
||||
|
||||
# Get queue
|
||||
curl http://localhost:8080/api/queue
|
||||
|
||||
# Send command
|
||||
curl -X POST http://localhost:8080/api/command \
|
||||
-H "Content-Type: application/json" \
|
||||
@@ -135,77 +133,30 @@ curl -X POST http://localhost:8080/api/command \
|
||||
curl -X POST http://localhost:8080/api/queue/clear
|
||||
```
|
||||
|
||||
### Python Test
|
||||
|
||||
```python
|
||||
from dora_voice_control.parser import rule_parse, normalize
|
||||
|
||||
# Test command parsing
|
||||
text = "agarra el cubo rojo grande"
|
||||
result = rule_parse(normalize(text))
|
||||
print(result)
|
||||
# {'resultado': 'ok', 'accion': 'tomar', 'objeto': 'cubo', 'color': 'rojo', 'tamano': 'grande'}
|
||||
```
|
||||
|
||||
## Dora Dataflow Configuration
|
||||
## Dataflow Example
|
||||
|
||||
```yaml
|
||||
nodes:
|
||||
- id: voice_control
|
||||
build: pip install -e ./dora_voice_control
|
||||
path: dora_voice_control
|
||||
inputs:
|
||||
voice_in: iobridge/voice_in
|
||||
tcp_pose: robot/tcp_pose
|
||||
objects: detector/objects
|
||||
status: robot/status
|
||||
outputs:
|
||||
- robot_cmd
|
||||
- voice_out
|
||||
- scene_update
|
||||
env:
|
||||
API_ENABLED: "true"
|
||||
API_PORT: "8080"
|
||||
DRY_RUN: "false"
|
||||
- id: voice
|
||||
build: uv pip install -e dora_voice_control
|
||||
path: dora_voice_control/dora_voice_control/main.py
|
||||
env:
|
||||
ROBOT_TYPE: "vacuum"
|
||||
API_ENABLED: "true"
|
||||
inputs:
|
||||
voice_in: iobridge/text_out
|
||||
tcp_pose: robot/tcp_pose
|
||||
objects: detector/objects
|
||||
status: robot/status
|
||||
outputs:
|
||||
- robot_cmd
|
||||
- voice_out
|
||||
- scene_update
|
||||
```
|
||||
|
||||
## Message Examples
|
||||
## Adding a New Robot
|
||||
|
||||
### Input: voice_in
|
||||
```
|
||||
"sube"
|
||||
"agarra el cubo rojo"
|
||||
"suelta en la caja azul"
|
||||
```
|
||||
|
||||
### Output: robot_cmd
|
||||
```json
|
||||
{
|
||||
"id": "550e8400-e29b-41d4-a716-446655440000",
|
||||
"action": "move_to_pose",
|
||||
"payload": {
|
||||
"x": 150.0,
|
||||
"y": 200.0,
|
||||
"z": 280.0,
|
||||
"roll": 180.0,
|
||||
"pitch": 0.0,
|
||||
"yaw": 0.0
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Output: voice_out
|
||||
```json
|
||||
{"text": "Ok, voy a subir", "status": "ok"}
|
||||
{"text": "No entendi el comando", "status": "error"}
|
||||
```
|
||||
|
||||
## Dependencies
|
||||
|
||||
- dora-rs >= 0.3.9
|
||||
- numpy < 2.0.0
|
||||
- pyarrow >= 12.0.0
|
||||
- fastapi >= 0.109.0
|
||||
- uvicorn >= 0.27.0
|
||||
- pydantic >= 2.0.0
|
||||
- google-genai (optional, for Gemini mode)
|
||||
1) Create `dora_voice_control/dora_voice_control/robots/<robot_name>/` with:
|
||||
- `adapter.py` implementing a `RobotAdapter`
|
||||
- `actions.py` defining action aliases (can reuse defaults)
|
||||
- `behavior.py` binding the behavior class
|
||||
2) Register it in `dora_voice_control/dora_voice_control/robots/__init__.py`
|
||||
|
||||
Reference in New Issue
Block a user