Adjust ROI usage and add voice control docs

2026-02-02 12:49:40 -03:00
parent 695d309816
commit 048de058a3
5 changed files with 170 additions and 25 deletions
--- a/dataflow_voice_control_ulite6_zed.yml
+++ b/dataflow_voice_control_ulite6_zed.yml
@@ -66,8 +66,7 @@ nodes:
      CALIBRATION_FILE: "calibration_ulite6_zed.npz"
      DETECTOR_WEIGHTS: "trained_models/yolo8n.pt"
      CONFIG_FILE: "config.toml"
-      ROI_TOP_LEFT: "500,230"
+      USE_ROI: "false"
      ROI_BOTTOM_RIGHT: "775,510"
      SIZE_THRESHOLD: "4200"
      DETECT_EVERY_N: "3"
      MIN_DEPTH_MM: "10"
@@ -107,9 +106,9 @@ nodes:
      DRY_RUN: "false"
      # Initial position (used on startup and reset command)
      INIT_ON_START: "true"
-      INIT_X: "300.0"
+      INIT_X: "250.0"
      INIT_Y: "0.0"
-      INIT_Z: "350.0"
+      INIT_Z: "400.0"
      INIT_ROLL: "180.0"
      INIT_PITCH: "0.0"
      INIT_YAW: "0.0"
--- a/dora_voice_control/docs/add_robot.md
+++ b/dora_voice_control/docs/add_robot.md
@@ -0,0 +1,100 @@
 # Add a New Robot
 This project uses a simple adapter + behavior pattern.
 ## 1) Create a robot adapter
 Implement the common command interface:
 - File: `dora_voice_control/dora_voice_control/robots/<robot_name>/adapter.py`
 - Base class: `RobotAdapter` (`dora_voice_control/dora_voice_control/core/robot.py`)
 Example (vacuum style):
 ```python
 from __future__ import annotations
 from typing import Any, Dict, List
 from ...core.robot import RobotAdapter
 from ...core.state import RobotStep
 NAME = "my_robot"
 ALIASES = {"my_robot", "my_alias"}
 class MyRobotAdapter(RobotAdapter):
    def grab(self) -> List[RobotStep]:
        return [RobotStep(action="vacuum_on", payload={})]
    def release(self) -> List[RobotStep]:
        return [RobotStep(action="vacuum_off", payload={})]
    def move(self, payload: Dict[str, Any]) -> List[RobotStep]:
        return [RobotStep(action="move_to_pose", payload=payload)]
    def reset_tool(self) -> List[RobotStep]:
        return [RobotStep(action="vacuum_off", payload={})]
 ```
 The `action` strings must match what your robot node understands.
 ## 2) Create robot actions
 - File: `dora_voice_control/dora_voice_control/robots/<robot_name>/actions.py`
 - Use `ActionInfo` to define names, aliases, and requirements.
 ```python
 from ...core.behavior import ActionInfo
 MY_ACTIONS = {
    "tomar": ActionInfo(name="tomar", aliases=["toma"], requires_object=False),
    "soltar": ActionInfo(name="soltar", aliases=["suelta"], requires_object=False),
 }
 ```
 ## 3) Create robot behavior
 - File: `dora_voice_control/dora_voice_control/robots/<robot_name>/behavior.py`
 - Subclass `RobotBehavior` and implement `action_handlers()`.
 ```python
 from typing import Callable
 from ...core.behavior import ActionContext, RobotBehavior
 from .actions import MY_ACTIONS
 class MyRobotBehavior(RobotBehavior):
    ACTIONS = MY_ACTIONS
    def action_tomar(self, ctx: ActionContext) -> bool:
        self._queue_steps(ctx, self.robot_adapter.grab())
        return True
    def action_soltar(self, ctx: ActionContext) -> bool:
        self._queue_steps(ctx, self.robot_adapter.release())
        return True
    def action_handlers(self) -> dict[str, Callable[[ActionContext], bool]]:
        return {
            "tomar": self.action_tomar,
            "soltar": self.action_soltar,
        }
 ```
 ## 4) Register the robot
 - File: `dora_voice_control/dora_voice_control/robots/__init__.py`
 Add a resolver entry that maps `ROBOT_TYPE` to your adapter/behavior.
 ## 5) Update dataflow
 Set `ROBOT_TYPE` in your dataflow:
 ```yaml
 env:
  ROBOT_TYPE: "my_robot"
 ```
 ## Safety notes
 - Keep bounds checks in behavior methods (`_queue_move` already checks workspace limits).
 - For real hardware, validate with a staged plan: simulation → dry-run → full run.
--- a/dora_voice_control/docs/voice_control_node.md
+++ b/dora_voice_control/docs/voice_control_node.md
@@ -0,0 +1,44 @@
 # Voice Control Node
 This node turns voice intents into robot commands and publishes them to the Dora graph.
 ## Dataflow wiring
 Typical wiring (from `dataflow_voice_control_ulite6_zed.yml`):
 - Inputs
  - `voice_in` (speech/commands)
  - `tcp_pose` (current robot TCP pose)
  - `objects` (detected objects)
  - `status` (robot status)
  - `image_annotated` (debug image)
 - Outputs
  - `robot_cmd` (robot command queue)
  - `scene_update` (scene debug)
  - `voice_out` (debug/feedback)
 ## Runtime flow
 1) Voice input is parsed into an intent (action + optional object/color/size).
 2) The scene is queried for a target object if the action requires it.
 3) `RobotBehavior.execute()` validates preconditions and dispatches to a handler.
 4) The handler queues `RobotStep` commands via the adapter.
 5) `CommandQueueService` sends commands on `robot_cmd`.
 ## Key files
 - `dora_voice_control/dora_voice_control/main.py`
  - Node wiring, intent processing, and dispatch loop.
 - `dora_voice_control/dora_voice_control/core/behavior.py`
  - Base behavior, action validation, handler dispatch.
 - `dora_voice_control/dora_voice_control/robots/`
  - Robot-specific adapters and behaviors.
 ## Environment variables (common)
 - `ROBOT_TYPE`: robot selector (e.g., `vacuum`, `littlehand`)
 - `COMMAND_OUTPUT`: output port name for robot commands (default `robot_cmd`)
 - `INIT_ON_START`: queue initial reset + move to home (default `true`)
 - `INIT_X/Y/Z`, `INIT_ROLL/PITCH/YAW`: home pose
 See `dora_voice_control/README.md` for full configuration.
--- a/dora_voice_control/dora_voice_control/robots/littlehand/behavior.py
+++ b/dora_voice_control/dora_voice_control/robots/littlehand/behavior.py
@@ -34,10 +34,11 @@ class LittlehandBehavior(RobotBehavior):
        return self._queue_move(ctx, ctx.pose[0], ctx.pose[1], target_z)
    def action_ir(self, ctx: ActionContext) -> bool:
-        """Move to object position (approach + target)."""
+        """Move to object X/Y while keeping current Z."""
        if ctx.pose is None or ctx.target is None:
            return False
        pos = ctx.target.position_mm
-        self._queue_approach_sequence(ctx, pos)
+        return self._queue_move(ctx, pos[0], pos[1], ctx.pose[2])
        return True
    def action_tomar(self, ctx: ActionContext) -> bool:
        """Activate tool (low-level grab)."""
--- a/dora_yolo_object_detector/dora_yolo_object_detector/main.py
+++ b/dora_yolo_object_detector/dora_yolo_object_detector/main.py
@@ -324,12 +324,13 @@ def _draw_detections(
    """Draw bounding boxes and labels on frame."""
    annotated = frame.copy()
-    # Draw ROI rectangle (always visible)
+    if cfg.use_roi:
        # Draw ROI rectangle
        cv2.rectangle(
            annotated,
            cfg.roi_top_left,
            cfg.roi_bottom_right,
-        (0, 255, 0) if cfg.use_roi else (128, 128, 128),
+            (0, 255, 0),
            2,
        )
        # Label the ROI
@@ -339,7 +340,7 @@ def _draw_detections(
            (cfg.roi_top_left[0] + 5, cfg.roi_top_left[1] + 20),
            cv2.FONT_HERSHEY_SIMPLEX,
            0.6,
-        (0, 255, 0) if cfg.use_roi else (128, 128, 128),
+            (0, 255, 0),
            2,
        )