Robotics recovery data

Human-produced failure and recovery data for soft-object manipulation.

Low-Likelihood Data is building a rights-cleared dataset and hidden benchmark around the messy manipulation episodes that separate demos from usable robotic systems.

01 / What we are building

Rights-cleared recovery episodes for hard manipulation cases.

Low-Likelihood Data is an early-stage robotics data company focused on human-produced failure and recovery data for deformable object manipulation.

We are starting with the cases that are easy for people but still difficult for robots: towels that collapse, bags that will not open, soft packages that slip, layers that are hard to separate, and objects hidden inside soft clutter.

The product is a rights-cleared dataset and hidden benchmark for these recovery episodes. It is designed for robotics foundation-model teams, VLA labs, humanoid companies, simulation teams, and physical-AI data vendors.

02 / Motivation

Most manipulation data overrepresents clean success.

Robots need to know what to do when the object does not behave cleanly. Soft and deformable objects create common failure cases:

  • The useful corner or edge is hidden
  • The wrong layer is selected
  • A bag, liner, or pouch collapses
  • Two objects are picked together
  • An object slips, folds, or twists during grasping
  • A target is covered by cloth or soft packaging
  • The opening, seam, label, or grasp point faces the wrong way

We are choosing cases like this because recovery is often the difference between a demo and a usable manipulation system. A model does not only need examples of successful handling, but also structured examples of failure detection, state correction, regrasping, and task decomposition.

03 / What the dataset contains

Each episode has a before state, ambiguity point, recovery action, and after state.

Initial task families:

  • Towel, T-shirt, and pillowcase corner finding
  • Layer separation after wrong-layer selection
  • Bad-fold recovery and refolding
  • Mixed soft-object retrieval from bins
  • Bag, liner, pouch, and polybag opening
  • Soft-package retrieval from clutter
  • Double-pick separation
  • Regrasping after slippage, bunching, twisting, or collapse

Initial capture stack:

  • Head-mounted egocentric video
  • Wrist or forearm video
  • Static side-view video
  • Before, failure/recovery, and after keyframes
  • Timestamps
  • Task and object metadata

Optional later additions include depth, hand-pose extraction, rough masks, object boxes, and buyer-specific capture variants.

04 / Annotation

A compact structure for failures, causes, recovery actions, and outcomes.

Each accepted episode can include:

  • Task family
  • Object family
  • Material state
  • Initial state
  • Subgoal
  • Failure event
  • Failure cause
  • Recovery action
  • Hand-object contact phase
  • Keyframes
  • Affordance tags
  • Outcome
  • Quality grade
  • Benchmark split
05 / Example schema

One episode, structured for model analysis and benchmark use.

Here is an example of what an episode might look like:

episode.json
{
  "episode_id": "LLD_000001",
  "episode_metadata": {
    "task_family": "corner_finding",
    "object_family": "towel",
    "environment_type": "soft_object_bin",
    "modalities": ["head_view", "wrist_view", "static_side_view"],
    "rights_status": "cleared"
  },
  "state": {
    "initial_state_tags": ["twisted", "partly_occluded", "mixed_bin"],
    "initial_state_caption": "Towel partly twisted in mixed soft-object bin"
  },
  "events": [
    {
      "event_type": "failure",
      "label": "wrong_layer_grasp",
      "start_time": 4.12,
      "end_time": 5.03,
      "cause_tags": ["layer_confusion", "soft_collapse"]
    },
    {
      "event_type": "recovery",
      "labels": ["stabilize_material", "expose_corner", "change_grasp_angle"],
      "start_time": 5.04,
      "end_time": 9.80
    }
  ],
  "affordances": ["corner", "edge", "fold", "hidden_by_other_cloth"],
  "outcome": "success",
  "quality": {
    "grade": "A",
    "review_status": "accepted"
  }
}
06 / Intended use

Not a replacement for robot-native data. A focused complement.

Our dataset is not meant to replace robot-native data. It does not contain the robot's own actions, force signals, tactile data, proprioception, or deployment traces.

It is meant to complement those sources. Likely uses include:

  • Failure-mode evaluation
  • Hidden benchmark and regression testing
  • VLA post-training support
  • Representation learning
  • Task decomposition
  • Affordance and state-transition learning
  • Retrieval data for model analysis
  • Simulation scenario generation
  • Internal annotation schema design