An assembly-line robot arm might have a very simple policy: rotate ten degrees clockwise, pick up an item, drop it, rotate back, repeat. The policies being trained here were far more complex—a ...