One reason to assume that the current amount of information presented on a roadway is sufficient is that humans can learn to drive in a matter of hours with only two vision sensors mounted on their heads.
> humans can learn to drive in a matter of hours with only two vision sensors
You use way more than two vision sensors in your head. But first, lets talk about those vision sensors. They are mounted on an articulated scanning platform with numerous degrees of freedom. The dynamic range is astounding, and that dynamic range can be easily augmented with shaded lenses that can be mounted and removed as needed via an extremely reliable pick-and-place mechanism. The vision sensors are really equipped with dual-sensing technology for day- and night-mode.
But, you also have extremely sensitive accelerometers in your ears. And force-sensing in your feet to measure applied acceleration/braking controls. And force-sensing in your hands to measure applied steering force, and measure road roughness. You also are sitting on a force sensor that serves as a back-up accelerometer. You have audio sensors that provide feedback about road surface, vehicle performance, and other agents sharing the road.
Relevant short and long-term memory of every 3d object you've ever seen along a road-side and what the expected behaviour of those objects should be. Does a road-sign usually move by itself? No, maybe someone is carrying it. Is a ball coming from behind a car in a school-zone likely to be chased by a child...
> Is a ball coming from behind a car in a school-zone likely to be chased by a child.
Yeah, that kind of agent prediction is a biggie. Agent prediction is something journalists don't talk about much, but is a huge research topic for AV practitioners.
And since you make note of expected behavior of objects... my all-time favorite bug report out of Waymo (that made the public news) is a classic classifier/predictor/planner corner case leading to deadlock. Classifier identified a cyclist and labeled as "stopped" if cyclist has foot on the ground, and "moving" if both feet on pedals. Predictor would plot a trajectory for moving cyclist, planner decides action based on predicted agent behavior. So... Waymo vehicle and cyclist on two corners of 4-way stop. Cyclist is stopped, doing a perfectly balanced track stand with both feet clipped in. Feet on pedals so cyclist is labeled as moving agent... planner yields right-of-way to cyclist according to rules-of-the-road. Deadlock -- nobody moves.
Isn't that how Waymo killed a pedestrian/cyclist? It couldn't decide if she was a cyclist or pedestrian, as she was walking and pushing her bicycle, so it never braked.
She was homeless so the shape of her bike wasn't a proper bike so it got confused.
Yes, Uber, operating in Level 3, and safety driver was watching a movie on phone at the time.
Very different circumstances. I don’t think this was a classifier defect. Not that I was ever a fan of Uber’s development program, but in this case I can’t fault the platform.
Except we don't, not really. We're still leveraging 15 or so years of experience living the real world and walking on sidewalks and crossing streets and observing how other humans drive cars and being passengers in cars before we ever try driving ourselves. Not to mention, some humans definitely don't learn to drive in a matter of hours, they need weeks if not months.
The US DoT estimated $340 billion lost in traffic accident damage and just over 42,000 people’s lives lost in 2021. I think we accept that largely because it is diffused across lots of people. Self driving cars would centralize some of this liability and so they have to be an order of magnitude safer before they are accepted to replace our current system. And the existing “system” of human brains has millions of years of R&D; I think anyone in the industry should be accepting any unfair advantages we can get in the name of safety.
After all does anyone care if the plane they are in is flapping its wings?
Plus the knowledge and intuition that, for example
Firetruck >_____________< Fire Hydrant
Something that looks flat but is actually a fire hose means you should stop instead of driving over it - this was actually issue in Phoenix(?)
A homeless lady was killed by a self-driving car. She was walking a loaded bicycle, at 2am, on a road that doesn't usually have pedestrians. The self-driving car couldn't decide if she was a bike or a human so it didn't brake and the human-driver backup had zoned out as the car had made 99.99% of the correct decisions.
A self-driving car is like a jigsaw puzzle - there's one way to do it right, and infinite ways for it to be completely or subtly wrong.
"What do you see" and "What does it mean"
Humans and cameras have the same "What do you see" input, but the "What does it mean" category hasn't been conquered by cars.
If a human screws up while driving and they end up in a wheelchair they have no one but themselves to blame, but if software puts someone in a wheelchair who otherwise wouldn't have been that person isn't going to care about "statistically better". They shouldn't care either. If carmakers want to take control away from a person, they had better do at least as well the person would 100% of the time.
It's not as if pedestrians are any safer. A self driving car that hits a person because it gets confused where a human wouldn't have is just as bad. When lives are on the line, taking control from the human requires that the machine must be at least good as the human would be in every situation or we're better off leaving humans in control with machines augmenting their abilities.
It's great that we have cars that can warn us when there's something in our blind spot, something behind us while we back up, or when we drift out of our lane. Until the software can perform as well as we can, any more than that is a liability.
> machine must be at least good as the human would be in every situation or we're better off leaving humans in control with machines augmenting their abilities.
That isn't quite right. There can be situations where the human is better, so long as in the majority of situations the machine is better. This is why I said it is a statistical question. I'm not demanding perfection, I'm asking for better than humans overall. Of course if the machine knows it is in a situation where it isn't as good it is fine to make the human drive.
We use a lot more than our eyes to drive, and people aren't trusted to start driving until they're in their teens - so closer to 15 years of training. And we're still not all that good at it.