If it doesn't work with an real life implementation e.g. docker test container, then I don't consider it a robust test.