I think the readability issue of vectorized numpy if often ignored (while readability is one of the main strength of python in my opinion). I could not resists to write the example of section 2.2 in Julia:
using Random, BenchmarkTools
seq = rand(0:2,10_000); sub = rand(0:2,4);
# translation of the "readable" but slow python code
function_1(seq, sub) = [i for i in 1:(length(seq)-length(sub)) if view(seq,i:i+length(sub)-1) == sub];
@btime function_1(seq, sub);
# 93.858 μs (5 allocations: 1.98 KiB)
Which is more than twice as fast as the vectorized (and quite unreadable) python
%timeit function_2(seq, sub)
215 µs ± 1.94 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
Having a JIT (like julia, numba...) has really a bit potential to improve code readability and reduce the need of vectorized numpy operations. In some cases, it can even be faster due to be better memory utilization and by computing more directly what you need.
I think it's helpful to keep in mind that Python is general purpose and used in many domains in addition to data analysis (no matter which side of the walrus operator you are on)