If the shuffle is truly random eventually you will find that the list is sorted and the bogosort will terminate. If your shuffle is not truly random then it may likely never terminate.
Given a uniform distribution over k discrete values (in this case permutations of the list), the expected probability of the next permutation is 1/k. This is not the same as saying that if you take k samples from the distribution, you are guaranteed to receive exactly one of each value. For any finite number n of samples, no matter how large, the probability of getting a given value tends towards 1 as n grows, but it only approaches it asymptotically. While the probability of not seeing any one value from the domain grows vanishingly small as n gets much larger than the number of possible values, it never hits zero. So even with a stochastically random source of list permutations, we cannot place a finite upper bound on a time in which bogosort is guaranteed to terminate.
It gets even worse with pseudorandom number generators — these display cyclical behavior over time (any good one will take quite a while to show it, but they all will), and you might have picked a seed for your PRNG that generates a cycle that never includes the sorted permutation.
So no, bogosort is not guaranteed to terminate. In practice, it has a reasonable chance of terminating for most inputs, and it will do so with an expected (not guaranteed!) runtime in O(n!) for a list of length n.
Actually this is about one of the first things you learn in in a probability theory course, and the probability that bogosort completes in finite time is 1.
Now of course, there's infinitely many conceivable scenarios where bogosort never completes, but it doesn't matter. The theoretical foundation (Lebesgue measure theory) to make this mathematically rigid is a bit complex. An example to illustrate:
Assume you have're looking at a 2-dimensional space (over the real numbers) and you want to measure the area of subsets of it. E.g. the area of a square defined as:
{ (x,y) \in \R | 0 <= x <= 1 and 0 <= y <=1 }
Now the area of that is 1. Then carve out the y-axis like this:
{ (x,y) \in \R | 0 < x <= 1 and 0 <= y <=1 }
It's obviously a smaller set than the square above, but the set that's missing doesn't have any area (it's a line with no width). So it's sane to assume that the area of the new subset is still 1.
[btw: Lebesgue measure theory throws in a few more simple assumption and then derives a very nice theory for computing areas, and in extension integration of functions.]
Now probabilities are the same: Instead of points in the plane you have different runs of bogosort and the measure of the "area" for all events is defined to be 1. But you still can carve out complete "lines" (i.e. sets with even infinte many different runs of bogosort) that don't have any influence on the value of your total "area", i.e. probability.
To conclude, your in /pratice/ actually holds up in theory, too ;-)
Edit: Lebesgue measure is actually only the application onto real spaces, the fundamental thing that also applies to probabilities is just called measure theory. Mixed that up a bit.
If I understand that correctly, we're dealing with a case analogous to the coin-flipping scenario.
And since, as you admit, there are infinitely many scenarios where a bogosort never completes, we cannot place a finite upper bound on its running time for all cases, which is what complexity theory's big-O notation is all about. As I stated, the average-case complexity for bogosort is in O(n!), but the worst-case complexity isn't bound by any function.