I once spent ~2 hours explaining all of this (well maybe not all of it) to a friend of mine who was studying sound design at the time. He didn't believe me.
I even transcoded some 24/192 FLAC Pink Floyd I had lying around and made him do a double blind test to show him that he'd prefer the slightly louder song every time, even if the louder song was 192kbps vs the FLAC. He did. He still doesn't believe me.
He still thinks he can hear the difference between FLAC and MP3 to this day. He works as a sound engineer now.
I don't think any amount of reasoning will make some people change their minds. Some people buy $500 wooden knobs to make their volume pots sound better. (or was that a hoax? i can't tell anymore)
The 100MB difference is not just due to the audio TOC being of smaller size than the ISO9660 or UDF file system metadata. It's also because of differences in error correction. I don't have the spec on hand but I recall from when I was investigating this that CD-ROMs use more bits for error correction than audio CDs. That's why you can fit more audio data than "filesystem data" on a CD-R. Reading (ripping, digitally) an audio CD will likely result in different digital audio files every time, since the error correction is not that good, but good enough, for audio.
I read into this when I was wondering why my CD-DA extracted .wavs came out with a different checksum every time. Vibration is one of the factors that would make the same audio CD, read with the same CD player, produce different digital signals some of the time or even every time.
CD-ROMs however, which store digital data, need better correction - you definately don't want a bitflip in your .exe, while a minor amplitude diff — an uncorrected bitflip in the upper bits of a 16-bit PCM signal — is no biggie.
So… I'm not saying that the people using CD mats are informed (or have tested whether the mat makes a difference, or would even know how to go on about testing this, scientifically), but there's more to it than what I originally thought — which was "it's digital so it's never degraded". I wouldn't have known without checking the md5sum of my .wav, though.
Uh, no. Bit-perfect ripping is trivial and routine, and tools like the AccurateRip DB (which has checksums for around three million different titles you can use to verify the checksums on your own rips) and the CUEtools database (which has recovery records you can use to correct bit errors on your own rips) prove it. I routinely get bit-accurate single-pass high-speed rips--no "paranoid" settings or re-reads--of discs dating back thirty years or more, and so do hundreds of thousands of other people. If you get different checksums on successive rips of the same CD, either the disc is damaged or the drive you're using is failing.
Oh sure, your rips may be perfect at the bit level, but how do you know that they're free of sub-bit quantization that isn't detectable by electronic circuits but can be heard by the human ear?
This sub-bit jitter and interference can travel along with a digital file and sneak right past your ordinary bit-level error detection and correction, no matter how lossless you make it. That's because these errors aren't visible in the bits. They occur at a deeper and more subtle level, in between the bits.
Even if you prove mathematically that two files contain the exact same bits, you can't prove that the human ear won't hear any difference, can you?
Ah, well, the human ear is a much more finely tuned instrument than your decoders and players. Think of the feelings you get when you hear the ocean waves, the birds sing, a thunderclap!
Can you turn this into mere "bits"? Of course not!
That's why it is so important to protect against sub-bit quantization errors, and this can only be done with proper interconnects. Ordinary cables allow the bits to travel willy-nilly until they jam up against each other creating a brittle, edgy soundstage. Quality interconnects are tuned, aligned, and harmonically shielded to keep those precious bits - and the all-important spaces between them - in a smooth flow.
And then, we can hear all of the things that make us human.
Interesting. I'll have to check those projects out. I have the same problem as the GP -- I have a script that rips CDs, taking multiple reads until it gets two bit-for-bit identical copies. And just about every time at least one track is silently "corrupted."
(I put the scare quotes on because I haven't actually bothered to check if there is an audible difference. But it does confirm the GP's experience.)
CD-audio "Red Book" data does have error correction (Cross-Interleaved Reed-Solomon Code). Whether you get a bit-perfect audio rip depends on how much error correction and retrying you do.
I remember experimenting with writing a CD ripping program in the 90s, using Windows APIs, and I found, like you, that I got different data each time. But modern rippers such as EAC does this stuff much better and will for the most part give you bit-perfect rips.
That mat does nothing. And if you read that linked page, you will see that he claims it drastically improves audio quality (bass, etc.), which is pure nonsense.
> Vibration is one of the factors that would make the same audio CD, read with the same CD player, produce different digital signals some of the time or even every time.
Err, no.
That makes sense when you have an analog version of the audio picked up by an analog transducer (i.e. a vinyl record) but makes no sense with an isochronous stream of quantized samples.
I suppose a vibration could cause a small phase shift in when the sample physically appears under the LED, but but since the D->A conversion is clocked by a PLL it is irrelevant.
If you have extreme warping or shaking (e.g fling your discman onto the floor or stick your finger on the disk while it's spinning) then a sample might not appear at all, but that's something different than you are talking about.
I suppose it's theoretically possible that some extreme warping or vibration could cause a bit flip, but that's what the ECC is for.
"[…] The change in height between pits and lands results in a difference in the way the light is reflected. By measuring the intensity change with a photodiode […]"
I'm no signals expert. Are you saying that there is no quantization in that intensity change measurement?
Regardless of quantization, maybe you're right on vibration not being a major source of errors (I know little about electronics and PLLs).
But then, what are the error sources that made the engineers put an extra 276 bytes of Reed-Solomon error correction per 2352-byte sector on a mode-1 data CD-ROM (vs none on an audio CD, which has just has the frame ECC and nothing extra). See https://en.wikipedia.org/wiki/CD-ROM#CD-ROM_format .
There was a day when the clock going in to the D/A converter could be affected by the bitstream coming off the CD. Those days are long gone I'm sure. Everything is buffered in RAM, overclocked, and digitally processed before it hits the D/A.
"Oh, and virtually no PC on earth has that kind of I/O throughput; a Sun Enterprise server might, but a PC does not. Most don't come within a factor of five, assuming perfect realtime behavior."
It's generally the lack of synchronization and positioning information compared to data CDs that gets you. In particular, on many older drives you can't reliably start the rip at the same place each time, so even if all the corrections and fixups work perfectly and you get a bit-exact rip (which isn't hard) you still won't get the same file twice.
> I read into this when I was wondering why my CD-DA extracted .wavs came out with a different checksum every time.
You sure there isn't something in the wavs like a creation date field that would always cause the checksum to be different? That would make way more sense than "vibration"....
FYI the difference is almost certainly due to the seeks not being sample-accurate, so your rips are bit identical for each sample, but you are starting in very slightly different places. Either that or you have a really broken CD-ROM drive (which is also possible).
>Self-styled audiophiles are, by and large, idiots with way too much money plagued by magical thinking.
I agree on the "plagued by magical thinking" part, but not all these people are idiots. Some of them are quite intelligent, in fact. I think they just want to be "in the know", and are able to suspend their normal skepticism in order to belong.
One of the smartest and most productive programmers I ever met was taken in by this nonsense. He replaced all the metal bolts in his power supply with teflon because the metal bolts disrupt the magnetic field around the transformer and you can hear that, maaaan!
He did have a nice sounding system, for which he spent about $10k more than one that would have sounded the same.
Not to be picky, but it bugs me to see when people talk about two different things and don't understand each other.
Intelligence is not a linear value that could be compared like "person1 intelligence > person2 intelligence". With both of these terms there are always skipped implication of the area of intelligence.
By "idiots" he meant "small amount of/incorrect knowledge in the area of audio quality and human hearing", and by "intelligent" you mean "big amount of knowledge and efficiency in the area of writing computer code".
If you're going to be "picky", please be picky about something you understand. A "small amount of/incorrect knowledge" is ignorance, not a lack of intelligence. When you label someone an idiot you're not talking about his lack of knowledge. You're talking about his intelligence.
Now, if by "idiots" he means "ignorant people" then he's using the word incorrectly. But there's no actual indication that's what he meant. At some point you just have to assume people mean what they say.
And despite what people want to believe, the last fifty years of psychometrics research indicates there really is such a thing as "basic intelligence" (which they call "g"), and people with more of it do better on a wide range of intellectual tasks. So you really can say "person1 intelligence > person2 intelligence".
What you are referring as "basic intelligence" is actually a combination of neuroplasticity and general knowledge. Neuroplasticity is a speed at which brain can learn new things, but even then you can't say "person1 neuroplasticity > person2 neuroplasticity". That is because brain is composed of many parts that can have different plasticity. Also neuroplasticity (i.e. "intelligence") is not static and can even change over time depending on what parts of the brain are most active.
James Randi has a speech (you can probably find it on Youtube) about how it is easier to fool smart people than average people. Smart people think they can't be fooled.
The most generous approach to audiophiles is to allow for a placebo effect. i.e. they get greater enjoyment from listening to what they believe is a perfect sound system, regardless of whether the gold-plated cables actually do anything.
> not all these people are idiots. Some of them are quite intelligent
You can be very intelligent in one domain and be a complete idiot in every other domain. The result ends up being unless we're talking about that one domain, they're an idiot.
I would refer to that as knowledgeable and ignorant rather than intelligent and idiotic. It doesn't really make sense to say that Gary Kasparov is an idiot regarding the construction of log cabins. Rather, he is an intelligent person who is ignorant of the construction of log cabins.
But I suspect the metal ones are better. More magnetic shielding
So, IF your audio system uses a linear power supply (and it should) AND it is badly filtered (it should have good filtering) you can hear the 60Hz/50Hz from the power network (assuming it's not creeping in your system through other means as well - most likely they are)
Bose and beats* are by, every /objective/ measure, shitty products.
Subjectively, you might like them, but the faithfulness of audio reproduction is not a subjective matter. You can play a tone and measure how well that tone is actually played back.
You can then also objectively compare things that produce that playback quality at various price points and figure out if they're priced competitively.
There is plenty of fanboyism in high end audio, but that's not why they say Bose and Beats are shitty. It's because Bose and Beats ARE shitty.
*The Solo 2 Beats actually measure very well. They're even competitively priced... with other overpriced fashion statement headphones. They're still overpriced vs. headphones that are just meant to play music well.
Are Beats shitty or just expensive? How little would I have to pay to get same quality?
I am finding it hard to believe that they are actually shitty, while I find it very easy to believe that they are way overpriced.
I have never listened to Beats headphones but I imagine they have a lot of base-boost (based on absolutely nothing), but that is not the same as shitty.
So, we have to define, in your opinion, what would make a pair of headphones shitty.
If you are going to reduce them to the basest level of what the purpose for a speaker or headphone is, to reproduce the input sound, then yes, they are shitty, because they are not good at that.
From a purely objective standpoint, you are going to have to judge them based on that. Why would you want the speakers or headphones to make a different sound than what the signal is?
If you want to move away from an objective measurement of what makes a headphone good or not to something that's purely subjective (i.e. 'I like how they sound'), it's impossible to answer that question.
The Solo2 are a pair of Beats headphones that actually measure really well - they're good at the base purpose of a transducer. But they're $250. You could buy a pair of Sony MDR-7506 that measure similarly (IIRC, a bit better, even) for $85.
>If you are going to reduce them to the basest level of what the purpose for a speaker or headphone is, to reproduce the input sound, then yes, they are shitty, because they are not good at that.
Though I don't personally like the cold, base heavy sound of Beats, I don't really get how you could know this, because most people have no idea what a piece of music should sound like. They know how they think it should sound, they know how they like it to sound, but very few know how it should sound. The only real exception to this is music with "real" instruments like pianos who's sound is familiar to enough people that their reproduction can be reliably determined. Even then, however, unless you know the piece well, it's unlikely most of us are in a good position to make a judgement about the speaker's quality.
So what factors are you using to determine if the sound is reproduces correctly?
This is a pretty scientific matter - when I say "the purpose is to reproduce the input sound", we can tell exactly what is supposed to be reproduced, and we can tell exactly how capable the speaker is of reproducing it.
Some exceptions have to be made due to how having headphones on your head causes the sound to change, but again, these are pretty much known quantities - to get the equivalent of a flat response from a speaker, you will see change X in bass response, change Y in treble response, etc for headphones.
It's not a question of esoteric "The artist and recording engineer meant for this to be played on Kef blades powered by a Cary tube pre-amp feeding into a Mcintosh amp setup using a rail to rail ladder DAC", but a "We know how frequency response should look when measuring equipment and if it doesn't look like that then the sound you are getting out of it is different than the source material"
> This is a pretty scientific matter - when I say "the purpose is to reproduce the input sound", we can tell exactly what is supposed to be reproduced, and we can tell exactly how capable the speaker is of reproducing it.
You are assuming the song was mixed by someone wearing headphones that perfectly reproduce the input sound.
Suppose the person who mixed a song was using beats headphones or other headphones that audiophiles consider inferior but that they know the majority of people use to listen to music. Wouldn't that then mean Beats headphones actually provide the listener with the actual, intended experience?
So, headphone use in studios is not generally for creating the final mix. Monitor speakers are used nearly exclusively in professional studios as what you are mixing for. Headphones have multiple places in the production process where they are used, but they're not the final target.
There's a few reasons for this. The most pragmatic is that doing so will produce the track that sounds the best on the widest variety of setups - EQed or not. There's also not any single headphone out there that is used so predominately that it would make sense to cater to it in specific. The closest might be apple earbuds, but people using those probably aren't too concerned about sound quality anyway, so it doesn't make sense to mix with those in mind either.
From a theoretical standpoint, you're not necessarily wrong, but it's just not how things currently work, and there's not really any reason why it ever would work that way in a professional studio.
I make no claim as to what the people making music exclusively in their bedroom are doing, though.
Interesting. I guess the main concept I'm exploring is the idea that if you don't control for the sound quality that the person mixing it (or more importantly, the person approving the mix) then it's hard to make any claims about how the sound was "meant to be heard".
Given whenever I stand near someone on a train with them I can hear a fair amount of their music (not anywhere near as bad as Apple earbuds though), I assume they can't be that great - that or the listener has very bad hearing.
I have a set of Sennheiser HD 202 which don't have anywhere near the same leakage and cost £35. I haven't tried Beats so can't say much about audio quality, but in my experience high leakage usually means that the audio is poor too. It also means you will listen to music louder to compensate, which leads to more distortion.
> I am finding it hard to believe that they are actually shitty
In that case the marketing team have done a good job :-)
>Given whenever I stand near someone on a train with them I can hear a fair amount of their music (not anywhere near as bad as Apple earbuds though), I assume they can't be that great - that or the listener has very bad hearing.
This makes the mistaken assumption that isolation and good sound are related, which -- as open headphones and speakers can attest -- is not true. The goal of a speaker or headphone is to reproduce music faithfully. Unless you are familiar with the music's origin or it has real instruments who's sounds you can easily identify, it's impossible for most people to tell if the music is reproduced "faithfully". So there are a couple of general rules that most "audiophiles" will consider when dealing with volume:
1. Music played at louder volumes generally sounds better than that at lower volumes. You can hear more of what you are intended to hear.
2. Music often goes up and down in volume, so you want to hear the broadest range of volume.
3. The best listening devices both allow high volumes without clipping and low volumes with clarity.
The point is, just because you can hear it, doesn't mean they are bad headphones.
It also doesn't mean they are good headphones or that the people aren't inconsiderate. It simply means that "sound leakage" isn't really a decent criteria unless it's something that important to you.
Leakage is sometimes intended so it's not necessarily an indicator of quality. See the HD800's. You'll hear them in any open-plan office, for sure. There's no attempt to keep the music from leaking, their only priority is sound quality (which is, at this price, a matter of taste and preference).
Yeah, Dr. Dre is on record saying that he's not an audio engineer but he knows what makes hip-hop music sound good. So he never claimed they had "flat response" or anything.
In response to your and your parent’s blanket claims
> Bose and beats* are by, every /objective/ measure, shitty products.
> God forbid you ever consider buying a Bose or Beats product.
If you need faithful audio reproduction, start with the room. There are reasons for buying a portable Bluetooth-enabled speaker, and also reasons one may consider specifically Bose SoundLink. Sound quality, in this sense, is not among them.
As far I've tried NC headphones, nothing comes even close to what Bose offer with QC25, any other brand I've tried cancel out less noise than the bose. Sound quality might not be the best, but the intented environment is the limiting factor anyway, and they do a great job at dealing with environment noise.
The biggest problem "audiophiles" don't seem to get is that accurate reproduction is not the end goal of music. Enjoyment is. Audiophiles have convinced themselves to find enjoyment from accurate reproduction, and that's OK. But the majority of the world does not see it that way.
I've got excellent bookshelf speakers that were cheaper than the equivalent from Bose, but the reviews and tests showed them to be way better.
My (small!) speakers end up producing way too much bass for the room they're in, in fact, and I use Foobar 2000 with the "MathAudio Room EQ" plug-in to get a flatter speaker response from them. But their problem isn't that they can't produce bass notes.
Whoosh ... I think my comment went over some people's heads. A "one note bass" isn't a good thing, technically. I did not say that Bose is great at making speakers with excellent bass.
Those are low-budget audiophile products, ie they are still marketed mainly to boost the ego of purchasers.
There are three brands of headphones that pros use: Beyerdynamic (typically DT-100 or DT-770), Sennheiser (typically HD-5/650) and Sony (typically MDR-7506/9). Beyerdynamics have somewhat better isolation so they're more popular in music studios, Sonys are more comfortable when you have to wear them all day so they're more popular on film sets; I favor the 7506 and am on my 4th or 5th pair. Some people love Sennheisers but I personally don't care for the ergonomics.
They're not beautiful, lightweight, or fashionable, but they're a lot nicer to listen to - which is why one or other of them was almost certainly used at the recording stage. If it was good enough for the people who made the recording, it's good enough for you. Also, you'll save money compared to the 'quality' consumer brands.
There are a few more brands and models that pros use. The AKG K240 has been in use for decades, and I think they're quite charming. Audio Technica has made a lot of in-roads into mid-level pro studios in particular (i.e. not million dollar rooms, but still quite good studios that make good records regularly). I've also seen Shure SRH series headphones in professional contexts.
But, your statements about "pro" headphones are accurate. They aren't the nicest looking, but they are really good, and I always recommend a good pro set of headphones over the marketed crap from Beats, Bose, Monster, etc. $250 will buy a lot of headphone quality from one of the pro audio manufacturers.
I have to try the Beyer some time. Agreed on the Sennheiser ergonmics - and the price.
I can't stand the Sonys. They're specifically designed for tracking & editing - all that screech points up Bad Things Happening. But they fatigue my ears.
Laugh now, but I landed on the Koss KTXPRO1 ( which are $20 to $40 ) and have basically stopped looking :) Most comfortable thing I've ever used and they're actually pretty flat, except for a little bass bump and a smidgen of upper mid. I think I'm on my tenth pair. They're a bit too light weight - if you catch the cable on something they'll fly off your head.
And yeah - I bet the $500 vs $20 figures into my perception of things.
That's nothing laugh-worthy. Audio equipment faces massive diminishing returns. If you're looking for midrange sound you might as well pay attention to the people recommending sub-$50 headphones as to the ones talking of $150. For instance, the $30 Panasonic HTF600s is better-sounding than the $160 ATH-M50x frequently recommended as an entry-level audiophile headset. As for studio-quality headphones, chinese Superlux/Takstar models are as good as the $200-300 range.
As a die hard fan of the M50x (and no qualifications whatsoever for judging headphones) I'll have to explore those Panasonics you mention. I haven't considered that company as a quality maker of audio gear since the portable CD player era. Did you do the test yourself, or are you going off of a review site? I typically rely on head-fi, but I'm always looking for a recommendation in this field.
From all the reviews I've seen, Bose noise-cancelling headphones are pretty much the best you can buy. Especially if you want earbuds (the QC20s). They're extremely expensive though. Do you (or anyone) have suggestions for alternatives?
For the price of Bose noise-cancelling headphones you can get headphones from the three brands the parent mentioned (throw in AKG for good measure) that sound better in 'lab conditions'. But if your discerning feature is 'noise-cancelling', i.e. headphones that sound excellent in noisy environments like trains, coffeeshops or open work environments I believe Bose is the king and will be as long as their patents are enforced.
I believe they're good as far as they go because Bose has the strongest patent portfolio in this area, but wearing any kind of noise canceling headphones immediately gives me the unpleasant sensation of having my eardrums sucked outwards. I'm not sure why; I think it's a side effect of the tiny latency inherent in the design. It's so unpleasant to me that I stopped paying attention to new products in that category so I'm a bad person to ask.
tomc1985: hope you see this - your account has been hellbanned for over two years (about 850 days, with one comment visible 270 days ago - not sure how that happened).
Your comments over those years don't seem bad at all - sometimes perhaps a little confrontational but not aggressively so. Perhaps HN could allow users above a certain karma threshold vote on [dead] posts, with those scores going towards a "repeal fund" - make decent comments over a certain period and get temporarily un-banned.
Bose QC15s and QC20s are the best active noise-cancelling headphones out there... but the problem is, hey still have very mediocre sound, but they do the noise-cancelling part well. They are also massively overpriced.
Sennheiser HD280 Pros have extremely good passive isolation, and will beat QC15s at a fraction of the price, in both isolation and audio quality.
So yes, Bose still loses when you look at the big picture. Bose is very good at marketing, they are not very good at making quality audio.
I use Klipsch x7i and they allow you to listen to audiobooks on a low volume setting in an underground train. Which, in Moscow, seems like a perfection.
The sound is very clear, but not balanced. Neither an expert nor a musician though.
There's a fair amount of pros that utilize Audeze and HiFiMan gear as well. Planar magnetics are popular.
Erik Larson is pretty vocal about his use of LCD-2s for mastering. Which... Honestly, I'm not generally a fan of his work, so it's not necessarily a ringing endorsement.
Also kind of surprised at your lack of mention of AKGs - they're another very popular brand for studio work.
Almost every studio used to have nothing but AKG K240 phones for monitors in the room. I haven't worked professionally directly in the field for years (I work in live sound occasionally now, and do interact with recording engineers occasionally), but I still see them discussed regularly enough online to assume they are still common. I love the look of them, and always have. To me, they are the definition of "studio headphone". (They aren't what I use in my home studio, as there are better phones, if you're willing to spend more money, but they are a good headphone for a good price.)
These Yamaha headphones are excellent, http://www.musiciansfriend.com/pro-audio/yamaha-rh3c-profess.... Durable, they collapse and sound great. I have another set of open phones with foam surrounds, the foam is not stable and collects gunk if used in a backpack. The Yamaha phones very respectable replacements for the MDR unit and stay clean while taking up little space.
Slightly larger driver, slightly heavier, supposedly has a greatly-extended frequency range of 5Hz - 80 KHz vs 10hz - 20Khz in the 7506. Of course your typical D/A converter won't even render such low frequencies due to DC coupling, and if they were there you'd want to EQ them away pronto as they would eat all your headroom. While I continue to enjoy excellent high-frequency hearing even in my mid 40s (to my surprise), neither I nor anyone else needs a tweeter that goes to 80 KHz.
The MDR-7509 and its successors the 7910 and 7920 have a lower impedance than the 7506 (24 vs 63 ohms) so if you plug them into the same sound source the higher-numbered models will be a bit louder - and as we all know, 'louder = better' for most people. This plus the larger driver is somewhat helpful for DJs, who work in very loud environments, but that's a fast track to hearing damage.
Why I like the 7506 so much: on film sets I give them to eople to listen in and they say 'is it on? I don't hear anything.' Then I turn the volume down or make a small noise next to the boom microphone and jaws drop. Plugged into a quality microphone like a Schoeps, which has a very flat frequency response, it's like there's nothing there. I always have two pairs now because if one gets damaged I can't deal with other brands at all.
thanks for the info; I'm considering upgrading my office headphones (I've got some random $20 over-the-ear pair right now). $80 is certainly reasonable, and I prefer transparent speakers in general.
>God forbid you ever consider buying a Bose or Beats product.
This has little to do with the article. If you've got 100 bucks to spend on a pair of headphones, it's only fair to point out that with certain products you're not getting the best sound out of your money.
> God forbid you ever consider buying a Bose or Beats product.
Meh, they're okay, but there are better choices out there.
I've a pair of Sennheiser HD600 and it's one of those things that make you go "holy cow, all the hype is justified".
And no, I'm not one of those folks who think gold-plated cables make a difference. Right now I'm listening to MP3 Internet radio on a pair of cheap behind the neck street cans.
I have a pair of Sennheister 280 HD Pros - they've lasted me about 7 years, an excellent set of headphones. I used it to help critique music for lots of artists on their production, and I know lots of artists who use it as a cheap pair of mixing headphones.
Work bought me a pair of 380 HD Pros, and I'm impressed on how much of an upgrade they are over the 280s - I can only imagine how good the other Sennheisers are.
The 280s are closed back. Great for isolation, for not letting ambient sound interfere with the music. It also changes the way the transducers work, a little bit.
The 600s are open back. Obviously there's no isolation, but the transducers work more freely. It's a bit easier to distinguish tiny sounds from a huge background.
I've both the 600 and the 280. Great phones both, in different ways.
A friend has had a set of these for ages. We found a difference between two source setups. A particular Sony DVD player sounded incredible - each note seemed perceptible in a 3D space [1]. A CD player he had, didn't. [2] We tried with Yamaha amp, without, different configs of widgets. That DVD player with nothing added was the best. He gave it to his sister and I haven't heard anything like that since.
I nearly went off on a tangent and bought an amp etc, but I'm happy with my much cheaper HD380's - great price/performance :) But those 600's are awesome.
[1] I've since learned it's called soundstage
[2] How would the source influence soundstage? Sounds irrational to me. Hey, one sounded better than the other and I don't know why.
And there are jobs created for engineers who design that equipment - simultaneously recycling more money in the economy and contributing to the statistics of STEM job prospects! Everybody wins!
Sure, you need to take externalities into account. One can imagine an audiophile who would rather spend $100K on a sound system that sustains 3 jobs at a niche sound-system-design business, instead of angel-investing that money in a growth company that would eventually create 300 jobs. But what if the audiophile's daughter is then inspired to build another growth company when she grows up, because she was intrigued by how much her dad would geek out about the electronics in her sound system? Nothing is clear, it's incredibly hard to quantify probabilities about any of these things. As long as the audiophile isn't neglecting responsibilities or breaking windows to obtain his sound system... that is, as long as negative externalities are not a foregone conclusion... we should let him enjoy his passion.
Certainly, there are valid moral arguments to allow, even encourage, this state of affairs. I was just pointing out that the economical argument that was offered is a broken window fallacy. Refuting an argument in favor of X is not an argument against X.
Wouldn't most of the digital cable costs be due to analog interference in the machines they are connecting and them acting like antennas and not that the signal is corrupted on the way?
Unless it's an insulator (like fiber optic cables), you are hooking both a 50 foot antenna and a digital transmission line to your box; if you want just the digital transmission line, you have to insulate the hell out of the antenna part of it.
On the other side of their spectrum, I went "near" an audiophile shop test booth and I was sucked by clarity and density of the sound in the air (this was a drum solo track). Some audiophileness is good.
I actually think you'll find many "audiophiles" enamored of the Sansa Clip Plus and Zip, which fall into that price range. Probably the most highly thought of MP3 players after the iPod classic 5.5.
Now that Android supports USB DACs, you can just get portable DAC+Amp combos that you can stick in your pocket with your phone. No need for a dedicated device anymore.
Far from me the idea to defend them, but $350/m is about 3 times higher than Monster's most bullshitty bullshit cable. Even their "2000HD HyperSpeed HDMI cable" only had a $115/m MSRP in 5ft, falling to $32/m in 35ft: http://www.monsterproducts.com/Monster_Video_ISF_2000HD_Hype...
My great business idea was to produce a line of "organic" cables.
Our company would go to the remote places of the earth to hunt down copper dragons (as in DnD) and harvest their veins to make audio cables.
The "natural", "organic" copper has a warmth to the audio signals flowing through it that artificially produces cables just can't provide (they have harsher undertones).
Then we'd also have silver and gold cables, harvested from, you guessed it, silver and gold dragons.
I plan to disrupt the audiophile business (and crush you in passing) with my homeopathic cables.
As we melt the gold we mix in a few atoms of "rare earth" elements (rare == expensive == so good "they" don't want you to have access) which is then diluted by adding more melted gold until only the imprint of the rare earth atom remains.
The gold will the be hand drawn by virgins (in truth these will be strong, hairy, 50-yo virgins with dreadful hygiene and B.O. though strong enough to pull, but we need not add all that confusing stuff, we'll just say "virgins"). The wire will be lovingly laid into hand-made insulation made from organic pinniped leather.
I see a variety of future applications both in the home (connect your cable modem to your WiFi access point) and business (data centers). To quote Rony Abovitz, we'll soon be "the size of Apple".
dragoncopper, have the website default to some non-existent northern european language/font with a translation button (british flag). Burled wood with reds and yellows. will buy.
When someone spends money on something in a silly way, do you consider them to be an idiot? Would you make the same statement about someone who spends 200$ on a bottle of scotch versus buying a 35$ bottle?
Let the 'idiots' spend their money driving an industry that is combining the creation of electronics with functional art. I'm not sad to see a $40,000 DAC. I don't have to buy it, and it's cool to know someone built something of silly 'value'.
For example, look at this thing: https://www.naimaudio.com/statement It's absolutely silly, and the cost is outrageous. I'm happy that they built it though. It was actually built as part of the acquisition that Focal made of Naim. It seems that they allowed the engineers at Naim to go nuts once the company was acquired.
I like seeing silly things that people build. It doesn't make me sad that someone spends thousands on ridiculous items that from an engineering perspective don't make a difference.
You are mixing two things.
There is a difference between Intel charging you >$1K for a CPU that is 5% faster (in games) than overclocked $300 one versus scammer selling magical power cables (made by gypsy virgins in Romanian monastery on top of the highest mountain, in full moon) to some rich retards.
$40,000 DAC? someone really went ahead and fabbed their own silicon (~$1mil for low volume run)? or did they maybe picked up $100 (at most) part, put it in a shiny box and started looking for suckers?
Do you even realize what $40K gets you? we are talking military grade Agile^^key'hole/RohdeShwartz multi gigahertz arbitrary waveform generators here, not some pityfull audio stuff.
HP was known for building generally good test equipment, including arbitrary waveform generators. Op is humorously referring to the fact that the test and measurement division was spun off first as Agilent, then Keysight (seriously?), and probably something else by tomorrow (marketing is furiously brainstorming new meaningless names. They only need to merge with Danaheer and rename themselves Flukeronix for the circle of life to be complete.
It's one thing when you plop down a ton of cash for something where you embrace the "silliness" or whatever makes it special (e.g. buying exotic cars with monstruous engines to drive them into traffic). It's another, however, believing something is objectively "better" because it cost more; e.g. spending money on Monster Cables thinking you can hear a difference.
In audio, you have people who love vinyl because they enjoy the distortion it makes, and that's perfectly fine; but you have others who somehow believe it sounds closer to the original, which is demonstrably ridiculous.
I completely get that as I personally prefer to collect Old Stock vacuum tubes for my listening purposes (http://imgur.com/a/INXVX). I just think folks should be left to their own devices to enjoy an avocation as they please.
Trying to defend something through completely subjective argument is silly, I've a hard time discussing 'objectively better' technology with audiophole folks, but if someone said 'I like this more' I really can't hope to debunk that through any sort of mathematical characterization of performance.
>I just think folks should be left to their own devices to enjoy an avocation as they please.
Only if they shut up and never tell other people that should be listening at 24/192. However, the people being complained about here spout nonsense like that all of the time.
I think you find you have it backwards. I don't hear anyone here telling you that you should be listening at 24/192. Go look at all the comments and count them up. All I hear is people saying that you should be listening at 16/44, because it sounds exactly the same, or even sounds better, and if you think otherwise you're obviously an idiot, stupid, audiophool who spends $5000 on a power cable.
I sure know who I think should shut up. It's all those arm-chair-experts who don't even own any decent hifi gear. Why would they? It's all crap and my second hand ipod headphones beat it all hands down anyway. Right?
If you enjoy music (who doesnt) and are bending toward learning electronics I highly recommend projects like Twisted Pear kits and build yourself a DAC.
http://www.twistedpearaudio.com/landing.aspx
The other fun stuff is building your own Speaker kits, hook all this up with a Pi Music Box and you have yourself a kind of home made Sonos.
http://www.woutervanwijk.nl/pimusicbox/
I would make the argument that if some of the silliness was stated for what it was, gullible rich people would spend their money on something equally frivolous that did more to drive innovation
That's rather a wide-ranging bit of character assisination. Some of us just enjoy listing to music on decent equipment (If you can buy it at your local big box store, it's probably not sufficiently "decent") properly setup (which doesn't mean expensively - just basic proper speaker placement and the like).
> Some of us just enjoy list[en]ing to music on decent equipment
The best definition of audiophile I've heard is somebody who listens to equipment, rather than music.
Of course your favourite Pink Floyd sounds better on a decent stereo rather than a clock-radio, but if somebody is forever chasing the proper "colour" for their speakers, or swapping amps for the perfect tone... they might be an audiophile.
>The best definition of audiophile I've heard is somebody who listens to equipment, rather than music.
Audiophiles are an easy target. There are a lot that do stupid shit like buy $5000 power cables, expensive risers to lift cables off the ground, etc. A lot are pretentious, even if they're not insane or dumb.
But that's a rather inflammatory, and in my opinion, unfair position to take. I would probably be considered an audiophile - I have put quite a bit of money into audio equipment. But I love music. I listen to it basically constantly. It's one of my primary sources of entertainment - and I don't just mean 'I have music on when I do other shit.'
Each week I spend probably 20 hours doing nothing but relaxing with a bit to drink and some music on. Not reading, not surfing the net, not doing anything but closing my eyes and enjoying the sound. I'm listening to the music.
At times, yes, when I have been testing out new equipment before deciding if I want to buy it I go through and I do blind ABX tests with level matching. In this case, yes, I am listening to equipment. But this is a very minor portion of my total listening time.
I know you're probably not being totally serious with the post, but I do think it's a bit unfair towards those of us that love music, but also have invested time and money into getting a setup that sounds better for increased enjoyment.
You've completely misunderstood the article. It's discussing differences in the the PCM sampling rate, not the MP3 encoding bitrate. Or in otherwords, all these samples they compare are FLAC.
192 KHz has nothing to do with 192 kbps.
I'm one of those audio engineers who mistakenly thinks I can tell between an MP3 and a FLAC. Yet somehow I understand the difference between sample rate and encoding bitrate, and you do not.
All of this is besides the point. I'd much prefer a 24/44 sample than a 16/96 or 16/192. Bit depth has a much larger impact on the sound than sample rate.
Bit depth affects dynamic range, and that's it. The only thing a 24-bit sample can do better than a 16-bit sample is accurately reproduce the difference between very loud sounds and very quiet ones. That's all. For the vast, vast majority of music listeners, the difference will be insignificant, as they don't listen in an environment where a dynamic range of 144 dB can be used to anywhere near its full effect.
This is slightly incorrect. The combination of bit length and sampling rate determine both dynamic range and frequency fidelity. Although it's common to hear the two values used to represent these two separate physical measurements, it's just another case of explaining new tech (80s CDs) to old technical consumers (70s hi-fi types). You can measure a reduction in dynamic range by reducing either sampling rate or bit length.
You'll frequently find 1-bit A/Ds and D/A at > 5Mhz on high fidelity systems. That 1-bit signal is converted to/from a higher bitrate, lower sampling rate signal without loss of fidelity. If you're interested in looking at alternate bitrate encodings you should just look at the Super Audio CD format https://en.wikipedia.org/wiki/Super_Audio_CD
On your second point, I agree, we live in a noisy world and hearing 144dB of dynamic range would require serious isolation.
What most people should be able to hear with 90Dbs of dynamic range, are the harmonics created by undersampling a high frequency signal. To quickly explain I'll use a 1-bit lower frequency scenario. Lets say we have a 2Khz sine wave and a 1-bit 5Khz sampling rate. The 2Khz signal is going to be represented by a different 2 samples every cycle. The result will be a signal that is no longer a sine wave and closer in frequency to 800Hz (wild approximation) than 2Khz. Low pass filters are used to to keep those harmonics from being too pervasive but they still sneak into the signal near the high frequency range. Transpose this example to our current audio standard and you might realize that in order to accurately represent the high end we need a little more than 16bit 48Khz.
In your last example about the harmonics, I can see that being so in the most simple case, but surely that's a limitation of the playback system, not the storage medium. And isn't that addressed by the oversampling that is used almost universally in DACs now?
It's a limitation of the recording/storage medium and is addressed by oversampling and low pass filters in ADCs in acoustical recording. Most audio recording, processing and mixing is performed in 24 or 32 bit. Once the data is down sampled for distribution in 16bit 44.1Khz you run into the limitation again where you have fewer samples to represent higher frequencies. The only remedy is to attenuate those frequencies before downsampling. I'm unsure of the role oversampling plays in DACs so I can't speak to that.
My understanding is that oversampling is done in DACs so that digital filters can be applied without introducing the effects that you describe at the top end of the frequency range. Basically it interpolates to a higher sample rate so that there's more play in filter selection/application.
In terms of attenuating the high frequencies before downsampling - have I misinterpreted Nyquist? I thought that there was no loss in fidelity, right up to half the sampling rate.
I figured it had something to do with the application of digital filters. There are only two significant samples of a source at half the sampling rate, so it should be able to represent a frequency at exactly half the sampling rate but I can't see how it could accurately represent a frequency at just a few hz below 1/2 the sampling rate. Notice I'm using the word frequency. A sine wave with a frequency of 22,050hz encoded at 16 bit 44Khz is not going to look anything like a sine wave.
Right, 22,050hz looks exactly like a triangle wave on a computer screen. The thing is that a triangle wave is composed of a fundamental sine wave (22kHz), and a series of ascending odd harmonics above it. So after the filters nix everything above 22kHz it looks exactly like a sine wave on a scope.
So you're right that you lose information as you go higher in frequency, but there is also commensurately less need for information to recreate it precisely because the filters remove the detail anyway (and if not the filters, the human ear).
Sure 22,050hz and 11,025hz get smoothed out into perfect sine waves and the human ear can't hear 22Khz anyway. But the Nyquest frequency isn't some magical threshold that you cross and suddenly everything is perfectly preserved. It's a folding frequency that determines where aliasing is going to occur, or rather where it's not going to occur. A 44Khz sampling rate is based roughly on western tuning (440Hz A) and makes no attempt to accurately capture sounds and frequencies that are not tuned to western music. As you move away from the folding frequency, there are frequencies in the human audible range that can not be represented, so they're discarded or attenuated by anti-aliasing filters. As far as I'm concerned CD audio is outdated tech that most of the world just doesn't care enough to drop. It's ridiculous in a world of 5K retina screens that people can't see the value in higher resolution audio.
>It's ridiculous in a world of 5K retina screens that people can't see the value in higher resolution audio.
Not if they can't hear the value in higher resolution audio. For many people the only difference in HD audio over 16-bit 44.1 Khz is that the files are bigger. If someone can't hear the difference, it's no surprise that they don't care to move to a new format.
The screen analogy isn't perfect as most people can still readily tell the difference between an HD image and a significantly lower resolution one. (Though yeah, we're getting closer to pixel densities surpassing people's ability to resolve pixels as well, provided they're not putting their nose to the screen. It won't be too long now.)
There's a difference between not being able to hear and not knowing what to listen for. Listen to the highs on a well tuned hi-fi system and you can hear the difference between CD and SACD. Listen for the sound of a singer taking a breath or the slow ring of a cymbal. If you've never heard these things in person to begin with then you're at a disadvantage trying to hear how badly they are represented in recording technology from the 1980s.
Don't argue from a position of ignorance. Make friends with a recording engineer and have them play you a 32bit mix followed by a 16 bit mixdown.
Yes, and also remeber that as you approach the nyquest frequency, the ability to encode phase is lost. People always talk abou sampling capturing different frequencies but they forget to think about phase.
Which gets back to your original point about both sample rate and bit depth contributing to the dynamic range. I'm very curious: can you give me some search keywords that would get me to the math behind that? Also, I've never heard the claim that the 44.1kHz rate biases toward 440Hz tuning, is there somewhere I can read more about that?
pulse code modulation, pulse density modulation...
44.1kHz/100 = 441hz. But that's nonsense in the same way that saying that a signal at the Nyqvist frequency can be accurately encoded. @diroussle pointed out that you lose phase but there's another consideration, sync. If your signal at Nyquist is not in sync with the sampling frequency, then it's going to be represented as a signal offset, an out of phase line.
I understand PCM, 1 bit DACs, et al. What I'd love to see are some equations relating the three quantities of bit rate, sampling frequency, and distortion. Turns out to be very hard to Google.
In any case, thanks for your patience, I'm glad to have cause to reconsider my position on this topic.
Yes. And 16 bit cannot by itself represent without distortion the full dynamic range of a lot of music. Most samples, most of the time, do not use a full 16 bits. This is why during CD mastering dithering is used.
Take it from me, when you master 24 bit stereo tracks, and you don't dither, huge amounts of low level detail disappear. The detail in the quiets is there in 24 bit, and lost when its truncated to 16 bits. Add the dithering, and you get increased noise, but the detail comes back.
One could suggest that with dithering 16 bits can represent it. But that's with a whole bunch of noise added to the signal. You can argue that noise is not audible, but it is _just_, and when mastering you can audition the different dither spectrums to find which dither least impacts the music.
I certainly won't argue that 16-bit is just as good as 24-bit from an objective standpoint, as 24-bit is obviously superior, full stop. I'm just saying that for most listeners (everyone except those who listen at high levels in dedicated, treated listening rooms in very quiet environments) the difference will be inaudible almost all the time. Extremely low level detail doesn't really matter if it's lost in the >20 dB of natural noise in your room.
At that point the issue may become moot as other problems like standing waves, harmonic distortion, inaccurate speaker frequency response and so on creep in and affect music playback to a subjectively larger degree than '16-bit versus 24-bit does', IMO.
All that said, 24-bit is definitely the way to go since we might as well do it right even if only x percent of listeners will notice.
As an aside, thank you for being one of the conscientious 'good guys' in the studio. I collect music and wish I had a nickel for every sloppy recording I've heard.
Yes. I completely agree. From my perspective, even if one person in a hundred can hear a difference, then I'm going to pay attention. I don't want to boil everything down to meet the average. I think it's fine to release 16/44. I think well produced music sounds excellent in that format. It's one of the reasons the CD has done so well. It's just hifi enough to capture everything. And it's amazing to think this technology had its début in 1982!
But for so long, for those of us who want a higher quality (hearing it exactly as they would have heard in the studio during the production) there was nothing we could do. Willing to pay more for it, doesn't matter. You just can't get it. It's still that way.
What gripes me is the attitude of many, including this xiph article, that hires versions "make no sense", that "there is no point" and thus everyone should just be happy with what they've got and that anyone protesting is an "audiophool" or believes in magic fairies or something. We all get lumped in with those people buying $3000 IEC power cables. For many people it's all black and white, there is no room for grey. You either think that 128kbps mp3s sound identical to the analogue master tape, or you are a fool spending $20,000 on magical stickers to increase the light speed in your CD player.
All I want is to be able to buy the mix and hear it as the engineer heard it in the studio. That would be nice. I know it's not for everyone, but it doesn't make me crazy.
As food for thought, have a read of what Rupert Neve said about Geoff Emerick's hearing ability (being able to discern a 3dB rise at 54kHz) here: http://poonshead.com/Reading/Articles.aspx
"The danger here is that the more qualified you are, the more you 'know' that something can't be true, so you don't believe it. Or you 'know' a design can't be done, so you don't try it."
What's the argument in favor of using extremely high sampling rates though? Using 48 kHz instead of 44.1 seems reasonable (as in the Philips digital compact cassette that never really caught on), giving a little bit of headroom for wider frequency response, moving the filters a little higher or whatever, but I've seen D/A convertors that use 384 kHz, and I just can't fathom what the point is... It smacks of the "if some is good, more must be better" mentality.
There's definitely nothing crazy about wanting to hear a recording with as much fidelity to the master as possible. Yeah, I do remember people saying that 128 kbps MP3 was "CD quality" in the early days of the format, and that was a laughable claim indeed. One would have to be pretty tin-eared to think 128 kbps was hi-fi, although I'd say there were valid use cases for it, at least back when portable music players had storage in the megabyte range instead of the gigabytes we have today.
So many of those audiophile tweaks are just outright scams, and a fool and his money are soon parted. I guess education is the only way to combat that.
As for Emerick's ability to hear anything at 54 kHz, much less discern a 3 dB difference there, well, I am really, really skeptical. I'm obviously not in a position to say it's impossible, but it strikes me as an outright superhuman ability that should have been tested scientifically.
I'm not sure there is a compelling reason to distribute final music pieces in 192kHz.
I can only speak from my own experiences, and I record and mix in 24/96 but for reasons that don't really relate to music distribution. When doing further processing, some plugins sound better with their algorithms taking 96k instead of 44k. Every plugin has been written with compromises. And I find I can push hires audio further in the digital domain before unpleasant artefacts arise.
It's very much like image processing. If you take a picture with a cheap basic 1 mega pixel camera and then play with the curves and sharpness, at a certain point smooth graduated colour becomes "posterised". If you take the shot with a DSLR (with 12 bits of each primary colour) then you can push the image a lot further before the posterisation occurs.
I have found the same occurs for audio. I can manipulate the sound with less artefacts when its hires. The plugins sound more transparent and smoother. I tend not to go above 96kHz because this effect is achieved at 96, and 192 (to my ears) sounds no better and I'd just have bigger files and more CPU load from the plugins processing the extra data.
The bandwidth of 96kHz is just short of 50kHz, so if as an added benefit I satisfy the one in a million Geoff Emerick's, then all the better.
But then once the final mix is rendered and no more processing needs to be done, ie for distribution, then this hires advantage seems moot. Maybe there is still some advantage for people or devices that may post process the sound digitally in some way, like a digital equaliser in your playback device, or something like that. But then again, that device could always upsample before processing.
I tend to use 88kHz if the final destination is intended to be CD, and 96kHz otherwise (so there is less aliasing when sample rate converting to 44kHz).
The reason I harp on about the bit depth is because in my experience that is where we are falling short. If I take my hires sources and convert to 44 or 48 with a high quality SRC I hear no difference at all. But when I change to 16 bit the difference is enormous. There is always a degredation. And it's never a good thing. It seems silly to just be throwing away that bit depth because of a 1982 format that people aren't even listening on anymore.
Also on the topic of SRCs, this site has some interesting comparison. For the record I do my SRC conversion with iZotope RX 64 bit SRC. http://src.infinitewave.ca/
So in conclusion, I want 24 bit tracks. If they're given to me as 44, 96, 192... whatever. As long as they're 24 bit. Enough with the 16 bit! :D
I think you're misreading what the OP said. He wasn't making the mistake of associating 192 kbps with 192 kHz. It was just a coincidence that he used 192 kbps MP3 transcodes. He could just as easily have used 224 kbps MP3s versus 24/192 FLACs in the test he described. The point was that his friend couldn't tell the difference between MP3s and FLACs.
One reviewer [2] notes: "The effect of the first few Shakti products was not as apparent as when the effect became compounded. Each built on the others' ability to eliminate EMI in the component on or under which it was placed. Music became more relaxed, with greater clarity. Space and ambience increased. The soundfield became considerably more open and defined. At a certain point, the effect became quite startling as another Stone or On-Line was added. Shazaam!"
I used ferrite beads to remove EMI in a pair of self-powered speakers. I was hearing the local college station broadcast at a very low level when nothing was playing. I added them to the speaker wires, along with a basic EMI power strip, and the interference was gone.
It is baffling to me that people even talk about 24/192. There are such vast differences in audio quality related to speakers, loudness, amplifier, mastering, and EQ, before you even get to the source format.
For some reason people seem to latch on to the format thing, before being able to make judgements about the more important factors.
It doesn't end there. For most people, I strongly suspect moving around your furniture would affect your sound reproduction more than changing your equipment.
I have no source either, but I'd bet a very small sum of somebody else's dollars that "most people" in the US, at least, listen to most of their music in the car.
"I do think in the domestic environment, the people that have sufficient equipment don’t pay enough attention to room acoustics. The pro audio guy will prioritize room acoustics and do the necessary treatments to make the room sound right. The hi-fi world attaches less importance to room acoustics, and prioritizes equipment; they are looking more at brand names and reputation."
If it's any consolation, I'm a pro sound engineer and I entirely agree with you. I do like 24 bits (although I'd be satisfied with 20 bits) but I can't be bothered recording at anything higher than 96khz and even that is mainly wiggle room in case I want to do extreme pitch-shifting or suchlike. Most of the time I use 24 bits/48Khz.
Sometimes I think I'd like to buy some expensive measurement microphones and record at 192Khz for Science, eg to find out if there are tunes in cricket stridulations or whatnot. But then I get over it.
Preface: I consider myself an audiophile, but keep reading before you judge. I completely, wholeheartedly agree with the original article and the science behind it. There's no question that our hearing just isn't good enough to discern the minute differences between sound of sufficient quality (which is well defined in the Xiph article).
However, all I'll say is, it's very different to hear or feel a difference, than to prove it 100% without any doubt in the exacting conditions of an ABX test. You behave differently and aim your listening at different things in the special case of critical testing, than when normally listening.
You have to know and respect this to make good arguments against hardcore audiophiles. Only once you give credence to the possibility can you bring on the real science: that the true bandwidth of the ear and the nyquist theorem truly does mean that any signal within our range of hearing can be encoded perfectly in double the sampling frequency and some 65 thousand steps—assuming an ideal decoding, of course, which means, yes, you should respect the idea of DAC design.
The world is full of idiots who are easily parted with their money. But don't throw the baby out with the bathwater. Pursue good quality audio equipment, to a point, because damn, it is enjoyable.
>However, all I'll say is, it's very different to hear or feel a difference, than to prove it 100% without any doubt in the exacting conditions of an ABX test. You behave differently and aim your listening at different things in the special case of critical testing, than when normally listening.
so what you are saying is you know what sounds better if you read the label BEFORE listening to it? :)
There is almost always a difference, no one is claiming otherwise. Pointing which one is closer to the original (not "better", because "better" might mean louder/overdriven bass) is the real test, and EVERY SINGLE audiophool to date fails at this point - Randi foundation did run a $1M pot for someone to spot a difference between 'audiophile grade' power/speaker cables versus coat hangar at one point.
>He still thinks he can hear the difference between FLAC and MP3 to this day. He works as a sound engineer now.
It's really not fair to compare FLAC vs MP3 to "hi-rez" FLAC vs regular FLAC
There's legitimately some instruments that do not compress well. The harpsichord is a particular example that you should be able to hear the difference on on any sort of decent equipment.
Bur hi-rez vs regular flac is something that I don't think can be really detected by humans. I've gone through and done the Philips golden ears challenge to completion, and have very high end equipment, blind ABX FLAC vs MP3 on a lot of songs I am familiar with, but have never once been able to successfully ABX between a 24/192 flac and a regular one.
By '"hi-rez" FLAC vs regular FLAC"' the parent post means something like 24/192 vs 16/44.1, not the amount of compression. It has a higher resolution than the other example.
MP3 has specific algorithmic weaknesses above 16khz[0] which even the highest legal bitrate can't cover up… sometimes. It's actually easiest to hear it with cymbals, or the old LAME sample "fatboy".
You can just not use MP3 though. It's 2014! Use AAC!
The early encoder releases were tuned for voice chat, not music, so the best rate control mode is CBR and it hasn't seriously been tested otherwise. It is pretty amazing considering it's only so good by accident!
Also, everyone is satisfied with AAC already, so there's no good reason to throw out your music collection or your HW accelerated decoding platform.
> There's legitimately some instruments that do not compress well.
Are you sure? I thought that was just a problem particular to early encoders for the Vorbis codec, which were alleviated by altering the tuning parameters of the encoder.
I'm basing it purely on modern (Within the past couple of years) encoded 320kbps/V0 MP3s.
I have not done any personal blind ABX tests on AAC or modern ogg vorbis, so I can't really speak to them.
I'm going to keep 'archival' quality stuff in FLAC anyway, just so I'm covered for any advances in compression tech or whatever, and I stream to my mobile stuff, so size concerns aren't a huge deal for me. My ABX testing has just been for the sake of the mp3 vs FLAC argument.
So, AAC and Vorbis might have very well solved the problem of compressing some of these instruments.
If it is no better, then the person who thinks it is better benefits from it.
If it is better, then the person who thinks it is better benefits from it.
If it is no better, then the person who thinks it is no better doesn't benefit from it.
If it is better, then the person who thinks it is no better doesn't benefit from it.
If the objective is subjective benefit, then placebo is a benefit; assuming your bank account is large enough and you don't care to give your money to someone who really needs it.
Edit: An answer to this is the Carl Sagan quote at the end of the article:
"For me, it is far better to grasp the Universe as it really is than to persist in delusion, however satisfying and reassuring."
Of course, it isn't really possible to not 'persist in delusion'. One can try, but he won't know if by trying he is perpetuating a grander delusion.
The DECT phone standard had to be marketed as "6.0" for the US market because non-technical people were trained to believe that 5.8GHz was better than 2.4GHz and a phone that ran at 1.9GHz would never sell in the US. This is the same thought process driving the audiophools' desire for bigger numbers regardless of reason. It doesn't help that the fundamentals of sampling theory aren't particularly intuitive.
The article is correct, but it's not true that nobody can't tell the difference between an MP3 and a FLAC.
I've personally done blind A/B testing in my (then) studio to discover the point at which I can't distinguish between MP3 and uncompressed audio. These days the encoders are really good, so it gets real hard at around 256kbps. I'm confident I could reliably pick out 192kbps though.
While I think that there are people who actually can tell the difference between lossless and lossy audio (with a decent bitrate), I'm not counting me among them.
Yet, I only buy lossless music since I plan to keep my music library around for ages and this allows me to change to a different format in the future if needed. This is an aspect of lossless audio, which is often overlooked.
The article specifically mentions that lossless formats offer advantages over compressed formats. The argument is 24/192 encodes lossless files vs 16/48, which I feel the author soundly confirms.
In my personal testing a few years ago, I couldn't tell the difference between source and 192kbps MP3s either.
I still rip CDs to FLAC but only to transcode them to lossy formats for later listening. I do this in case I decide to switch lossy formats in the future (note: due to differences in psychoacoustical models, you should never transcode from one lossy format to another).
> He still thinks he can hear the difference between FLAC and MP3 to this day
I don't know if I would tell the difference in your test, but where I have noticed it the cause might be bad MP3 encoding - MP3 encoding quality varies widely... The difference between good and bad encoding may be far greater than between good MP3 and FLAC.
Maybe it depends on the song as well? Classical music is supposed to have a much greater frequency spectrum - meaning, the effects of MP3 encoding becomes apparent when played on high fidelity equipment.
It has greater dynamic range (think volume, this is the domain of the bit rate properly called bit depth) than most compressed studio-produced music but the frequency range or spectrum (think pitch, which is limited on the high end by the sampling rate/2 via the Nyquist-Shannon theorem) is not different if not also clipped off by mixing.
mp3 sucks for bass and sub bass physical response. It's incredibly easy for a non audiophile (like my girlfriend back in the day) to tell the wooly, muffled bass of a quantized in my car vs the actual cd on the same system. Perceptual coding is not perfect.
I don't know about comparing digital to digital, but I know I can tell the difference between a 320kbs mp3 and a good quality LP record. Its in the treble. I guess any kind of digital just ruins that.
MP3 in particular has issues above 16KHz. This is solved in modern formats such as AAC and Ogg Vorbis. It has nothing to do with "any kind of digital".
I keep FLAC copies just so that I have lossless versions that I can convert into the lossy format du jour, but for actually listening, I mass-convert to lossy. I've listened to FLAC and properly encoded MP3 files (@192 and 160) on $50,000 audio equipment, and I can't tell the difference.
I'm curious, did you listen in a properly treated room? As in with bass traps, panel traps, diffusers, a cloud, first reflection points covered, etc? Because I can hear a difference on less that $10,000 of playback equipment, but in a fully treated listening room. And I know from the days when I used an untreated room, there would be no way I could tell in an untreated room, even with a million dollars of equipment.
It's the most overlooked part of the listening chain, and is in fact the most important part. In fact it always shocks me how many "audiophiles" will pump tens of thousands into audio equipment for their reflective, untreated, boxy listening space. A $1000 pair of speakers in a room with $5000 in room treatment will totally blow away $20,000 pair of speaker in a room with $0 in room treatment. Every time.
It was an audiophile who'd spent a small fortune decking out his "listening room". I imagine that he knew what he was doing.
I humored him by teasing out which one was which without him noticing, and then saying the lossless one sounded better. Didn't want to hurt his feelings. And really, the sound system as a whole sounded awesome. I just couldn't tell the difference between the formats.
To put it simply and honestly, you are not paying enough attention. There is no human-discernible difference between FLAC and V0/320kbps. But implying that 192kbps and 160kbps are sufficiently perfect is somewhat ludicrous. You can still hear noticeable artifacts on cymbals at 192kbps. They will sound like an absolute mess of warbling, and all of the audio will be imbued with a slight tinge of white noise.
There is a reason the mp3 scene moved away from 192kbps, and it doesn't have anything to do with bandwidth availability. It's because 192kbps sounds terrible.
>There is no human-discernible difference between FLAC and V0/320kbps
I can't on almost any modern music (which is the majority of what I listen to), but when I was going through the philips golden ears course, I did a fair amount of blind ABX on harpsichords, cymbals, and a few other instruments at v0/320kbps and didn't have much trouble identifying them.
Granted, at that point I had been going through something specifically intended to help train you for discerning differences in audio, but they were distinct enough I don't think I would have had any trouble beforehand, either.
On some stuff I couldn't immediately tell that some sounded better - just different. Though on some of the samples the FLAC was easily better to my ears.
(My criteria for a 'successful' ABX was accuracy of at least 8 out of 10 using the foobar ABX comparator plugin)
> He still thinks he can hear the difference between FLAC and MP3
If we're to take Tidal and Spotify(at highest quality) as representative of those two (please correct me if I'm wrong, no expert) then the difference is night and day. Perhaps Spotify could use a higher quality mp3 encoding?
I agree with the silliness of 192kHz, but not 24-bits. Here is why:
In typical PCM recordings, like CDs, mid-range frequencies (e.g. 1kHz to 4kHz) are recorded with lower amplitudes because our ears are more sensitive to them.
Sampling theory is correct and 16-bits can reproduce any waveform with ~100dB of range, however, in a complex waveform consisting of low, mid and high frequencies, the mid- and hi-range frequencies quite simply get shortchanged.
Imagine a recording of a bass sinusoid and a mid-range sinusoid of equal volume. It might use e.g. 10 bits to store the bass and only 6 to store the high frequencies. (2^10sin(200wt)+2^6sin(4000wt)). That means the resolution of the high frequencies is less than the lower frequencies. When the volume of those frequencies changes dynamically, the high frequencies' amplitudes are more quantized. That is quite simply why 16-bits are not enough.
This is similar to the problem with storing waveforms unprecompensated on vinyl. The precompensation makes up for the non-uniformity of the medium. It could be done with 16-bit digital as well. Or alternatively, larger sample sizes like 24 can be used.
I haven't A/B tested this. The A/B test in the article compares CD with SACD. SACD isn't PCM, so its artifacts are going to be totally different from 24-bit PCM.
If you're playing a 16 bit PCM at a reasonable listening level of 85dB SPL, then your 6 bit sinusoid is at 25dB SPL, which is quieter than a whisper at 6 feet away in a library. The quantization noise floor of a 6 bit recording is a further ~30dB quieter.
So, the noise of that signal is -5dB SPL. 0dB SPL was set to be the lowest possible perceivable level of a single sound in an aechoic chamber. And that's not even considering other sounds in the recording, or ambient noise levels in a typical living room, etc.
In your example, moving to 24 bit would been a long way from having any effect (other than a 50% increase in file size). And if you use, say, an 8 bit signal as an example, then things are even less noisy. Note that the noise is the only consideration here: any fidelity loss is represented in that figure.
The audio engineers of yore who (among other things) decided that 16 bits was more than enough for final mixdown were much more competent than they get credit for (many were downright amazing at what they did, in fact). They thought of stuff like this.
The correct way to attack this isn't by attacking the theory. It's to gather a lot of people and ask them to press a button indicating whether the audio they hear is 16-bit or 24-bit.
If the results are no better than chance, then 24-bit doesn't matter, regardless of how sound the underlying argument is.
EDIT: The experiment would also be extremely difficult to design. For example, you'd need to run this test with music, not simple sounds. So the question is, which music? I think whatever is most popular at the time would be a good candidate, because if people are listening to music they hate, they won't care about the fine details of the audio. But that introduces an element of uncertainty and noise into the results which is hard to control for.
Or, it could be the exact opposite: Maybe you can only detect whether a sound is 24-bit when it's a simple tone, and not music.
Age is also a factor. My hearing is worse than a decade ago.
The headphones used by the test are another factor. If you feed 24-bit input to headphones, there's no guarantee that the speakers are performing with 24-bit resolution. In fact, this may be the source of most of the confusion in the debate. I'm not sure how you'd even check whether speakers are physically moving back and forth "at 24-bit resolution" rather than a 16-bit resolution.
For example, you'd need to run this test with music, not simple sounds. So the question is, which music? I think whatever is most popular at the time would be a good candidate
A quick summary would be that most "popular" music has been mastered with the following goal: the song should be recognizable and listenable on a FM-radio with only a limited bandwidth midrange-speaker. One of the many things they do to achieve this is by eliminated almost all dynamic range through a process called "compression" (dynamic compression, not digital-compression).
They also limit the spectral range to not have "unheard" sounds cause distortion when played through limited bandwidth amplifiers and speakers.
This means that the kinds of musical pieces which could benefit from the increased dynamic range of 24-bit would be thoroughly excluded from the test.
And then you'd probably get the "expected" result, but only because you now test whether music mastered specifically not to have dynamic range benefits from having increased dynamic range. For which the answer is given.
Note: I'm not claiming 24-bit end-user audio has merits, of which I have little opinion. I'm just pointing out the flaw in the proposed experiment.
If you feed 24-bit input to headphones, there's no guarantee that the speakers are performing with 24-bit resolution.
Not sure if you're just imprecise in your language here or if you're genuinely confusing things. Speaker-elements, as found in both speakers and headphones are analogue. They operate according to the law of physics, and respond to changes in magnetic fields, for which there is practically no lower limit.
They have no digital resolution. A quick example: Take your 16-bit music, halve the volume and voila! You are now operating at "17-bit resolution". Halve it again. 18-bit resolution. Etc.
There's probably some minimum levels of accuracy, yes, but it just doesn't make sense to measure it in bits.
If you're aware of this and were just trying to adjust the language to the problem at hand, I'm sorry for being patronizing, but I just wanted to make sure we keep things factual here.
There's also headroom for the signal processing in the equipment. Equalization or volume control done poorly can lower your dynamic range, for example when turning the volume down on windows then turning it up on an external amp.
The experiment would also be extremely difficult to design.
I disagree. I think all the factors you are concerned about can be eliminated with a large enough sample size, like in the thousands (or maybe 10s of thousands).
You allow each person to select the genre of music they like, and you play a few clips from a few songs of each bitrate. Then they guess which is 24-bit and which is 16-bit.
I'm not paying to set it up. But it could all be done online without too much grief. It would be good to track the other statistics (age, headphone brand, etc.) as well, and see if something falls out of that.
Pretend your headphones only moved with 8-bit resolution. There is no possible way the experiment could derive a useful conclusion, but you might trick yourself into thinking it did. Especially if your sample size was 10,000 people.
More realistically, the participant might choose music for which no 24-bit recording exists.
It's very important to control for every variable. It's actually not possible to gather info about what headphones the listener is wearing. Even if it was, it wouldn't be possible to know whether they're doing the experiment in a quiet room, or whether there's a traffic jam just outside their apartment window, or whether their dog is barking during the test. Stuff like that.
Crowdsourcing this is an incredibly cool idea, but it'd just be so easy to believe you've performed a reliable test even though some variable undermined it.
I forgot another variable: Whether the music was recorded at 16-bit resolution. Most musicians use 24-bit, but it's easy to imagine that some of their samples might've been quantized to 16-bit without them realizing it.
It's very important to control for every variable.
It's not, actually. Say you have 10,000 listeners and you randomly assign each one to 16-bit vs 24-bit listening. You have enough listeners that any differences between the groups are due to chance and will very close to even out. Now, if you find people are unable to distinguish between 16-bit and 24-bit you might want to try the test again with more control over the environment, but if you find a substantial difference in a large blind randomized test that's a real finding.
More realistically, the participant might choose music for which no 24-bit recording exists.
Well, obviously we'd need to have a limited set of music selections for which we have 24-bit recordings.
As you suggest, I expect the biggest impact on playback fidelity is going to be other factors like the noise in the system (likely a PC) and such.
But the flip side of it is that's also a good real world test. If the only time you can tell a difference is to be in an acoustically dead room with top end equipment, then the higher sample rate really isn't worth it.
But the flip side of it is that's also a good real world test. If the only time you can tell a difference is to be in an acoustically dead room with top end equipment, then the higher sample rate really isn't worth it
Hey, that's a great point! Hadn't considered that.
Proving "most people can't tell the difference between 24-bit and 16-bit in real-world settings" is less compelling than proving "no one can ever tell the difference," but it's still very relevant.
If the results are no better than chance, then it remains possible that a small subset of the test group actually can appreciate an improvement. Content providers may like to cater for that small subset.
disclaimer: I am not in that hypothetical subset.
There is another reason 24-bit music is desirable: it's good for remix culture (silly IP laws notwithstanding).
If I pay for music, I want to truly own it, including the possibility of someday making a mashup, a music video, a hip-hop beat, et cetera. A 24-bit source gives casual creatives the same quality material as the original masters, for a relatively paltry 1.5x increase in file size.
There's also the fact that you can FEEL some sounds that you can't hear.
Chris Randall of Sister Machine Gun (at least used to?) use a low-frequency generator at live shows to produce a sound that the audience could feel but not hear in order to make the music more intense. I suspect that you'd gain some of that effect with a larger bit size.
The increase in sample rate to 192kHz only allows frequencies above 22KHz to be represented (i.e. no effect whatsoever on the low frequencies that you mention). Pushing the bits to 24 only lowers the noise levels (which at 16 are already demonstrably imperceptible).
What you mention, though, points directly to what /will/ improve the quality of sound reproduction: speakers. It gets harder and harder to move that much air with precision as you get lower and lower in frequency. It's a definite technical limitation, but it's to do with very high-power amps and giant speakers, not the recording format.
We have (to the degree that humans can prove that they can perceive), perfect reproduction from digital recordings, perfect amplifiers for reasonable prices (at lower-than-concert-power-levels at least), but we haven't yet developed good enough speakers to cover the whole perceptible range of frequencies to anywhere near the same degree.
Audiophiles love to try and improve the whole chain, but really the only place it matters is at the very end.
According to this interesting-looking article here http://www.theregister.co.uk/2014/07/02/feature_the_future_l... the problem isn't frequency response but things like time delay, and it's not due simply to the very end but also to systems leading up to and around it like inaccurate crossovers and speaker cabinets that introduce time delay.
Well, first of all there is no problem with encoding low frequencies here (we do all the time! like the fact that the notes are not all played at the same time...).
What you feel is parts of your body resonating (because the low frequency sound is exciting modes of your body). This is unlikely to happen at high frequencies, partially because it would necessarily be much smaller parts of your body (see [0] for a diagram of typical body resonance frequencies) which we probably don't feel and because attenuation of sound greatly increases at higher frequencies (for example, see [0] for air), making it likely impractical to excite any such modes. My guess is that you might cause some tissue damage if you had significant ultrasonic excitation in your body (see [2] for something that may or may not be true...).
You hypothesized about tissue damage from ultrasonic excitation, so this tangential post may be of interest. Tissue damage from ultrasound is an intentionally-caused phenomenon being used (and experimented with) in some non-invasive medical procedures, where focused energy can locally ablate a tumor for instance. HIFU, high-intensity focused ultrasound, is the technique: http://en.wikipedia.org/wiki/High-intensity_focused_ultrasou...
I don't understand what you mean, so I can't say you're wrong, but higher frequencies are not more quantized than low.
In fact, you should be very careful about to think about quantization in digital sound reproduction, because it can easily lead you astray. Think of bit depth as a measure of dynamic range. Do look at that digital music primer for geeks posted elsewhere in this thread, it makes for awesome reading.
He was referring to the specific example he gave in which the higher-frequency had an amplitude of 2^6 and the lower frequency sinusoid had an amplitude of 2^10
I don't think anyone is arguing that you do, the high-frequency sinusoid he described has a peak-to-peak amplitude of 2 * UINT6_MAX (although it appears as though the intention was a sinusoid with a p-p amplitude of UINT6_MAX) which on its own (in terms of the integer numbers given in his example) be represented in a 6-bit system. This isn't really relevant though, because that signal would be 0dBFS in a 6-bit system. The higher 10 bits are far from unused by the sinusoid in a 16 bit system, as they allow the amplitude of the sinusoid relative to the other sinusoid to remain unchanged. The little stairstep digitally-sampled sinusoid picture might look a little more rough for the "6-bit" sinusoid than the "10-bit" sinusoid but that's A)kind a pathological case and B)not at all representative of what gets sent to the amplifier after the DAC. (Think about the spectrum of all those little "stairsteps" and what happens to them once they are sent across a shunt capacitor...) anyways
I guess it doesn't matter to human ears with a well-mastered 16 bits, but the video linked in the OP explains that typically dithering noise is shaped (toward frequencies we're less attuned to). The models used to lossy-compress also typically put more noise in some of the higher freqs.
Different frequencies aren't stored in different bits. It's all mixed together. If you mix -10dB of 100 Hz and -20db of 1000Hz, the composite will be the same at 8 bits, just much, much noisier. There is, curiously enough, nothing at all like "vertical resolution" in digital audio - resolution in the amplitude dimension.
24 bits is great for recording, but the end distribution medium only needs to be 16 bits, after you normalize the thing you tracked at 24 bits.
You can get libsndfile and FFTW and do the tests yourself.
The resolution of both sinusoids in the example you suggest is exactly 16bits
Edit:
While it would be possible to represent each of those individual sinusoids in 6 and 10 bits individually, the signal you describe has the high frequency signal "riding on " the low frequency signal. you need 16 bits to represent the amplitude of that signal at e.g. t = pi/4.
Except that high frequencies attenuate fairly quickly in air.
Sound attenuates proportional to f^2 in air. So, a 10-fold increase in frequency causes a 100-fold increase in attenuation. So, a 100KHz signal has 10,000 times the attenuation for a 1KHz signal over the same range.
In addition, attenuation due to water vapor is particularly bad above about 40KHz.
Much of that study cannot be replicated, and people have tried:
http://en.wikipedia.org/wiki/Hypersonic_effect
In addition, the effect went away when using headphones.
However, I can certainly believe that if you can pump enough energy to rattle things at ultrasonic frequencies you are going to get a result. Especially since ultrasonic frequencies rattle in water particularly effectively.
As an example, if I pump enough energy into an ultraviolet or infrared signal at your eye I will eventually get a detection result in your brain. However, pain and a burned retina are not what we think about when we consider a brain response.
I find the intermodulation argument a convincing one - it's hard to not have it affect the test, and if they haven't taken specific steps to avoid it then it would be easy to join a large amount of tests that have fallen prey to it.
I note, however, that Wikipedia doesn't mention any specifically brain-scan-based studies that counter it. If you know of any more I'd be very interested in hearing about it.
What gets me about the intermodulation effect is that some people want to have it both ways. They are so stuck in the 'there is no difference' camp they fail to see the self contradiction in the argument.
On one hand, higher frequencies above 20KHz can't be heard at all, so there's no point having them! You can't hear them!
Then on the other hand, higher frequencies above 20kHz affect the audible region of the sound (intermodulation distortion), so you can hear them, so make sure they aren't there!
What if the presence of the higher frequencies in a spectrum that shares a harmonic relation to the audible region causes intermodulation distortion that is pleasing and musical to the ear. What if the complete absence of this high frequency information, or alternatively a non-harmonic higher frequency signal (say some kind of switching or power supply noise) causes the audible region to be perceived in a less pleasant manner?
Intermodulation in this case is distortion that wasn't in the source material. It's a product of the failings of the playback system to perfectly recreate high frequencies without distorting the lower ones, and will vary depending on which system it is played on.
Certainly some people find certain kinds of distortion pleasant, but the people arguing for 192kHz claim increased fidelity, not pleasing distortion - when it is just the opposite for any stereo that introduces these artefacts.
There's a special irony to the fact that this high fidelity audio format is being promoted by Neil Young. Young's a rock musician. He's been around loud noises (e.g. rock concerts) most of his life. He's also 69 years old. Our ability to hear high frequencies decreases dramatically with age and exposure [1]. If anyone were able to discriminate 24/192 from 16/44.1, it sure as heck wouldn't be an elderly rock musician.
To be fair, I have worked with older sound engineers, and they can hear a lot of audio artifacts that I miss, just because they've been paying closer attention for a lot longer than I have, much in the same way my wife (who plays violin 6+ hours most days) can hear tuning and pitch problems better than I can.
High frequency limiting is not the only artifact that results from data compression.
and that is after 15 years in a symphony orchestra having my ears blasted by the brass and percussion section (with a demonstrated hearing impairment from my time in the orchestra).
The industry wants to be able to sell you something "better" and 24/192 is clearly bigger and therefore better than 16/48.
This is the same reason I'm convinced we're going to get 8k phone displays someday.
If the recording industry wants to sell me a "platinum" version of recordings, what I'd really like to have is different mastering of an album: at least one for noisy environments like the car, and one for higher-quality environments like my home theater. If you're familiar with "The Loudness Wars", this is a reaction to that. NiN tried to do this with their "audiophile" mix of Hesitation Marks (although a lot of people think they did not succeed, http://www.metal-fi.com/terrible-lie/ )
On the other hand, I don't need to buy any new equipment to support that, so the equipment guys aren't going to be happy. I don't know if there's any silver bullet for them--if there is a hypothetical advancement that would cause me to upgrade my system, I can't envision it.
4k would exceed that significantly on a 6" display. Thus why parent compares an 8k phone to the 24/192 discussion - the benefits are nothing more than being able to advertise a larger number.
I kind of had that effect at a trade show a while ago where they had huge demo displays like 12 foot /6000 pixels across and I couldn't see all the detail at a usual viewing distance but could make it out in small areas by wandering up close. It was kinda cool. I'm not sure it would have that much value in a domestic setting but in something like a museum / gallery could be good.
so, carrying that to its logical conclusion, does that mean that at some point we will create displays w/ such a high resolution that they will, in some sense, be creating a "universe" which is indistinguishable from that which they are displaying? or, i guess that would mean the display essentially is what it is displaying. is that even possible in our universe, or would that be akin to creating energy / matter from nothingness?
clearly the display would blink "let there be light"* at startup.
No one can see X-rays (or infrared, or ultraviolet, or microwaves). It doesn't matter how much a person believes he can. Retinas simply don't have the sensory hardware.
The author seems to have stumbled into a poor example, as a recent study shows that humans can indeed see infrared light using an unexpected process. Should we read anything into the audio case from this? Probably not, but it's a sign that even those who are sure they are right because they have science on their side should retain some degree of openmindedness.
Human infrared vision is triggered by two-photon chromophore isomerization
This study resolves a long-standing question about the
ability of humans to perceive near infrared radiation (IR)
and identifies a mechanism driving human IR vision. A few
previous reports and our expanded psychophysical studies
here reveal that humans can detect IR at wavelengths longer
than 1,000 nm and perceive it as visible light, a finding
that has not received a satisfactory physical explanation.
We show that IR light activates photoreceptors through a
nonlinear optical process. IR light also caused
photoisomerization of purified pigments and a model
chromophore compound. These observations are consistent
with our quantum mechanical model for the energetics of
two-photon activation of rhodopsin. Thus, humans can
perceive IR light via two-photon isomerization of visual
pigment chromophores.
Openmindedness isn't the same thing as being open to accepting a mistaken belief. If no one can tell the difference between 24-bit and 16-bit, then 24-bit simply doesn't matter. The only way to know is with a controlled experiment that checks whether the people can detect no better than chance whether music is 24-bit.
Openmindedness isn't the same thing as being open to accepting a mistaken belief.
Perhaps not, but I think it's very close. I'm not advocating for noncritical acceptance of anything, but making the point that any jump from "X is a physical law" to "Y is impossible" depends critically on the assumption that the only way to achieve Y is through X, and that there is no possibility of a back door that entirely sidesteps X. This not be openmindedness exactly, but failing to account for this possibility strikes me as an example of closedmindedness.
The only way to know is with a controlled experiment that checks whether the people can detect no better than chance whether music is 24-bit.
Which people, which music, under what circumstances? And whose results do you trust? I feel safe guessing that many companies selling high-end audiophile quackery claim to have done tests showing that their equipment makes a positive difference to sound quality. Some of them are simply lying, some are misinterpreting their data, others have something real too small to be reliably measurable, and a tiny remainder might have a genuine breakthrough because they are approaching an unsolvable problem in a way that sidesteps the previous barriers. The question is how much openmindedness is right to account for this probability without being overwhelmed by the garbage.
If no one can tell the difference between 24-bit and 16-bit, then 24-bit simply doesn't matter.
I'd probably make the bet that there exist certain 24-bit sound files that certain listeners can discern from the same sound file that has been downsampled to 16-bit. While the difference may well be too small to be actually considered "better", I don't think that any physical laws that prevent this. I think it would be fun to see such a test. This Youtube video on "overtone singing" might offer insight on the sorts of effects that might be enhanced by the greater bit depth: https://www.youtube.com/watch?v=UHTF1-IhuC0
> I'd probably make the bet that there exist certain 24-bit sound files that certain listeners can discern from the same sound file that has been downsampled to 16-bit.
I would take you up on that bet. This has been tried before, and no difference was found, even when dithering wasn't used! The noise floor on 16 bit audio is around -96dB. There are very few HiFi systems that can manage that. Even in the highly unlikely event that there are listeners that can distinguish it, it's likely any difference will end up eliminated by noise in the analog components.
I probably should have written 16/44.1 vs 24/192, since I was thinking mostly of the waveform 'beat' interactions as shown in the linked video. Do you feel those are also indistinguishable? I can't afford an actual bet on it, but I'm interested enough to explore a bit and see what I can find.
1) I don't believe ultrasonic beat frequencies are by themselves audible (though I could be wrong about this).
2) It is possible to exploit nonlinearities in air to make audible sounds from ultrasound, but IIRC levels above 100 dBSPL are required. I think Disneyworld has an attraction that uses this.
3) It is highly likely that an arbitrary sound sample will sound different on playback if you add e.g. an 80kHz tone, as harmonic distortion of amplifiers and speakers tends to increase with frequency. This is generally considered a bad thing though, as it is a difference that would not be heard by a live listener.
To understand how this works, consider that the speed of sound in air varies with air pressure. Furthermore sound is a pressure wave. Hand-wavingly this means that a sufficiently intense sound wave will alter its own propagation speed, which will in turn cause all sorts of interesting effects.
An analogous effect for light is used in many green lasers: some piezoelectric crystals will vary their index-of-refraction when an electric field is applied; a sufficiently powerful laser will generate a strong enough electric field. This can be used to frequency double infrared lasers into visible light.
The frequency of a 'beat' is completely different to what we normally refer to as the frequency of a sound. The beat's frequency relates to how quickly the amplitude of the sound wave varies. The frequency of a tone relates to how quickly the pressure waves oscillate back and forth.
Besides, if a combination of sounds outside the audible spectrum DID combine to produce audible sounds (maths says they don't, but maybe non-ideal properties of air etc mean they might?), the resultant audible sounds would be picked up by the recording equipment anyway! So you'd never need to record the inaudible source sounds, just the resultant audible bit.
Unless your listener can hear frequencies above 22.05kHz, it's theoretically impossible because the sampling theorem says 44.1kHz sampling can perfectly reproduce all frequencies below 22.05kHz.
Any differences within normal human audible frequency ranges must be caused by imperfect DAC. Agreed?
(No, I don't have a perfect DAC, but if the audible artifacts are because of a DAC that produces less perfect analog waveforms below 22kHz when fed a 44kHz source rather than a 192kHz source, isn't that squarely the DAC's fault? It should also be made abundantly clear that this is a hypothetical. Is this actually a problem? Has anyone simulated an analog waveform from 44.1kHz sample, compared it to oscilloscope readings from a decent quality DAC, and noticed theoretically audible differences?)
I think the main argument was that 192Khz sampling does not matter, where as 24-bit can provide some benefit (albiet might be in your ability to implement filters).
>Edits and corrections made after this date are marked inline, except for spelling errors spotted on Dec 30, 2012 and March 15, 2014, and an extra 'is' removed on April 1, 2013
That study is 8 months and 16 days younger than the last edit to the article.
The thread at https://news.ycombinator.com/item?id=8210878 (assuming StavrosK was not lying like one of the comments suggested), shows an interesting discrepancy in the vision of different people.
Monty actually mentions this in the footnotes of the article:
> The original version of this article stated that IR LEDs operate from 300-325THz (about 920-980nm), wavelengths that are invisible. Quite a few readers wrote to say that they could in fact just barely see the LEDs in some (or all) of their remotes. Several were kind enough to let me know which remotes these were, and I was able to test several on a spectrometer. Lo and behold, these remotes were using higher-frequency LEDs operating from 350-380THz (800-850nm), just overlapping the extreme edge of the visible range.
From what I can make of it, it doesn't sounds like that's really the same thing as seeing the infrared part of the spectrum, at least not the way people would normally mean it. It sounds like an effect that makes photons from infrared light simulate green light.
More rarely, there's also tetrachromacy, which is thought to improve colour differentiation within the visible spectrum, and aphakia, which causes an increased sensitivity to the near ultraviolet.
Anyone here who has not already seen Xiph's Digital Show and Tell ( http://xiph.org/video/vid2.shtml ) should do themselves a favor and sit down for a watch. It makes sense of a lot of mysteries and misconceptions around digital audio.
So I watched, and I'm confused.
How exactly does DAC recreate a band limited signal without having an infinite lookahead?
Honest question. Any pointers?
"The Whittaker–Shannon interpolation formula or sinc interpolation is a method to construct a continuous-time bandlimited function from a sequence of real numbers."
http://en.wikipedia.org/wiki/Reconstruction_filter has a less mathematical explanation, but, as Monty mentions in the video, be wary of the sentence "in practice, the output of a DAC is more typically a zero-order hold series of stairsteps". This really only applies to some (mostly low-end) DACs. There are many DACs that will use anti-aliasing / interpolation and probably many other methods to recreate the signal.
Thanks, exactly what I was searching for. Quoting from "Reconstruction filter" page:
Practical filters have non-flat frequency or phase
response in the pass band and incomplete suppression
of the signal elsewhere. The ideal sinc waveform [..]
would require infinite delay.
So it's all good in theory, and it depends in practice.
Any idea what is the highest frequency "recomended" in a typical audio that will not cause trouble for consumer grade DACs?
That is, if I'm mastering, say, for 44.1kHz, what is the recomended antialiasing filter?
You can't simultaneously say that you're dithering to represent low amplitudes while also saying you're keeping enough samples to capture all audible frequencies. Dithering doesn't create resolution out of nowhere, it sacrifices temporal resolution for amplitude resolution. It's also bad for compression (hence why modern video encodes are done at 10-bit even for output to 8-bit devices), and worse if you want to use a source as the basis for further work (i.e. a remix). If you want to store your signal in a simple, convenient way, and not have to carefully tweak the levels for each individual recording, 16 bits isn't quite enough. And as the article admits, extra resolution certainly can't hurt; worst case is the extra bits are thrown away.
Also 44.1KHz is a pain to do in realtime (there's not enough headroom to really filter out the higher frequencies without damaging the 20KHz response), meaning you need a separate mastering step which is inconvenient and frankly unnecessary. 48KHz is a much more sensible standard to work with.
192KHz may be dumb, but 48KHz/24-bit is perfectly sensible. It gives you fewer ways to make mistakes than CD-quality (44.1KHz/16-bit), and at some point the extra space is worth that, particularly since it may well compress better than a dithered 16-bit signal.
> Dithering doesn't create resolution out of nowhere, it sacrifices temporal resolution for amplitude resolution.
Could you elaborate on that? As I understand it, dithering trades distortion for uncorrelated noise, under the assumption that the distortion is more objectionable than the noise. Where did you come to the conclusion that temporal resolution is affected? The temporal resolution is instead related to the frequency resolution.
I agree that 48kHz/24-bit is the most sensible for production, especially since (as you said) you don't have to worry about the levels too much. But when you master a track, you pay very close attention to the levels anyway, so those 8 extra bits don't do you any good. I think most people can't hear past 12 or 14 bits anyway, unless the audio has a particularly wide dynamic range.
The article talks about representing signals below the notional noise floor using dithering, which requires either temporal dithering or something morally equivalent to it - if your ear is detecting an average of dither + waveform, then it has to have several samples to average from.
What's so funny is to see yet another occurrence of basically "because Nyquist" yet fails to address that Nyquist only holds true over infinite time. Over a window of any finite length perfect reproduction is NOT guaranteed.
If you just change your definition of basis functions from infinite sinusoids to truncated sinusoids and your definition of bandwidth to the maxiumum frequency of a nonzero truncated sinusoid coefficient you can recover an analog of Nyquist Rate just with a different idea of bandwidth. When you're talking about the typical 10^6 cycles over a song they're the same for all intents and purposes.
EDIT: Your truncated sinusoids have to be harmonics of the total signal length for this to work trivially.
I don't see how either of these papers solidly refute "because Nyquist" in the context of the article and sample rates applied to audio signals. They're both worth considering, though.
The first[1], the more theoretical of the two, is focused on digital communications where a Nyquist rate derived from the signal rate isn't enough to recover the signal function. The numerical examples given highlight recovering a sine wave from two, three, and six samples. The millions of samples in a single song might as well be infinite compared to these.
The second paper is much more practical and gives a number of real-world examples where naively applying Nyquist causes problems. The first, cutoff filters at half Nyquist to avoid aliasing, plauged many early CD players and other digital audio equipment. The second example is great as well, showing how understanding the signal being measured is important. The third example doesn't really apply to the subject at hand but is a good example of using aliasing for good. The fourth and fifth address the effect filters have on the signal. The sixth example refers to time response in control systems and minimizing phase delay to avoid oscillations.
The author is largely right. It all comes down to the master. And the lossless format. The numbers here don't matter, what matters is will the masters be better for 24/96..24/192..32/384? If yes, prefer the 24/96. Don't prefer it for numbers' sake.
Personally I can't ABX anything above 192kbps, lame vs flac. Very occasionally a pre-echo reveals itself, but hardly ever. V2 is fine, 16/44.1 is fine, V0 & 320 are indistinguishable from flac, but buying anything but lossless is a crap deal.
The 'volume knob' trick, once you notice it, makes it basically impossible to objectively compare 2 headphones, or two amps. I can't match within a 1/4 dB with my fingers.
Humans almost universally consider louder audio to sound better,
and .2dB is enough to establish this preference.
The argument is fine for listening to a finished product but those absolutes in the post have some hidden assumptions that don't hold in all real-world scenarios.
That probably sounds like BS. Hear me out.
A common practice for DJs is to use special decks to DJ from iPods, or other digital sources (eg. regular turntables with timecode vinyls operating a PC.) DJing involves playback speed adjustment when there is a BPM difference between the two tracks during a transition. 192kHz is overkill for basic DJing and you'd probably be fine in the vast majority of cases with 48kHz, but if the DJ is a turntablist (scratch etc.), you want all you can get when going from zero to target speed—which is happening constantly. It sounds awful when consumer-grade¹ audio is used. As for the ultrasonics, filtering is the answer in this scenario. It may be quite a good thing for these DJs (and their fans) that Apple is doing this. (I'm not under the impression that this is why they're doing it.)
A lot of the music I listen to uses samples that are played at something like half speed, or tuned down at any rate. I tend to tune samples down by about a fifth myself. Point being: a lot of detail that would otherwise be present with higher samplerate source material goes missing. It doesn't help that tuning down like this dumps a good portion of the low end.
There are also neat ways of exploiting nonlinearities from ultrasonic sources, which I use, but that's harder to describe.
I had a similar reaction. In addition to the article's assumption that the consumers and producers of audio are distinct groups with no overlap, there is the assumption that humans will be the only ones listening (rather than computers, e.g. Shazam).
Those two factors are probably not enough to change the way ALL music is distributed, but they deserve to be acknowledged and dismissed with proper evidence as well.
Such "studio" formats don't make sense for end-user listening, but they make sense as inputs to further processing, mixing, etc. Having a pile of headroom in frequency and amplitude means that after further processing you can subsequently output a sensible 16-bit 48kHz file without loss. If you start with a 16-bit 48kHz file and then do a pile of processing, you won't necessarily preserve the same degree of quality.
Most of the places selling these files don't include disclaimers about how the additional quality is only useful in studio conditions -- they use it as a differentiator in the marketplace for end listeners. The article does mention the need for higher quality in production environments:
> Also, there are (and always will be) reasons to use more than 16 bits in recording and production.
> None of that is relevant to playback; here 24 bit audio is as useless as 192kHz sampling. The good news is that at least 24 bit depth doesn't harm fidelity. It just doesn't help, and also wastes space.
The summary here is to just rip everything losslessly and then to go ahead and use 44.1/16 since it's actually better in some ways and not worse in others.
+1. I store originals (rips and etc.) in FLAC and then encode in Opus (around 140 Kbps) to actually listen to. 140 is an approximation of transparency level, it can be lower according to these tests:
http://listening-test.coresv.net/results.htm
Yes, I have it installed on a Sansa Fuze player. Waiting to get Jolla 2 handset when it will come out, and then will probably play it there. I'm sure though that VLC would handle it just fine on any Android device as well.
I think most effective way to improve sound quality is to get a good DAC as close as possible to the output. Headphone amps with integrated DACs do wonders for little money. When possible go for XLR on the last mile to the speakers (good neutral studio speakers) to cancel out distortion from external electromagnetic pulses. To me differential, or balanced signalling is still the most clever, yet so simple, analog information preserving method I ever heard of.
http://en.wikipedia.org/wiki/Differential_signaling
Also; of cause 24bit increases the resolution of the signals amplitude, that's exactly what it does by definition, that it sets the noise floor is only a function of that. To stress the visual analogy: it's like looking at 16bit images with banding and then on 24 bit images. Of cause the effect is not very pronounced but it's there.
Especially if you have a high dynamic recording of a soundscape but you would like to zoom in into a certain volume range like that of the human voice it's good to have that extra resolution not only in a studio environment. Think about it like developing a picture from RAW into JPEG to stress this analogy again, this doesn't make a difference for compressed pop music but for uncompressed recordings of live performances with analog instruments or natural soundscapes it does. You can then choose at which volume range your most at home for listening and do the compression or not.
I like to think about 24bit audio like having access to the RAW files of images, it doesn't matter in most cases, and most likely you should leave the mixing to professionals artists for the intended effect, but it also enables you to experience the sound in many more different ways.
> When possible go for XLR on the last mile to the speakers (good neutral studio speakers) to cancel out distortion from external electromagnetic pulses.
A couple minor corrections: XLR is just a connector, the equipment has to be balanced and there's plenty of equipment on the market with XLR jacks but unbalanced signals. Second, electromagnetic interference does not cause distortion, but it does cause noise.
Balanced signals are far from a magic bullet. It is basically a tool for solving a couple specific problems: interference and ground loops (NOT distortion). In a typical home setup, interference and ground loops will not be a problem, because we're just talking about plugging a CD into a stereo that's a few inches away, and they're both plugged into the same power strip. Balanced cables are more helpful if you have audio equipment plugged into different mains circuits, or drawing lots of current, or transmitting signals over long distances (more than 10 meters). For your home stereo, regular RCA cables are sufficient.
> this doesn't make a difference for compressed pop music but for uncompressed recordings of live performances with analog instruments or natural soundscapes it does.
I'm sorry, but this is just ridiculous. The noise floor of 16-bit PCM is -96 dB. Your living room, if it is very quiet, has an ambient noise level of 30 dB SPL. Or you could suppose that you spent money soundproofing your rooms, and you live out in the country, and it's as quiet as a professional recording studio, at 20 dB SPL. Now, turn up the CD player until the noise floor is audible. What happens when you play music? It will be at 110 dB SPL, which is the same sound level as sitting next to a chainsaw.
Now, the benefit of 24-bit audio is that you can play music louder than a chainsaw (> 110 dB SPL) and still have parts of the music that are quieter than a whisper (< 20 dB SPL). Even without compression, it is rare for actual live performances to have that kind of dynamic range. Pianos, for example, are simply not physically capable of it, with typical microphone technique.
In a professional audio context XLR (as in RS-297-A) is pretty much equivalent with balanced audio, of cause this requires a balanced output and signal chain. Just as well as a stereo phono connector can be used for a balanced mono signal.
You surely heard a GSM pulse on your speakers before; that's not just noise. Especially when living in an apartment where you can't control what kind of cold fusion reactor your neighbor from hell is running, this a significant improvement over RCA and extensive shielding also in a home setting.
I'm not talking about using the full dynamic range in a linear fashion, more about doing the compression at home and not in the studio. Like being able to listen to a whisper, a piano and a chainsaw from the same recording, recorded at it's original amplitude, at playback mapped to a pleasant listening range at full fidelity, yes that's not the normal casual use case but that's what a higher bit depth enables.
In a professional context, XLR is not the same as balanced. I have several pieces of equipment with balanced connections, but only a minority of the balanced connections are XLR. Most are TRS.
I think you are using a different definition of "noise". "Noise" is unwanted sound.
The GSM noise you hear is caused by demodulation of GSM radio, in the 800-900 MHz range. At these frequencies, even a very small wire works well as an antenna. For example, an 8 cm wire makes a quarter-wave antenna. In my experience, there is often a trace of at least 8 cm within an amplifier or monitor which picks up the signal. In these cases, balanced connections do not help. The nonliniarities in semiconductor components then demodulates the GSM signal into the audio band. The worst offenders at picking up GSM interference are cheap amplifiers and radios. Good equipment is well shielded and doesn't suffer from radio interference, and in my experience, using unbalanced connections over short distances (1-2 meters) won't change that.
Balanced connections fix ground loops and let you run cables over long distances. That is all. They don't save you from GSM demodulation.
Exactly my point in the first post in this thread, the reason why getting the DAC close to the output is a good idea in my opinion to minimize those types of opportunities for interference on otherwise reasonably priced equipment (DAC/headphone amp plus decent headphones for under USD400, or DAC with balanced outputs and active monitors for under USD1000). Balanced cabling and signalling is only a bonus but recommended for active studio monitors.
I have worked a lot with audio programming and while most of the article is right in what is known, it could be wrong at what we don't know. Technical myopia(you see too much of what you know but you don't see the big picture).
We are using HDR in pictures even when the eye could not differentiate between HDR colors because it adapts to the general luminosity of the image but CAUTION the general luminosity level affects lots of biological cycles like the circadian rhythm.
In normal pictures we discard this info, but this info is enough for a person to differentiate a picture from a real image in the real world.
Also natural sensors work different than our eyes and ears.
This difference makes HDR a necessary because we can see a huge shadow along with an illuminated area at the same time because our cones in the retina adapt locally, but if you make a single picture we can only choose to picture the bright areas, making shadows too dark, or choose the shadows making the bright areas appear too white.
Artificial sensors linearize over a fixed level. Nature sensors are really continuous exponential, even touch.
The same happens with our ears. So you are doing an spectral analysis using an arithmetic frequency decomposer called the FFT?
Well, sorry to burst your bubble but the cloclea frequency analysis runs circles around anything we have. It does a geometric analysis, and also does it locally. Using just a single tone as an example is a fallacy. Most real sounds are not a single tone but changes in lots of frequencies at the same time.
The law of diminishing returns applies to sound and video, we have a good enough experience for what we want to do, but by no means it is perfect.
Ask John Carmack that is trying to create an immersive experience. Sound is one of the big problems. Yes, you can understand the sound, but you know that it is not the real sound you hear in the real world.
Not disagreeing with you, but one of the main reasons for using HDR in film is comparable to using 192/24 (or even 192/32) in audio. It provides greater dynamic range for post-processing.
The final "rendered" product will usually still be at normal dynamic range for film, and 44.1/16 (or 48/16) for audio.
Not sure about this one. I'm pretty sure at one point I could hear the difference between 44.1 and 48 khz sampling. I agree 192 is overkill, but 44.1 is just above the Nyquist limit. At that resolution, the top breathy harmonics of a piccolo are only getting 2-3 samples per cycle, which seems to leave room for some possible aliasing if you are not 100% sure about your filters. So why not just go crazy and throw 4X samples at the problem, eliminating any question of proper anti-aliasing?
As to # of bits, the issue there is the wide dynamic range of music, and the fact that our ears can adjust to this wide range. Probably you would get the same effect as 24 or 32 bits with the right dynamic range adjustments, but then we'll have to argue about which algorithm is "right". A surfeit of bits just makes the question go away.
So I appreciate everyone's point of view and applaud using an empirical approach, for those of you who share the author's point of view, but I disagree unfortunately. For those of us who have worked in DSP, either using it or implementing new things with it, there's s highly mathematical reason to record the source of you audio with a higher sample rate than what the author suggests is a generous maximum.
It has to do with waveforms and how continuous they are. So, for starters, true, if you have a perfectly continuous wave form, at 22K, then your sample rate must be at least 44K. In fact, with sample rate of 44k you can perfectly discretize a continuous wave form, like a sine wave.
Does you see the problem with this? Sounds are not always continuous! If you look at the waveform of a violin, distorted guitar, cymbal, etc... They're very jagged. To effectively approximate these analog waveforms as a finite set of sums you need a much higher sample rate. It makes s HUGE difference, trust me.
So basically, technically speaking, 44K works just five if you only listen to music made by orjan pipes and penny whistles, but most sounds are very complicated, and to be properly captured you actually need a higher sample rate. It's simple and mathematical. Also, this whole "44.1K is all you need and if you don't agree with me then you're dumb and don't understand math" ra ra ra has been going on all over the Internet for ages, and while u appreciate the motivations that people may have, it gets a little annoying. Basically, instead of immediately jumping to the conclusion that people's ears are wrong, maybe the more patient and mindful approach is to ask oneself, "why does my mathematical knowledge of a subject fall short of explaining what many people seem to experience?".
Note: everything I said was regarding the source of capturing a sound. There's an entire science behind compression and all that sauce.
Also Stanford's DSP lectures (available online) explain this much more indepth, albeit abstractly.
As a former recording engineer, I completely agree that there's value in capturing audio at better than 16/44.1. 24-bit means I don't have to care so much about "filling up the bit bucket" when I set input levels because I have a lower noise floor. And if I'm doing any DSP, obviously I like having more information rather than less.
But I'm not at all convinced that distributing recordings at better than 16/44.1 has any real benefit. I've done some blind tests before and was never able to reliably beat 50/50 on figuring out which tracks were "higher quality" - and while I certainly don't have the best ears on the planet, I feel pretty confident that my hearing is more developed than the average person. Not to mention the fact that probably 90%+ of consumers are listening on systems with poor speakers and worse DACs.
I often hear audio people explain why music should be distributed at higher bit and sampling rates, but I have yet to see anyone who can reliably tell the difference - especially on a consumer-grade system.
Edit: Downvotes, really? Was there something objectionable in there?
I'm sorry, but if you have a DSP background you should at least be able to give a better explanation than 'trust me'. I have a really tiny DSP background (a 3 week lecture course), but specific questions that would help convince me of your point of view would be:
- Sampling at a frequency 2f lets us reproduce the specific component of every frequency <= f (proved by Shannon, Nyquist etc). This can be easily proven mathematically, and were it not the case in practice would have been largely discredited (it's a fundamental of many fields).These 'jagged' waveforms must then have their frequency components <= f perfectly represented - the only components that are ignored are those that are > f (I've idealised here, but I'm assuming we've filtered correctly). If f is greater than the maximum frequency we can hear, why does it matter at all?
- When the entire field of digital signal processing (and anyone who uses it, such as a sound engineer) relies on such math from top to bottom to make the recording process work, why should we ignore it as soon as we start experimenting and simply consider what our ears tell us? Why should you ask the question 'why does my mathematical knowledge of a subject fall short of explaining what many people seem to experience?' in the first place? It seems to imply that your ears are more accurate than a century of study, which at least seems arrogant.
- A number of the points you've made (regarding people hearing different things) were explained in some way in the article. What are your views on that?
I'm largely convinced that you're wrong (you've given no specific justification for any of your points), but I would be interested to hear what you have to say.
"everything I said was regarding the source of capturing a sound"
Well, the rant is about reproduction of the sound. For capturing sound he basically agrees with you, 24/192 has its place in studio, recording and music production.
The articles entire point is that statements like this cannot be trusted. Read under "Listening tests", and it's clear that there is no benefit to the higher samling rate. I highly encourage you to do your own double blind ABX - if you can select 10 tracks where people can reliabilly tell the 44K from the 192K I will reconsider. But until then, everything we know so far tells us that 44KHz is more than enough.
I remember when CDs first came out and Neil Young was very critical of their sound, and I believe he was entirely justified in his criticism. When they first came out CDs were Record Company's poor stepchildren, and they were treated very poorly. I believe the engineers mastering CDs were given tapes that were several generations away from the original master tapes, and I suspect they may have even been already equalized for vinyl. No wonder audiophiles preferred the sound of vinyl over the sound of early CDs.
Nowadays CDs are made from digital recordings to digital masters to digital discs (DDD). I love Neil Young's music, but his grasp of the details of digital audio recording and reproduction is not particularly strong.
I suspect something else was happening too: Recording techniques were adapted to the problem of making something sound good after recording, mastering, cutting, and playback. This probably meant getting away with some shortcuts that would be masked by subsequent processing. The perfection of the digital medium unmasked those things.
I wonder how many people see 24/192 and think "24-bit, 192kbps" sound instead of "24-bit, 192khz sound", getting the units wrong.
MP3s are (often/traditionally) 128kbps, so 192kbps would be better, and 24 is more than 16, so it too must be better.
For the people who do get the units, you still have the 'more is better' problem. We've finally gotten past the point of everyone trying to make ultra-compact 30MP cameras because consumers have realized that their current camera is 'good enough' and that more MP doesn't always make the picture sharper.
Could this just be the same thing in the sound world?
TL;DR, but skimming through an interesting problem comes to mind: The OP "orthogonalizes" the question of the sample accuracy's (24 vs 16 bit) and sample rate's (48 vs 192 kHz) impact on quality, answering one independent of the other. But even with my limited background in mathematics it's quite obvious that that approach is not entirely correct: the Nyqvist theorem only really applies when you have infinite sample accuracy. It would be interesting to see an analysis of how the two interact; i.e. how the discretization error impacts the highest representable frequency.
The highest representable frequency really is the Nyquist rate: you only need 1 bit of sample depth to generate a digital signal whose corresponding band-limited continuous signal is a sine wave that oscillates at that rate.
Sample depth roughly tells you how far away a sample can be before the magnitude of the ideal sinc function used to reconstruct the continuous signal falls below your quantization threshold. That distance in turn gives you an idea of the frequency resolution you have... You could in theory run an FFT over that many samples and detect that the corresponding change was not just due to quantization noise.
Fascinating article. Is there any reason to believe that upsampling an audio file should produce a better analog signal than playing it at its native sample rate?
I find that playing high quality mp3's through an upsampling DSP filter and a 24/192 DAC seems to produce a better listening experience. (As the article points out, this could be due to confirmation bias, or the filter making the music a tiny bit louder.) Intuitively it makes sense to me that the DAC sending signals twice as frequently to the headphones would produce a smoother analog signal, but is that actually true?
> Intuitively it makes sense to me that the DAC sending signals twice as frequently to the headphones would produce a smoother analog signal, but is that actually true?
No. Watch the Digital Media Primer and Digital Show & Tell, Monty (the article's author) explains how digital sound encoding works and why the analog signal will be reproduced beyond your ability to notice imperfections either way: http://xiph.org/video/
And that's for native 24/192, in your case since it's been upsampled the 24/192 signal can't have more information than the original 16/44.1, since that's all the information that went into it.
The engineering reason to upsample is to simplify the analog anti-aliasing filter on the other side of the conversion by giving it a wider transition band to work with. It also means one analog filter can handle a wide range of samplerates.
masklinn is right though—from a consumer perspective, there's no real reason to care how the conversion's done. Hopefully your DAC was designed by a team that knows what they're doing and will handle whatever rate you feed it.
Of course, if you have one of those obnoxious devices that can't actually clock below 48kHz or a mixing daemon configured to run at 48kHz, you're stuck oversampling anyway, and depending on your environment that might be done anywhere between amazingly well and linear interpolation. Many years ago I was stuck playing that game on Linux—in that case, there's an audible benefit from using a better designed upsampler in your player.
But if you're equipment's not broken, it's a waste of CPU cycles.
That's circular reasoning - surely the definition of a better DAC is one that makes a difference?
Anyway, the author also makes this point in the video referenced by other commenters. The computer hardware he demonstrates with is mid-range at best and yet the DAC and ADC are far more than adequate.
"Can you see the Apple Remote's LED flash when you press a button [4]? No? [Some other remotes] may be just barely visible in complete blackness with dark-adjusted eyes [5]. All would be blindingly, painfully bright if they were well inside the visible spectrum."
Ok, what happens if you point a remote at an infrared camera?
A little research on the FLIR website shows cameras with a range of 7500-13500nm, which would be 22-40THz (?); the remotes are 300-380THz. Sigh.
This is just so anthropocentric, of cause the true selfless audiophile wants his music to be experienced in it's full natural range by bats and dogs just as well.
Monty's example files made me realize that my computer's sound chip is probably kind of lo-fi.
No, I couldn't hear anything in my headphones when I played the 30Khz/33Khz tones. In comparison, I heard something after it stopped. Somehow it was quieter during the playback.
Does it make any sense that my computer is better at making no audible sound when it's asked to play a loud inaudible sound, than when it's being asked to play nothing?
I am sorry, but the authors' examples and parallels doesn't make much sense to me.
First of all, what does sample rate has even remotely to do with hearing range?
If we are comparing this to visual information, perhaps it would be best to draw parallels with video. Namely, sample rate would be equal to frames per second and bit depth to colour depth. In both areas we are still seeing improvements from TV manufacturers (100Hz TVs) and gamers are still racing to get better fps so it looks more realistic.
The question then would be, why 25 fps is the minimum average human can withstand? What are the limits here and can it look more realistic for certain people with more frames per second.
After all, this race for fidelity is just attempt to get more accurate representation of analog sound, which opens different problem - absolute all of todays music is electronic - produced from samples and samples above 44KHz are simply not available. You may find some vinyl rips of more classical music (which does explain the phenomenon why so many rock stars would start listening to classic music - after getting rich and dumping money into expensive sound systems they'd discover elegance beyond distortion).
Saying all this fidelity does not matter is kind of equivalent of throwing away all the vinyls (although transistors play a role here too).
> If we are comparing this to visual information, perhaps it would be best to draw parallels with video.
But it isn't. We perceive pitch more like we perceive color than how we perceive motion. "Motion" in audio is well below the frequency of pitch. Same goes for video really, but in video the framerate has nothing to do with the color, unlike how the sampling rate has to do with sound frequency in (cough most) digital formats. If color in video was sampled the way pitch is in audio, we would have framerates in the high terahertz range.
"First of all, what does sample rate has even remotely to do with hearing range?"
The sample rate (among a few other things like the quality of the low-pass filter) determines what's the highest frequency you can perfectly read from a signal. Since most human beings can't hear frequencies above 20 kHz you simply need to choose a sample rate that can represent a 20 kHz signal perfectly and that's something like 44 or 48 kHz.
Ok that does make sense and indeed defines overall distortion. Moreover, what his video shows is example with single tone, while reality is that single instrument would produce multitude tones and typical track would employ multiple instruments.
Is it possible this was just a move to get advertising space in audio files, like with devices that use ultrasonic frequencies to trigger events in devices? (For example, http://lisnr.com/)
I know that people cannot perceive sound at above 20Khz but has this verified with brain imaging? Just to be 100% certain that you are not subconsciously hearing something above 20Khz.
The entire argument is based on the premise that the ear can't detect content above the limit of 20khz. While it is no doubt that few if any ears can hear independent tones above 20khz, what about detecting the shape of tones below that point? I don't think anybody's proven that the overtones of notes below 20khz are unimportant, or that the ear doesn't use edge detection for example to determine phase differences for location determination. These would require faithful reproduction beyond 20khz.
Also to clarify a technical point, sampling at 192khz doesn't extend the frequency response to 192khz, it only extends it to 96khz (Nyquist in action).
If you're listening to music and it still doesn't sound like the musician is sitting right next to you in your living room, then there's still room for improvement.
24bit absolutely makes sense in the age of streaming devices pushing sound to digital inputs on powered speakers. Simply for convenience reasons I want to adjust the playback volume already at my phone. I have no analog volume control (or rather, it's a configuration knob you set up once)
If you play back a sample at 1/8 the volume, you still have 21bits of range from a 24bit sample. Playing a 16bit sample at 1/8 volume is basically a 13bit range, which is bad.
You might want a high bit depth audio stream if you were seriously boosting the volume, but not if you were lowering it. At most you might want a 24bit DAC.
And if you can't pass an ABX test, then you don't really want it. Can you?
Why would s higher range not be good for lowering the amplitude? If you divide the samples in a 16bit stream by e.g 16 that means losing several bits of range. Surely if a 12bit dynamic range was perceived as being just as good as 16bits then we'd have that to begin with?
To be clear, when I say "lower the volume" I mean lowering the amplitude of the digital samples, passing them through a high quality dac and amplifying. I.e sliding the volume slider down on a iPhone using AirPlay to a Toslink->amp.
A 24bit digital format doesn't mean the DAC actually has 24bit of SNR. Hell, most DACs in computers, phones and the like struggle with producing 16 bits. Not to mention that even if you had that headroom, when approaching the threshold of hearing would mean that you wouldn't be able to tell a difference at lower volumes.
assuming a signal is played at "normal" listening volume after first being attenuated by 1/8 digitally (samples shifted 3bits down) and then passed through a high quality dac and amplified, are you saying the difference between 16 and 24 bit streams would be inaudible or nearly so? That to me sounds surprising. That seems to imply that the difference to playing the 16 bit stream unscaled (No digital attenuation and 8x on the analog side) would also not be hearable?
Well, depending on the gain of the monitors, it will not matter With low enough gain you will have enough headroom, with high gain you run into hissing from the source. At a high gain the hissing from the DAC's noisefloor will be dominant, making the loss of bits a thing of minor conern. In any case, with powered monitors you need an analog volume control at the end of the chain, that's the correct practice if you care about quality.
Yes there is an analog gain but since volume is controlled by digital attenuation, the analog gain is set to the maximum listening level, which is rarely used.
The ideal solution would be a digital format where desired attenuation is passed along to the amp, but I don't think eg Toslink does that.
So yes if you want single device (phone) control over volume then you attenuate and amplify a noisier signal only for the convenience of not having to use dual volume controls. It's less than optimal, but like I said, I think there aren't any good digital audio standard with attn.?
So while you do amplify noise, surely you don't want to also lose range? Even if the noise dominates, the DR can easily be avoided!
Sell to people what they want, not what you think they need. It's all about perception not whether there are basis behind the perception. Even Placebo effect needs equivalent of little pills as props. Higher the price, more they'll get out of the effect. If some nut can hear God's voice from a lump of rock costing $1M, who are we to take away his 'talent'?
While the 30/33 kHz inter modulation test is interesting, how would the 30/33 kHz tones get there in the first place? Aren't there low pass filters and shielding on most studio equipment to attenuate EMI? Not to mention, the inter modulation products are 60 dB below the 30/33 kHz tones, I'm sure you wouldn't be able to hear them...
I think the point is that some people assert that they want the ultrasonics, as if they could tell the difference, so you wouldn't filter it. Then the presence of those ultrasonic tones creates the modulation in a range that you can hear. Likewise, the 60dB difference doesn't matter because you can't hear the signal at all, so you're going to notice the noise.
There are a couple of things that I would like to point out about digital sampling.
Early on in digital audio, anti-aliasing filters were awful. The ADCs weren't very good either, but the AA filters were bad. Using a higher sampling frequency was one method to help this problem by providing more padding in the frequency domain so that a more gradual AA filter could be used.
Later, this was completely OBE with the advent of oversampling. Most ADCs in use now are oversampling delta-sigma converters operating a very high sampling rates that perform decimation on the output to provide a 16 or 24 bit waveform at the normal 44.1kHz or 48kHz sampling rate. Delta-sigma converters, and high sampling frequencies are actually the basis for Sony's direct stream digital (AKA Super Audio CD).
Today, you can be reasonably assured that an ADC will provide a very clean, low noise, noise output in the vast majority of cases. For music playback, this never mattered anyway. Whether or not the transport of digital audio is at 44.1 kHz or 1 MHz, the quality to the human ear will be the same as long as it was sampled correctly and accurately.
That said - sampling at 44.1 kHz, 48 kHz, or even oversampling, may not capture all that sound has to offer our ears. One thing that can happen with music for instance is beating. Beating occurs when two notes are sounded at slightly different frequencies (say 20,000 Hz and 20,100 Hz) there will be an audible beat at their difference. Where this can come into play is in close mic recordings. If I record guitar A) with an overtone at 30,000 Hz and guitar B) with an overtone at 30,100 Hz, these two would have an audible 100 Hz beat. However, if we filtered and sampled at 48,000 Hz or over-sampled then filtered, we would lose both overtones and the ability to hear that beating.
How important is that beating? Good question - but with live music it's not a problem and with close mic recordings it is. For recording, there are reasons to use higher sampling rates for at least the mixdown process. I still think it's silly to have 192 kHz audio for listening to recordings at home.
Bit depth on the other hand is something I think we could use more of - especially with classical recordings. Audio quantized to 16 bits only provides about 96 dB of SNR. I would much prefer having 20 or 24 bit audio to fully encompass the actual range of human hearing that is closer to 120 dB.
The other thing that cannot go without mentioning, MP3s, AAC, and other lossy audio formats are pretty good - but they DO NOT compare to lossless audio. Having Google, Apple, and Amazon all step up to 44.1 kHz 16-bit audio sourced from equal or greater source material would be huge improvement over MP3s and AAC.
So if this is true and CDs contain already 'the best possible' audio that the human ear can hear, why does vinyl sound better? Are we saying records have a different pressing/recording or something else?
Here's why you might perceive a record to sound better than a cd:
1. Vinyl introduces harmonic distortion into the audio signal. From a scientific standpoint, this means vinyl has worse fidelity than digital audio. But some people like the distortion, as it makes music sound "warmer."
2. In the mid 90s, audio engineers began mastering music louder. In order to maximize the perceived loudness, they use an effect called dynamic compression, which reduces the dynamic range of the recording (the difference between louds and softs). This trend was not present during the heyday of vinyl. As a result, many CD versions of old records sound worse than the original vinyl. But this has nothing to do with audio formats--just poor mastering.
Aren't there benefits other than listening quality to having the music available in this format? Imagine 200 years from now, finding an old, lost stash of music in the attic. What would you prefer, a bunch of CDs or some drives with 24/192 FLACs on them?
They're silly for listening maybe, but for editing (or so I gathered) they're quite valuable because the extra bits give editors a lot more room to play with the sounds and frequencies and such.
I think this underestimates how many people are DJing. You don't have to be a professional to want to warp a couple tracks together, and this is a case where more samples help. Many of the 'only professionals need this' arguments fall apart; DJing is a popular hobby.
Sampling theory says you can reconstruct a signal by sampling it at least twice per cycle. So 44.1KHz is an adequate sampling rate for an 22.05Khz signal.
Unfortunately to make Digital-to-Analog conversion work according to theory, you must first construct an ideal analog filter which filters out everything above 22.05Khz while leaving everything less than 22.05Khz unmolested. That's not possible in reality. If 20KHz is the goal, you have a measly 2.05KHz to make the filter ramp from kill-nothing to kill-everything. I'd imagine real-world CD players with cheap filters probably kill everything above 15Khz.
In reality you want a lot of headroom between half the sampling frequency and the actual max frequency you want to pass unmolested. Even 48KHz only grants you a 4Khz band in which to let the filter roll off.
Second, significant playback timing jitter can render the LSBs useless. At 44.1KHz with 16bit sampling, the max difference between a pair of samples of a 22.05KHz max-amplitude input is very roughly 2^15. What does that mean? If your jitter is more than 692ns [1] you have just lost an LSB.
Sure, 24/192 is serious and unnecessary overkill. The advantage is that it has lots of headroom. The disadvantage is that it takes more space. If you were designing a new format today in our era of large hard drives, why wouldn't you waste a bit of space?
This isn't gold-plated monstrous-cable properly-broken-in HDMI snake oil. The current format isn't perfect; an upgrade is a reasonable idea.
[1] 44.1KHz period is 22.6ms; one part in 2^15 of that is 692ns.
Author is an expert, and correct. In the linked video he mentions that the difficulty of creating a sharp analog high-pass filter is in practice completely mitigated by oversampling, which is described in the Wikipedia article on DACs.
Suppose you have a 96khz DAC coming from your computer. Surely you see that a computer can solve for the nyquist reconstruction of some lower sample-rate recording (e.g. CD audio) to that sample rate, and then (still in digital 96khz) remove ALL the noise between say 21khz and 96khz, at which point a final analog filter will have an easy time leaving 0-21khz unmolested while killing all higher frequencies, right? It's practically implied by what you wrote. Per what OP says, that's more or less the effect of any DAC you'll use today.
Typical way to get around the requirement of steep reconstruction filter is oversampling just before the DAC, which is something that was done even by earliest CD players, today in most cases audio DACs are sigma-delta which is essentially oversampling taken to the extreme, that causes that reconstruction filter is mostly irrelevant for output audio quality and is there only for EMC reasons.
It's kind of funny that typical DIY audiophile-grade DAC constructions use some R-2R DAC without oversampling and thus need significantly steeper reconstruction filters.
Overall point is that you don't need to store and more importantly transfer the additional data because they do not contain any useful information.
As for the jitter, I somehow don't believe it is as significant problem as it is often presented, but anyway it's perceived effects should be mostly independent of sample rates and sample sizes.
Both of the points you bring up are unrelated to the format in which files are stored. Sure, they would be requirements for the format that gets sent to a primitive DAC. But the answer to your problem is basically contained in your post: before the signal is sent to your theoretical DAC component, the signal's rate could be digitally upsampled by a factor of 2 (to avoid artifacts), and dither could be applied to eliminate any jitter problems (with the unnoticeable side-effect of raising the noise floor to -66dB, in the case of 16 bit depth).
Anyways, as dfox mentionned, most sound cards nowadays use sigma-delta DACs, which do not need upsampling as they do not involve a filter in the way you described.
You are overestimating the jitter problem by a large amount. The first S/PDIF receiver IC I find via google (CS8416) claims to have an output jitter of typ. 200ps on the clock output, running from its internal PLL synced to an externally supplied S/PDIF input.
I would imagine in most cases the quality of the rip wont matter if the production on the album, or song, isn't very good. A poorly produced and mastered album wont sound good whether it's 128kbps or a full lossless rip.
I think similar to a lot of things there are some people out there who can notice the difference, but the percentage is likely very small. Most probably are telling themselves it sounds better, but as the article points out they're likely guessing when it comes to doing A/B tests.
24/192 playback isn't entirely silly. It's probably true that you can't hear a difference between a perfectly reconstructed 16/44.1k audio and perfectly reconstructed 24/192k audio. But the quality of your DAC certainly does matter, as well as the analog hardware that's after it. If a device has a 24/192k DAC, it means the manufacturer didn't just use the cheapest DAC they could find, and it's more likely to be high-quality (plus, it shows they care about audio quality, so they may have gone for a higher-quality analog hardware too).
In a perfect world, cell phone and music player manufacturers would advertise "this phone has a Wolfson DAC and uses OPA2134 opamps running at 9V in the output stage". But the mass consumer market doesn't care about those things, so for now all we have to go off is "this one supports 24/192 so it's probably got better hardware" (or the alternative, "this one says it has Beats Audio so it's probably got hyped bass and is terribly inaccurate").
"The ultrasonics are a liability during playback.... Neither audio transducers nor power amplifiers are free of distortion, and distortion tends to increase rapidly at the lowest and highest frequencies. ...any nonlinearity will shift some of the ultrasonic content down into the audible range as an uncontrolled spray of intermodulation distortion products covering the entire audible spectrum. Nonlinearity in a power amplifier will produce the same effect."
I would not expect a willingness to exploit consumers' magical thinking to be a good signal for quality engineering.
That's what audio mastering is for. A good mastering engineer tests their masters on iPhone headphones as well as studio monitors.
> I would not expect a willingness to exploit consumers' magical thinking to be a good signal for quality engineering.
Companies do this all the time, not just in audio, and they have for years. I don't know why most consumers would need a phone with a camera more than 8 MP, when most users will only ever display it on a 1080p (~2 MP) screen. I don't know why anyone needs a screen with more than 300 PPI. I don't know why anyone needs a TV with higher than 120 Hz refresh. But guess what, if Nokia puts a 41-megapixel sensor on their phone, I'm willing to bet they've also got a darn good lens. If Google wants to put an almost 500 PPI screen on their phone, I'm guessing they've chosen a screen that also has pretty good contrast & color.
Sample rate/depth is one thing device manufacturers can do to easily send the message "we care about audio quality" to the general consumer market, just like how a 41 MP phone camera tells you they are serious abut the quality of their camera.
Obviously you shouldn't judge a phone's camera or screen by the number of pixels alone. Unfortunately, it's much easier to directly compare screens and cameras than it is phone DACs. I wish there was a good benchmark system for audio hardware, but it's really hard to find accurate, unbiased, quantitative information.
> I don't know why most consumers would need a phone with a camera more than 8 MP, when most users will only ever display it on a 1080p (~2 MP) screen.
(1) Many computers, TVs, tablets, smartphones, and laptops now have screens with greater than 1080p resolution ("Quad HD" 2560x1440 is particularly common), so I don't thinks it true that most will only ever display pictures on a 1080p screen, heck, many of them will be taking pictures on devices with greater than a 1080p screen.
(2) Often pictures, after being taken, will be cropped; so the image that will be viewed on a screen (of whatever size) will be some subset of the full picture taken.
(1) That's beside the point. Even if a phone has a 4k display, they still don't need a camera of more than about 8MP to display it on screen.
(2) Yeah, I get this use case, and I understand why there are DSLRs that big. But how many cameraphone users are actually cropping their images so extremely that they need 41 megapixels? Phone makers don't put 41 MP sensors for the niche market of users who need cameras that good but don't have a DSLR; they do it because the majority of their customer base thinks "the more pixels the better".
> But how many cameraphone users are actually cropping their images so extremely that they need 41 megapixels? Phone makers don't put 41 MP sensors for the niche market of users who need cameras that good but don't have a DSLR; they do it because the majority of their customer base thinks "the more pixels the better".
Actually, the Nokia 41MP sensor is sold as enabling high-power digital zoom, which is the feature (with the associated benefit of taking clear pictures from much further away than with other phones) of the phone most heavily touted in the TV ads for the phones with the sensor. And digital zoom is exactly the same thing as cropping.
So, no, I don't think the actual marketing of the phone supports the idea that 41MP sensor is targeted at people using MP as a quality metric disconnected from any concrete utility, its targeted at selling a very specific benefit.
Not sure about that... when 100mbps hubs started rolling out, there were 10mbps ones that actually ran better... same goes for the 100/1gbit change. Eventually the higher capacity cheap versions got good/better enough. But just because the sample rates are higher, doesn't mean the parts/materials are better.
You're right, it doesn't - a high-quality 44.1k DAC will out-perform a low-quality 192k DAC any day. But device manufacturers (especially when it comes to phones) are generally not using high-quality DACs. As I said, I wish phone manufacturers advertised which DACs they were using.
A hub is different. A hub is a device you buy specifically for its networking performance. On the other hand, 99% of the market doesn't buy their phone for its audio quality.
Edit: I forgot to mention I don't think the author does a good job of addressing how jitter can reduce the effective sample rate. This is a real issue, and why dsd was conceived of. http://en.m.wikipedia.org/wiki/Direct_Stream_Digital
Regardless, for non pop music, a strong case can be made for 24 bits. The assumption that music doesn't need that kind of dynamic range is based entirely on a subset of music that happens to be what most people listen to: highly compressed popular (rock/hiphop/edm/etc) music.
The author falls for the "this makes sense in terms of physics and biology therefore must be true" way of thinking.
I'm clearly not qualified in biology and DSP to argue with the 24/192 argument (although i am one of those guys that will swear that it definitely sounds better, having tried one), but here's another thing to consider :
- Since the 2000's the general trend with music has been to compress audio as much as possible to send it through the pipe and store it everywhere, assuming availability was a superior concern than quality.
Therefore, the audio industry in general has been totally collapsing, in favor of the network provider business and now cloud storage one.
- We audiophile, have been thus pushed during the last decade to listen to shite mp3 on our phones, to get poor audio resolution on our cable TV, and often cry at those "boom bass" headphones and speakers advertised as being the bests.
In this context, this new trend to high fidelity audio for everyone is simply a miracle. Maybe we will finally see lossless audio everywhere. Maybe we will get 4 or 5 channels audio file formats (i have fond memories of a choir song demo included with my soundblaster audigy which put you in the middle of the choir). Maybe all of this just means the world is ready to jump to high-res audio hardware before 4K video becomes mainstream...
I wouldn't say being a data compression geek makes anyone a master of DSP or of psychoacoustics.
The basic issue with 16-bit is that low level details like reverb tails and hall ambiences or very quiet musical passages get the equivalent of 14-bit (f) 12-bit (mf) to 8-bit (ppp) sampling.
This sounds noticeably grainy and digital.
It's not about total dynamic range or sine waves, it's about the fact that human ears can do really neat source separation tricks. We can hear quiet elements in a mix without too much difficulty.
If those elements are sampled at less than 16 bits - which they will be, if the maximum resolution is only 16-bits - we can hear that too.
So 24-bits gives you effortlessly smooth sound for quiet passages and quiet details. 16-bits doesn't. (Dither helps a lot, but it only takes you so far.)
Why are there still people who pretend this isn't relevant? It's not a difficult point to understand, and it shouldn't be controversial.
Edit - the technical misunderstanding is a lack of appreciation of the different properties of the absolute theoretical noise floor of a converter, and the fact that quantisation noise isn't like analog noise. It's actually more of a hyper-objectionable and nasty sort-of-nonharmonic distortion.
So as the bit resolution goes down, the sound doesn't just get buried in noise it also gets more and more obviously distorted.
If this is so relevant and obvious, why not put up two files: one 192/24 and one 48/16 and allow people to run their own double-blind test as he notes in the article? If you could produce a repeatable test where some number of people can tell that one is better, that would be a powerful argument.
He's argued that people have done this test over and over, and nobody can ever tell the difference.
Firstly people haven't 'done this test over and over.'
There's been exactly one serious sort-of peer-reviewed paper in the AES journal, and that paper compared high-res commercially mastered audio sources of possibly questionable parentage with a 44.1/16 downconversion.
It also included SACD, which isn't a fixed bit depth linear PCM technology, and has been justifiably criticised for it.
I'm not aware of any tests that compare raw high-res unprocessed recordings with downsampled content.
Secondly, a fair comparison would be 48/16 and 48/24.
Personally I'm not very sold on high sample rates. I know there are technical reasons why it's easier to make antialiasing filters sound transparent at 96k than it is at 44.1k, and in practice it's not easy to pull apart practical design from theoretical limits. (Nyquist is only ever an ideal. No hardware is ever Nyquist-perfect.)
Basically psychoacoustics is hard. Ears are ridiculously sensitive, brains are occasionally delusional, and marketing people lurk everywhere.
It's extremely difficult to pull apart fact from reality.
But that's no excuse for having a misleadingly superficial understanding of the theory - which the original article does.
The problem with this approach is no one (except professionals) have properly treated rooms. There is no way anyone, on any equipment, can make any critical decisions about audio in an untreated room. Ok, I over exaggerate. But my point stands. Unless you've fully gone to town on room treatment, no one is going to be able to tell. The room will sound like ass even with a million dollar speaker system.
What you need to do is get those files, and send people down to a professional mix studio. Then AB them in there. Get people to sit in the sweet spot. Then you'll have a decent result.
Thing is, people have done this very test in professional listening spaces. And the results are always fascinating (and some people can tell!). In fact it's one of the most fun aspects of a professional facility! Audio shootouts!
After thinking about this for a while, I think I might have identified your specific problem: you're playing the music back louder than it was recorded. It's possible to amplify quiet sounds until you can hear the quantization noise, but it's also possible to turn the volume down until you can't hear the noise. And at that point, there is plenty of dynamic range in 16 bits to take you all the way to the threshold of pain, so you're not losing out on the high end either. Of course it's possible that the recording was made very quiet, so you have to crank it up to listen to it. But that's a problem with the mastering, not an inherent limitation of the 16-bit format.
I'm not sure what you mean about quieter sounds having less resolution. The point of having a 16-bit representation of a sample is that you pick the single point on the 65,536-point (logarithmic) line that is closest to the sound pressure level at that time. It's not like the points "below" your sound add up or anything. A loud sound gets a high number, and a quiet sound gets a low number. Both cases have the same precision.
In fact, because the points are logarithmically spaced, the points in the quiet part of the spectrum are closer together and have better resolution than loud sounds.
He knows what he is talking about. But I think his tone is somewhat snobbish because he is a little tired of the nonsense that is spread about music theory.
I've had an exactly opposite feeling - I've read a piece of text written by a guy who's tired of explaining people that more isn't better (and in some cases is worse).
I guess his exasperation with audiophiles must be leaking through. Since I feel the same way the tone actually resonates with me and only signals confidence in the correctness of the facts presented.
Not just his tone! Check out the self styled experts in these threads who despite never having been in a professional facility, nor actually tried the experiments themselves in such a facility, are none the less totally sure about what their outcome would be, and that anyone who disagrees is wrong.
Ok, I have a 1k+$ of sound equipment, I listen to lossless, and my equipment is 24/192 DAC in hardware (so even if I pass 16/44.1, it will get up-converted). Now give me 24/192 music and don't tell me "you don't need it", because I need it.
But for my mobile phone and 29$ earpods, 96 kbps 16/44.1 mp3 from soundcloud is good enough.
Well... There are a great many (infinite, really) possible input signals that will all generate the same 16/44 recording, but some of those inputs are more likely than others. You could simulate the likeliest and "record" it at 24/192.
It's also "impossible" to upscale an image, but the difference between bilinear and bicubic and sinc upscaling is readily apparent.
This of course assumes that the listener is capable of discerning 24/192 sound.
Dure, but there's no reason to expect you lose anything, either, barring implementation errors. Lots of the TV you watch looks fine, despite being a 720p signal displayed on a 1080p display. Same concept.
There is a good section in here describing why not only do you not need it, if you use it (as a terminal/listening format) you are probably just making things worse.
A system designed end to end to play 24/192 (not the DAC, it's the amps, speakers etc. you have to worry about) could do this reasonably, but you aren't putting that together for $1k. Or probably $10k. Mainly because there is no point, if the mix down is done right you can't physically hear the differences anyway.
It's a good format for mixing, and a hopeless one for listening.
I even transcoded some 24/192 FLAC Pink Floyd I had lying around and made him do a double blind test to show him that he'd prefer the slightly louder song every time, even if the louder song was 192kbps vs the FLAC. He did. He still doesn't believe me.
He still thinks he can hear the difference between FLAC and MP3 to this day. He works as a sound engineer now.
I don't think any amount of reasoning will make some people change their minds. Some people buy $500 wooden knobs to make their volume pots sound better. (or was that a hoax? i can't tell anymore)