KeithL
Administrator
Posts: 10,273
|
Post by KeithL on Dec 27, 2016 12:35:09 GMT -5
I'm going to agree with both sides here...... The value of Double Blind Tests is in separating actual audible differences from other factors. If you cannot hear a difference between two pieces of equipment in a double blind test, with music you're familiar with, under ANY test conditions you prefer, they they really don't sound different. As someone much wiser than I once said: "If the differences you think you hear mysteriously disappear when you close your eyes, then they were never there to begin with". Now, obviously that's not to say that certain TYPES of double blind test may not be seriously limited..... However, if you can't hear a difference under ANY conditions, then there really isn't any. (So, if you bring your favorite music, and use your favorite speakers, and listen as long as you like, and you still can't tell the difference, then it's because there isn't any.) HOWEVER, it's also true that we don't always select audio equipment based ONLY on how it sounds. We buy equipment all the time on how it looks, or how the controls feel, or what features it has. And, yes, some features actually do affect the quality of the listening experience in sometimes unexpected ways. Let's say I have two preamplifiers, one with a stepped level control, the other with a potentiometer, and both exactly and identically perfectly neutral. One person may find that the mis-tracking between channels on the potentiometer volume control at low levels drives him nuts, while another may not notice that, but finds that he can never get the stepped control set to exactly the volume level he likes. There we have a difference in something other than "sound quality" that nevertheless actually does influence the sound quality each person experiences. (And there's no way you can do any sort of double-blind test because it's going to be obvious which control is which.) I personally once spent an extra $400 to get the more "premium version" of a preamp, knowing full well that it sounded identical to the less costly standard version, because I simply liked the look of the front panel and the feel of the Volume knob on the more expensive one. (It's no different than buying a car with fancy wheel rims, or comfortable seats.) However, to me, I REALLY want to know what I'm paying for. I want to KNOW whether I'm buying a real difference in sound, or simply a prettier face plate and nicer knobs. I very much don't like the idea that I may be paying for something THAT I'M NOT GETTING. (And, in fact, it bugs me even more to find out that I've been TRICKED into paying for something that's not there.) I'm not going to comment on the merits of DBT per se, but just say that I don't find it useful for my own audio evaluations. It's too much trouble and whether I like gear because it truly does sound "better" (as in to my own ears) because physically it is, or if it is purely psychological, then what difference does it make. I suppose someone may say if it is psychological then why waste my money but then on the other hand how many things do we buy for purely psychological reasons? If we didn't buy things for psychological factors then there wouldn't be competing brands of the same product. The main thing is to be able to enjoy whatever you spend your hard earned money on. If you have to keep second-guessing yourself then what's the good in that (unless that's what brings you joy, rather than whatever the product is intended to do). Monku, I have to mention one point here. I know of many folks who have spent for mainly psychological reasons thousands of IMO useless dollars on $1000 DAC's for example plus expensive speaker connectors, cables, and other very questionable gear, way more than I have spent on my entire system!
|
|
KeithL
Administrator
Posts: 10,273
|
Post by KeithL on Dec 27, 2016 12:51:05 GMT -5
I'm not quite sure why it HAS to be an either/or situation...... Would it really be so awful to say: "In a double blind test, with my favorite music, I couldn't hear any difference, so I guess I was imagining that. However, since I enjoy using that one, there's obviously something I like better about it, so I'm going to keep it." Maybe you just LIKE drinking wine out of a nice cut crystal wine glass instead of a jam jar from Walmart. I don't understand why most audiophiles are essentially unwilling to admit that they just like a certain piece of equipment for totally irrational reasons. To me it seems obvious that many are just trying to rationalize what is basically an aesthetic decision by imagining that there must be some hard fact somewhere to justify it. Could they just be desperate to convince their friends that they're not irrational. Maybe, if you can't hear a difference with your eyes shut, you just like the way it looks, or the color of the face plate, or that nice blue display..... Would that be so bad...? Then how about this: Live with each DAC for whatever period of time you need one at a time. A week, a month, whatever. You would know which one you were listening at any given time. Then have someone put one of the three back into your system and again listen to it for whatever time you need. But this time you would not know which one you were listening to. Would you be able to identify which one it was? 100% of the time? The chance of identifying it would be higher. But not necessarily a given. But what does that say if you don't. Does that mean all the times you were listening and heard the difference was your brain being deluded. Was it? 100% of the time you were using it? It's easy to say of course! Your brain was deluded by pretty looks / all the cash you spent. It could. But it doesn't mean it is so. Which one was correct? You closed your eyes and the answer is different. So your brain is to fault. Or is it...? What if you closed your eyes for a week or a month and not knew what DAC it was. But then you open your eyes and then start identifying different differences. Which one is the correct result? Would you go with your sighted result or your blind result. What if you go with the blind result (which happened to score say last place in the sighted)....and you find it still scores last place as time goes on with your eyes open? Would you still stick with the blind result while listening with your eyes open?
|
|
|
Post by yves on Dec 27, 2016 13:35:01 GMT -5
https://www.itu.int/dms_pubrec/itu-r/rec/bs/R-REC-BS.1116-1-199710-S!!PDF-E.pdf
"As an example, if the actual sequence of audio items is identical for all the subjects in a listening test, then one could not be sure whether the judgements made by the subjects were due to that sequence rather than to the different levels of impairments that were presented."
"Where non-homogeneity is expected this must be taken into account in the presentation of the test conditions."
"A major consideration is the inclusion of appropriate control conditions."
"It should be understood that the topics of experimental design, experimental execution, and statistical analysis are complex, and that only the most general guidelines can be given in a Recommendation such as this. It is recommended that professionals with expertise in experimental design and statistics should be consulted or brought in at the beginning of the planning for the listening test."
"It is important that data from listening tests assessing small impairments in audio systems should come exclusively from subjects who have expertise in detecting these small impairments. The higher the quality reached by the systems to be tested, the more important it is to have expert listeners."
Should I go on?
|
|
|
Post by garbulky on Dec 27, 2016 14:06:26 GMT -5
I'm not quite sure why it HAS to be an either/or situation...... Would it really be so awful to say: "In a double blind test, with my favorite music, I couldn't hear any difference, so I guess I was imagining that. However, since I enjoy using that one, there's obviously something I like better about it, so I'm going to keep it." Maybe you just LIKE drinking wine out of a nice cut crystal wine glass instead of a jam jar from Walmart. I don't understand why most audiophiles are essentially unwilling to admit that they just like a certain piece of equipment for totally irrational reasons. To me it seems obvious that many are just trying to rationalize what is basically an aesthetic decision by imagining that there must be some hard fact somewhere to justify it. Could they just be desperate to convince their friends that they're not irrational. Maybe, if you can't hear a difference with your eyes shut, you just like the way it looks, or the color of the face plate, or that nice blue display..... Would that be so bad...? The chance of identifying it would be higher. But not necessarily a given. But what does that say if you don't. Does that mean all the times you were listening and heard the difference was your brain being deluded. Was it? 100% of the time you were using it? It's easy to say of course! Your brain was deluded by pretty looks / all the cash you spent. It could. But it doesn't mean it is so. Which one was correct? You closed your eyes and the answer is different. So your brain is to fault. Or is it...? What if you closed your eyes for a week or a month and not knew what DAC it was. But then you open your eyes and then start identifying different differences. Which one is the correct result? Would you go with your sighted result or your blind result. What if you go with the blind result (which happened to score say last place in the sighted)....and you find it still scores last place as time goes on with your eyes open? Would you still stick with the blind result while listening with your eyes open? Well yes. I'd rather spend it on the sound quality not looks. If it looks great, cool. But that's not the selling point for me. This may surprise people, but I'm not really concerned about what my friends think about my enjoyment! Though I do think they'd think I was more nuts if I started wearing blindfolds or throwing towels on my equipment. We got to stop looking at ourselves like a moth, easily distracted by shiny things. Double blind tests to me should be reworded as "spend more time to hear less." People love to jump for the imagination aspect. But it's not really supported by a null result. They are mixing science with their own biases, assumptions, and an incorrect implementation and pretending that they are doing hard science here. They aren't. I don't pretend I'm doing science here. But I do think I'm listening to music subjectively with all its inherent biases and finding that it tells me a lot more than a DBT test does.
|
|
|
Post by garbulky on Dec 27, 2016 14:07:55 GMT -5
https://www.itu.int/dms_pubrec/itu-r/rec/bs/R-REC-BS.1116-1-199710-S!!PDF-E.pdf "As an example, if the actual sequence of audio items is identical for all the subjects in a listening test, then one could not be sure whether the judgements made by the subjects were due to that sequence rather than to the different levels of impairments that were presented." "Where non-homogeneity is expected this must be taken into account in the presentation of the test conditions." "A major consideration is the inclusion of appropriate control conditions." "It should be understood that the topics of experimental design, experimental execution, and statistical analysis are complex, and that only the most general guidelines can be given in a Recommendation such as this. It is recommended that professionals with expertise in experimental design and statistics should be consulted or brought in at the beginning of the planning for the listening test." "It is important that data from listening tests assessing small impairments in audio systems should come exclusively from subjects who have expertise in detecting these small impairments. The higher the quality reached by the systems to be tested, the more important it is to have expert listeners." Should I go on? I think you should yvesThe DBT tests described previously in the thread don't really match what is being described do they?
|
|
klinemj
Emo VIPs
Official Emofest Scribe
Posts: 15,098
|
Post by klinemj on Dec 27, 2016 14:50:08 GMT -5
I don't understand why most audiophiles are essentially unwilling to admit that they just like a certain piece of equipment for totally irrational reasons. I always am totally rational. For example, I insist my audio gear double as a toaster. Mark
|
|
|
Post by yves on Dec 27, 2016 15:08:00 GMT -5
https://www.itu.int/dms_pubrec/itu-r/rec/bs/R-REC-BS.1116-1-199710-S!!PDF-E.pdf "As an example, if the actual sequence of audio items is identical for all the subjects in a listening test, then one could not be sure whether the judgements made by the subjects were due to that sequence rather than to the different levels of impairments that were presented." "Where non-homogeneity is expected this must be taken into account in the presentation of the test conditions." "A major consideration is the inclusion of appropriate control conditions." "It should be understood that the topics of experimental design, experimental execution, and statistical analysis are complex, and that only the most general guidelines can be given in a Recommendation such as this. It is recommended that professionals with expertise in experimental design and statistics should be consulted or brought in at the beginning of the planning for the listening test." "It is important that data from listening tests assessing small impairments in audio systems should come exclusively from subjects who have expertise in detecting these small impairments. The higher the quality reached by the systems to be tested, the more important it is to have expert listeners." Should I go on? I think you should yvesThe DBT tests described previously in the thread don't really match what is being described do they? That is why the ones described previously in the thread are very silly indeed, and it is also why I linked to that 30 year old article by Carl Sagan a few pages ago. The first part I quoted from the ITU document already invalidates them because with a group of only a handful of test subjects there's just no chance that you can use statistical analysis to correctly identify—as well as compensate for—the bias towards "hearing no difference" inherent of the sequence of audio items, that I already tried to explain in the second paragraph in my response to @chuckienut (see: emotivalounge.proboards.com/post/864118/thread ).
|
|
KeithL
Administrator
Posts: 10,273
|
Post by KeithL on Dec 27, 2016 16:53:24 GMT -5
I disagree...... Double blind tests test for an identifiable difference, and so, if someone is biased to NOT hear a difference, then they may fail to notice a difference that's actually there. However, if someone does claim to hear a difference, we can easily use statistics to confirm their claim. Therefore, the solution is to provide a LOT of motivation for people to want to hear a difference. Here's the easiest way to do that..... Find a bunch of people who are at least willing to consider that there might be a difference (and as many people who are convinced there is a difference as you can find). Invite them all to do a test, with their choice of other components, and their choice of program material. And offer them each $100 if they can demonstrate that they can in fact hear a difference. (You get to run the test and tabulate the results.) That should provide plenty of motivation for them to really WANT to hear a difference........ And, if none of the people who claim that the difference is real, can prove that they hear it, even when there's money on the table, then I'll be convinced that they really can't hear it because it isn't there. And, yes, if just ONE GUY can hear that difference, but hear it reliably enough that it can't possibly be random chance, then he will have proven that it exists. And, with a standing offer of cash, you're probably going to have lots of candidates eager to prove that they CAN hear it.... all with a bias to believe that it's at least possible that they will. (And the ones with a really strong negative bias will probably stay home.) REMEMBER that, in order to prove that a difference exists, we don't need anything beyond a single person who can hear it. And, in the converse, if we open our test to a lot of people, all with a strong bias to succeed, and NONE of them do, then we can reasonably infer that it's very unlikely to be there. (While we won't have dis-proven it absolutely, we will have made "every reasonable attempt" to confirm it, and failed -which is as close as you can practically get to proving a negative.) I think you should yves The DBT tests described previously in the thread don't really match what is being described do they? That is why the ones described previously in the thread are very silly indeed, and it is also why I linked to that 30 year old article by Carl Sagan a few pages ago. The first part I quoted from the ITU document already invalidates them because with a group of only a handful of test subjects there's just no chance that you can use statistical analysis to correctly identify—as well as compensate for—the bias towards "hearing no difference" inherent of the sequence of audio items, that I already tried to explain in the second paragraph in my response to @chuckienut (see: emotivalounge.proboards.com/post/864118/thread ).
|
|
Deleted
Deleted Member
Posts: 0
|
Post by Deleted on Dec 27, 2016 19:43:34 GMT -5
OK Chuckie - We'll agree to disagree. You contend that UNLESS a LMDB test identifies the differences, that they don't exist (except in the imagination of the listener). I contend that they do, and that it's not merely imagination. The Chuckienut test is but a variation of LMDB, so nothing different there, thanks. I really don't care enough to expend the effort. And with that, my participation in this thread is (finally) completed. I don't ever expect to convince you, and (it goes without saying) you haven't convinced me. Such is life. That said, I still enjoy your posts greatly, and will continue to look forward to your (wonderful & twisted) sense of humor! Happy New Year to you & yours! Cordially - Boomzilla Boom, Happy New to you and family also and that goes for Garbulky too. We'll agree to disagree. I never agreed, don't put words in my mouth! You insult me when you trash my Nut Test when you say it is simply a variation, because it is very different than the LM/DB ones you like to talk about. I made it very flexible and I only have control over which DAC is being used and you don't not know which one is active. There is no level matching (LM) and it is not double blind (DB), hello! It is very obvious why wouldn't want to try it. You won't even give it any consideration. I looked in the dictionary for the words/phrases stubborn and non-open minded and I saw the photo that's in your avatar. Boom said: And with that, my participation in this thread is (finally) completed. Ok Boom, I do have some more to post here in your thread. Let's not see you sneak back into this thread, I'll be watching! Also, here is a photo of the last person who called my sense of humor twisted. (I love this photo as I get so much mileage from it!) PS: Next time you and/or Garbukly do a review/shootout how about posting some good photos of the room, your gear rack, speakers, hook ups, seats and the products being tested, etc. I usually have in all of my reviews (most have been way back). Photos are easy to post and give us some extra confidence in and idea of your test environment.
|
|
Deleted
Deleted Member
Posts: 0
|
Post by Deleted on Dec 27, 2016 20:03:07 GMT -5
|
|
|
Post by garbulky on Dec 27, 2016 20:10:11 GMT -5
Put a little effort in to it, now! I'm here with my towel and blindfold. But I have changed my mind with that weak sauce you laid on.
|
|
|
Post by Jim on Dec 27, 2016 20:21:34 GMT -5
But but but.. "Why do you irritate yourself by continuing to read this thread?"
|
|
Deleted
Deleted Member
Posts: 0
|
Post by Deleted on Dec 27, 2016 20:35:45 GMT -5
Just out of curiosity, and I haven't read this entire thread, has a DBT ever been done when minor imperfections were deliberately introduced in the DACs? So, 2 versions of DC-1, with one of them having a small (and variable) amount of conversion imperfection (not sure if this is easily possible). At what percentage of difference between the 2 do trained (and non-trained) listeners reliably identify difference (preference is immaterial). That is a good question IMO, and of course it is hypothetical, no problem. I'm not sure if anyone here could give a definitively average answer. I would think a decibel difference might be much easier than trying to figure out a percentage for loudness variations. Usually I have read the average Joe can detect down to the area of about 1dB difference in the 1kHz-4kHz range, when we are talking about loudness. Other sources which I think are perhaps more correct is that those with excellent hearing can detect down to 0.5 decibels range. These are in the mid range where our hearing is most sensitive. When talking about frequencies, I have read that the average Jane can distinguish frequencies that are about 0.25% (or so) apart. Some folks have far superior perfect pitch than others. Many times our ears are even less accurate which is why I always use a RS meter when I manually (which I always do) set the speaker gain/volume in the speaker setup menu. Many folks will set the speakers by ear and be 1-2dB's off of exact level. The meter allows one with care (I always use a quality photo tripod and other cautions) to get precise results down to the 0.5dB (or even close to 0.25dB with great care). This is much more precise than human ears. I usually don't trust the auto setup speaker processing in the Pre-Pro/AVR's. Yours is an interesting question and I'm not sure if my thoughts are any help. Maybe others can contribute.
|
|
Deleted
Deleted Member
Posts: 0
|
Post by Deleted on Dec 27, 2016 20:56:38 GMT -5
Put a little effort in to it, now! I'm here with my towel and blindfold. But I have changed my mind with that weak sauce you laid on. WOW! Garbulky. I am so proud of you. You have finally seen the light. Your photo proof is very classy!
|
|
Deleted
Deleted Member
Posts: 0
|
Post by Deleted on Dec 27, 2016 21:45:48 GMT -5
Post by the Lounge Lizard formerly known as ChuckieNut:
The answer of course is they wouldn't be identified because you imagined you heard the differences knowing which DAC's were being used. Using my test will prove you can't really hear any such differences and your circle logic won't let you BS your way out of the results this time.
Plus, my Easy Nut Test is the official DAC sound test of the National Blind Foundation.
|
|
Deleted
Deleted Member
Posts: 0
|
Post by Deleted on Dec 27, 2016 22:53:17 GMT -5
I'm going to agree with both sides here...... The value of Double Blind Tests is in separating actual audible differences from other factors. If you cannot hear a difference between two pieces of equipment in a double blind test, with music you're familiar with, under ANY test conditions you prefer, they they really don't sound different. As someone much wiser than I once said: "If the differences you think you hear mysteriously disappear when you close your eyes, then they were never there to begin with". Now, obviously that's not to say that certain TYPES of double blind test may not be seriously limited..... However, if you can't hear a difference under ANY conditions, then there really isn't any. (So, if you bring your favorite music, and use your favorite speakers, and listen as long as you like, and you still can't tell the difference, then it's because there isn't any.) HOWEVER, it's also true that we don't always select audio equipment based ONLY on how it sounds. We buy equipment all the time on how it looks, or how the controls feel, or what features it has. And, yes, some features actually do affect the quality of the listening experience in sometimes unexpected ways. Let's say I have two preamplifiers, one with a stepped level control, the other with a potentiometer, and both exactly and identically perfectly neutral. One person may find that the mis-tracking between channels on the potentiometer volume control at low levels drives him nuts, while another may not notice that, but finds that he can never get the stepped control set to exactly the volume level he likes. There we have a difference in something other than "sound quality" that nevertheless actually does influence the sound quality each person experiences. (And there's no way you can do any sort of double-blind test because it's going to be obvious which control is which.) I personally once spent an extra $400 to get the more "premium version" of a preamp, knowing full well that it sounded identical to the less costly standard version, because I simply liked the look of the front panel and the feel of the Volume knob on the more expensive one. (It's no different than buying a car with fancy wheel rims, or comfortable seats.) However, to me, I REALLY want to know what I'm paying for. I want to KNOW whether I'm buying a real difference in sound, or simply a prettier face plate and nicer knobs. I very much don't like the idea that I may be paying for something THAT I'M NOT GETTING. (And, in fact, it bugs me even more to find out that I've been TRICKED into paying for something that's not there.) I have to compliment Keith in this very non-confrontational/diplomatic post as I believe he is very kindly avoiding choosing sides. However, just reading his first paragraph clearly indicates he is generally in favor of many blind tests which is directly in conflict with the title of this thread: Why double-blind testing is completely worthless. In the second paragraph he changes topics when talking about choosing audio equipment on other factors as well as how they sound, such as looks, control feel and features. I have in one of these recently connected threads mentioned that I actually own the Emo XDA-2. I did lots of study and consulting with others before I ordered it. I decided I wanted it whether I would be able to hear any sound advantages over the PC's DAC or that of my ERC-2. I was happy to make it my "control center" for my PC sound system due to its ability to control both the Airmotiv 4's plus the Mirage 8" sub with good connections, blending and volume control with remote. It of course accommodated my good headphones. It would also accommodate an external CD player from the PC if desired. Much of the listening thru the speakers is near field when sitting at my computer desk, but many times I listen when away from the desk and up to 20 feet away in the adjacent kitchen or living room, rather than turning on my main system. The convenience of the remote control makes it a strong buying factor when I'm away from the computer desk. I did in fact do some fairly extended sound comparisons and generally found little if any significant sound differences, but seemed to hear better sound with the ASRC. I highly value my XDA-2 for other than sound reasons but perhaps a little sound improvement though quite subtle.
|
|
Deleted
Deleted Member
Posts: 0
|
Post by Deleted on Dec 27, 2016 23:12:35 GMT -5
I don't understand why most audiophiles are essentially unwilling to admit that they just like a certain piece of equipment for totally irrational reasons. I always am totally rational. For example, I insist my audio gear double as a toaster. Mark OK, I have to fess up here. There is one main reason the Nut Wife and I like our Emotiva gear. It's the damn sexy blue lights! I wish the speakers had a blue power light too. When I ordered and kept the Furman ELITE-15 DMi power box it was for the cool blue lights. Ended up that they were almost a perfect match. When Nori-Maki gets bored with a movie we are watching she just fixates on the blue lights in the gear stand which is off right of the right speaker and doesn't detract from the screen. Our Blue Heaven!
|
|
|
Post by yves on Dec 27, 2016 23:13:45 GMT -5
I disagree...... Double blind tests test for an identifiable difference, and so, if someone is biased to NOT hear a difference, then they may fail to notice a difference that's actually there. However, if someone does claim to hear a difference, we can easily use statistics to confirm their claim. Therefore, the solution is to provide a LOT of motivation for people to want to hear a difference. Here's the easiest way to do that..... Find a bunch of people who are at least willing to consider that there might be a difference (and as many people who are convinced there is a difference as you can find). Invite them all to do a test, with their choice of other components, and their choice of program material. And offer them each $100 if they can demonstrate that they can in fact hear a difference. (You get to run the test and tabulate the results.) That should provide plenty of motivation for them to really WANT to hear a difference........ And, if none of the people who claim that the difference is real, can prove that they hear it, even when there's money on the table, then I'll be convinced that they really can't hear it because it isn't there. And, yes, if just ONE GUY can hear that difference, but hear it reliably enough that it can't possibly be random chance, then he will have proven that it exists. And, with a standing offer of cash, you're probably going to have lots of candidates eager to prove that they CAN hear it.... all with a bias to believe that it's at least possible that they will. (And the ones with a really strong negative bias will probably stay home.) REMEMBER that, in order to prove that a difference exists, we don't need anything beyond a single person who can hear it. And, in the converse, if we open our test to a lot of people, all with a strong bias to succeed, and NONE of them do, then we can reasonably infer that it's very unlikely to be there. (While we won't have dis-proven it absolutely, we will have made "every reasonable attempt" to confirm it, and failed -which is as close as you can practically get to proving a negative.) That is why the ones described previously in the thread are very silly indeed, and it is also why I linked to that 30 year old article by Carl Sagan a few pages ago. The first part I quoted from the ITU document already invalidates them because with a group of only a handful of test subjects there's just no chance that you can use statistical analysis to correctly identify—as well as compensate for—the bias towards "hearing no difference" inherent of the sequence of audio items, that I already tried to explain in the second paragraph in my response to @chuckienut (see: emotivalounge.proboards.com/post/864118/thread ). That's not what I was referring to because my first quote from the ITU document is about the sequence of audio items that is heard by the test subject, not about motivating the test subject. In a sequence of audio items the order in which audio items are presented skews judgement of audio items, and the 2nd paragraph in my response to Chuckie is a logical explanation of that observation.
|
|
|
Post by copperpipe on Dec 28, 2016 9:41:59 GMT -5
Garbulky and Boomzilla; you guys seem to enjoy these little tests and gettogethers ... so here is a suggestion for next time; one of you buys a new DAC and without telling the other what it is, what model, brand, color, specs, prices, preliminary thoughts, or anything else other than that it's a DAC... then run blind tests on that DAC and the DC-1. It's very hard to step back from a DAC when you're already emotionally invested in it for months, lot easier when you have no "skin in the game".
|
|
KeithL
Administrator
Posts: 10,273
|
Post by KeithL on Dec 28, 2016 10:25:09 GMT -5
But in this context they are in fact related..... If you're doing a comparison between multiple items, then anything that skews preferences matters.... and that includes the order of presentation. HOWEVER, the situation is quite different when you're doing a study on whether something exists or not (or is audible or not). In this situation, there's no consideration of fairness, and bias is simply irrelevant, other than as a motivating factor. The result is an either/or answer. Either someone can hear a difference or NOBODY can hear a difference. Since you are essentially attempting to justify a null assumption (or fail to justify it), the most accurate method is the one that provides subjects the absolute best possible bias to succeed. (This is true because, unlike other forms of tests, a bias CANNOT skew positive results and make them less reliable, because we can statistically test a positive result; but a bias can skew a null result by raising doubt that a "best effort" was made to succeed. There is no form of bias that will allow someone to hear a difference that doesn't exist; at most, a bias could prevent them from noticing a difference that does exist, so the most accurate result will be obtained with the strongest possible bias in favor of a positive result. This will ensure that a null result is really due to a null condition.) Think of it like trying to determine the absolute fastest a human can run. While a LACK of motivation might cause the best runners not to show up, or might cause the participants to fail to "give their all", there is no possible bias that will force or encourage anyone to run faster than they actually can. Therefore, the most accurate result will be obtained if you provide the most motivation possible. An Olympic gold medal is great incentive for some people, and a million dollar prize is better for others, but you'll probably get an even better result if you release some man-eating lions behind the runners. In our example, since you are trying to test a human limit, the ability to distinguish a certain difference, and it's simple enough to rule out any false positives (because we can statistically determine whether someone has actually detected a difference or not), we will get the most accurate result by testing the widest variety of subjects, and giving them the most motivation to succeed. Since we cannot practically test every human on Earth, and we cannot rely on every participant being optimally motivated, the closest we can practically come is to provide sufficient motivation and allow some self-selection. If we offer a huge prize, we will ensure that people who believe they have an opportunity to win will show up, and will do their best to win the prize. While it's possible that we will miss a potential "winner" who lacks the willingness to compete, or the confidence to compete, we will attract most of the people who already have the bias of believing that they might win, avoid the people who have the opposite bias, and strongly motivate the ones who show up. And, with luck, the ones who think they might win will really be the ones with a reasonable likelihood of doing so. (This will give us the best practical chance for success. And, by starting with the best chance for success, will also provide us with the best claim that, if we fail, it's because the answer is really null. Back to my original analogy, if you want to claim to have found "the fastest person on Earth", within practical limits, the best way is to have an open contest, where anyone is welcome to compete, and give them both a strong motivation to compete, and a strong motivation to win. ) I agree entirely with you that the sequence of presentation tends to skew judgment.... however, in this case, we're not talking about judgment.... it's a simple yes or no (the difference is or is not audible). If you want to extend the test to "which one sounds better" then you've specified more complex test. However, you'll get better overall accuracy with less effort, if you separate the question "is there a difference at all", and answer it first, with a simpler yes/no test. (After all, if there is no difference, then any effort trying to figure out which is better will be wasted effort - and any result you get will be "statistical noise".) Alternately, if there is no actual difference, a study about what factors affect the differences that people IMAGINE are there would be most interesting... especially to the marketing department. I disagree...... Double blind tests test for an identifiable difference, and so, if someone is biased to NOT hear a difference, then they may fail to notice a difference that's actually there. However, if someone does claim to hear a difference, we can easily use statistics to confirm their claim. Therefore, the solution is to provide a LOT of motivation for people to want to hear a difference. Here's the easiest way to do that..... Find a bunch of people who are at least willing to consider that there might be a difference (and as many people who are convinced there is a difference as you can find). Invite them all to do a test, with their choice of other components, and their choice of program material. And offer them each $100 if they can demonstrate that they can in fact hear a difference. (You get to run the test and tabulate the results.) That should provide plenty of motivation for them to really WANT to hear a difference........ And, if none of the people who claim that the difference is real, can prove that they hear it, even when there's money on the table, then I'll be convinced that they really can't hear it because it isn't there. And, yes, if just ONE GUY can hear that difference, but hear it reliably enough that it can't possibly be random chance, then he will have proven that it exists. And, with a standing offer of cash, you're probably going to have lots of candidates eager to prove that they CAN hear it.... all with a bias to believe that it's at least possible that they will. (And the ones with a really strong negative bias will probably stay home.) REMEMBER that, in order to prove that a difference exists, we don't need anything beyond a single person who can hear it. And, in the converse, if we open our test to a lot of people, all with a strong bias to succeed, and NONE of them do, then we can reasonably infer that it's very unlikely to be there. (While we won't have dis-proven it absolutely, we will have made "every reasonable attempt" to confirm it, and failed -which is as close as you can practically get to proving a negative.) That's not what I was referring to because my first quote from the ITU document is about the sequence of audio items that is heard by the test subject, not about motivating the test subject. In a sequence of audio items the order in which audio items are presented skews judgement of audio items, and the 2nd paragraph in my response to Chuckie is a logical explanation of that observation.
|
|