Upper academia has its unforgivable sins. Fudging data, plagiarism, and faking credentials are certainly on the list along with cheating on exams. Still my ego and my conscience went round and round in an endless debate. Why should I feel guilty and shameful? If so, I was in a golden career position. Was I a psychopath? But my ethical world-view was strong-form Darwinism: I wasn’t caught and nobody knew, therefore I was fit to survive. Of course I would have to keep my secret forever, even from my psychoanalyst. But weren’t the test questions idiosyncratic and arbitrary? Wouldn’t faculty members themselves have trouble answering the ones they hadn’t written? But why couldn’t I score higher on tests without cheating? What was wrong with my intellect? Still I never had to take another test again in my whole life now that I’d completed and defended my thesis. But cheating on exams was one of the few unforgivable sins in academia.
I began to try to prove to myself that I wasn’t a psychopath by doing research that would benefit mankind, unconsciously gravitating toward the study of crime and criminals. I also became acutely aware of moral and ethical flaws in other psychologists and psychology as a whole. I carefully avoided data fudging and plagiarism. Somehow, I hoped to redeem myself.
Like all that doubt their own competence, I became sharply attuned to the competence of my superiors. I studied my teachers as much as I studied their teaching. The Princeton faculty in psychology were at odds. Each major player formed his own “school” and each school had contempt for the others. The top psychometrician thought the top clinician fuzzy minded and mendacious. The clinician diagnosed the psychometrician as anal-retentive and obsessive-compulsive. Everybody joked about the physiologist who spent day and night in the dank basement with his dying cats. Everybody thought the social psychologist was a political opportunist. He in turn implied his colleagues were short-sighted , unaware of the real world of human affairs. I thought they all were correct.
Whatever their failings, they were extremely professional. What’s more they were actual founders of their respective “schools”. Only by an arbitrary stroke of a dean were they destined to live under the same roof.
Glen Wever, for example, clearly belonged in a biology department somewhere, but his degree was in psychology and his physiology and electronic skills were self-taught. Wever and another autodidactic psychologist, Charles Bray, were renowned for discovering a electrical signal given off by the body’s smallest organ responsible for turning sounds into nerve impulses. When Wever and Bray placed their tiny electrode in just the right place on the Organ of Corti, they detected a signal correlating perfectly with sound coming into the ear. When they amplified this signal, they could actually hear through the ear of a cat. At first, physiologists thought this “cochlear potential” to be an artefact of crude electronics or just a nerve impulse, but by repeating the experiment on different animals with different instrumentation, Wever proved the signal to be emanating from an organ – and earned Wever-Bray research a place in physiology textbooks.
Silvan Tomkins and Glen Wever were so far apart in world-view that they ignored each other rather than squabble. If Wever was a physiologist, then Silvan Tomkins, a personality theorist, belonged in a philosophy department. Later in his career Tomkins authored a four volume tome on emotion (affect) and the behaviour of the face but in the fifties he constructed theories of personality with evidence from anywhere and everywhere, classical fiction, theatre, pop culture, psychoanalysis, experimental psychology, even horse-racing! A truly comprehensive personality theory was perhaps an impossible task: trying to generalise about something so utterly unique as individual personalities. Nevertheless Tomkins was giving it his best shot, reading, thinking, observing, interviewing, studying people of all kinds but – and this was his peer problem – doing no experimental (control group) research. After his death, Tomkins rapidly gained renown because of his work on affect imagery which, interpreted by disciples, became cult classic among artists, writers, cartoon animators, method actors, forensic lie detector specialists, literary critics and even CIA terrorist spotters. Nevertheless, back in the fifties, Tomkins didn’t fit into the psychometric orientation of Princeton psychology. Meanwhile the philosophy department employed mostly historians of philosophy which Tomkins was not. For this reason, acceptance of his work was delayed.
One person on campus doing what I thought a clinical psychologist really ought to do was not associated with the Psychology Department at all and largely ignored by it. S. Roy Heath was employed on a grant to conduct a four-year developmental study of students, their goals, aspirations and achievements. Heath spent thirty hours a week interviewing a sample of thirty-six men over their four undergraduate years. He met with his subjects individually to monitor intellectual growth and development. He also met in small groups for evening discussions.
Routine as this study might seem today, it was a first of its kind and produced a phenomenal result. Responding to attention from Heath, the students in his sample became very keen on him – and worked harder and performed better. Selected to be average, they rapidly became outstanding scholars and leaders. As news spread, students not in the sample were requesting Dr. Heath be allowed to study them as well. Students in Heath’s sample talked so glowingly and inappropriately about the great Dr. Heath that the Dean of Students feared a cult was developing and took steps to make sure Heath’s term did not extend beyond the four years of his contract.
Notwithstanding his difficulties with the Dean and neglect by the Psychology Department, Heath completed his study and his results have been cited ever since to justify the profession of student counselling.
Heath’s popularity with students is usually attributed to his skill as a good individual listener and counsellor. However, that is not what Heath, who was fascinated by group phenomena of all kinds, believed. He told me repeatedly that he considered himself an adequate clinician but certainly nowhere near as skilled as others like Tomkins. Heath believed the results he was getting had little to do with counselling skill but much to do with role. The phenomenon was something called “The Hawthorne Effect” and Heath was as surprised as anyone to discover just how powerful an effect it was.
The Hawthorne experiment (19241932) reported worker output increasing even when conditions deteriorated merely because workers were participating in an experiment. When working conditions were improved, output went up as expected. However, when working conditions were deteriorated, output continued to go up. The attention workers received caused them to respond the way they thought the experimenters wanted. Like so many of the icons of psychology, the original Hawthorne experiment has now been severely criticised. Unfortunately, the research was crudely and perhaps dishonestly done. Still, the term “Hawthorne Effect” entered the psychology lexicon to depict improvements in performance merely because one is participating in an experiment.
Roy Heath explained his popularity, “They like me because they are helping me. I am not counselling or teaching them. They are teaching me about what it is like to be in their shoes.” It was a role reversal wherein students were treated as experts. Heath’s undergraduates were making a contribution to science, to their university and to educational practice. At the same time they were helping Heath become an expert on, “undergraduate life today”. This accounted for their improved performance.
Heath believed that whenever we help someone, we get to like that person providing he receives and respects our help. Heath convinced his subjects that he valued what they told him. Then, he said, they identified with him. Since he worked hard to study and understand them, they were inspired to work hard as well. Improved performance became a group norm as his research subjects inspired each other to greater achievement.
What was the intention of Heath’s study? Was there an expectancy that his sample would do better? The answer is a qualified “no”.
From my conversations with Heath, I must conclude that he, personally, wanted to help his students if he could. He was just that type of person. Nevertheless, I am equally certain that the formal, stated purpose of the Princeton study was strictly to assess the goals, aspirations and achievements of Princeton undergraduates and did not include any hint of counselling-for-improvement. The university administration was anticipating nothing like the results of this study.
I am also sure that, when they agreed to participate, Heath’s sample was not led to expect any form of help or improvement.
However, once the study got under way, the original “contractuals” may have faded as students began to use Heath as a means to improve competence and leadership. Still the study was perceived on campus as an experiment strictly research not treatment or training..
All of the above, in my opinion, would predispose a maximum Hawthorne effect.
Heath never wavered in his conviction that his students were helping him, not vice versa. Even if it was not that simple, the results are remarkable. Roy Heath was a trustworthy, friendly, caring man but having these qualities did not necessarily mean that people with whom he conversed would become significantly more successful in achieving life goals. Yet, over the course of four years Heath’s research sample averaged better than others in the class in almost every respect. And unlike the original Hawthorne experiment, Heath’s data, eighteen boxes of them, are available to qualified scholars and researchers. On 1/1/2035, the Heath collection will be open to the public at which time I think the Hawthorne Effect should be renamed “The Heath Effect.”
Personally I was impressed with Heath because I’d had a lonely adolescence and wanted to be an important person. Helping others by letting them help me seemed like a solution to my alienation. Roy Heath suggested I take a summer job at the Training School in Vineland, N.J., supervising groups of mentally handicapped people. Heath also introduced me to “Highfields”, a group experiment conducted at the former Charles Lindbergh family estate in Mercer County. After their baby was kidnapped, the Lindbergh’s donated Highfields to the State of New Jersey and in 1950 it became a rehabilitation program for delinquent boys.
Highfields was perhaps the first of the confrontational rehabilitation programs that blossomed in the sixties and seventies. A peer group was led by a trained adult leader to focus on one of its members at a time, high-pressuring him to cop, that is, confess his faults and promise to change. The theory was that individual behaviour was greatly influenced by group norms. A skilled adult using techniques of guided group interaction (GGI) in turn could alter and control group norms. Individual behaviour would then conform to the new group norms. The method certainly looked powerful, too powerful for some right-minded social workers: during the Korean War, GGI was accused of brainwashing. But eventually it was discovered that neither brainwashing nor GGI produced permanent change. So another social science icon got shot down when “Positive Peer Culture” (as GGI was called by then) proved ineffective in reducing recidivism. Still when Roy Heath introduced me to Highfields, I was fascinated watching the toughest gang kids repent of their sins.
The Princeton faculty saw nothing new in Heath’s work. The Hawthorne Effect was disparaged as an artefact, a “placebo” (bread pill) effect, not something to be studied but a problem to be overcome, a pain in the neck to experimenters and clinicians alike.
Biomedical researchers had long known that people to whom attention was paid sometimes got better just because they thought they were getting treatment or were participating in an experiment. Such “placebo” effects were dubbed “psychological”, meaning unwanted. To eliminate them, researchers ran control groups for whom all conditions were the same except for the main variable. Thus the “experimental group” was given the real pill and the “control group” was given a placebo. If the experiment was “double blind”, then neither the experimenters nor the subjects knew who got which pill until the test was all over and the results were analysed. Since the placebo effect would influence both groups equally, any difference between the groups would be due to the main variable (the real pill). This is how they could control for (factor out) the placebo effects so that only the effect they really wanted to study remained.
There is no evidence that the original Hawthorne researchers ever ran a group that didn’t know it was being studied. Actually, they didn’t need such a group in order to prove their point. Evidence for the Hawthorne Effect came, not from a comparison between groups, but from observing one group’s performance over varying conditions. When conditions improved, performance went up, when conditions worsened performance went up even further. It seemed that no matter what the experimenters did, performance improved over time. That was the “effect.”
Heath’s study did not require a separate control group either because the study was not trying to demonstrate an effect but to characterise a population (discover the goals, aspirations, achievements etc. of the class of 1954). It would be too expensive to study every member of the class, so a representative sample was chosen. The expectation was that results from the sample would then apply to the entire class of 1954. If, at the end of the study, the grades averages, sports participation, yearbook type data etc., were the same for the sample and the entire class, then one could assume that the interview results from Heath would also apply to the class as a whole. This expectation was not fulfilled: the sample did better than the rest of the class, ipso facto the Hawthorne Effect!
The study was then a failure from the administrative point of view because they didn’t know any more than before about the goals, aspirations and achievements of the class of 1954. All they knew was the goals, aspirations and achievements of Heath’s population and that could not be assumed to be typical because Heath’s population was superior to the rest of the class on nearly all common variables such as grades, sports, campus politics etc.
So even though the Hawthorne Effect was not discovered by means of a control group experiment, once it was defined, it became “something that must be controlled for.”
Instead of trying to study placebo effects in depth, psychologists opted merely to imitate the other sciences and control for them. One reason was obviously that placebo effects smacked of faith healing, medical fraud, and the paranormal. Princeton experimental psychologists were mostly aggressively sceptical regarding such things.
So the Hawthorne Effect was honoured in the breach as a placebo effect and never adequately studied. And the educational administrators and guidance counsellors failed to see the importance of role reversal. Instead they saw students benefiting from having a trustworthy, helpful, caring, counsellor to focus their attention on the learning progress. Failure to perceive the role reversal is an example of professionalization of experimental results.
What happened to Roy Heath’s results has happened to some of the most interesting research in clinical psychology. Premature professionalism also inhibits scientific progress in all the social sciences. Results that do not fit into the standard professional paradigm get ignored, marginalised and misunderstood. Results arising from the amateur sector are disregarded or sometimes attacked. For this reason, professional psychological counselling and treatment remain rudimentary, powerless to solve the major social problems that were their mandate.
What should have happened to the Heath effect as well as the Freud effect, the Hawthorne Effect and dozens, if not hundreds of other experimental results?
First of all, to establish whether the effect is reliable, the original research should be repeated in its original (idealised) form as stated. Repeating an experiment with different experimenters is the first thing to do with controversial results.
When a profession gets possession of an experiment, only those who already believe in the results repeat the experiment and even they don’t repeat the original experiment (that had naive experimenters and subjects). Those who don’t believe, simply avoid the profession. In this way, “schools” develop that don’t communicate with each other. Psychology is rife with schools.
How different history would have been had Freud’s original idea (listening to someone talk an hour a day, several days a week for many weeks) been repeated by experimenters who, like Freud, had not been analysed. Instead, an expensive medical treatment specialty rapidly evolved with its own professional standards precluding research by the “untrained”. Meanwhile, back in the psychology lab, the doubting scientists who should have investigated Freud’s claims, had their own professions to support.
Not repeating experiments was only one of clinical psychology’s major sins. Another, as we shall see, was failing to vary the major variables. Consciously or unconsciously, clinical psychology was not testing itself properly. It was almost like cheating on its exams. I knew the signs: it wanted to be a profession so badly that it failed to do what was necessary to be a science.
And, given the opportunity, I thought I might correct the situation.