IQ testing and research has been around for over 100 years. Though it is often a controversial issue, the fact remains that more than any other psychological trait studied, IQ scores contain a remarkable amount of predictive power with regards to life-time outcomes. One of the most surprising aspects of intelligence that early researchers encountered was that performance on a wide variety of divergent tasks was positively correlated. In other words, if you did well on one type of task, it was very likely you would do well on any task you were given including ones that were nothing like the original subject. This is the origin of the term g or general intelligence. By determining a person’s g on a few tasks, you can predict how they will perform on a variety of others and remarkably how well they would do in terms of lifetime achievement. This finding has withstood 100 years of robust research and a greater amount of heavy criticism, thanks to political correctness, than most other scientific findings.1, 2
Though there is only one g, there are also sub-g abilities that are both positively correlated to each other and with g (meaning their existence does not disprove a general intelligence factor). However, these sub-g abilities do not perfectly correlate with each other, which leaves some room for people with similar IQs to possess individualized intellectual profiles. These sub-g abilities can be divided into verbal reasoning, numerical reasoning, and spatial/mechanical reasoning. Along these dimensions a large sex differentiated pattern appears which has been well documented since the beginning of IQ tests. Women tend to outperform men on tests of verbal reasoning while men have an advantage in both numerical and spatial reasoning.3,5
Since this is an article about how standardized tests currently undervalue men, I will focus on the male cognitive profile. A recent study that quantified male advantages found that older adolescent men out-performed women on average by 6 IQ points on items involving numerical reasoning and 13 IQ points on items involving mechanical reasoning. This is about a half standard deviation and a full standard deviation respectively.4 A full standard deviation advantage on spatial reasoning tasks is a LOT and goes a very long way in explaining the dearth of women in STEM and the low numbers of female electricians or mechanics. Having a high spatial reasoning has been shown to be essential to the pursuit of the inorganic sciences among the smartest people.6,7 It should also be noted that brain development continues into the twenties which means that it is very possible that these numbers underestimate the extent of the gender gap in adulthood.
Considering the importance of spatial ability to scientific endeavors and success, it is curious that these types of tasks are conspicuously absent from aptitude tests which are supposed to identify people qualified for STEM; tests including the SAT and the GRE.6,7 In both tests, there is a verbal component, a numerical component, and a writing component. The writing component is really just a more subjective way to measure verbal aptitude. One study7 comments on the current state of the GRE (the SAT shows the exact same pattern) thusly:
Based on approximately 2.5 million GRE test takers assessed in 2002–2005, 30% scored P700 (out of a top possible score of 800) on GRE-Q (ETS data: all examinees tested between 1 July 2002 and 30 June 2005, N GRE-V = 1,245,878, N GRE-Q = 1,245,182). The GRE-Verbal was not compromised by ceiling effects, with only 3% scoring P700. Indeed, the GRE-Q mean of 591, with a standard deviation of 148, reveals that the mean is 1.4 standard deviations from the GRE-Q ceiling; whereas the GRE-V mean of 467, with a standard deviation of 118, places this mean at 2.8 standard deviations from the GRE-V ceiling (twice the distance). This results in 10 times as many scores P700 for GRE-Q than GRE-V! Of the two most critical specific abilities for commitment to and excellence in STEM educational–occupational tracks, selection criteria for advanced education and training in the US are severely compromised by ceiling effects for one (mathematical reasoning) while the other (spatial ability) is totally neglected.
What this means is that a large range of ability in numerical reasoning is clustered together in the high range of the GRE quantitative test and is thus preventing the possibility to distinguish high ability students from exceptionally high ability students. By making the top score of the test (the ceiling) low, you can ensure that the very able and exceptionally able have roughly the same score. Individuals who excel in spatial ability are unidentified because that method of mental reasoning is completely ignored. Meanwhile, the verbal ability test is designed such that exceptional talent can readily distinguish itself thanks to a much larger difficulty ceiling. Not only that, but verbal is double weighted by a second exercise which also exclusively focuses on verbal reasoning. (Un)coincidentally, this is exactly how you would design tests if you wanted to obfuscate gender differences that showed men doing better than women. On the GRE-Q, super-exceptional men get the same scores as merely able women because they can’t demonstrate their greater talent with a higher score than the low maximum. By making numerical tests ineffectual at the upper ranges of ability and ignoring spatial reasoning entirely, these tests ignore two essential factors in creativity and intelligence which are relevant for any field, but especially relevant for STEM. The testing of abilities which women have a sex advantage in are remarkably over-emphasized and makes men and women appear more intellectually equal than they really are. Especially disconcerting is that this test design guarantees that there are a relatively large number of men at both the mean and at the high levels of ability who are having their talent squandered. They are not being admitted to the quality of schools they should be. In public school, they are not being given the type of hands on education that is befitting of their talent in spatial and mechanical reasoning even though it is the men with this particular ability who are most important to our technological development.6,7
There is little doubt in my mind that these tests are purposefully designed this way for reasons of political correctness and cultural marxism (IE “The Cathedral“). I find it hard to believe that College Board, the company that designs and administers the SAT and GRE, does not understand what effect this kind of test structure has on the resultant scores. They are a professional testing company whose bread and butter is understanding how IQ tests work and designing effective ones. There is simply no way they could miss this glaring problem. However, I don’t think I can necessarily blame them for how they designed the test. They are acting rationally to avoid (false) accusations of sexism and bias that would surely result if the tests openly demonstrated the intellectual superiority males have over females in mathematics and mechanics. The problem is with our repressive and feminist dominated culture which can’t bring itself to admit that men have innate cognitive advantages over women; especially not if the level of male advantage is so large and substantial that it requires surveying an extremely gerrymandered map of cognitive talents to hide. That equalizing men and women in test scores requires two different tests of verbal intelligence (one of which is conveniently subjective), a poorly designed quantitative test, and ignoring an entire dimension of mental reasoning says a lot about just how large the gap between men and women is. If you don’t want to take my word that these tests are geared to emphasize the talents of women at the expense of the talents of men, maybe you will believe the American Psychological Association (emphasis mine):
“Most standard tests of intelligence have been constructed so that there are no overall score differences between females and males”3
A high school student with experience with AP tests (also designed and administered by college board) indicates that the same pattern described above is true for AP calculus vs. AP English tests. see the /r/darkenlightenment comment.
- Just one g: consistent results from three test batteries Wendy Johnson*, Thomas J. Bouchard Jr., Robert F. Krueger, Matt McGue, Irving I. Gottesman Department of Psychology, University of Minnesota—Twin Cities, 75 East River Road, Minneapolis, MN 55455, USA Received 8 April 2003; received in revised form 27 May 2003; accepted 15 June 2003
- The g facto: the science of mental testing. [book length PDF] Arthur Jensen 1998
- Intelligence: knowns and unknowns. American Psychological Association. http://www.gifted.uconn.edu/siegle/research/correlation/intelligence.pdf
- Sex differences on g and non- g intellectual performance reveal potentialsources of STEM discrepancies Gina C. Lemos, Francisco J. Abad, Leandro S. Almeida, Roberto Colom
- Sex differences in mental abilities: g masks the dimensions on which they lie Wendy Johnson, Thomas J. Bouchard Jr. University of Minnisota. 2006
- Kell, H. J., & Lubinski, D. (2013). Spatial ability: A neglected talent in educational and occupational settings.Roeper Review, 35, 219-230.
- Spatial Ability and STEM: A sleeping giant for talent identification and development. David Lubinski. Department of Psychology and Human Development. Vanderbilt University.