Data Analysis of Big Five Results

Want to jump to a specific question?

Which results are the most common?

See, if Open Psychometrics recorded the respondents’ results, it would be quite quick to answer this question. Unfortunately, they didn’t. So, we’re working with a bunch of numbers!

This is what the data looks like:

I’ve selected the first 6 columns of the data set… and it’s not too bad! It’s a little intimidating to think that there are over 1 million more rows, but we’ve gotta start somewhere >:)

We start by calculating the scores for the ‘Extraversion’ personality trait. We can do this by summing up the scores of the 10 EXT (standing for Extraversion) questions. However, it should be noted that answering a ‘4’ or ‘5’ on an EXT question doesn’t always mean that a user is more extraverted. For example, answering 5 for “I don’t talk a lot” vs. “I am the life of the party” would significantly affect your results.

So, we’ll need to make sure to add and subtract scores as necessary, depending on how the questions are keyed. I’ve arbitrarily decided to ‘positively key’ questions symbolising extraversion (meaning that I’d add points to their total score if they showed signs of extraversion), and ‘negatively key’ questions symbolising introversion (subtracting points from their total score). See the table below to better understand this keying:

Table 1: EXT Question Key
Questions Key
I am the life of the party. \(+\)
I don’t talk a lot. \(-\)
I feel comfortable around people. \(+\)
I keep in the background. \(-\)
I start conversations. \(+\)
I have little to say. \(-\)
I talk to a lot of different people at parties. \(+\)
I don’t like to draw attention to myself. \(-\)
I don’t mind being the center of attention. \(+\)
I am quiet around strangers. \(-\)

Hence, after adding/subtracting their points from each question, any positive total score (> 0) means that the user receives an ‘S’ for Sociable, while a negative score (< 0) would result in an ‘R’ for reserved.1 If their score is 0, then they’d receive an ‘X’, as their results since they’re perfectly in-between S and R, and will remain inconclusive.

Below is a sample table of results, where extletter (their final letter result) and extscore (the sum of their points) are calculated from the raw data.


We can also visualize our results into a graph:

The same processes will be applied to the other personality traits, with their graphs located below:

Now, we can combine everybody’s letters from each personality trait to form their final result. Then we can determine which results are the most common within this data set.

In fact, we can see the 25 most common results below:


WOOO! Out of the 243 possible answers, congratulations to the __OAI family for dominating the top 4 spots!

If you’d prefer to see a more numerical representation, I’ve provided the table below with the top 10 results and the average score of each personality trait:2

Table 2: The 10 Most Common Results
Results Amount of Responses Percentage of People
SCOAI 159575 15.716379
RLOAI 121251 11.941888
SLOAI 100841 9.931727
RCOAI 92548 9.114958
RLUAI 70642 6.957459
SLUAI 56870 5.601068
SCUAI 41138 4.051640
RCUAI 26379 2.598041
RLOEI 23547 2.319120
RCOEI 22333 2.199554
Table 3: Average Score for Each Personliaty Trait
Extraversion Neuroticism Conscientiousness Agreeableness Openness
avgscore 1.44 0.32 3.346667 9.586667 8.026667

Let’s try to analyze why these results happened.

Extravertism

Extravertism is the most balanced trait, as there seems to be a nice mix of S(ocial) and R(eserved) – representative of the larger population. I’d assume this is because questions in this trait were very straightforward (e.g. “I start conversations”), and users likely display/are conscious of these behaviors in their everyday life. Hence, users will have a more ‘objective’ viewpoint, leading to more accurate results.

There are also about ~40k more introverts compared to extraverts. Maybe the Sociables aren’t as likely to spend 20 minutes on an online test, compared to the Reserved?

Neuroticism

Once again, a solid distribution of results between C(alm) and L(imbic), mirroring what I’d expect the general population to be like. Similar to the extravertism section, the questions tend to be straightforward and (I think) there’s less self-bias within the trait, as I believe that many people are conscious of their experience with their emotions emotions, and whether or not they control them.

There are ~60000 more limbic people compared to calm. I’m not really surprised - people are more stressed3 and more sad4 than we’ve ever been. Furthermore, emotional regulation is a result of introspection and reflection. As our society slowly moves towards constant stimulation and overindulgence, it’s no wonder we struggle to achieve emotional stability.

Conscientiousness, Agreeableness, and Openness

With these three personality traits, we start to see a much larger gap between the possible letters. There are significantly more results that are organized, agreeable, and inquisitive, relative to unstructured, egocentric, and non-inquisitive.

I don’t think these results genuinely reflect the general population, even though I have no evidence stating so. However, if I had to speculate, I’d say that this data is a bit biased and leads to skewed results.

  1. We’re just a little bit tainted

No matter how hard we try, we can never be completely objective. Many struggle to see things beyond ourselves and to judge every action we take from a neutral standpoint.

In our daily lives, this is not too significant of an issue. A little bit of subjectivity never hurt anybody! However, when we’re trying to objectively analyze and record data about ourselves, our lack of objectivity can lead to delusional answers and incorrect results - especially since this is a self-administrated quiz.

  1. We’re fragile creatures

In this society, it is almost ALWAYS better if you’re organized, agreeable, and inquisitive, rather than the opposite. In fact, it’s almost an insult if we AREN’T these things.5

For example, people tend to be more agreeable and people-pleasing, since it shows that you’re friendly and nice. Then, because you exhibit these behaviours people will tend to ‘like’ you more (relative to being confrontational and assertive for your own needs). Similar comparisons can be made for conscientiousness, especially as our society moves towards maximizing productivity and ‘grind culture’. You NEED to be conscientious and self-disciplined, or else you won’t be successful. Or, in relation to inquisitivity, people don’t want to be seen as rigid and unwilling to try new things. Society will call them scaredy-cats, boring, or tell them that you’re ‘bringing the mood down’.

It doesn’t help that we don’t like being ‘bad people’ – we constantly attempt to justify our actions to be good and perceive ourselves in a positive light. We are rarely the villains in our own stories; it’s always the other person doing something wrong, or doing something worse than we did, or they started the ordeal.

However, when we can’t be objective, we can confuse what we actually do, with what we wish we did. This is going to lead us to choose answers corresponding to those ‘better’ traits of self-discipline, agreeableness, and inquisitiveness, instead of objectively saying we have the ‘worse’ traits of unstructuredness, , and un-inquisitiveness.

  1. We’re just built for this (for Openness)

There’s also the classic case of sampling bias. The people who take this quiz are probably curious about psychology, want to know more about themselves, and are willing to try something new. These are typically the types of people who are inquisitive (matching with the I letter) instead of non-inquisitive (matching with the N letter).

Sheer Statistical Impossibility…?

When analyzing this data, it seems.. strange that so many __OAI types are represented. In fact, it feels weird that 15% (!!!!) of people were the EXACT SAME TYPE, despite ALL POSSIBLE COMBINATIONS!

So, let’s see how it compares to the theoretical data. Using the data6 from SimilarMinds, we can see how our data stacks up.

Table 4: Theoretical Data
Results Percentage of People
SCOAI 3.4
RLOAI 2.7
SLOAI 2.5
RCOAI 3.5
RLUAI N/A
SLUAI 3.4
SCUAI 4.1
RCUAI N/A
RLOEI N/A
RCOEI N/A
Table 4: Experimental Data
Results Percentage of People
SCOAI 15.716379
RLOAI 11.941888
SLOAI 9.931727
RCOAI 9.114958
RLUAI 6.957459
SLUAI 5.601068
SCUAI 4.051640
RCUAI 2.598041
RLOEI 2.319120
RCOEI 2.199554

Looking at these tables, there’s a large discrepancy between theoretical and experimental data, where only SCUAI matches the theoretical results. Clearly, certain results are far more represented in our data set than the average value of the general population.

I believe that most of it can be explained by the previous analysis: Wrong results might occur because of biases and societal norms (cementing the last two letters to be A and I and increasing the likelihood of an O vs. an U), in addition to the natural disposition of respondents.

Honestly, I’m not too sure why 15% of people are SCOAI, maybe people are being influenced to answer what they WANT to be like, rather than what they actually are. My best guess is that SCOAI seems like one of the most ‘socially-successful’ results, since extraverted, stable, organized, agreeable, and inquisitive are all attributes that allow people to thrive in this society. They tend to have larger social circles, better interpersonal relationships and private lives, and the ability to study and work hard to achieve their goals.

So, my best guess is that many respondents are not actually SCOAIs, and they’re not answering these questions objectively.7

Also, just to provide a comprehensive review of the results, here’s a list of the most uncommon ones!

Table 5: Most Uncommon Results
Results Amount of Responses
XXUXN 3
SXXXN 5
XLXXN 5
XXUEX 5
SCXXX 6
XCXEX 6
XXOXN 7
SXOXX 9
XCUXX 9
XCXXN 9

No surprise, it’s a lot of results with X’s!

At first, I was surprised that ‘XXXXX’ didn’t appear, since I’d assume it (theoretically) is the most unlikely.8 However, I wouldn’t be surprised if only 5% of XXXXX respondents were genuinely XXXXX, while the other 95% of people who got the result just kept clicking 3 (Neutral) for every question, or had some kind of game to see if they could get the (theoretically) super rare XXXXX.

Do results vary between countries?

This data contains 224 unique ISO country codes.9 Let’s dig through this data - a fun bit of stalking!

We can see that the majority of data came from the US, with a whopping total of 546403 US respondents. Trailing (very far) behind, we also have Great Britain (GB), Canada (CA), and Australia (AU).

This is likely because this quiz is in English, hence, generally caters towards countries with English as their primary language. In addition, Google’s SEO (search engine optimization) algorithm is also affected by location and will rank websites by their proximity to the user.10 So, it’s possible that Open Psychometrics is located in America,11 and when Americans search up “Big Five Personality Test”, this would be the first quiz that shows up.12

But, we’re not really concerned about WHERE people are taking the quiz. Instead, we only care about how it affects the responses. Hence, we’re going to start by finding the most common results for America, Great Britain, Australia, and the Philippines – these locations have the greatest number of results while being from different continents – and see how they compare to one another.

Table 6: America
Results Percentage
SCOAI 16.855325
RLOAI 12.283425
SLOAI 10.141050
RCOAI 9.943943
RLUAI 6.578112
SLUAI 5.344041
SCUAI 4.044268
RCUAI 2.546472
RCOEI 2.150610
RLOEI 2.142924
Table 6: Great Britian
Results Percentage
SCOAI 13.129918
RLOAI 11.623821
SLOAI 10.149258
RLUAI 8.353355
SLUAI 7.272208
RCOAI 6.776683
SCUAI 4.606883
RLOEI 2.660820
RCUAI 2.575230
RLUEI 2.386029
Table 6: Australia
Results Percentage
SCOAI 16.546072
RLOAI 11.279233
SLOAI 9.696182
RCOAI 9.020588
RLUAI 6.534080
SLUAI 5.954427
SCUAI 4.309414
RCUAI 2.372577
RLOEI 2.072756
RCOEI 2.046772
Table 6: Philippines
Results Percentage
RLOAI 15.105557
SCOAI 11.699501
SLOAI 10.540636
RCOAI 8.726760
RLUAI 5.869905
SLUAI 3.305285
RLOEI 2.216960
RLOAN 2.191767
XLOAI 1.889454
RCUAI 1.627450

From the table above, it’s pretty clear that countries have very similar trends. However, there are slight differences:

Generalizing the World

We can also plot the trait averages on a world map, and then analyze the results.

A note on the data: Locations with less than 10 responses have been omitted from the map data, as they often significantly skew the scale of the maps. This removed 58 locations off the map, with Africa losing a pretty big chunk of their land.

Higher scores are correlated with extraversion, lower scores are correlated with introversion.

Table 7: Most Extraverted
region ExtScores
Cuba 2.708333
Greenland 2.382353
Rwanda 2.093750
Ethiopia 1.342960
Afghanistan 1.018519
Norway 0.989608
Table 7: Least Extraverted
region ExtScores
St. Kitts & Nevis -6.333333
Sudan -5.466667
St. Lucia -5.190476
Åland Islands -4.733333
Guyana -4.543478
Bhutan -4.500000

There doesn’t seem to be an overarching trend in the world, but there are several generalizations.

It’s quite interesting that Americans are stereotyped to be more extraverted and loud, while Asians are more introverted and quiet. Doesn’t seem to apply to this data set!

Higher scores are correlated with being calm, lower scores are correlated with being limbic.

Table 8: Most Calm
region estScores
Suriname 4.697674
Eswatini 3.461539
Ethiopia 3.451264
Cape Verde 3.181818
Cuba 3.000000
Papua New Guinea 2.791667
Table 8: Least Calm
region estScores
Jersey -4.476191
Guernsey -4.044444
Syria -3.875000
Samoa -3.727273
Algeria -3.234310
Belize -3.125000

It seems like the southeastern part of the world is less limbic, specifically Africa and East Asia. I find it funny that China is relatively ‘calm’, despite their notoriety for bad work-life balances13 and the ‘lie down’ movement14 – factors that would propagate negative moods and emotional instability.

Yet, it’s also reasonable to say that Chinese people are accustomed to high stress after dealing with academic pressures.15 Hence, they might have learned to better regulate their emotions.

Higher scores are correlated with organization and contentiousness, lower scores are correlated with carelessness.

Table 9: Most Contentious
region csnScores
Ghana 6.922794
Papua New Guinea 6.708333
Grenada 6.291667
Rwanda 6.281250
Uganda 6.127517
Kenya 6.111744
Table 9: Least Contentious
region csnScores
Bhutan -1.0000000
Bolivia 0.0575342
Libya 0.1764706
Angola 0.3571429
Åland Islands 0.6000000
Paraguay 0.6975089

For conscientiousness, Africa is lit up like a Christmas tree! North America is also pretty light. On the other hand, South America is quite dark.

I find it interesting that Asia, which is known for their disciplined schedules and focus, seems to be quite average. Furthermore, Sub-Saharan Africa seems to be incredibly conscientious - yet, there’s a direct link between low conscientiousness and poverty.16 Is this just another instance of self-bias?

Higher scores are correlated with agreeableness, lower scores are correlated with egocentrism.

Table 10: Most Agreeable
region agrScores
Papua New Guinea 11.04167
Rwanda 10.56250
Cameroon 10.45455
St. Lucia 10.42857
Cuba 10.20833
Tanzania 10.17442
Table 10: Least Agreeable
region agrScores
Madagascar 1.500000
Åland Islands 1.533333
Bhutan 3.428571
Poland 3.818162
Cape Verde 3.909091
Belarus 3.927711

I had to double check twice to make sure I didn’t accidentally duplicate the conscientiousness graph, as they look almost identical.

I am, once again, not surprised that everybody thinks they’re agreeable. When looking at the scores on the ‘least agreeable’ table, not a single country is willing to admit they’re… just a tiny bit egotistical! Alas, the bias goes deep.

Higher scores are correlated with inquisition, lower scores are correlated with traditionalists.

Table 11: Most Inquisitive
region opnScores
Madagascar 12.50000
Cuba 11.37500
Seychelles 11.36364
St. Lucia 11.04762
Armenia 10.91743
Germany 10.89598
Table 11: Least Inquisitve
region opnScores
Macao SAR China 3.506623
Bhutan 3.642857
Cambodia 3.971429
Malaysia 4.221204
Nepal 4.783489
Gambia 5.300000

The Americas, Africa, and Europe seem to be a lot more open to new experiences, as they’re significantly brighter than the other continents. I can only think that risk-taking is just encouraged inside these societies or people are given more personal freedom and allowed more individuality. However, in other locations, they may prioritize stability and traditional methods – not necessarily a bad thing.

Are All Questions Created Equal?

We can also analyze the questions themselves. This quiz actually recorded how many milliseconds each respondent spent on each question.17

Thus, I present to you: The amount of time spent on each question!

For your reference:

I really like using a box plot18 to symbolize these results. For those who are unaware, the white box symbolizes the interquartile range (the range of values encompassing 25th to 7th percentiles. The lower end of the box represents the bottom 25%, while the top end of the box represents the top 25% of users. The line in the center is the median value – in this case, the median time taken for each question.19 Box plots are great for visualizing the spread of data (which seems to range quite a bit) and managing outliers.

Notice for the data: I’ve cutoff values above 20 seconds20 in the graph. I’ve also taken a sample of 50000 response times for each question.21

However, I want to be more precise when analyzing loading times, so I’ve specifically pulled the questions that take the longest and shortest times to complete.

Table 12: Questions That Took the Longest
Category Question Average Time in Seconds
AGR1_E I feel little concern for others. 17.04968
EXT2_E I don’t talk a lot. 14.90597
EST3_E I worry about things. 12.96387
CSN1_E I am always prepared. 10.38801
CSN4_E I make a mess of things. 9.14376
Table 12: Questions That Took the Shortest
Category Question Average Time in Seconds
EXT10_E I am quiet around strangers. 4.641439
EST9_E I get irritated easily. 4.621047
CSN9_E I follow a schedule. 4.529715
OPN8_E I use difficult words. 4.289161
OPN10_E I am full of ideas. 3.421091

I’ve taken the liberty of removing EXT_1 (the first question on the quiz), which had an average score of 87 seconds. I’d assume it’s skewed because people might have wanted to scroll around the webpage and get accustomed to it (instead of immediately focusing on the first question) or maybe started the quiz then forgot about it.22

“I feel little concern for others” seems to be the question that took the longest. It’s likely because it’s incredibly situational - ‘others’ is very vague. Does it count if you care about everybody, including strangers or people you hate? Or should you just limit it to your friends, people you may feel an obligation to care about, or want to care about? This ‘situational’ factor also plagues the other high-ranking questions, where there are SOMETIMES you agree with the actions and other times you don’t. There’s a lot of nuance that people need to consider, hence, requiring a longer response time.

On the other hand, the questions that took the least amount of time were concise, straightforward, contained simple words, and were easy to judge yourself on. You either talk pretentiously to seem smart, or you don’t.23 You can also see that these questions are the ones that are asked later (i.e. A majority of the questions have an XXX9_E tag, meaning that it’s part of the 9th out of 10 rounds of questions, so they’d be the 40th - 45th question they’ve completed). Anticipation for results may have caused them to rush through the later questions. The reverse is also seen. (i.e. The questions that took the longest show up a lot earlier in the quiz, such as AGR1_E, which would’ve been in the first round of questions).

Another interesting thing to note is the distribution of answers for each question. You may expect that each question has a distribution similar to standard distribution. However, that’s actually not the case! There are various patterns you can spot:

There are four major types of distribution seen in the data:

1. Logarithmic (First Row, i.e. I shirk my duties and I have a small heart)

The columns progressively increase or decrease. I think these graphs are the most “trustworthy”, as you’d only put 1 or 5 (the extremes) if you were incredibly confident in your answer.24 Not only that, but these questions aren’t really ‘shameful’ to admit, like “I get stressed out easily” is seen to be a relatively normal thing to say, which allows people to be honest and pick extremes.

Questions also tend to be less ‘situational’ and more ‘specific’, where you can very clearly visualize what you’d be doing in that situation, rather than responding “Oh… sometimes I am, sometimes I’m not!” (E.g. “I am quiet” likely wouldn’t follow this trend, but “I am quiet around strangers” does because the question narrows down the situation. You also don’t really change your behavior around different strangers - it’s pretty consistent.)

Some other examples include:

If you’re wondering which side they skew on…. just trust your gut on it :)

2. Skewed (Second Row, i.e. I am relaxed most of the time, I feel litte concern for others)

This is the most common distribution type. I feel like these are best associated with questions where self-bias is most prevalent, as you want to pretend you’re something you’re not (to make yourself feel better). These questions are also very situational; Sometimes I do this, sometimes I do. That’s why people tend to lean towards the middle.

Examples include:

3. “Normal” Distribution (Third Row, Left Plot)

I’m lying to you. These graphs don’t have normal distributions. I’m just calling it normal distribution because the answer 3 (the middle) is the most common – similar to a normal distribution graph. Quite frankly, I think this pattern means that people are either confused or they don’t really have a large opinion on it, so they’re almost forced to choose 3. These questions also seem to be the most ‘observable/objective’ of the bunch.

Personally, I try to make a habit of not selecting 3 on these tests (I’m not sure if others are the same), but it’s still interesting to see. I feel like these things are not things to be ‘proud’ of or dislike about yourself, nor would you mention it unless prompted, which is likely why it’s typically associated with the extravertism questions.

Examples Include:

4. Relatively Uniform (Third Row, Right)

These are the ones that have relatively more uniform distribution, as 1 and 5 are similar in height and have about 150k responses, while 3, 4, and 5 are also similar in height with about 250k responses. These questions are the ones you go “Yea… I’m definitely not SUPER ___, but it happens from time to time… I’m not sure how I would compare with other people though… so 2, 3, or 4 sound about right.”

Examples Include:

Learn More Theory

Explore the Big Five Test beyond this dataset. Continue to ‘Beyond the Dataset’ to learn about the test’s history, common critiques, and connections to society and other tests.


  1. Recall how introverted questions are negatively keyed and subtracted from your total.↩︎

  2. The average values will actually shift every time I update the website (on RStudio!), as the function ‘slice_sample’ will grab the average of 100 randomly selected responses, resulting in a unique average every time!↩︎

  3. wbur, It’s not just you: A new survey shows the world is more stressed out than ever.↩︎

  4. VOA, Why People Worldwide Are Unhappier, More Stressed Than Ever.↩︎

  5. If someone told you, “Yea, you’re just a bit egocentric… no offense”, would you be happy about it?↩︎

  6. This was the only site that had theoretical values, but I don’t know where they got their percentages from. It should also be noted that this site does not use ‘X’ as a possible result. E.g. XLUEI (or any combination with an ‘X’) is not considered. So, this means that there is no data for specific combinations. Hence, the theoretical values should be trusted with a grain of salt. It should also be noted that SimilarMinds separates the theoretical values by female and male. Since this data set does not have this distinction, I used the average of the male and female theoretical values to get my average number.↩︎

  7. If you have more ideas, I’d love to hear them! Reach out :)↩︎

  8. There are actually 3794 people who got XXXXX in this data set.↩︎

  9. ‘Country codes’ are kinda misleading. There are ~195 countries, and ISO has 249 different codes. This is because the ISO contains subdivisions of countries, e.g. Caymen Islands (UK) and Christmas Island (Australia).↩︎

  10. Miller, Physical Proximity To Searcher: Is It A Google Ranking Factor?↩︎

  11. I couldn’t find any specific location data on the website.↩︎

  12. As a Canadian, it’s actually the third search result!↩︎

  13. E.g. the 996 schedule of working from 9 am to 9 pm, 6 days a week.↩︎

  14. Youth in China are adopting the philosophy of ‘lying down’ and giving up, due to the bad job market after graduation. They often feel let down by their society, as students have been taught that studying hard and getting into a good university will lead to a good life. Yet, when they graduate, they struggle to find jobs and keep themselves afloat.↩︎

  15. E.g. To get into university, students take the Gaokao, a two-day standardized test that determines their entire future.↩︎

  16. Caplan, For a New Liberty, Chapter 8, Aikins, Africa is losing the battle against extreme poverty.↩︎

  17. The timer starts when you select an answer to a previous question and runs until you answer the current question. The first question’s timer starts when you load into the page.↩︎

  18. I actually forgot this type of graph existed, until I was randomly scrolling through the ggplot2 library of different graph types!↩︎

  19. Wikipedia, Interquartile Range.↩︎

  20. As they are significant outliers and make the data difficult to visualize. These values are only omitted for the graphing aspect and are still kept for the analysis of the data, such as the analysis below.↩︎

  21. 50000 is enough to show the general trend, and utilizing more values only makes graphs take longer to load.↩︎

  22. Recall that the timer for question 1 starts as soon as the page loads (since there are no previous questions to start from).↩︎

  23. I’m referring to the difficult words question.↩︎

  24. I.e. If you ask people to choose a number between 1 and 5, they tend to pick 2, 3, or 4, rather than 1 or 5.↩︎

  25. This one is almost inverted, where 2 and 4 are the most common responses, followed by 1 and 5, then 3.↩︎