Some thoughts about the English Proficiency Index

Last week, Education First released its English Proficiency Index for the year of 2015. This index was launched in 2011 and since then, EF has released reports about the English proficiency of different countries around the world.  The 2015 report is based on tests taken by around 910 thousand adults from 70 countries in 2014 (Education First, 2015). Besides ranking the countries around the world, the report also ranks the different states in Brazil.

EF’s index was widely publicized in the news, since it showed that Brazil ranks 41st among the 70 countries researched, having dropped from the 38th position of the two previous years. The fact that the Federal District came first this time, outscoring São Paulo, was also highlighted in the local news. The day the report was released, it was also widely shared on FB. I myself was one of the enthusiasts.

However, what surprised me the most about everything I read and heard in the news about EF’s English Proficiency Index is that there wasn’t a single comment in the news about the methodology used to calculate the index and how this can impact the reliability of the results. I’m amazed by how people believe everything they read at face value. I was interviewed by a TV network and mentioned that we have to take these types of findings with a grain of salt, scrutinizing the study’s methodology, especially aspects such as sampling, test administration, test content, etc. They chose not to include this part of my interview in their news report. What’s exciting is the result, no matter how accurate it may or may not be. It reminds me of the rankings of schools based on ENEM results…

Granted, the effort made by EF to analyze and compare proficiency in English all around the world is noteworthy, as is the fact that their exam is free and available online to anyone who wants to take it. It also makes some interesting and useful correlations considering the countries’ results and a number of other indices, such as countries’ human development index, income per capita, ease of doing business, number of researchers, and many others. As the report states, “There is no other data set of comparable size and scope, and despite its limitations, we, and many others, believe it to be a valuable reference point in the global conversation about English language education” (Education First, 2015, p. 62).

Hence, as EF itself admits, the English Proficiency Index has a number of limitations, and I believe they should be considered or at least mentioned in any serious discussion on the topic. Here are some that I would like to highlight:

–       Around 910 thousand young adults and adults took the test around the world. If you divide this by the number of countries, 70, you reach the figure of 13,000 participants per country. However, the report says that the minimum number of test takers for a country to be included in the study was 400, though many countries had a much higher number. The report does not mention whether the number of participants included in the study was calculated based on a percentage of the population in each country and it does not look like it was. Nor was there a sampling of participants based on age, income, education, region, etc. Thus, the sample does not truly represent the population in each of the countries included.

–       The EF report explains that the EF Standard English test (EFSET) was developed by experienced exam writers, was piloted with a diverse population, and went through psychometric analysis. It suggests that the EFSET is comparable to well-known exams such as the TOEFL, TOEIC, Cambridge, and IELTS exams, with the advantage of being free of charge. There is no information about whether it was validated against these exams or about its content and construct validity. Also, differently from most of the international exams mentioned above, it does not have a writing or a speaking section; it is limited to multiple-choice items assessing reading and listening. Thus, it does not encompass all the elements of communicative competence that a proficiency test should.

–       I took the EFSET Express, a 15-minute version of the test, and saw the tutorial for the 50-minute version. I found the items well-developed but daunting for a student below the B1 level. The results show whether you have basic proficiency (A1 or A2), medium proficiency (B1 or B2), and high proficiency (C1 or C2). The EFSET, though, will classify the student in his/her exact CEFR level. I find it difficult to understand how the types of items I answered based on a listening and a reading piece can differentiate an A1 and an A2 student.  This is something I will have to analyze if I choose to take the whole test. Anyhow, since the test encompasses only listening and reading, the result should indicate the CEFR level for listening and reading, and not a general level as it does.

–       Anyone can take the EFSET anywhere in the world, and they can seek help from others or use reference books and dictionaries. They might even use a translator. An inhibiting aspect is that the test is timed, so using other sources would slow the test-taker down and make it difficult to finish in the allotted time. EF claims that the low stakes nature of the test does not encourage students to cheat on it. However, it is really risky to base a whole worldwide ranking on tests taken in such uncontrolled and varying circumstances, an aspect that definitely affects the reliability of the test.

–       On page 5, the report says, “(…) while there are millions of Cambridge English FCE, TOEFL, TOEIC, and IELTS test takers every year, they make up only a small fraction of the world’s nearly two billion English learners.”  Isn’t there a contradiction here, considering that the EPI is based on the results of fewer than a million tests all around the world., also a very small fraction of  the nearly two billion English learners?

–       The index is based on two tests taken online by volunteer participants. They are people interested in learning English or measuring their English proficiency. This eliminates a large proportion of any country’s population and skews the results significantly. Also, some countries have a stronger test-taking culture than others. In countries that are highly driven by standardized testing, people may be more inclined to take the EFSET to test their proficiency in English, while in other countries people may only take this type test if they are asked to by their employer, for example.

–       The report also ranks the states in Brazil. Again, it does not mention how many people took this test in each state. Could it be, for example, that more people in the Federal District took the test and that’s why it appeared as the place in Brazil with the highest proficiency index?

I’m not saying here that it’s unlikely that Brazil falls behind most countries in English proficiency, and perhaps any piece of evidence that shows this, as questionable as it may be, should be displayed so as to urge our government to revise its policies regarding English-language instruction. It is also reasonable that DF should have a higher proficiency rate than other states due to the higher income of the population, leading to more access to high-quality English-language instruction.

I also believe that if EF administers its tests to so many people all around the world, it should indeed share their results. I just think that the results need some hedging and that we need to be more critical about what we read so as not to go around drawing conclusions and making strong and irrefutable claims that are not supported by the data on which they are based. We also need to bear in mind what and whose interests such endeavors serve and who benefits from this wide dissemination of the study so as not to fall prey of marketing strategies, as intelligent as they may be.

 

Reference

Education First (2015). EF English Proficiency Index. Retrieved from https://media.ef.com/__/~/media/centralefcom/epi/downloads/full-reports/v5/ef-epi-2015-english.pdf on November 08, 2015.

Isabela Villas Boas

Isabela Villas Boas holds a Master's Degree in Teaching English as a Second Language from Arizona State University and a Ph.D. in Education from Universidade de Brasília. She has been at Casa Thomas Jefferson for 33 years, where she is currently the Corporate Academic Manager . Her main academic interests are second language writing, teacher development, ELT methodology, and assessment. She also supervises MA dissertations for the University of Birmingham. She has recently published the book “Teaching EFL Writing - A Practical Approach for Skills-Integrated Contexts.

7 Comments
  • SIMONE SARMENTO
    Posted at 10:45h, 28 setembro Responder

    Congratulations on your piece. I’ve been meaning to write about it for ages but never really got down to it. I am right now sending it to my graduate students at UFRGS since we’ve been discussing language tests as kinds of hidden language policies!

    • Isabela Villas Boas
      Isabela Villas Boas
      Posted at 10:47h, 29 outubro Responder

      Dear Simone, thank you for your comments and for referring my post to your students. I haven’t met you in person but know about your work and feel humbled by your appreciation of my analysis.

  • Leandro R. Tessler
    Posted at 10:57h, 28 setembro Responder

    Great text. The main flaw, IMHO, is the sloppy treatment of the sample, as you mention. Any Statistics student knows the catastrophic result of a biased sample, in particular the infamous 1936 USA election prediction by the Literary Digest (I learned about if from my daughter who studies Statistics 🙂 https://www.math.upenn.edu/~deturck/m170/wk4/lecture/case1.html). Apparently EF ignores this (intentionally or not).

    • Isabela Villas Boas
      Isabela Villas Boas
      Posted at 10:46h, 29 outubro Responder

      Thanks for commenting, Leandro. It’s amazing how people believe anything that is said to them and don’t analyze the methodology in the study!

  • David Beach
    Posted at 23:11h, 10 fevereiro Responder

    Alternative facts. 🙂

  • Dave Hopkins
    Posted at 01:51h, 06 junho Responder

    Thank you Isabela Villas Boas for posting this. I have been concerned about the interpretation of the EFSET for some years, and the lack of any validity statistics to back this up. Please find below my post to my friend and colleague Steve Hanchey of the RELO in Bangkok. “This says in detail what I am suggesting. Since I work with the TOEIC people I will have to ask if they have data about the “8 million” people that take TOEIC every year! To be more specific, I would question the validity of the test since there are no statistics to back this up. All tests have a “validity curve range” where the scores are most meaningful. This is determined through “trialing” of the test items to see if they are consistent in “difficulty” and “discrimination” between test items The bottom line is, a test is not equally valid over its whole range of test scores. TOEFL is only valid in a narrow range around 570 which is the target score for US universities. TOEIC has an amazing range from 200 to almost 800. From what Isabela Villas Boas says here, I would make a wild guess that the EPI is valid around the B1 level, but how wide that range is is unknown. My contention is that EF is a business. As a business they are unlikely to be doing anything for the “general good” of education. Pardon my skepticism. More likely, they compile and publish these results as part of their global marketing strategy. It’s a tough market out there and they are using the “test boogie” to scare up some students. I have sent you some slides which illuminate this issue very well. Thanks again for your post.”

    Sou brasileiro de coraçao e espero que estivesse no brasil de novo. Tudo de bom e obrigado.

    • Isabela Villas Boas
      Isabela Villas Boas
      Posted at 18:19h, 01 julho Responder

      Dear Dave,
      I’m sorry that I only saw your comment today. Since this is a collective blog, it is hard to sort the comments out. Thank-you for you comment on my post. I like your response to the RELO, and I particularly like your idea of the “test boogie” to scare students. Yes, they are a business and want to use these tests to attract more students. What they can’t do is make such huge claims using these types of tests.
      Cheers!

Post A Comment