This article discusses how to forecast college enrollment as well as the number of credit hours demanded. The case studies a university in Central Asia that despite a strong reputation has been losing enrollment for nearly a decade. The effects of demographics on enrollment appear stronger than those of such traditional factors as tuition and income; but enrollment and hours may also respond to such characteristics of the university as prerequisite courses that are difficult. The article compares three ways of forecasting enrollment and credits: a structural approach, which predicts the effects of such determinants as tuition and student traits; a univariate approach, which predicts enrollment based on past enrollment; and data mining, which discerns patterns in big datasets through such new models as artificial neural networks. Of the three approaches, the structural one may be the best at explaining enrollment changes but does not necessarily yield the most accurate forecasts, as measured by the percentage error in the model’s predictions of past enrollment.
By the end of this case, students should be able to
- Compare three methods of forecasting enrollment—structural analysis, univariate analysis, and data mining—on the basis of their strengths and flaws
- Choose a method for forecasting the enrollment of a school like the one described in the case study, perhaps on the basis of accuracy
- Explain to a client the limitations of forecasts; for example, if the dataset is small, the forecast may be imprecise
This case study should enable the student to choose among ways to forecast; indeed, forecasting itself is critical thinking. We consider the problem of predicting enrollment at a small university in Central Asia. The next section presents attributes of the college—such as tuition and grades—that may affect the demand to enroll. The following section discusses forecasting models in light of their merits and demerits. Concluding sections reflect on practical problems and lessons. The case study tries not to persuade the student to adopt a particular model but instead to choose one knowledgeably.
Project Overview and Content
We sought to predict enrollment at University X, a business-oriented school in Almaty, Kazakhstan, with 2,400 students. Parents and educators regard it as one of the best colleges in the five post-Soviet countries of Central Asia. As much of the faculty holds Western PhDs, instructional costs are high for the region. An accurate enrollment forecast can save money.
To design the forecast, one may need to know basic characteristics of University X. Let’s turn to these.
The academic year has four semesters—fall, spring, and two summer sessions. Enrollment is high in the fall, almost as high in the spring, and larger in the first summer session than in the second. The type of semester is one of the best predictors of the university’s enrollment.
Despite rising real income per capita in Kazakhstan, credit hours equivalent at University X in Spring 2018 fell 11% below their total in February 2017. We predict that fall hours will decline 11% to 12% per year over the next two calendar years. By Fall 2019, autumn hours might have almost halved since 2011.
Trends are discouraging: In Spring 2018, hours trailed the prediction by 5%. Credit hours fall even more sharply than enrollment.
Before 2011, enrollment had grown steadily at the university, which had been founded in 1992, when Kazakhstan became independent after the Soviet Union collapsed.
Factors affecting enrollment and hours are either internal to the university or external.
It would be natural to attribute the fall in enrollment to rising tuition, which has doubled in terms of the tenge, Kazakhstan’s currency, since Academic Year 2011–2012. But adjusted for price changes, tuition rates have risen by only a fourth.
Student applicants for financial aid may be sensitive to tuition, as they are often poor; so Manoylenko and Taylor (2018b) studied their demand for credit hours in Fall 2011. One advantage of this approach is that the tuition price of an hour net of aid varies more widely among aid applicants than the gross tuition rate does among all students; the greater variance makes estimates of the tuition effect on hours more precise. The dataset included 118 undergraduates; it excluded applicants reporting zero income, as obvious underestimates. Control factors were the student’s score on a national entry exam and dummy variables for programs (e.g., the variable for the business program would equal 1 for a business student and 0 otherwise).
Income and price effects turned out to be small. Net tuition was highly significant, but a 1% decrease in this price increased the demand for credits by only about a third of 1%.
The income effect was even smaller: A 1% rise in mean household income reported by the applicants raised the demand for credits by just 0.08 of 1%. Also, income was statistically significant at only the 13% level of significance; that is, given the sample estimates, the probability that income had no effect on the demand for credits among students in general was a bit more than 12%. With a probability this high, many researchers would be reluctant to dismiss the possibility of a zero income effect in the statistical population.
Aside from tuition, students may be generally dissatisfied with University X. That could reduce the number of hours that they sign up for. But surveys of recent alumni find that more than 80% are willing to recommend the university to potential students. Yes, the percentage of alumni who say they would recommend X has a positive relationship to registered hours, with statistical significance at the 10% level, according to Manoylenko and Taylor (2017). But there is not a sharp decline over time in the share of alumni who would recommend. That is true for undergraduates as well as for graduate students.
Perhaps prerequisite courses slow down registration in a subsequent course when they are hard to pass. This might occur when the prerequisite course does not prepare the student well.
But Manoylenko and Taylor (2018a) find that evidence for this possibility at University X is weak at best. Controlling for student ability and for the type of subsequent course, a high grade in the prerequisite course relates positively to a high grade in the subsequent course; similarly, a low grade in the prerequisite relates to a low grade in the subsequent course. These results have statistical significance. And they indicate that students who have mastered the prerequisite (thus earning a high grade) are better prepared for the subsequent course (earning a high grade there, too). The prerequisite course studied is for learning English—vital for X as its classes are taught in English, which few students speak as their native language.
In addition, grading in the English course is relaxed: Only 1% of the students fail or withdraw from the course, and a third of the students receive an A or A+.
The pool of potential students for University X depends on the number of Kazakhstanis who are of college age. In the mid-1990s and early 2000s, the number of births fell before beginning to rise substantially in 2002. So the pool of college-age Kazakhstanis may not expand much until 2020.
Income per capita may also affect the demand for college, although the direction of the effect is not clear. In Kazakhstan, since the Russian ruble crisis in the late-1990s, average income has usually risen. This is largely because of rising oil prices; exports of crude account for a fourth of Kazakhstan’s economy.
It might seem reasonable that rising income would boost demand for enrollment; after all, richer people seem to want more education. But on second thought, matters are not so simple. Increasing salaries raise the opportunity cost of going to college, as a youth could have worked rather than studied. Rising income also makes Western universities more affordable, which parents may prefer to Central Asian universities.
In a study of the university over the period 2011–2017, Samuratova and Taylor (2018) find that the effect of real national income on credit hours was positive and statistically significant in some specifications, controlling for tuition, type of semester, and for the program. But the effect was small: A 10% rise in income was associated with a 1.2% rise in credit hours registered. And the effect was not robust to the type of model estimated. In some specifications, the effect of income was negative or statistically insignificant. In general, the positive effects of income on college demand may roughly offset the negative effects.
Research Design and Forecasting Methods
The most important step in designing a model to forecast enrollment at University X is to pick the forecast method. Enrollment predictions may be based on such determinants as tuition and income, which I will call “structural” analysis; projections of past enrollment (“univariate” analysis); or patterns in big datasets (“data mining”). (A side note: instead of enrollment, the dependent variable—that is, the variable on the left-hand side of the equation, which we are trying to predict—might be the number of credit hours registered. Enrollment and credit hours move closely together, so it doesn’t matter much which of the two that we use.) Assessing past studies using these methods may help us choose a method for the case at hand.
The success of a structural model depends on the explanatory variables chosen. Past structural studies may suggest explanatory variables to use.
Studies around the world often find that tuition and fees barely affect the demand for higher education. Hemelt and Marcotte (2008) estimate a tuition and fees elasticity of total headcount of just –0.1072 in the United States.
Student gender may also influence the demand for higher education. A study of application rates for taking an undergraduate entrance exam in Iran, by Ghavidel and Jahani (2015), found that it differed between the two sexes and that it rose with economic growth.
We saw that enrollment at University X may be restricted by the size of the cohort. (The cohort is usually determined by the semester in which the students began study.) But cohort effects can be complex. Ahlburg, Crimmins, and Easterlin (1981) find that big cohorts cut returns to college, discouraging enrollment.
Other researchers target timing in the cohort. Wachter and Wascher (1984) note that those born early in a baby boom will enroll in college quickly to cash in on high rates of return—because these returns will fall once those born at the peak of the boom enroll, due to the large number of boomers. Similarly, those born late in the boom will delay enrolling as rates of return will rise as the boomers decrease.
The ultimate in structural approaches is to scour Big Data to find prospective students who are like successful graduates of the college, as Finley (2017) reports. Jay Goff, vice president for enrollment and retention management at St. Louis University, said,
… When we looked at the demographics of the previous class, we wanted to not only look at the students who chose to enroll at the institution, but those who ended up succeeding and were satisfied. We wanted to know if we could replicate those students.
Surveying marketing research, Delen (2011) concludes that “retaining existing customers is crucial because … acquiring a new customer costs roughly 10 times more than keeping the one that you already have” (p. 19). To identify students about to drop out, Delen mines the data, which we discuss below.
Logistic regression is a popular nonlinear way to forecast enrollment. A study by Sampath, Flagel, and Figueroa (2017), at George Mason University in the United States, applied the logistic probability distribution—which resembles an S—to model the chance that an admitted student would enroll. The parameters of the distribution depended on such factors as the student’s exam scores, grades, gender, and race. Similarly, Gevasimović, Bugarić, and Božić (2016) used logistic regression to predict what graduates of vocational schools would do next; their choices included enrolling in a higher college. Their sample was of 159 students from two vocational schools in Belgrade.
We can compare logistic regression with a linear probability model, in which the probability of enrollment is a function of student characteristics that is linear in the parameters. One advantage of logistic regression is that it avoids a violation of assumptions that underlie the interpretation of the linear model. The usual linear regression, estimated by ordinary least squares (OLS), assumes that the dependent variable is unbounded; its values may vary from near negative-infinity to near positive-infinity. But in a linear probability model, the dependent variable is actually bounded between 0 and 1. In contrast, logistic regression permits these bounds.
Although useful, logistic regression is not perfect. Delen (2011) concludes from prior studies that machine-learning methods (such as neural networks and decision trees) predict more accurately than such statistical methods as logistic regression and discriminant analysis. They are also freer of assumptions.
The idea in univariate modeling is to exploit the inertia in most economic time series to forecast on the basis of what we know about the series today. (Time series are variables that change over time, such as enrollment.) For example, as national consumption changes little over the years, we can forecast its value next year by extrapolating from its value this year.
This contrasts with the structural approach, which forecasts on the basis of the dependent variable’s likely causes. Such a model seems sensible. In microeconomic theory, the demand for most goods depends heavily on price and income. So why shouldn’t it apply to the demand to go to college?
Advocates of univariate modeling concede that the structural analysis is indeed appropriate, if we mainly want to explain enrollment. To be sure, most papers on enrollment test hypotheses rather than forecast. In predicting enrollment, however, the structural model runs into problems, according to its critics.
First, it often roots in a theory of demand which considers just a few variables. This simplicity helps us understand how to connect the dots; but forecasting depends less on understanding than on accuracy.
Second, accuracy depends on a bevy of variables that taken one by one may seem transient and minor but taken as a whole can reshape enrollment. The analyst would like to add right-hand variables to control for each factor; but economic time series are often short, so she may not have enough information in the dataset to estimate many coefficients.
In short, the analyst needs an econometric model that is both simple and accurate. So, contend supporters of the univariate approach, why not forecast enrollment at time t by extrapolating from enrollment at time t – 1? The enrollment lag summarizes the impact of the determinants that are too numerous to specify.
Some statisticians use past and current enrollment data to predict what prospective students would do, such as Edstrom (2012). Similarly, to study part-time adult students at Northwestern University, Shapiro and Bray (2011) used past enrollment to predict. Lazăr and Lazăr (2015) specified a cubic function to forecast admission in a Romanian university.
When the forecaster has an observation for each student, then an advantage of using past enrollment is that she can update her data with each additional semester, improving the forecast, note Shapiro and Bray (2011). In contrast, the structural analyst cannot update a set of data of the student’s fixed characteristics such as gender.
A weakness of a simple univariate model is that it does not control for recent events, or “shocks,” that affect current enrollment but not past enrollment. An example is the central bank’s unexpected decision in August 2015 to float the currency in Kazakhstan, the tenge, rather than keep managing the exchange rate. After the float began, the exchange rate fluctuated wildly. Uncertain of what would happen, families sharply reduced the number of credit hours bought in Spring 2016. Controlling for hours in Fall 2015, the univariate model overestimated the number of hours in the following spring by nearly 2,000, which was its most severe error. By Spring 2017, however, the model was back on track.
The analyst can control for shocks with lasting effects by treating them as determinants of current enrollment. For example, the exchange-rate shock just described shows up in the Spring 2016 error of the enrollment model, where the error is the difference between the actual enrollment for that semester and the predicted enrollment. The analyst can introduce such past errors as an explanatory variable in an equation for forecasting. The past-errors variable is called a “moving average.”
Another weakness of the univariate approach may arise from a lack of data. Short time series are often hard to disentangle due to common time trends. The analyst might consider using several models.
Data Mining Models
To predict enrollment, statisticians are beginning to first examine data for patterns. An example is the artificial neural network, which initially assigns random weights to inputs and then updates the weights to reduce the error in output. For example, a model predicting enrollment may use such inputs as tuition, demographics, and the wage premium earned by graduates. The network first assigns random weights to these inputs and predicts enrollment as the output. It compares the prediction with actual enrollment and then updates the weights to minimize some measure of the error, such as the sum of squared errors over all observations. The updating process itself is a black box.
Delen (2011) gives an example. To predict attrition—that is, the student’s failure to graduate—a neural network inputs data that are socio-demographic, financial, and educational. The network forecasts whether the student will return for a second fall semester, compares the prediction with what happened, and then reweights the inputs to reduce the error.
Siri (2015) used neural networks to predict dropouts at the University of Genoa. Accurate rates of prediction ranged from 76% to 84% across three groups of students. Studying freshmen attrition at a public university in the Midwestern United States, Delen (2011) found that neural networks performed better, with 81% accuracy, than did logistic regression or decision trees. (A decision tree splits the sample into subgroups of similar observations to improve predictions. It enables the analyst to identify the variable that explains the largest share of the variation in the dependent variable. For example, the high school grade-point average might explain movements in college enrollment.) But users may prefer the decision tree because it makes clearer just what affects the decision to drop out, “whereas artificial neural networks are mathematical models that do not provide such a transparent view of ‘how they do what they do,’” Delen writes (p. 32).
Bogard, Helbig, Huff, and James (2011) use decision trees, which they view as relatively accurate as well as robust to missing data. They also tested logistic regression, ensemble models and neural networks. (An ensemble model is a combination of several approaches, such as a neural network, a decision tree, and a logit regression.) Their dataset consisted of 3 years of students in their first year, seeking degrees for the first time. They observed students at the beginning of the term, in the fifth week, and at the end of the term. The accuracy of their model improved at each stage.
Critics of data mining say the analyst should begin with a theory, not with the data, for two reasons. First, statistical relationships depend on her controls; she should specify the theory first so that she knows what to control. For example, many studies observe a positive relationship between consumption and income; often, to explain current consumption, these studies use current income (the amount of money that you make this year) as the only independent variable, with no explicit controls. But economic theory holds that consumption may also depend on your accumulation of past savings, known as “wealth.” If the analyst doesn’t control for wealth, then she may overstate the impact of income on consumption, because income correlates positively with wealth.
For instance, suppose that the analyst estimates that an additional dollar of income raises consumption by 70 cents. In reality, the dollar of income may directly raise consumption by only 50 cents; the other 20 cents is due to the indirect relationship of wealth with income and consequently with consumption. Had the analyst controlled for wealth, she would have estimated the correct impact of a dollar of income on consumption, which is 50 cents.
The second reason asserted for theorizing before estimating is psychological. The analyst’s interpretation of the data depends on her perspective, so she should write down her theory first, to determine what she expects the data to show. For example, suppose that the analyst believes that enrollment depends on economic factors. Then if she finds a negative relationship between enrollment and tuition, she may regard this result as verification of common sense. But if she instead finds no relationship between enrollment and tuition, she may dismiss this result as due to bad data, because it is not consistent with her (possibly unconscious) beliefs. To avoid this anomaly, the analyst should begin by stating her theory so that she can avoid ignoring results because she didn’t expect them. Unexpected results may occur not because they’re invalid but because the analyst needs to revise her theory and thus her expectations.
Evaluating Forecast Models
To gauge the accuracy of a forecast model, the analyst can look at how well it would have predicted past enrollment. For example, she can restrict her dataset to values from 2000 through 2015, estimate her model, use it to forecast enrollment in 2016, and compare the forecast with the actual enrollment that year. An appealing way to do the comparison is to calculate the absolute value of the error and divide it by the actual value of enrollment. For example, suppose that in 2016 the actual enrollment was 2,000 and the forecast was 2,200. Then the forecast error is (200 / 2000) × 100% = 10%. The analyst can compute this measure for other years and take their average. If, for example, she calculates an error of 20% for 2017 as well as 10% for 2016, then her average for the 2 years is 15%. Econometricians call this the “mean average percentage error.”
Another way to test for the thoroughness of the forecast model is to see whether its errors fit the pattern of random shocks—shocks that can’t be predicted. All other shocks, such as a predictable change in the fertility rate, are already reflected in the model if it is a good one. In the usual pattern of random shocks, small shocks are much more likely to occur than large ones, be they positive or negative. This pattern is the “normal probability distribution”—or, more vividly, the “bell-shaped curve,” so-called because the height of the distribution measures how likely that particular shock is. The middle of the curve corresponds to small shocks and the extremes of the curve to large shocks—positive ones on the right and negative ones on the left. In a good model, the errors from the sample, called “residuals,” usually follow a normal distribution.
At University X, the Office of Enrollment Records maintains data for each semester and program going back to Fall 2005 at least. But the university reformed its business programs in 2011, so we chose this as the initial year in our time series. This constrained the number of independent variables that we could use, as the small dataset limited the precision of our coefficient estimates.
Method in Action
We used OLS and related methods to forecast enrollment. Under certain assumptions, OLS chooses coefficients that minimize the model’s sum of squared errors in predicting past enrollment.
The key choice was that of a univariate model. At one time, we had forecasted enrollment with a simple structural model, in which the only determinant was the annual world price of a barrel of crude oil on the spot market. In principle, oil prices determine national income, which in turn determines enrollment. Users liked this model, because they could just plug in the assumed oil price and get a prediction of enrollment. Unfortunately, the percentage errors in predicting enrollment were often north of 20%. One problem was that the forecast of enrollment required a forecast of global oil prices, which are volatile. Also, it turned out that income had small effects on enrollment after 2011.
So we replaced the structural model with a univariate one. One advantage of the univariate model was that it did not require as much information as an elaborate structural model. We forecasted enrollment on the basis of the type of semester, a time trend, past enrollment, and past errors in predicting enrollment. The mean average percentage error of this approach was about 8%, much lower than that of the structural models that we had used.
Practical Lessons Learned
While the forecasts focused on the total number of credit hours registered, a practical question is the number of credits anticipated for a particular program, as this determines the program’s faculty load. The dataset for a given program is often too small to provide a reliable forecast. One solution is to base the program forecast on the program’s share of all credits in a recent semester. Another is to estimate the program share of credits from a linear regression on a time trend.
Another lesson concerns the timing of observations. Because of courses dropped and added, the number of credits registered fluctuates throughout the semester. A solution is to always use the number of hours registered at the end of the semester. But this prevents the use of data from the middle of the current semester—data that may indicate an unfolding shock.
The literature is shifting from studies that explain enrollment to studies that predict it. The evolution is natural, as the former studies generate determinants that can be added to the latter. But small young universities, like those of Central Asia, might not have generated enough data to yield reliable predictions, particularly because their curricula often change rapidly. In some cases, administrators might gain from simulations that probe the sensitivity of forecasts to such parameters as the elasticity of enrollment to tuition and income.
Exercises and Discussion Questions
- Are there any circumstances under which one might prefer a structural model to a univariate one for a forecast?
- The R2 statistic measures the share of variation in the dependent variable that the OLS model can account for. For example, in a time-series analysis, an R2 value of .8 indicates that the model can account for 80% of the movements in the dependent variable over time. Does a high value of R2 necessarily mean that the model will forecast accurately the dependent variable?
- Will gathering more data always improve a forecast?
- At University X, the pool of college-age Kazakhstanis might continue to shrink until 2020. Meanwhile, judging from the case study, what can X do to boost enrollment?