Affirmative Action, Education and Gender: Evidence from India∗. Guilhem Cassan† March 25, 2015 Abstract: I use a unique natural experiment to measure the impact of caste based positive discrimination in India on differential educational attainment by gender. I take advantage of the harmonization of the Scheduled Castes lists within the Indian states taking place in 1976 to measure the increase of the educational attainment of the new beneficiaries. I show that this policy had heterogenous effects across genders, with males benefiting from the SC status and females remaining essentially unaffected. The results are consistent with a model of identity based norm of behavior in which the effect of the policy would go through an increase of returns to education more than a decrease of its cost. I also show that this translated into a differential increase in literacy and cognitive skills, and propose a new method based on a meta analysis of the data to proxy the latter. JEL Classification: I24; O15; H41 Keywords: identity, scheduled caste; quota; affirmative action; gender; India; education. ∗ I am grateful to Gani Aldashev, Denis Cogneau, Christelle Dumas, Hemanshu Kumar, Eliana La Ferrara, Sylvie Lambert, Andreas Madestam, Annamaria Milazzo and Rohini Somanathan as well as seminar participants at AMID Summer School, Bocconi University, Louvain La Neuve, Bristol, Graduate Institute Geneva, Paris I, CERDI, EUDN Conference and Paris School of Economics for useful comments. Marion Leturcq and Lore Vandewalle comments on an earlier draft vastly improved this paper. I am indebted to Julien Grenet for his help in the data collection, and to Ashwini Natraj, with whom I started this project. This paper is produced as part of the project “Actors, Markets, and Institutions in Developing Countries: A micro-empirical approach” (AMID), a Marie Curie Initial Training Network (ITN) funded by the European Commission. Funding from the CEPREMAP is gratefully acknowledged. All remaining errors are mine. All remaining errors in this paper, obviously. † University of Namur, CRED. Email: guilhem.cassan@unamur.be. 1 Introduction Affirmative action policies target socially disadvantaged groups based on an identity criteria such as skin color, religion, gender or caste. Those identities are selected based on the historical discrimination that they have faced, and are often still facing. In most cases, those policies take identity as univariate. However, as underlined by the “constructivist” view on identities (see Chandra (2012) or Laitin and Posner (2001), among others), individuals have a portfolio of identities (gender and religion, for example) and might cumulate disadvantages. As a result, policies who fail to take into account this identity plurality might fail to succeed in reducing the social disadvantages faced by the groups, or part of the groups, that they are targeting. To illustrate this case, this paper measures the impact of affirmative action policies on educational attainment in India, using a nation wide natural experiment. It shows that the effect of affirmative action policies in India affect males and females in a dramatically asymmetric manner. India is probably the best case study to illustrate the importance to take the multiplicity of identities into account while evaluating affirmative action policies. Indeed, while several countries have put in place affirmative action programs targeting disadvantaged minorities, the largest one has been established in India. It consists in a set of various policies targeting certain caste or tribe groups1 , in three main domains: politics, public employment, and education. With an increasing competition in the access to education, those policies are heavily contested in India. However, to the best of my knowledge, there is only very limited quantitative evidence of the impact of the affirmative action programs on the educational attainment (as underlined by Chalam (1990), Chitnis (1972) and Galanter (1984)). Only Khanna (2013) has evaluated the effect of affirmative action policies on OBC’s educationnal attainment. For the SC population, the first two domains of the Indian affirmative action policies have been systematically evaluated at the national level: Pande (2003) studied the impact of reservations of seats in legislative assemblies for low castes, while Howard and Prakash (2008) and Prakash (2009) chose to focus on the effect of reservation of public employment. On education however, most of the research has focused on case studies quantifying the equity dimension of affirmative action (Bertrand et al., 2010; Krishna and Frisancho Robles, n.d.; Bagde et al., 2011). Since those studies typically focus on 1 The “Scheduled Castes”, or SC, the “Scheduled Tribes”, or ST, and the “Other Backward Classes”, or OBC. 2 a very narrow group of low castes that reached higher education level, we have little knowledge of the extent to which the SC as a whole benefited from the affirmative action programs in terms of educational attainment. This paper takes advantage of a nationwide natural experiment on the beneficiaries of caste-based positive discrimination in order to measure its impact on educational attainment. Indeed, the list of the castes considered as SC were drawn by each Indian state at the Independence, with variations in the list across and even within states. In 1956, the borders of the Indian states have been redrawn, while the lists of SC remained unchanged. This created a situation where, within a State, the “Scheduled Caste” status of the members of a same caste could vary across areas. For administrative reasons, this situation lasted until 1976, when the list of castes considered as SC were finally harmonized within each state, allowing 2.4 million individuals (Government of India, ed, 1978) to have access to the benefits of the SC status. This unique historical event allows me to assess the fate of the individuals who had access to the SC status in 1976. I use two identification strategies to evaluate the impact of the affirmative action policies on educational attainment. First, I compare those “new” SC to individuals that will obtain access to the OBC status in the 1990’s. Second,I compare those “new” SC to individuals that had access to the SC status since the Independence. Those two very different identification strategies yield similar results. Overal, access to the SC status leads to an increase of 0.3 years of education, unevenly distributed across genders, with males capturing essentially all the increase in the educational attainment and females remaining unaffected. As identity norms of behavior are different across genders, we can use this feature us to distinguish the channels through which the policy is affecting the level of education of the SC. As women tend to be excluded from the labour force, they will not benefit from the increased returns to education provided by the public employment quotas. They will however benefit from the decrease in the cost of education provided by the SC status. As a consequence, a simple model of education choice with norms of behavior by gender allow us to interpret the difference in outcome by gender as a consequence of the effect of the policy playing mainly through its increase in the returns to education. In addition, to assess if this increased schooling translated into increased human capital, I add to the standard self reported skill measures by using a novel meta data approach. Building on the use of age heaping as a measure of numeracy by economic historians such as Mokyr (1983), I use the within household declared age structure to infer the cognitive abilities of the household respondent. Both this novel measure and 3 the standard self declared literacy point to the same direction of the access to affirmative action policies leading to an increase in human capital. This paper is organized in the most traditional manner. In the first section, I will examine the context and the natural experiment exploited in this paper. A simple model is then presented to facilitate the interpretation of the results. I will then describe the data as well as the empirical strategy, which will open the way to the presentation of the results, in the following section. In order to enhance the confidence in the results, the issue of selection migration is discussed and ruled out. And I will finally conclude. 1 1.1 Context Affirmative action in India While the first affirmative action policies for the Scheduled Castes were implemented under the British rule, it is not before the Independence that a systematic positive discrimination policy was implemented across India. Reservations concerned 3 main dimensions: legislative seats, education and public employment. Within education, affirmative action consists in various policies. The most famous one is quotas in higher education institutions, but secondary schooling was also made free, while each state also has various policies (specific scholarships - for girls in particular, schools and hostels, free mid day meals, etc). All in all, the various schemes targeting the education of the SC are the main expenditures in the budget for the Welfare Scheme for SC of the Indian Plan: they represented 33% of its amount in 1951-1956 and 55% in 1974-79. Positive discrimination might thus affect schooling through various channels. By reducing the cost of education, it favors longer studies in the cost-benefit arbitrage of the household, while quotas in higher education will help the pursuit of studies after secondary schooling. Moreover, the quotas in public employment also are an incitation to pursue longer studies, as they increase the probability of employment in the formal sector, and thus, increase the returns to education. Hence, this paper does not evaluate the effect of affirmative action in the educational sector but the effect of the affirmative action policies on educational attainment. The Constitution of India also identified an other group of castes, the “Other Backward Classes”, castes that would not be as discriminated as the SC, but still low in the caste hierarchy, as recipients for affirmative action policies. However, it is not before 1993 that public employment quotas were systematically implemented for those groups throughout India, while quotas in higher education were systematically implemented solely in 2008. There are no seats reserved for those groups 4 in the assemblies at the state and federal level. 1.2 The definition of the Scheduled Castes This section, drawing from the work of Galanter (1984) provides a short history of the list of Scheduled Castes. One of the main problems with the making of such classification is that the definition of “untouchability”, the criteria to be considered a SC, is not straightforward: as “untouchability” varies in its meaning across the sub continent, it is hard to create a definition that would apply to the whole country. Indeed, while untouchable castes are relatively well identified in the South and West of India, it is not the case in the other parts of the country. Hence, the Constitution of 1950 avoids to define a clear concept and only provides a procedure of designation2 that each State is to follow. This allowed for the possibility of inconsistencies across States3 and even within States, as certain States decided to give the SC status to certain castes only in certain areas. Despite those inconsistencies, the lists were revised only four times since the Independence4 . With an increase of 2.4 million SC over an original population of 80 million SC, the Scheduled Castes and Scheduled Tribes (Amendment) Act of 1976 was the most dramatic change in the list of SC in India. 1.3 The Scheduled Castes and Scheduled Tribes Orders (Amendment) Act of 1976 In 1956, India reorganized the borders of its States along linguistic lines5 . But, as the borders of the States were redefined (see Figure 1), the State-wise SC lists remained unchanged. This led to a situation of large discrepancies between State borders and the lists of SC. The latter ones not being defined at the State level anymore, but by area within each State, areas corresponding to the pre-1956 borders. Hence, from 1956, the number of castes considered as SC in one part of a State and non SC in an other part of the same State vastly increased. 2 “castes, races or tribes or parts of or groups within castes, races and tribes which shall for purposes of this Constitution be deemed to be Scheduled Castes in relation to that State.” 3 Bayly (1999) gives the example of the Khatik caste, considered as SC in Punjab, but classed as a “forward” caste in Uttar Pradesh, a neighboring State at the time of the establishment of the lists. 4 The first change, in 1951 was a matter of correcting small anomalies. The second change, in 1956 mainly affected Rajasthan and Uttar Pradesh, and also allowed all Sikh untouchable castes to claim SC status, while the fourth change of 1990 allowed the Buddhists to have access to the SC status in all the States. The most important change took place in 1976 and is described in more details in the paper. 5 And in 1960, the states of Maharashtra and Gujarat were created from the former state of Bombay. 5 [Figure 1 about here.] The reason for the list not to be adjusted to the new borders is the slowness of the administration: “It has been mentioned in the last report that the President has issued the SC and ST Lists (Modification) Order, 1956, specifying the SC and ST in the reorganized States. As these lists had to be issued urgently for the re-organized States, it was not possible to prepare comprehensive and consolidated lists and therefore, the SC and ST had to be specified in these list territory-wise within each re-organized State” (Government of India, ed, 1958). But not only did the administration fail to change the lists on time, it failed to do so for a period of twenty years. The yearly reports of the Commissioner on SC and ST are particularly telling in this aspect, as many of its yearly occurrences refer to the fact that “[...]the question of preparation of comprehensive lists of SC and ST for the reorganized States [...] remained pending [...]” (Government of India, ed, 1960). It is only under the emergency rule of Indhira Gandhi that the SC lists were harmonized within states with the SC and ST (Amendment) Act of 1976. Hence, due to administrative reasons, individuals from the same caste could be considered as SC in one part of a State but not in an other until 1976. This situation thus creates a natural experiment setting in which the members of a caste faced a different access to affirmative action status within the same state. In 1976, the Area Restriction Removal Act removed almost all intra-State restrictions6 . This removal of restrictions led to an increase of 2.4 million of the SC population7 . This unique historical event creates a plausible exogenous variation in the SC status across individuals, allowing me to assess the causal impact of the SC status on educational attainment. 1.4 Gender and schooling Within the SC (and in the Indian population in general), women are generally at a disadvantage in terms of schooling. For example, in the data, women aged 18 to 48 at the time of survey have a on average 2.6 years of schooling as opposed to 5.4 for 6 According the Galanter (1984) their number dropped from 1,126 to 64. The states affected by the reform were Andrah Pradesh, Bihar, Gujarat, Karnataka, Keral, Madhya Pradesh, Maharashtra, Rajasthan, Uttar Pradesh, Tamil Nadu, West Bengal and Himachal Pradesh.The analysis excludes Himachal Pradesh, as when this Union Territory was granted the State status in 1966, large portions of the state of Punjab were also transferred to it: hence, between 1956 and 1966, the population living in the contemporary borders of Himachal Pradesh were exposed to different State policies, preventing us to identify the sole effect of the access to the SC status. Results are unaffected by the inclusion of Himachal Pradesh. 7 6 their males counterpart. One of the main reasons put forward to explain this fact are the lower returns to schooling (Kingdon, 1998), or lower perceived returns to schooling (Dreze and Sen, 2002). Indeed, as is evident in Table 1, women’s participation in the labour force is much smaller than males’, with 63.2% of women out of the labor force as opposed to 14% of males. As a result, women’s returns to education are often perceived as quasi inexistant8 . Dreze and Sen (2002) also underline a second important reason for the low enrolment of women: the low quality of schooling infrastructures. Indeed while, say, the absence of a nearby school could be thought to be gender neutral, in practice, girls are more affected, as parents tend to be more reluctant to send their daughters to far away schools than their sons. Finally, an other reason put forward by Dreze and Sen (2002) for the lower education of girls is that even if education can be an asset on the marriage market, it can only be so if the girl’s education remains lower than that of her potential husband. [Table 1 about here.] 2 Theoretical Framework A simplified version of the two children model of La Ferrara and Milazzo (2014) will allow me to take those various dimensions into account and derive predictions of the response of daughters’ and sons’ education as a function of the type of mechanism through which affirmative action are affecting education decisions. I consider a household with a parent and two children, one son and one daughter. The parent allocates its exogenous resources Y p between own consumption C p and investment in children i’s education Ei . Following the notation of La Ferrara and Milazzo (2014), I write a child’s preferences as U i = u(C i ) with C i = Y i (E i ), u0 (.) > 0, u00 (.) < 0, Y 0 (.) > 0, Y 00 (.) < 0 and i=(s,d), where s stands for son and d for daughter. The parent’s utility is a function of her children consumption and her own consumption: V P = v(C p ; Y s (E S ); Y d (E d )) with partial derivatives vCp (.) > 0,vYs (.) > 0,vYd (.) > 0, vCp ,Cp (.) ≤ 0, vYs ,Ys (.) ≤ 0 and vYd ,Yd (.) ≤ 0. Moreover, I make a standard separability assumption between parental consumption and child’s income: vCp ,Ys (.) = vCp ,Yd (.) = 0. 8 Dreze and Sen (2002) write for example: “[...] the gender division of labour [...] tends to reduce the perceived benefits of female education. [...] It is in the light of these social expectations about the adult life of women that female education appears to many parents to be of somewhat uncertain value, if not quite ’pointless’ .” 7 The parent’s problem can be written as: max v(C p ; Y s (E S ); Y d (E d )) C p ,E s ,E d s.t. Y p = C p + pd E d + ps E s With pi the price of education of child i. Substituting the budget constraint into the parent’s utility and taking the ration of the first order conditions, I get the equilibrium condition: vYs (.) ps ∂Y d (E d )/∂E d = pd ∂Y s (E s )/∂E s vYd (.) (1) As in La Ferrara and Milazzo (2014), “this condition states that the ratio of resources invested in son and daughter’s education reflects differences in returns to education and relative prices of education across genders”. Hence, I get the following predictions: 1. A (a)symmetrical decrease in the cost of education of boys and girls will lead to a (a)symmetrical increase in their education. 2. A (a)symmetrical decrease in the returns to education of boys and girls will lead to a (a)symmetrical increase in their education. From the discussions in 1.1 and 1.4, we know that affirmative action policies can affect the cost of schooling, and that this effect is likely to be either gender neutral or biased towards girls. The quotas in public employment will increase the returns to education, but as girls are for the most part excluded from the labour market, only boys will be perceived as benefiting from those quotas, leading to a gender asymmetric increase in (perceived) returns to education. As a result, I will be able to test which of the decrease in costs or increase in returns is likely to drive the effect of affirmative action policies on educational attainment by comparing boys and girls. Indeed if I find that both boys and girls, or only girls, see their level of education affected, then the likely channel will be the cost of education. If, to the contrary, only boys benefit from the access to the SC status, then the likely channel will be the returns to education. 3 3.1 Data and Empirical Strategy Data and descriptive statistics The National Family and Health Survey of 1998-99 is, to my knowledge, the only dataset offering both the precise caste name (the “jati”) of respondents, their district of residence 8 and a sufficient sample size to perform the type of analysis done in this paper9 . Using the 1971 and 1981 Census lists of Scheduled Castes and the district of residence of households, I am able to identify the households that were granted the SC status in 197610 . Table 2 provide the summary statistics for the variables used throughout the paper. As expected, one can see that the educational attainment in the sample is lower for females than for males, and that overall, OBC have a higher education level than “Old SC” who themselves have a higher level of education than ”New SC”. [Table 2 about here.] 3.2 Identification Strategy As the SC are a different population from the general population, much poorer in particular, comparing the “new” SC to the general population does not allow to identify the specific effect of the access to the SC status on educational attainment from other social policies, as various pro-poor policies were launched at that period. Hence, two credible counterfactuals arise. The first are the OBC that will have access to other affirmative action policies in the 1990’s, but do not have access to such policies in 1976. The second are the SC already on the lists in 1976. I will call them the “Old SC”. The identification strategy of this paper will thus rely on simple differences in differences. The first difference will compare cohorts too old to benefit from the access to the SC status in 197711 from the point of view of their educational attainment to those that were young enough. Throughout the paper, I will consider as “young enough” the cohorts aged 10 years old or below in 1977, that are still at primary school age in the year of implementation of the law. The second difference will be derived from the comparison of individuals who will be considered as OBC from the early 1990’s (first identification strategy) or are “Old SC” (second identification strategy) to those that were granted the access in 1977 only. My preferred identification strategy is the first, as OBC did not have systematic access to affirmative action policies before 1993, while “Old SC” had access to those policies since the Independance. As a result, the second identification strategy is somewhat non conventional since the control group is “always treated”, while in a typical difference in differences, the control group is “never treated”. 9 For example, the 2004-5 round of the NFHS does not contain information on the district of residence of the respondents. 10 In order to do so, I simply match the 1999 districts of the NFHS data to their 1971 counterparts using the Indian Administrative Atlas of 2001, which follows district changes over time. 11 Year of implementation of the 1976 change 9 The difference in the timing of access to the SC status suggests that if the policy had an impact, the evolution of the educational status of the groups should diverge for the cohorts at school age between 1950 and 1976, when their treatment status differed and converge for the cohorts at school age after 1976, when both groups are treated. As is evident in the Figure 2, such an evolution indeed seems to have taken place. [Figure 2 about here.] I will thus run regressions of the type: Eduidt = constant + βN ewSCid + δN ewSCid ∗ post1967i + γpost1967i (2) + λXidt + idt Where Eduidt is a measure of educational attainment of individual i born in year t and residing in district d, N ewSCid a dummy indicating whether individual i residing in district d is member of a caste added to the SC list in 1976, post1967i a dummy taking value 1 if individual i is born in year 1967 or after and Xidt a set of control variables. Obviously, the ideal specification to be ran would be a non parametric one, in which the effect of the “New SC” coefficient would be allowed to vary across cohorts, as in Duflo (2001). Unfortunately, with only 1202 “New SC” aged 18 or above at the time of the survey, such a specification is not feasible, as each cohort would only have around 40 individuals, and half of that for gender level analysis. I nonetheless present the coefficients of “New SC” and 2-years cohorts dummies in Figure 3. [Figure 3 about here.] It can clearly seen that the level of education of males “New SC” starts to catch up after 1967, when the first cohorts are exposed to the SC status. For females, however, the cohorts born from 1967 do not seem to be at all affected. 4 4.1 Results Years of Education Table 3 presents the results of the estimation of Equation 2. We can see that the access to the SC status led to an increase in the number of years of schooling of 0.3 to 0.4 years. The coefficient of interest on New SC*post 1967 remains stable throughout specifications: district trend in column 2, within household estimation in column 3 and 10 within household and state-cohort FE, in column 4. As the within household estimations are very demanding on the data, and allow an estimation only on households who have members born before and after 1967, my preferred specification is the second one, with Jati fixed effect and district trends, which controls for all relevant covariates will still allowing to benefit from most of the variation in the data. The results are qualitatively similar across identification strategies even if the comparison to “Old SC”, relying on much less observations, results is somewhat less significant coefficients. [Table 3 about here.] However, the picture changes dramatically once the coefficient is allowed to vary across genders, as can be seen in Table 4. Indeed, it becomes clear that only males seem to benefit from their inclusion in the lists of SC. With an increase of their educational level of 0.7 to 1.3 years of schooling, they capture all the increase in schooling that had been measured in Table 3, while women remain essentially unaffected. It is to be noted that the positive coefficient on the New SC * Female coefficient is due to the fact that women, especially in the older cohorts, have very often no education at all, which by construction reduces the New SC educational gap for women. [Table 4 about here.] 4.2 Educational level Focusing on the number of years of education might hide various effects of the access to the SC status on the different levels of education. Table 5 presents the results of Jati fixed effects-district trend specification on educational levels: schooling, primary completion, some secondary schooling and secondary completion12 . It can be seen that the educational attainment of males has increased throughout all the steps of education but secondary school completion, and this increase is particularly strong - and precisely estimated - for primary completion and some secondary schooling. Moreover, the absence of impact of the SC status on the level of education of women remains true throughout the spectrum of education levels. [Table 5 about here.] Hence, it is now clear that while the access to the SC status does indeed lead to an increase in the educational attainment of the SC population, this increase is in fact completely captured by males, females being completely excluded from it. 12 The very similar results for all the specifications can be found in Online Appendix 1. 11 4.3 Skill acquisition However increasing the number of years spent at school might not lead to an increase in human capital if the quality of education is low. Hence, I will now turn to the evaluation of the impact of the access to the SC status on two skills: literacy and cognitive skills. Since literacy is self reported, one might question the reliability of this measure. In this sub section, I propose a novel way to measure skill acquisition. The household module of the NFHS, used in this paper, is addressed to one respondent, who reports information on the other household members. As a result, the specific way in which the questions are answered can be used to infer some information about the respondent. Any person familiar with household data on developing countries knows that the tendency to round age to the closest multiple of 5 is very high. The standard way to measure age heaping in the data is the Whipple index13 , which measures the tendency to declare ages which are multiple of 5. Economic historians (Bachi (1951),Myers (1954), Mokyr (1983) or A’Hearn et al. (2009)) pioneered the use of this index at the national level as a measure of the long term evolution of numeracy . I propose here to exploit this measure at the household level to proxy for the cognitive skills14 of the respondent, the persons that declares the age of all the other household members. This measure has the major advantage to be based on revealed cognitive skills instead of self reported ones. An important drawback of this measure is that is a very noisy measure of those skills. Appendix 5 describes the age heaping in the data. It also shows that household level age heaping correlates well with the respondent’s level of education, an indication of the pertinence of this index for cognitive skills. An additional concern with the use of this measure is that it might be correlated with other variables than the cognitive skills, such as the professionalism of the enumerator, for example. However, in the setting of this paper, the difference in differences strategy used essentially washes out all such potential concerns. To invalidate the use of such a measure on that grounds, one would for example need to make the claim that enumerators collected differently the ages of “New SC” compared to the rest of the population, and, that this difference of treatment across castes in turn changed across cohorts. To compute this measure of cognitive skills, I calculate the Whipple index of the ages 13 It is the share of declared aged between 23 and 62 that are a multiple of 5, multiplied by 500. Hence, it varies between 0 and 500, where 0 means that no one declares an age multiple of 5 and 500, the opposite. If there is no tendency of age heaping, the Whipple index is of 100 (20% of the population should have an age that is a multiple of 5 if there is no misdeclaration). 14 I believe that in this setting, I capture not only numeracy, but also other abilities such as memory, for example. 12 declared by the respondent of each household. The declared ages can be multiple of 5 for three reasons. First, one fifth of the household members should have an age which is a multiple of 5. Second, the respondent lacks cognitive skills and rounds the age to a multiple of 5. Third, the respondent does not lack cognitive skills, but does not have the information on the exact age of the household members, and tends to round them. The Whipple index of the household is thus a function of the size of the household and of the cognitive skills of the respondent, as well as the cognitive skills of other household members. As I am interested in the cognitive skills of the respondent, I want to control for the second and third possibilities. The second point in particular could be an issue for small households, as the Whipple index will tend to take more extreme values for small households. To attenuate this measurement bias, I will control for the number of individuals aged 23 to 62 in the household. For the third point, I will control for the number of household members aged 23 to 62 who were born when the respondent was 15 and above (i.e when she was old enough to remember the year of birth of the respondent). In addition, as I am willing to assess the effect of the access to the SC status on the respondent’s cognitive skills, I will control for the number of other household members aged 23 to 62 born after 1966, as they also could have seen their cognitive skills affected by their access to the SC status, and thus indirectly affect the ages declared by the respondent. Additionally, I control for the average years of education of the household members aged 23 to 62 (except of the respondent). Finally, since the identity of the respondent is heavily selected, separating the coefficient between males and females would not be interpretable, and I will only present average effects on respondents. I will consider as “having cognitive skills” respondents whose household’s Whipple index is below 17515 . Table 6 presents the result of various specifications on self reported literacy outcomes and revealed cognitive skills. It can be seen that there is indeed a male specific increase in literacy for new SC born after 1967, and that cognitive skills seem also to have progressed. [Table 6 about here.] 15 A Whipple Index of 175 means that 35% of the population is declared to have an age which is a multiple of 5, i.e 15 percentage points above what the distribution of ages is expected to be absent age heaping. This threshold is the one used by the United Nations (2000) to declare that the data is “very badly” biased towards multiples of 5. Regressions using other thresholds yield similar and more precisely estimated results and can be seen in Appendix 5 13 5 Selective migration One concern with the estimation is selective migration. Indeed, I attribute the “New SC” status based on the district of residence of respondents. It is possible that between 1956 and 1976, households that would be considered SC in a part of a state would have migrated in order to benefit from the SC status. As a consequence, the results of the estimations might be biased by this migration. However, migration is relatively low in India (Munshi and Rosenzweig, 2009) and particularly so in the 1970’s and before16 . In addition, the households choosing to migrate would probably be the ones that would have benefited the most from the access to the SC status had they not migrated before 1976. That means that, if anything, this selective migration is likely to bias the estimates downwards, as I would consider those households as “Old SC”. Finally, given the difference in differences specification and the differential effect by gender, only a very specific and unlikely form of migration could drive the results. Indeed, one would need the migrant household to have at least one of the two following characteristics: 1. that its male members born before 1967 would have a higher education than the “Old SC” of the same age; 2. that its male members born after 1967 would have a lower education than the “Old SC” of the same age. And that these characteristics should be the opposite for the females of the household. As the NFHS data does not offer migration data, it is not possible to directly verify if the migrating household indeed are different in any form from the other households. However, since households migrating to benefit from the SC benefits are probably more likely to migrate to cities, where they could benefit from better schooling infrastructure and better access to public employment quotas, one robustness check would be to see if the results hold without the urban population. The NFHS distinguishes 3 types of urban areas: “capital and large city” (capitals and cities of more than a million inhabitants), “small city” (population over 50 000) and “town” (other urban areas). As migrants searching to acquire the SC status are likely to get to larger cities, I remove from the sample successively those various levels of urban areas and re-run my preferred specification to verify if the results are at all affected. As the coefficient remains very stable across samples, it is safe to conclude that the results are not at all driven by the most migration prone areas, and as a consequence, that migration is not driving the results. [Table 7 about here.] 16 Indeed, from the Migration Volumes of the 1981 Census, one can estimate that in 1981, only 3.91% of the males and 6.27% of females had migrated between 1961 and 1977 from an other districts. 14 6 Conclusion This paper studies the impact of the positive discrimination policy in education conducted by the Indian Government since the Independence. It shows that the impact of reservations in educational attainment is mixed. Indeed, while the males young enough to benefit from the SC status indeed see their schooling level increase, along with their actual literacy and cognitive skills, no such effect can be detected among females. This is coherent with a model in which norms of behavior prevent females from entering the labour force and the effect of affirmative action policies on education mainly going through perceived increased returns in education. Hence, while the affirmation action policies do indeed seem to have been successful in increasing the average education level of the SC population, they have not been able to reach its most backwards subpopulation: the women. While the “creamy layer” debate has captured most of the attention in the affirmative action debate in India, this paper suggests that the gender dimension also deserves a much larger attention. In particular, it suggests that population that cumulates discrimination need specific attention, as they might be completely excluded from the benefits of policies to which they were eligible. 15 References A’Hearn, Brian, J¨ org Baten, and Dorothee Crayen, “Quantifying Quantitative Literacy: Age Heaping and the History of Human Capital,” Journal of Economic History, 2009, 69 (4). Bachi, R, “The Tendency to Round Off Age Returns: Measurement and Correction.,” Bulletin of the International Statistical Institute, 1951, 33. Bagde, Surendrakumar, Dennis Epple, and Lowell J. Taylor, “Dismantling the Legacy of Caste: Affirmative Action in Indian Higher Education,” Working Paper, 2011. Bayly, Susan, Caste, Society and Politics in India, Cambridge University Press, 1999. Bertrand, Marianne, Rema Hanna, and Sendhil Mullainathan, “Affirmative Action in Education: Evidence from College Admissions in India,” Journal of Public Economics, 2010, 94 (1-2), 16–29. Chalam, K.S., “Caste Reservations and Equality of Opportunity in Education,” Economic and Political Weekly, 1990, 25 (41). Chandra, Kanchan, Constructivist Theories of Ethnic Politics, Oxford University Press, 2012. Chitnis, Suma, “Education for Equality: Case of Scheduled Castes in Higher Education,” Economic and Political Weekly, 1972, 7 (31-33). Dreze, Jean and Amartya Sen, India. Development and Participation., 2 ed., Oxford University Press, 2002. Duflo, Esther, “Schooling and Labor Market Consequences of School Construction in Indonesia: Evidence from an Unusual Policy Experiment,” American Economic Review, 2001, 4 (91), 795–813. Ferrara, Eliana La and Annamaria Milazzo, “Customary Norms, Inheritance, and Human Capital: Evidence from a Reform of the Matrilineal System in Ghana,” Working Paper, 2014. Galanter, Marc, Competing Equalities: Law and the Backward Classes in India, University of California Press, 1984. 16 Government of India, ed., Report of the Commissioner for Scheduled Castes and Scheduled Tribes, 1957-1958., Manager Govt. Of India Press, 1958. , ed., Report of the Commissioner for Scheduled Castes and Scheduled Tribes, 19581959., Manager Govt. Of India Press, 1960. , ed., Report of the Commissioner for Scheduled Castes and Scheduled Tribes, 19751977., Manager Govt. Of India Press, 1978. Howard, Larry and Nishit Prakash, “Does Employment Quota Explain Occupational Choice Among Disadvantaged Groups? A Natural Experiment from India,” Working Paper, 2008. Khanna, Gaurav, “That’s Affirmative: Incentivizing Standards or Standardizing Incentives? Affirmative Action in India,” Working Paper, 2013. Kingdon, Geetah, “Does the Labour Market Explain Lower Female Schooling in India ?,” Journal of Development Studies, 1998, 35 (1). Krishna, Kala and Veronica Frisancho Robles, “Affirmative Action in Higher Education in India: Targeting, Catch Up, and Mismatch,” Working Paper. Laitin, David and Daniel Posner, “The Implications of Constructivisim for Constructing Ethnic Fractionalization Indices,” Newsletter of the American Political Science Association Organized Section in Comparative Politics, Winter 2001, 12. Mokyr, Joel, Why Ireland Starved, George Allen & Unwin, 1983. Munshi, Kaivan and Mark Rosenzweig, “Why is Mobility in India so Low? Social Insurance, Inequality, and Growth,” Working Paper, 2009. Myers, R, “Accuracy of Age Reporting in the 1950 United States Census,” Journal of the American Statistical Association, 1954. Pande, Rohini, “Can Mandated Political Representation Provide Disadvantaged Minorities Policy Influence ? Theory and Evidence from India.,” American Economic Review, 2003, 93, 1132–1151. Prakash, Nishit, “The Impact of Employment Quotas on the Economic Lives of Disadvantaged Minorities in India,” Working Paper, 2009. United Nations, “Demographic Yearbook - Special http://unstats.un.org/UNSD/Demographic/products/dyb/dybcens.htm 2000. 17 Issue,” Appendices A Using the Whipple index to measure the cognitive skills of the respondent As using the Whipple index at the household level is highly non standard, I illustrate in this section why it makes sense in the setting of this paper A.1 Age heaping and education of the respondent One simple way to assess the relevance of the use of this measure is to see how it correlates with standard measures of education. Figure 4 graphs the age distribution in the data by level of education of the respondent. The strong tendency to declare an multiple of 5 can be easily spotted, and decreases with the level of education of the respondent. This is a clear indication that the cognitive skills of the respondent are reflected in the household level Whipple Index. [Figure 4 about here.] A.2 The Whipple Index Figure 5 presents the distribution of the Whipple index in the data, by education level of the respondent. The first striking pattern is that the data is clearly concentrated on three points: 0, 250 and 500. Indeed, a large number of households have only 2 members aged between 22 and 63. For those households, the Whipple index can only take one of those three values. The second main pattern is the leftward shift of the Whipple index with the increase of the educational level of the respondent, as is to be expected given the pattern of age heaping by education. [Figure 5 about here.] Table 8 presents the descriptive statistics of the Whipple index and the cognitive skills figures derived from it in our data. [Table 8 about here.] 18 A.3 Robustness check: alternative definitions of cognitive skills Given the bunching of the data for values of the Whipple index at 250 and 500, I am going to test in this subsection if the main results proposed in the paper are robust to alternative definition of cognitive skills. While in the paper, I define the respondent as having cognitive skills if the Whipple index is below 175, in line with the recommendations of the UN, here, I use below 250 and below 500 as alternative thresholds, as well as a dummie for wether the respondent’s age is not a multiple of 5. Tables 9 and present the results using those alternative thresholds of numeracy. It can be seen that the results remain qualitatively similar to those with a 175 threshold. [Table 9 about here.] 19 Figure 1: Variation in States’ borders between 1951 and 1961. Indian States in 1951. Indian States in 1961. 20 Figure 2: Evolution of educational attainment by treatment status. Local Mean Smoothing, with 95% confidence interval Figure 3: Coefficient on the interaction of “New SC” and 2-years cohort dummies. 2.5 2 1.5 1 Males 0.5 Females 0 ‐0.5 ‐1 ‐1.5 21 Figure 4: Distribution of ages by level of education of the respondent All No Schooling Some Schooling Primary schooling completed Some Secondary Schooling Secondary schooling completed 22 Figure 5: Distribution of Whipple indexes by level of education of the respondent All No Schooling Some Schooling Primary schooling completed Some Secondary Schooling Secondary schooling completed 23 Table 1: Occupation by gender, in percent. Female 1.3 0 0.6 0.8 2.5 22.2 9.5 63.2 Professional, technical and related work Administrative, executive & managerial Clerical and releated workers Sales workers Service workers Farmers, fishermen, hunters, loggers… Production & related workers, transport Not working Male 2.7 0.6 3 5.3 4.4 32.6 37.7 14 Source: NFHS2. Sample: SC population aged between 18 and 48 at the time of the survey. Table 2: Descriptive Statistics. Mean Education variables Years of Schooling up to Secondary Schooling Primary Completion Incomplete Secondary Secondary Completion Literacy Cognitive Skills Individual and household variables Female Household size Urban Woman head of household Hindu Muslim Christian Sikh Buddhist/Neo Buddhist Zoroastrian or Parsi No Religion Declares to be SC N (aged 14+ at time of survey) N (aged 18+ at time of survey) N (cognitive skills sample) New SC Std. Dev. Mean Old SC Std. Dev. OBC Mean Std. Dev. 3.21 0.49 0.40 0.32 0.14 0.47 0.32 4.01 0.50 0.49 0.47 0.35 0.50 0.47 4.06 0.59 0.48 0.39 0.19 0.55 0.39 4.18 0.49 0.50 0.49 0.39 0.50 0.49 4.87 0.67 0.58 0.47 0.25 0.64 0.46 4.19 0.47 0.49 0.50 0.43 0.48 0.50 0.48 6.90 0.24 0.06 0.98 0.01 0.01 0.00 0.00 0.00 0.00 0.66 0.50 3.29 0.42 0.24 0.13 0.07 0.10 0.00 0.00 0.00 0.04 0.47 0.50 6.81 0.29 0.08 0.89 0.01 0.03 0.00 0.06 0.00 0.00 0.81 0.50 3.48 0.45 0.27 0.31 0.11 0.17 0.05 0.24 0.00 0.01 0.39 0.50 7.11 0.30 0.08 0.89 0.10 0.01 0.00 0.00 0.00 0.00 0.00 0.50 3.91 0.46 0.27 0.32 0.30 0.12 0.04 0.02 0.01 0.02 0.00 1,435 1,202 374 24 15,735 13,124 4,099 56,517 47,445 14,411 Table 3: Effect of access to the SC status on years of schooling. OLS regressions. (1) New SC*post 1967 (2) (3) Control group: OBC 0.264 [0.163] 0.324* [0.185] (4) 0.343* [0.196] (5) 0.356* [0.197] New SC District Trend Within Household State‐Cohort FE N Number of groups N N N Y N N 48490 Y Y N 48490 Y Y Y 46407 16380 (6) (7) Control group: "Old SC" (8) 0.076 [0.184] 0.163 [0.174] 0.284 [0.257] ‐0.561 [0.345] ‐0.614* [0.340] N N N 46407 16380 0.260 [0.243] Y N N 14322 Y Y N 14322 13620 5026 Y Y Y 13620 5026 Standard errors two way clustered at the jati and district level in brackets. All regressions include districts FE, jati FE, birth cohort FE, state trends, gender FE, religion of household head FE, urban FE, household size, and woman head of household FE (all partialled out). Population aged 18 and above at time of the survey and born after 1950. Table 4: Effect of access to the SC status on years of schooling, by gender. OLS regressions. (1) (2) (3) Control group: OBC (4) (5) (6) (7) Control group: "Old SC" (8) New SC*post 1967 1.039*** 1.073*** 1.176*** 1.192*** [0.239] [0.262] [0.250] [0.264] 0.738*** 0.766*** 0.861*** 0.844*** [0.252] [0.251] [0.317] [0.324] New SC * Female * post 1967 ‐1.608*** ‐1.556*** ‐1.747*** ‐1.756*** [0.260] [0.272] [0.385] [0.386] ‐1.403*** ‐1.333*** ‐1.410*** ‐1.320*** [0.325] [0.326] [0.454] [0.446] New SC * Female 0.897*** 0.834*** 1.043*** 1.041*** [0.310] [0.316] [0.333] [0.327] 0.957*** 0.908*** 0.973*** 0.880** [0.316] [0.316] [0.353] [0.354] Female * post 1967 0.300*** 0.291*** 0.395*** 0.390*** [0.099] [0.098] [0.100] [0.101] ‐0.082 [0.196] Female ‐2.937*** ‐2.931*** ‐3.061*** ‐3.064*** [0.125] [0.126] [0.143] [0.145] ‐2.840*** ‐2.835*** ‐2.985*** ‐3.007*** [0.148] [0.157] [0.155] [0.157] New SC District Trend Within Household State‐Cohort FE N Number of groups ‐0.103 [0.197] ‐0.000 [0.269] 0.031 [0.261] ‐1.000** ‐1.015*** [0.393] [0.392] N N N 48490 Y N N 48490 Y Y N Y Y Y 46407 16380 46407 16380 N N N 14322 Y N N 14322 Y Y N 13620 5026 Y Y Y 13620 5026 Standard errors two way clustered at the jati and district level in brackets. All regressions include districts FE, jati FE, birth cohort FE, state trends, gender FE, religion of household head FE, urban FE, household size, and woman head of household FE (all partialled out). Population aged 18 and above at time of the survey and born after 1950. 25 Table 5: Educational attainment, by gender. OLS regressions. Table : Education levels, by gender Schooling Some Secondary Secondary Control group: OBC Primary Schooling Some Primary Secondary Secondary Control group: "Old SC" New SC*post 1967 0.125*** 0.139*** 0.134*** 0.001 [0.029] [0.029] [0.035] [0.038] 0.062 [0.041] New SC * Female * post 1967 ‐0.157*** ‐0.204*** ‐0.152*** ‐0.029 [0.042] [0.039] [0.035] [0.044] ‐0.124*** ‐0.183*** ‐0.121*** ‐0.012 [0.048] [0.045] [0.044] [0.042] New SC * Female 0.027 [0.043] 0.039 [0.046] Female * post 1967 0.089*** 0.068*** 0.005 [0.010] [0.011] [0.012] Female ‐0.347*** ‐0.327*** ‐0.290*** ‐0.191*** [0.018] [0.015] [0.013] [0.008] 0.100*** 0.090*** 0.083*** [0.036] [0.033] [0.027] 0.018* [0.010] New SC N 58755 58755 49249 49249 0.104*** 0.100*** ‐0.013 [0.039] [0.034] [0.038] 0.119*** 0.100*** 0.050* [0.039] [0.035] [0.030] 0.050** 0.022 [0.020] [0.024] ‐0.034 [0.028] ‐0.021 [0.015] ‐0.356*** ‐0.318*** ‐0.282*** ‐0.151*** [0.023] [0.020] [0.017] [0.012] ‐0.075 [0.046] ‐0.131*** ‐0.113** ‐0.043 [0.048] [0.044] [0.049] 17163 17163 14322 14322 Standard errors two way clustered at the jati and district level in brackets. All regressions include districts FE, district trends, jati FE, birth cohort FE, state trends, gender FE, religion of household head FE, urban FE, household size, and woman head of household FE (all partialled out). Population aged 14 (for schooling and primary) or 18 (some secondary and secondary) and above at time of the survey and born after 1950. Table :hard skills Table 6: Skill acquisition, by gender. OLS regressions. Literacy Cognitive Skills Control group: OBC Literacy Cognitive Skills Control group: "Old SC" New SC*post 1967 0.136*** 0.070* [0.028] [0.041] 0.091** [0.039] New SC * Female * post 1967 ‐0.169*** [0.039] ‐0.144*** [0.047] New SC * Female 0.039 [0.043] 0.058 [0.047] Female * post 1967 0.061*** [0.011] 0.031 [0.020] Female ‐0.355*** [0.017] ‐0.363*** [0.021] New SC N 49253 12429 0.026 [0.053] ‐0.073 [0.048] 0.081* [0.045] 14324 4083 Standard errors two way clustered at the jati and district level in brackets. All regressions include districts FE and trends, jati FE, birth cohort FE, gender FE, religion of household head FE, urban FE, household size, and woman head of household FE (all partialled out). Cognitive skills regressions additionnally control for categorical variables for number of household members aged 23 to 62 and the average education of non respondent household 26 Table 7: Number of years of education: non urban samples. Table : Education levels, Migration Without Without Cities Full Sample Capitals and and above large cities Without Without Full Sample Capitals and Cities and large cities above Rural Sample Control group: OBC Rural Sample Control group: "Old SC" New SC*post 1967 1.153*** 1.151*** 1.019*** [0.248] [0.263] [0.266] 1.080*** [0.283] 0.766*** 0.716*** 0.597** 0.606** [0.251] [0.260] [0.262] [0.287] New SC * Female * post 1967 ‐1.553*** ‐1.581*** ‐1.464*** [0.276] [0.269] [0.278] ‐1.630*** ‐1.333*** ‐1.200*** ‐1.105*** ‐1.153*** [0.299] [0.326] [0.315] [0.312] [0.331] New SC * Female 0.821*** 0.961*** 0.962*** [0.300] [0.309] [0.299] 1.357*** [0.314] 0.908*** 0.963*** 0.948*** 1.123*** [0.316] [0.319] [0.307] [0.329] Female * post 1967 0.289*** 0.221** 0.130 [0.099] [0.095] [0.098] ‐0.059 [0.104] ‐0.103 [0.197] Female ‐2.931*** ‐3.012*** ‐3.042*** [0.125] [0.125] [0.127] ‐3.118*** ‐2.835*** ‐2.843*** ‐2.854*** ‐2.842*** [0.127] [0.157] [0.169] [0.173] [0.190] New SC ‐0.292* ‐0.356** ‐0.524*** [0.167] [0.171] [0.171] ‐1.015*** ‐1.089*** ‐1.089*** ‐0.917** [0.392] [0.389] [0.418] [0.402] N 49249 45257 41674 34606 14322 12788 11846 10207 Standard errors two way clustered at the jati and district level in brackets. All regressions include districts FE, district trends, jati FE, birth cohort FE, state trends, gender FE, religion of household head FE, urban FE, household size, and woman head of household FE (all partialled out). Population aged 14 (for schooling and primary) or 18 (some secondary and secondary) and above at time of the survey and born after 1950. Table 8: Descriptive statistics on cognitive skills. Mean Cognitive skills variables whipple Cognitive skills Cognitive skills 250 Cognitive skills 500 Cognitive skills self New SC Std. Dev. 265.74 0.32 0.32 0.67 0.56 N 195.76 0.47 0.47 0.47 0.50 374 Mean Old SC Std. Dev. 236.12 0.39 0.40 0.73 0.61 OBC Mean 195.19 0.49 0.49 0.44 0.49 4,099 Std. Dev. 206.40 0.46 0.47 0.79 0.66 192.07 0.50 0.50 0.41 0.48 14,103 Table 9: Varying definitions of cognitive skill dummie. OLS regressions. Respondent's age not rounded Control group: OBC Whipple<500 Whipple<250 New SC*post 1967 0.083* [0.046] 0.072** [0.034] 0.016 [0.025] New SC N 12429 12429 12429 Respondent's age not rounded Control group: "Old SC" Whipple<500 Whipple<250 0.119* [0.063] 0.027 [0.053] 0.057* [0.033] ‐0.145*** 0.082* [0.051] [0.045] ‐0.054* [0.029] 4083 4083 4083 Standard errors two way clustered at the jati and district level in brackets. All regressions include districts FE and trends, jati FE, birth cohort FE, gender FE, religion of household head FE, urban FE, household size, and woman head of household FE, categorical variables for number of household members aged 23 to 62 and the average education of non respondent household members (all partialled out). Population agedr 18 and above at time of the survey and born after 1950. 27
© Copyright 2024