Reimagining Data Architecture

Reimagining Data Architecture

The preamble of the Constitution of India lays emphasis on securing economic, social and political justice for all its citizens, based on which, several developmental programmes have been launched over the last six decades. As per a report from CARE Ratings, ~18.8% of India’s GDP is allocated for the implementation of development programs at the central and state level. Given the magnitude of such government expenditure, it is imperative to understand whether such programmes are implemented efficiently on-ground and also understand how critical a role, these programmes have played, in attaining these objectives. The outputs/outcomes envisaged through the programmes can be assessed objectively only through the availability of relevant data. As emphasized in the World Development Report (WDR) 2021 - without such data, the ability to hold governments accountable and track progress withers; data is a critical piece in the delivery of good governance. WDR 2021 also emphasizes that data can be leveraged for improving service delivery, prioritizing scarce resources, regulating economy and markets, fostering public safety & security and dispute or conflict resolution.

Programme monitoring, over the years, has undergone drastic changes from monitoring of inputs to monitoring of outputs/outcomes today. Such changes in the demand for monitoring requirements also necessitated the availability of robust data architecture systems. Data architecture during the 1960s emphasized on inputs such as expenses and revenue from specific projects/programs.  With this, the explicit objective often was the achievement of cost-efficiency in terms of resource utilization. However, such a system did not measure the impact of the programs as a result many of the programs continued albeit their irrelevance to the changing times. The Indian government, along with many other governments, has recognized the need for monitoring outputs/outcomes and institutionalized an Output-Outcome Monitoring Framework (OOMF) in 2018-19 wherein each ministry sets output/outcome targets at the beginning of the year and monitors the achievement of such targets over a period of time. Though such a change is welcome, the government also needs to ensure a sound data architecture to feed into the output-outcome monitoring framework, otherwise, the system may lose out its utility over time. 

The principles of a sound data architecture mechanism require attention on three key aspects of data, namely: frequency, granularity and independence. The frequency at which data is made available is key for taking corrective action. For example, measuring outputs to ensure that the pensioners are receiving a monthly pension from the government may require monthly collection of data; whereas, outputs to measure a graduate’s capacity to land a job may require collection of data after completion of the graduate program. Therefore, data architecture systems should be tailored to collect data frequently and as per a program's requirements. The emergence of new forms of data sources such as nightlight data, social media data, digital transaction data, etc., also provides a new opportunity for policy researchers to collect data at a much higher frequency than what is possible through traditional, survey-based data collection or MIS systems.

The second aspect is granularity. The level at which data is captured becomes very critical, especially for large complex development programs. For example, effective monitoring of programs such as implementation of the National Food Security Act (NFSA) may require collection of data at a beneficiary level; while the road development program may require collection of data at the district or city level. The non-availability of data at the right granularity level would mean that the key outputs/outcomes go unmonitored at the right level. Data sources such as Call Data Records (CDRs) may facilitate much granular monitoring as evident during COVID-19, including tracking public/people’s movement across the borders and monitoring compliance with the lockdown rules.

The third aspect is independence. The independence of the agency collecting the data helps in improving the reliability and credibility of the data collected. WDR 2021 reveals a positive correlation between independence of the National Statistical Office (NSO) and Statistical Performance Index (an index to track the performance of the national statistical system). Data collected when validated through independent mechanisms can also restore trust in the data generation process. The new forms of data are also not immune to quality issues. First, the age-old data science adage – “Garbage In, Garbage Out” emphasizes that new forms of data will lose their utility if appropriate and accurate data is not captured. Second, the algorithms, at times, can magnify the discriminatory assumptions made during data generation.

Besides these three aspects of data, given the rapid explosion of data and forms of data, data analysis is emerging as the fourth pillar of a sound data architecture system in modern times.  Data analysis includes checking anomalies in data, aggregating or disaggregating data and, data visualization. Also, an unspoken aspect of a sound data architecture system is the need for a sound institution. Institutions need to have a formal mandate, be sufficiently resourced; and have a definite technical capacity to institutionalize a robust data architecture system.

The presence of the Directorate of Economics and Statistics in every state as well as in the Ministry of Statistics and Program Implementation (MoSPI), provides a sound institutional architecture. Development Monitoring and Evaluation Office (DMEO), NITI Aayog set up at the central government level also adds to this institutional richness and is pivotal in pushing government programmes towards better outcomes. Many government flagship programmes have set up MIS dashboards to monitor utilization. However, the emergence of multiple IT systems without integration with each other and the data collected by many agencies in different formats raises concerns on the comparability of each of these data sets. The non-frequent nature of data, non-availability of data at the right granularity, especially at the programme level,  and lack of independent validation of data collected by the Ministry, raises concerns on the utility of such data. Therefore,  there is an urgent need for governments today, to strengthen their capacities across the board to institutionalize a sound data architectural system in every department and state. A beginning was made by the Government of India and a few states wherein an output/outcome framework is formulated and is made an integral part of the annual budgetary process. 

In a nutshell, a renewed focus on collecting data at frequent intervals and right granularity by independent agencies such as the Directorate of Economics and Statistics will ensure correct monitoring of various programmes, at both the central and state level. The explosion of a new wave of data forms and technologies provides an opportunity for various levels of the government to tap into and capture data at low costs, compared to the earlier methods. Also, ensuring availability of such data to research communities will ensure that various programmes of the government are evaluated and will build trust in data systems. The shift towards outcome monitoring along with an alignment to sustainable development goals, by the states and the central government is a welcome step and must be pushed with greater vigour across the board.

 

Deepak Kumar is Economic Investigator at DMEO, NITI Aayog and Venugopal Mothkoor, is M&E Specialist at DMEO, NITI Aayog. Views expressed are personal.

Author
Deepak Kumar and Venugopal Mothkoor