Managing the Corruption Perception through Master Data Management

0
1317

Effective master data management - Compact

by Burhan Rasool    29 May 2021

Government makes policies around data. Erroneous and/or incomprehensive data is always misleading and leads to arbitrary decisions. For Example, a few years ago, the Government of Punjab (Pakistan) wanted to give a subsidy to the Manufacturing Sector to curb child labor. The official number regarding manufacturing industries was 23,392. Still, to double-check and confirm this number, an independent third-party survey was carried out, with the location of all such industries geo-tagged. The results of this survey revealed that, in reality, on the ground, there were 66,318 manufacturing industries.

Imagine, had the Government decided to give ONE million rupees subsidy per industry, 22 billion would have been allocated in the Annual Development Budget, instead of the required 66 billion. This would have resulted in 2/3rd of the manufacturing industries not getting the announced/promised subsidy and cried foul. This would have led to the buildup of a wrong perception that there was “definitely some corruption” there.

It is critically important to address the corruption perception issue that data wins on the policy table instead of verbosity and/or confidence. For this, the Bureau of Statistics must be revamped around the following lines:

  1. Knowledge of the latest scientific methods adopted by the developed world to collect data. For this, they must have funds to send their statisticians abroad to attend international conferences on data collection and statistics.
  2. The right Human Resources, and
  3. The best set of technologies available in the market

Master Data Management is the main building block in this regard. It is defined as “Data shared across teams and IT applications that define key information of an organization, company or public sector departments such as assets, locations, reference codes, financial hierarchies, products, customers or suppliers.”

Following are there are 4 main styles of Master Data Management (MDM):

  1. Centrally authored: In this style, data is authored in the MDM, and other systems subscribe to the MDM platform for master data (or the MDM pushes the data into downstream applications).
  2. Consolidation: Source systems feed data into the MDM platform for consolidation into golden records.
  3. Coexistence: A mashup of centrally authored and consolidation styles that allows for data creation in multiple systems (including the MDM platform).
  4. Registry: Rather than consolidating records, registry-style MDM joins and aligns unique identifiers across all the systems into intersection tables.

 

For the Government of Punjab (Pakistan), the consolidation style is expected to yield the best results, given the huge number of applications already in use, generating tons and tons of data. This MDM style has the following 8 core steps to implement:

  1. Identification of Data Sources: In the first step, all the datasets available with an organization/department are identified.
  2. Data Cleansing, Conflict Resolution, and Profiling: Here, cleansing and profiling of data are done. Conflicts in data, if any, are also removed here. Data is profiled into different types and categories for efficient management.
  3. Internal Master Data Repository: During profiling, data is classified into three main categories shareable, data commons, and sensitive. These categories are saved together in a repository called Internal Master Data Repo (IMDR) for internal consumption.
  4. Data Anonymization: Security of the data is the utmost priority when defining any system related to data. It is important to ensure that data classified as shareable is anonymized before loading into External Master Data Repo (EMDR) so that no personal or sensitive information is identifiable.
  5. External Mater Data Repository: After anonymization, the anonymized data and data commons (reference data) are saved in a repo for external consumption called External Master Data Repo (EMDR). Golden records (or big numbers) aggregated from different datasets can also be stored separately in the EMDR for instant access of consumers.
  6. Data Service Bus: In addition to direct access to EMDR, data consumers can also subscribe to receive periodic updates through a service bus called Data Service Bus.
  7. Access Control Layer: The security protocols related to sharing and accessing the data with external sources based on an access control mechanism are part of this step.
  8. Metadata (or Data Catalog): Metadata is information about the available data. This layer includes a catalog of available datasets.

If properly implemented, this will enable the availability of the right data to the right people at the right time and the right place.

Now imagine making all useful datasets digitally accessible (after anonymizing personal & sensitive information) to businessmen, traders, merchants, and budding entrepreneurs. These datasets may include, but not limited to, (1) population census; (2) sale/purchase of movable & immovable properties; (3) imports/exports; (4) toll taxes collected on highways & motorways; (5) domestic Air & Rail travel; (6) goods transported via rail, road, & air; (7) ships docked at seaports; (8) sale/purchase of commodities in wholesale markets, etc. This master data repository would enable them to gauge their potential consumer base as their market size and plan accordingly for their respective product/service at any given point in time. Considering how impactful it can be, one cannot help but wonder, what is stopping us from opening these datasets, making them searchable via free text mechanisms like Google’s search engine? Also, why can’t we have web services for real-time data sharing? Also, what’s the harm in the government monetizing its data just like Dubai Pulse, wherein the government gives a certain amount of generic data for free and then charges slightly on each detailed level of granularity. All this can be done digitally over the internet. All the payments for getting more specific datasets can be made online via credit/debit cards.

 

The writer is a member Prime Minister of Pakistan’s Task Force on Austerity & Restructuring Government and General Manager, Punjab Information Technology Board, Government of Punjab (Pakistan)

LEAVE A REPLY

Please enter your comment!
Please enter your name here