THE FUTURE ROLE OF BIG DATA AND MACHINE LEARNING IN HEALTH AND SAFETY INSPECTION EFFICIENCY

Introduction
Inspections are probably the most important policy instrument that governmental labour inspectorates use to ensure that companies take the necessary steps to comply with occupational health and safety regulations. However, the effect that inspections have depends on several different factors. One fundamental factor is the process of selecting inspection objects, i.e. companies or sites to inspect. In principle, there are at least three different selection approaches available. The first approach is to inspect all companies regardless of potential risk, company size, type of industry or any other criteria. The second approach involves selecting enterprises based on random sampling, where every company, regardless of any characteristic, has an equal probability of being selected. As regards preventive and economic conditions, both of these methods are usually seen as ineffective (Blanc, 2013). Thus, most labour inspectorates select objects on the basis of the third approach, namely the risk-based approach. In brief, the risk-based approach involves the selection of inspection objects on the basis of risk level.
Although the risk-based approach is an essential principle for most modern labour inspectorates, there are substantial challenges to applying it in practice. The main reason for this is that sufficiently fine-grained methods for risk analysis are lacking (Mischke et al., 2013). Without appropriate methods to make risk-based prioritisation possible, the risk-based approach runs the risk of becoming a governmental policy statement without tangible practical consequences. Hence, there is a need to develop methods that allow the targeting of high-risk companies (Weil, 2008).
Most labour inspectorates collect and store huge amounts of data related to their inspection objects
and their inspection activities. Thus, inspectorates potentially possess large and rapidly growing volumes of data, nowadays referred to by the term ‘big data’. Big data, combined with machine learning technology, is being used at an increasing rate for different predictive purposes, by learning from hidden trends in the data. For example, the predictive value of big data and machine learning techniques are being tested in areas as diverse as cancer prognosis and patient outcomes, bankruptcy prediction, oil price prediction, tax fraud detection, crime prediction and stock market forecasting. The fundamental question being addressed in this paper, however, is whether or not the use of big data and machine learning technology to target high-risk inspection objects is a promising avenue for labour inspectorates.

Risk-based targeting
According to the best practice principles for regulatory policy, outlined by the Organisation for Economic Co-operation and Development (OECD, 2014), risk analysis and risk assessment should be the basis for targeting inspection objects for labour inspectorates. This means that companies should be selected for inspection on the basis of assessments of the probability and consequences of risk elements such as accidents, harmful exposure and illegal working conditions.
The fundament of risk-based targeting is the recognition that, because of limited inspection resources, it is not possible to control all risk areas and all risk objects. With regard to labour inspection authorities’ health and safety inspections, this means that some problem areas must be prioritised over others. Furthermore, some companies must be prioritised for inspection over others.

The principle of risk-based targeting is not a new one. Nearly 50 years ago, in the Robens Committee’s evaluation of the UK system for the supervision of safety and health at work, the risk-based approach (combined with self-regulation) was introduced as an ideal in the process of modernising regulatory inspection (Robens, 1972). To ensure the cost-effective use of inspection resources, the Robens Report recommended the regulatory authority to concentrate its resources selectively on the most serious problem areas and to prioritise companies and problems that had been identified through the systematic analysis of all available data related to health and safety, e.g. statistics of accidents, technical information and inspectorates’ local knowledge.
The recommendations of the Robens Report have been extensively adopted by labour inspectorates internationally and in the EU Member States. The spread of the risk-based approach means that most modern labour inspectorates have embraced the idea of pulling back resources from low-risk objects and concentrating more resources for enforcement on objects with the highest risks. To make this possible, some type of data analysis is necessary. Analytical methods for identifying high-risk industries and risk-exposed groups of workers are well developed. Such risk-based analyses are typically based on national statistics related to, for example, occupational diseases, work-related accidents and occupational exposures. The analyses constitute the fundament of inspection campaigns, strategic plans and national, and even international, priority areas.
Far less common than the broad risk-based analyses are methods that make prioritisation across companies within an industry possible. Among labour inspectorates, a common approach to targeting concrete risk-exposed companies is to rely on the local knowledge of inspectors. Some labour inspectorates, such as those of Denmark and Sweden, have explored the usability of risk-ranking systems based on additive scales. By using additive scales, each company is given risk scores based on several company characteristics (e.g. size, type of industry and number of registered accidents) that are added to form a sum score, and those with the highest sum scores are prioritised for inspection. However, the problem with using such additive scales is that they display relatively low levels of predictive validity, i.e. the score is not particularly appropriate for separating high-risk enterprises from low-risk ones.

Big data and machine learning
The process of prioritising across companies is comparable to finding needles in a haystack. In this case, the haystack potentially consists of hundreds of thousands of possible inspection objects, but only a certain number of these objects are needles, i.e. have a level of untolerable risk. Finding needles in a haystack is to a large degree what big data and machine learning is all about.
The main objective of machine learning algorithms is to provide a statistical model that can be utilised to perform predictions, classifications, estimations or similar tasks. Within the field of, for example, cancer prediction, researchers have, for more than three decades, utilised machine learning algorithms to predict cancer susceptibility, cancer recurrence and cancer survival.
Thematically, cancer prediction is far removed from the risk-based targeting of inspection objects.
However, both are examples of predictive challenges, or needles-in-a-haystack-type problems.
The two main common types of machine learning algorithms are supervised learning and unsupervised learning. In supervised learning, the algorithm consists of a dependent variable (e.g. risk level) that is to be predicted from a set of independent variables. Accurate predictions, of course, require high levels of correlation between the independent variables and the dependent variable. In unsupervised learning, there is no dependent variable to predict, but the objective of the algorithm is to cluster the data into groups (e.g. different risk groups) by similarity. In contrast to additive scales, such as those explored by the labour inspectorates of Denmark and Sweden, the algorithms used in machine learning progressively improve their predictions, primarily by trial and error. This means that the machine learns from past successes (correct predictions) and errors (wrong predictions) and attempts to capture this knowledge to make predictions more precise based on the feedback received.

Utilising big data and machine learning in selecting
inspection objects

Supervised and unsupervised learning algorithms require a sufficient volume of data, with regard to both the number of observations and the number of variables, usually referred to as ‘features’.
As already noted, most labour inspectorates collect and store huge amounts of data related to their inspection objects and their inspection activities. The available data typically relate to company- specific features such as the number of employees, company age, industrial grouping, the number of previous inspections, results of previous inspections and notifications of accidents. Furthermore, the amount of data increases day by day as results from new inspections are added. In principle then, addressing the challenge of targeting high-risk companies by utilising big data should, at least at first glance, be well suited to machine learning algorithms. Despite this, there have been few such attempts. There are, however, a few notable exceptions, all of which illustrate that big data and machine learning could be highly relevant for labour inspectorates to solve the challenge of targeting high-risk inspection objects.
The first example is a research study that explored the suitability of machine learning methodologies for the prediction of workplace accidents, or, more specifically, floor-level falls (Matías et al., 2008). Despite its relatively precise predictions, the drawback of this study is that the features included in the algorithms are not the type of data that labour inspectorates normally possess (e.g. use of personal protective equipment and housekeeping practices). Furthermore, floor-level falls represent only a tiny proportion of the workplace risks that labour inspectorates are concerned with.
The second example is also a research study (Hajakbari and Minaei-Bidgoli, 2014). This study developed a scoring system for predicting the risk of occupational accidents. Moreover, the study concluded that it is possible to predict the risk of different types of occupational accidents relatively precisely on the basis of some general company characteristics (a company’s main activity, gender distribution, number of employees, etc.). Furthermore, the study concluded that the algorithm could be utilised to identify workplaces that need periodic health and safety inspections. The data used
in this study were retrieved from the database of a labour inspectorate. The drawback of the study, however, is again that workplace accidents represent only one of the many workplace risks that labour inspectorates deal with. Furthermore, a particular problem with relying on injury statistics is that such data are known to be highly vulnerable to underreporting.
The third example is a tool developed by the Norwegian Labour Inspection Authority (NLIA) to assist inspectors in selecting enterprises with regard to risk (Dahl et al., 2018). The tool, named the Risk Group Prediction Tool (RGPT), differentiates enterprises into four groups based on predicted risk: lowest risk, low-risk, high-risk and highest risk enterprises. The higher the risk group of a given company, the higher the probability that a future inspection in this company will identify serious deviations from health and safety regulation compliance. The group a company is assigned to is visible to inspectors via the NLIA’s internal web-based user interface. Hence, when targeting companies for inspection, inspectors are informed about companies’ risk groups and are thus able to make risk-informed selections.
The RGPT was built on the basis of predictive modelling by means of a machine learning algorithm
using so-called binary logistic regression analysis. On the basis of the regression model, all companies in Norway (approximately 230,000) are assigned to one of the four risk groups. This is done in two steps. In the first step, the regression model predicts the probability that a future inspection will identify serious deviations from health and safety regulation compliance. In the second step, the model uses the probability value predicted to assign the company to a risk group.

Initially, the tool was developed on basis of registrations from approximately 35,000 health and safety inspections carried out by the NLIA. However, the predictions made by the tool gradually and automatically become more precise as the number of inspections increases. This means that the algorithm adjusts itself based on the feedback (correct or faulty predictions) it receives when new inspections are carried out and registered in NLIA’s database.
The RGPT falls within the class of supervised learning algorithms, where health and safety inspections resulting in serious deviations (dependent variable) are to be predicted from a set of
company characteristics (features). The features that the RGPT uses are general company characteristics such as company size, industrial group, number of previous inspections, results from previous inspections, company age, geographical localisation and notifications of accidents.
The predictive validity of the tool is checked every month, and the experience thus far (after approximately 18 months of testing) is that the algorithm manages to target companies with a high risk extremely precisely. This means that there are few false positives and few false negatives, i.e. few inspections within the lowest risk group result in the identification of serious deviations, whereas the vast majority of inspections within the highest risk group do result in the identification of serious deviations. The low- and high-risk groups fall between the two extremes.
The findings from using the tool developed by the NLIA demonstrate that it is possible to target inspection objects by means of utilising big data and machine learning. Similar machine learning approaches have also been tested by at least two other European labour inspectorates with promising results: the Swedish Work Environment Authority (Ridemar, 2018) and the Dutch Inspectorate SZW (Jacobusse and Veenman, 2016). However, the Norwegian tool is not necessarily transferable to other labour inspectorates, as its use depends on how data are stored, data quality, data access and database structure. Furthermore, targeting companies on the basis of the tool involves the acceptance of the way risk is defined and operationalised in the algorithm.
As described, the tool is based on a definition of risk that implies that the higher the risk group of a given company, the higher the probability that an inspection in this company will identify serious deviations from health and safety regulation compliance. This means that the tool is primarily concerned with so-called management and control risks and not inherent risks. Whereas management and control risks arise from a company’s ability and willingness to manage risk (e.g. by means of complying with the relevant regulations), inherent risks are those that arise from the nature of a business’s activities (e.g. fall from heights, chemical exposure and musculoskeletal strain).
In practice, management and control risks and inherent risks are related. This, however, does not imply that the two types of risks are necessarily highly empirically correlated. Hence, to rely blindly on tools that target companies based on one type of risks might result in another type being missed.
Within the Norwegian regulatory regime, this challenge is addressed by emphasising inherent risks
when identifying priority areas, risk-exposed clusters of workers and high-risk industries, whereas management and control risks are emphasised when targeting companies specifically.

Challenges
The fact that management and control risks on the one side and inherent risks on the other are not necessarily empirically correlated leads us to another, probably even greater, challenge in applying big data and machine learning algorithms to risk-based targeting. The three examples of machine learning tools above are all examples of one-dimensional targeting, i.e. targeting based on one particular definition and operationalisation of risk. Risks in the world of work, however, are not of only one particular type. Hence, enforcing authorities are concerned with multiple types of risks, e.g. accidents, chemical exposure, biological exposure, psychosocial threats, musculoskeletal risk factors and social dumping. Within these types of risks, there are even more subtypes. Developing risk models that manage to capture this variety is very challenging, because the different types of risks do not necessarily correlate. Hence, capturing this variety is quite different from predicting the probability of one particular type of risk (Dahl et al., 2018).
A second, but related, challenge makes the task of risk-based targeting even more complex. This is the so-called political pitfall (Black, 2010). Even though machine learning algorithms are dynamic, in the sense that they can learn from and adapt to successes and errors, they cannot take different political point of views into account. First, the political context is fickle. Thus, the types of risks that are worthy of prioritisation today might not be worthy of prioritisation tomorrow. Second, the political context is multifaceted. Thus, different stakeholders, e.g. politicians, employers, employees, the media and the public, hold different views on which types of risks are worthy of prioritisation. This illustrates that risk in the world of work is not necessarily an objective entity, but a social construction.
A third challenge, worthy of consideration, is related to the fact that, even though labour inspectorates possess huge amounts of data related to their inspection objects, these data usually relate to the company level, and company-level data are not necessarily the most appropriate data to consider (see, for example, Gunningham and Sinclair, 2007). In a database, a unique company is typically identified by a unique identifier such as an organisation number. The ability of a machine learning algorithm to assign a given predicted risk value to a given company is dependent on unique identifiers. However, all potential inspection objects are not automatically identifiable by a unique identifier. For example, within the construction industry it is not necessarily a concrete company that is targeted for inspection, but a temporary construction site. There are at least two challenges related to such temporariness. First, construction sites and other temporary locations of work might not be identifiable by unique identifiers. Second, even if they were identifiable, the temporariness means that a machine learning algorithm might not be given the chance to learn from its predictive successes and errors before the construction site ceases to operate and the companies that were using the site have moved to new constellations at a new site.


Concluding remarks
The challenges described above illustrate that there are some significant difficulties related to targeting high-risk inspection objects by utilising big data and machine learning techniques.
However, these challenges do not in any way erase the usefulness of such techniques within a risk-based approach. Rather, the challenges illustrate that risk-based targeting will probably not benefit from relying completely on machine learning algorithms. The Norwegian example above illustrates this. Rather than allowing the algorithm to pick and choose objects directly, the inspectors are allowed to make risk-informed decisions on the basis of the predictions that the algorithm makes. This involves a combination of artificial and human intelligence, where each complements the strengths of the other. When it comes to predictions of complex social events in general, combining the two types of intelligence is probably a necessity.