MOJ statistics: intentional obfuscation?
I have been looking further at the statistics released by the MOJ on 24 May. As well as the report and supporting tables with their dubious "success rates" (see "MOJ statistics: manipulation and omission"), the MOJ Chief Statistician released a text file of record-level data. This file contains the data for the 26,059 language service requests on which all the tables and statistics are based… Well, not quite.
The raw data has been doctored to make independent analysis difficult: 4,446 records - 17% of the total - have had the "Language" applicable suppressed and replaced by "Not disclosed". The notes provided with the record-level data state that this has been done "to protect the privacy of individuals" for any instances where there is only one interpreting request for a single combination of court, month and language.
This "justification" doesn't bear examination. First and foremost - there is no data relating to individuals included in the data set. How does the fact that a specific court (or group of courts) had a language service request which was fulfilled, not fulfilled or cancelled in a particular month impact on any individual's privacy? It should be borne in mind that this data is being provided one to three months after the event. In any case, the interpreter's identity in any specific case is not secret - the vast majority of court hearings are open to the public and the interpreter will commonly give his or her name to the judge in open court.
So what does this restriction achieve? It makes many additional analyses of the data either incomplete or impossible. In particular, it would assist in covering up high failure rates in certain languages. For example, say I wanted to calculate the "fulfilment rate" for Tamil, as opposed to the dubious "success rate" shown in Table 2. By summing data from Tables 8, 9 and 14, I can calculate that there were 621 requests for Tamil in total, but the record-level data contains only 541 records for Tamil i.e. 80 requests or 13% of the Tamil data is hidden in "Not disclosed", so full analysis is impossible.
The situation for Lithuanian, Vietnamese and Latvian is even worse because these languages are not included in Table 14, so there is no way of knowing how many requests there were in total for each of these languages, i.e. we don't even know what % is hidden in "Not disclosed". Even for very frequently used languages such as Polish the "Not disclosed" category has a significant impact. I wanted to look at the figures for Polish for my region, the North West, but the number of Polish records concealed in "Not disclosed", which can't be allocated to a region - 310 - is greater than the 239 Polish requests definitely associated with the North West!
I fail to see any justification whatsoever for restricting the record-level data in this fashion. The only persons protected by this obfuscation are Gavin Wheeldon and Crispin Blunt and those who are desperate to fly in the face of the evidence and maintain that the Framework Agreement is a success.