#7 - Mihaly Fazekas (University of Cambridge)

Corruption in public procurement



Interview with Mihaly Fazekas, research associate at the Department of Sociology of the University of Cambridge. Mihaly is pioneering the use of "big data" in social sciences research settings and is part of two major research projects: ANTICORRP and DIGIWHIST. You can find his personal webpage here.




Let’s start with corruption in public procurement, that is probably your biggest area of interest. How can we measure corruption in public procurement?

Measuring corruption in public procurement is usually a difficult and challenging topic, and a lot of people before, a lot of scholars and policy people tried to measure it and measuring it directly is still pretty much impossible. I don’t think we will get any closer to directly measuring, however, there are two avenues to measure it indirectly. The first one which has received much wider attraction in policy and academia is looking at people’s perceptions or reported experience which is good in many ways, as long as people have experience with that type of corruption, but in public procurement, you can ask companies but they’re unlikely to really reveal if they have been part of a corrupt scheme of not.

The only alternative remaining, and this is what I have done with a couple of colleagues around Europe, is to develop proxies. Proxies are indicators of corruption risk, so basically this means that we can track a range of characteristics of companies, individuals behind companies and contracting entities, also the characteristics of the procurement process, the tendering process itself which are not unknown for being associated with corruption. Now, of course, a lot of the characteristics which we are measuring could be also associated with non-corrupt problems such as low state capacity or simply just, you know, problem with conducting the usually really complex procurement process according to legislation. So it’s really the challenge of finding those indicators, those proxies, which rather indicate corruption than anything else, any other problem. So what we have done, for example, is coming up with a set of red flags highlighting the outcomes of the procurement process which are associated with public procurement corruption and linking these outcomes with the characteristics of the process itself so that the input side of the process to signal corruption together, so kind of measuring the input and the output. Now to give you concrete examples, starting from a conceptual understanding, so we understand corruption in public procurement as a restriction on open competition with the purpose of awarding the contractor, pre-selective bidder, typically the same bidder than they are connected bidder over and over again. These translate into corruption risk outcomes in the procurement process as lack of competition, meaning a single bid submitted or a single bid considered on an otherwise competitive market, say school meals. And the recurrent institutionalised nature of corruption can be captured by the market share of a single company, so if the same company is winning over and over again from the same procuring entity while having no competitors on an otherwise competitive market, we think this is jointly signalling high corruption risk.

Now that is the output side. The input side can be characterised with a lot of red flags known from literature for over a decade now, for example, really short advertisement period or some other reduced lead time, so basically the number of days between publishing a call for tenders and the submission deadline for the tenders. Now if this period is really short, then the likelihood of having one company bidding only drastically increases this and often, in many countries, not in everywhere, but in many countries, this also means that the company will win is the company who has the highest market share anyway. So these input and output relationships together give us some indication of corruption risk.

How can you reach that conclusion?

You cannot reach that conclusion with certainty, but you can reach that, and draw a conclusion with a certain amount of probability. So that’s why it’s a risk measure we are applying and then to continue, basically you build up these relationships and individual indicators for every single tendering procedure, every single contract awarded, and then you put them together in a single score so that we can capture their core occurrence, so when they occurred at the same time, in the same tender. Now tenders vary for a lot of different things, you know, maybe it’s just Christmas period, no one bothers putting in a bid so that’s the same bidder winning the contract. However, if the aggregated information to characterise bidding companies or contracting authorities, so organisations in general, then we find that curious distribution of these red flags, that some organisations have a bidding activity with a lot more red flags than others. So when we take together that an organisation as a whole over a longer period, say a year or two, has a lot of red flags in its contracting activity, then our trust that this indicator is actually indicating corruption is increasing.

That’s any time you don’t get an actual clear-cut conclusion, unless someone says well, I’ve paid a bribe or I’ve received a bribe?

Yes, so we are not interested in bribes, not at all, and the reason is that corruption in public procurement is in its most dangerous form is barely down to bribes. What you find is a complex lab of consultancy firms, subcontracts, offshore accounts, maybe some cash as well to grease the wheels of the administration, but the big fish is playing true, seemingly legal channels. So that’s visible in Brazil in the Petrobras case which is just ballooning and ballooning, but hundreds and probably thousands of other cases all around Europe.

But the Petrobras case, if I may add, it does involve bribes, it does involve money ending up in the pockets of people that were making decisions.

Yes, but the big corrupt trend goes to party foundations and construction companies, bogus consultancy contracts, so that’s where the big money disappears.

That’s our focus. I’m not saying bribery cannot happen and bribery is not important, I’m saying that’s not our focus. Our focus is really this particular understanding of public procurement corruption, so restricted access to public resources with the goal of benefitting the particular company or a particular set of companies.  

How important is transparency in trying to stave off corruption in public procurement?

Transparency in public procurement is crucial for controlling corruption and then, you know, this is just a blanket answer we always say, well transparency is great for controlling corruption. We think it’s great, but you need actors who can act on that transparency and the good thing about public procurement and transparency and corruption is that if you have transparency on a market, and it’s an effective transparency so the actors can actually make use of the information, then typically there are companies and individuals who have the resources to act on that. So as soon as you open up a market by publishing a call for tenders in a centre of registry, then suddenly a lot more companies are bidding. There are examples of this, namely from Indonesia, Bangladesh or India. Their electronic procurement systems have been introduced recently and then you see that there are companies who entered the market. Now it’s not a panacea to corruption, but it does decrease the likelihood of corruption to occur and there are more companies to bid and especially when there are companies who are not local, who are not, by default, linked to local politicians and bureaucrats, bid.

But you’re talking about transparency ex ante, so before the procurement procedure starts, or during the procurement procedure?

During the procurement procedure, a call for tenders.

Yes. My point was on one hand that, but also on the other hand, ex-post transparency, so transparency for the contracts that have been awarded.

Yes, that’s also really crucial and that can add a lot to the fight against corruption if that information is used by audit bodies or similar society, so that’s one of the goals of DIGIWHIST, for example, that we try to make the whole bidding process, including the contract award and contract implementation, more transparent and giving simple tools to citizens and audit bodies to track risks and quickly identify those individuals, organisations and contracts which are at risk. So this is crucial because like it or not, as we stand now after many directives on public procurement and terms of national public procurement legislation, we often don’t know the basics. So if you ask the Minister of Economy in France or Germany or in the UK, they won’t be able to tell you a really simple answer to this question: who is the biggest supplier to the government? For example, in the UK, of the high-value contracts regulated by the EU directive, 43% of those awarded contracts have no contract value, 43%. So if you have such formal transparency but effectively known transparency, then what are we talking about? You have to get the basics right.

I’m on your side on this, but usually the conversation I have with pro-competition academics and also lawyers, is that especially the contract information after the contract is awarded, if it’s made public and if it’s made available, yes, it can be useful anti-corruption measures, but it also facilitates the life of entities…


…cartels and inclusive practices?

Yes, that’s true. There’s a downside of it.

So is there any way that we could try to minimise the downside?

Yes. If you used the information more efficiently then cartels will use that information. Once it’s public, there’s no way to control who’s using that information and it is documented. There is a known fact that it’s easier to maintain cartels, but it’s also easier to spot them. now as it stands now, there is like, maybe two or three competition authorities, excluding the UK’s CMA, who actually make use of procurement, micro-level procurement data systematically to track corruption risk. The South Korean Competition Authority is one of those exceptions and those of us going around the road and saying how great practice this is. Now if we push for transparency and we don’t use it for evaluating risks of collusion or corruption, then in fact, it can be negative overall, it’s true.

Thinking of transparency of awarded contracts, a few years ago, I don’t know if you were aware of this information, in Portugal we made it mandatory not only to use electronic procurement, but also for all contracts that were not subject to contract procedure, so transparent contract procedure, to be registered into a central repository. The compliance for it, as far as I know, is close to 80 to 90% of the contracts that needed to go there, so it’s a very high compliance rate, but what I found fascinating is some, let’s say, entrepreneurial “white anchors”, cross-referenced information from that database, the Public Contracts Database, with the information that is contained in the Companies Register which is also public by definition, and I found out that in some examples, or in some situations, companies that were yet to be incorporated were being awarded contracts and I found that fascinating. All it did was to take the data from the two data sets and combine it and see what it produced. So do we have any examples of using this kind of data that may yield an unexpected result?

Cross-checking and linking public procurement data and any special interest companies to Company Registry is part of what DIGIWHIST is doing in 35 countries, I mean 34 countries plus the European Commission’s own procurement activities. So yes, I mean this [?? 13.36] plan to do this and we are working on this as part of DIGIWHIST, particularly for this reason that often really simple things can emerge. Now I’m not saying that everything is as simple as that, for the very reason that as soon as the information is being used for tracking such risks, then the actors themselves change their practice. They become more sophisticated, but one of the really simple things we have seen is the politically motivated, seemingly politically motivated date of incorporation of companies. So we didn’t look at whether they were incorporated already when they got into a procurement contract, we didn’t see any red flag on that, but we have seen red flags, the winning chances of companies registered in the very first year or just before the new government came into power, was significantly higher than companies registered just a little bit before that. Now, of course, you would expect companies with more experience having higher winning chances, at least in the first few years of their existence, and in addition, we also found that companies registered when the same government was in power the first time, say 13 years ago, then their winning chances increased, but only once the government, the same government, came into power. So this is, you know, practically no economic theory can explain why those particular years made companies so successful. We could only come up with political explanations.

Speaking about the DIGIWHIST project, what are you trying to achieve with it and what’s going to be the outcomes?

First, we tried to set up an infrastructure, both for research and government accountability more broadly. So there is a lot of transparency legislation out there, a lot of data, but it’s either not linked or it’s in a horribly bad format. So what you have seen all around in Europe are individual procurement tenders in like HTML pages but there is no way you can tell everything like vary the data in a really simple way, like biggest winners or the average number of competitors for a particular company. So the first thing what you’re doing is scraping all this data, collecting all this data, cleaning it up, standardising it and republishing public procurement data in 34 countries plus the European Commission. Now then it’s linked to company data on financials, registering information, but also on ownership and also on the managers and boards and directors. Now this data is also linked to the list of political officeholders elected and appointed and finally, we link the data, the public side of the procurement data, to treasury information on contracting authorities, so how much money they receive from the central government, how much deficit they are making and so on and so forth. So really, really try to come up with a complex structured database which can be used for research as well as policing. Now on top of that, this will generate a lot of data, so on top of that, we have to come up with simple summary indicators which lay people, citizens, policymakers, can use in their daily activities.

We will generate a set of transparency indicators including data quality, a set of indicators on corruption risks and also, indicators on quality of public administrations or state capacity. Now this information we hold will be fantastic and all that, but our ambition is really to push for impact and policy change if possible. So what we will do, we put this information, the data and indicators, in a really compelling packaging, say, for example, a mobile app, that you can browse this information, you can directly access the risk scores and if anyone has any intention to blow the whistle, then whistleblowing reports can be attached to tenders. So the idea is that when all our, for whatever national body, receives a whistleblower report, then it has, on the one hand, big data, all the contracts, all the entities, the ownership ties, everything there. It has the risk scores generated by researchers and validated in a rigorous way, plus the usual insider information whistleblower share. So auditors and investigators will not only see what insiders want to share, but also see whether it’s important, whether any chance of going for these cases and whether there are, you know, large enough amounts involved to start an investigation. So this is really the tool which we hope will, on the one hand, revolutionise information in this domain, and also would hopefully allow losers to realise that they are losing out to public procurement corruption and mobilise them and help them form alliances to act, for example, companies who realise that they are losing out by not having access to certain markets.

That is fascinating and very ambitious. My concern with that is how you’re going to get access to the end of line data, especially the contracts, because other than Portugal, and I think Estonia and to a certain extent, parts of the UK right now, I’m not aware that countries inside the EU are collating that data in a streamlined, or at least in a consistent fashion, especially contracts with altered thresholds.

Yes, so we are not directly collecting contracts data. By the way, Slovakia is also…

Yeah that’s true, Slovakia is also in, you’re right.

Yes, so we are not collecting contracts data indirectly. We are collecting announcement data, call for tenders, contract award announcements, contract modification announcements, sometimes in some countries, contract completion announcements. So these are the official published documents which are, in central repositories, like Contractsfinder in the UK or the EU Standardised Electronic Daily.

But that’s precisely my point is that those sources of data are incomplete by nature.

But they’re incomplete in a particular way, right, because it’s regulated what is in there and what’s not there.

No, the point is, for example, let’s say the obligation of posting contract data, it has been EU directives for many, many years and if you look at the number of contract notices that are published on OJEU, so on tenders electronic data, and then you cross-reference it with a lot of information about those procedures, when they’re supposed to reach an end, only around 40% of the procedures, you know, the conclusion of the procedures actually registers on Tender Electronic Daily. So it means that either the procedure never reached to an end, it’s possible, but more often than not, it means that the contracting authority simply did not upload that information. So although you’re collating data that already exists in various different buckets, the underlying problem remains, and I know, for example, in Germany that there’s a huge problem in terms of trying to collate this data due to the way that devolution in Germany occurs and that the responsibility for this data to be collected, for example, does not rest at federal level but yes, at a lender level.

Mmm. So what was this 40%? I didn’t understand your calculation…

So my comment was that a set number of contract notices are published on the Tender Electronic Daily and that only 40% of that original number actually officially reach and end and there’s a contract award notice.

 No, that’s not true…

It is true. 

…it’s a much higher number. I think it’s around, the last time I looked at this was around 70 to 80%. I did have an email exchange with the [?? 21.28] who is creating some of this data and the percentage was definitely higher than 40% and this is a factual question we can clarify later by email, but even the 70% is really high, I agree, and that is due to three things: one is the lack of our capacity to link the contract offer announcements, and if you count, there’s an equal number of contract offer announcements which should have a call for tender but it doesn’t…

Correct, that’s true as well.

 ...so the numbers match up. So our best hope is that they are there, it’s just not linked, so they can be linked with some kind of probabilistic matching which DIGIWHIST will do. The second point is what you mentioned already, that they start the procedure but it never ends because of whatever reason, or the third is that it’s not there even though it should be there. So that’s true, this is think is a real problem, but we don’t actually blame this collating the data and cross-referencing it to, for example, public budgets on the contracting authority level. We don’t know the extent of problems. So, for example, we have done some of this kind of cross-checking in Hungary and it actually varies from year to year. If you look at public procurement as estimated from agency budgets, so spending on investments and material costs, and public procurement is estimated from announcement data, taking into account the threshold effect. So I think the future and one of the goals of DIGIWHIST is to expose these problems because currently, no one is looking at this, no one is saying hey, hey, it’s like millions of years are missing and you know, like shouting around and we should fix this and I think that some people are working on this, but they could use a lot more publicity and a lot more direct exposure of these problems.

Personally I think that the data situation will change once e-procurement becomes mandatory, so that’s data gets collated centrally and automatically.

But that’s only for the above EU threshold contracts.

Correct, however, for example, in Portugal, you have to use e-procurement for all contracts that are subject to a contract notice. So, for example, if you want to use an open procedure below the thresholds you can do it, but you have to do it as an e-procurement exercise. So it varies from country to country, but that is one of the points that I think is going to change it, or it’s going to change the data collection later on.


The other one that may change is in terms of consequences, because it’s very clear for you, as a procurement officer, if you do not put a contract notice out that you should have, it’s very clear what are the consequences for you. So your contract may be annulled, you may be dragged over the coals, you may have problems with your line manager, so on and so forth. If you don’t put the contract award information online, there’s no consequence. Nothing happens to you.

Except for in Slovakia where the contract doesn’t enter into force until it’s published.

And also in Portugal and you see that once you change the incentives, you see that the compliance rate then goes to what I would expect to be the compliance rate also with the obligation to put out the contract notices in the first place.

Mmm, mmm.

Okay, very well. One final question, so you are an early career researcher and as far as I know, you finished your PhD last year, about a year ago, am I correct?

Yes, last February.

And you’re already a scientific coordinator for a very large project, about £3 million or €3 million worth, what is your experience with that? So what kind of advice could you give to an early career researcher that wants to work in that kind of field? 

My honest advice is wait a bit longer and plan it better.


I mean in general, the problem of research funding, I mean good researchers are not selected on management skills, they’re selected in their career based on ability to write compelling research papers and these are competitive processes. So big grants will lead to disaster unless you know the people really well because managing an organisation in multiple countries, which is often a precondition for your funding, for example, and managing a project which is typically on top of people’s everyday work, is really, really difficult. So I’m lucky because I have worked previously in other projects with most of the people who are part of DIGIWHIST, and I see the enormous advantage of knowing these people, trusting them, knowing their strengths and weaknesses, as well as knowing my own strengths and weaknesses. So unless you trust these people and you know that you can work together with them, even in difficult situations, then just wait. I mean ambition is a great thing, but you save a lot of your nerves and your time.

So effectively you’re saying be careful with whom you get in bed with in terms of projects? 

[Laughter] yes exactly, be really careful and because you are in bed with them for years.

Yeah I agree with you and that’s as much as I’ll say on the record. Thank you very much, Mihály for the interview.

Can I add one more thing if…

Sure, of course.

…I may? So like my new, you sounded a bit critical about the corruption measurement approach.

No, it’s my job to push you back.

Yeah, yeah okay. So just one more addition, so why we think it’s a valid indicator of corruption, a valid proxy of corruption, is that there is this internal logic and the build-up of indicator but there is a lot of external validity tests we have done, and those are the ones which convince people who are really critical. For example, if you aggregate our red flags to the country level, and then you see if, you know, whether Sweden looks better than Romania, so kind of roughly correlate the macro indices with Transparency International’s Corruption Perception Index or the World Bank’s Corruption Index, you get a really good feed around 0.5 and 0.7. That’s the linear correlation co-efficient. So basically, countries which are perceived to be corrupt, they tend to do a lot more of these red flags, for example, but also companies registered in tax havens, they are much more prone to the red flags as our corruption risk methodology defines them. So there is a lot of micro and macro evidence on external validity and this is, I think, one additional point when people, you know, thinking about using these indicators or not.

Okay, very well. Thank you very much for the clarification.

[Laughter] and thank you for pushing me back.

[Laughter] you can find me at my blog telles.eu or on Twitter where I use two handles, @Detic for general discussion and @publicprocure for public procurement related topics. As for Mihály, you can find him on Twitter as well with the handle @mihaly_fazekas. As ever I am very grateful for the support of the British Academy Rising Star Engagement Awards.