Trust: Why and How?

Let's start with Why. The computer journals I read are filling their pages these days with variations on "Trusting ML." ML is short for "Machine Learning" a less audacious term for the same technology that its enthusiasts call "Artificial Inteligence." Only the developers and their ignorant acolytes actually believe the technology is intelligent like a human, but everybody can see that the results look pretty amazing -- until it doesn't. Everybody brainwashed in the American public school system believes the neo-Darwinist hypothesis that the accumulation of random events can -- and did -- result in all manner of system complexity, nevermind that every specific test of the hypothesis always disproves it (see my "Biological Evolution: Did It Happen?") so they unthinkingly believe the failings of the current crop of "AI" will be cured by the accumulation of more random events.

Which brings us to the Trust question. What is "trust"? A dictionary is no help, they just define it terms of its synonyms.

So here's my thinking. Trust is an axiom, something you accept as true without adequate proof, like the parallel postulate, which if you assume it's true, you get Euclidean geometry, which works on flat surfaces. But you can also assume it's false and you get a different geometry, which works on curved surfaces like a sphere or saddle shape.

There's more. Trust comes in the same package with Moral Absolutes. The nature of trust is that you can reasonably expect a certain outcome from a particular set of inputs. A Moral Absolute is a fact or obligation binding on all people everywhere and in all time without exception. A person is Right if they conform to the moral absolute, and Wrong if they fail. Good people do Right things, so you can trust them, and Bad people do Wrong things so you cannot trust them to do Good. Sometimes you can trust people to do Good things because the government (or their employer, or the Mafia) will punish them if they don't, but that involves a moral calculus in which they might weigh the perceived benefits of doing Bad Things against the external punishment if they get caught. It stops being a moral absolute, so there can be no trust.

Science and technology happened in one place and time only, and nowhere else ever, and that is the result of Christian Moral Absolutes pervading the culture, because God created the universe, and God gave it Laws to operate on -- same as He gave us moral Laws to operate on -- and we can trust God to not make mistakes (or as Einstein famously put it, "God does not play dice"). If there are no rules, or if the rules are the inscrutable whims of multiple capricious deities, then it's foolish to try and figure out what the rules are.

That's the contradiction inherent in Darwinism: there are obviously rules that organisms work by, but let's pretend that it all came about by chance. And if that's a Law of Nature -- what does "Law of Nature" mean in a universe where everything came about by chance? -- then we can harness that Law of Nature to produce "Artificial Intelligence." The whole ML house of cards is built on that supposition. And it sort of works, because there really are rules (all computer code is codified rules) and the computer does what is told to do, which in the case of ML is a very small program ("30 lines of C") adjusting millions or billions of numerical weighting factors based on what words or image pixels appear together in the input data, no need to understand what the words mean nor what objects the color shapes represent in the image. Just what is allowed to appear together in the input data. Really.

By disallowing word combinations that did not occur in the training data, the computer is able to spit out word combinations (or color shapes) that did occur in the training data, which if you looked carefully through all the billions (with a "B") of training sentences or images, you'd probably find exactly the one it produced, or very nearly so. And if the training data has racist or sexist remarks -- recall, this is unsupervised data -- then the "generator" can and will produce racist or sexist output. And who would know? Nobody can read (and remember) a billion lines of training data text in their whole lifetime and still have time to eat and sleep and compare what they memorized to the ChatGPT output.

Where does Trust fit in all this? Trust is the expectation that people will do Good Things and not Bad Things, but it only works in a culture informed by the Golden Rule. Otherwise you can only trust people while you have eyes on them and a whip or gun in your hand. These articles about trust all came out of a Christian or post-Christian culture. Machines are not people, but they are programmed by people. You put your money into a bank, and they keep a record of it on their computer, and you can trust that you can get the same amount of money back out because a person (the programmer) wrote the program, and if they programmed it to steal your money, they would go to jail, and nobody wants to go to jail. But no person or persons supervised the training data to make sure the ML program learned Good Things and not Bad Things. In fact, nobody even knows what the machine actually learned, except that this particular input produced that particular output. And maybe a picture of three cupcakes with be recognized by the computer as "a dog and a cat playing frisbee," or a selfie picture of a couple of dark-skinned people will be classified as "two gorillas." Yes, those things actually happened (Google removed the "gorilla" category from their program because nobody knew why it happened nor any other way to prevent it).

With a track record like this, why should anybody trust a ML program at all? It is inherently untrustworthy by its very nature.

But the developers have put millions of dollars into developing (training) these programs, and ultimately they need to convince people to trust (and pay for!) the results in order to show a profit.

Nobody is talking about changing the way ML happens so that we understand the process and can trust the programmers. Instead they are layering on more ML barnacles that only look like explanations and transparency, but in reality are only more opaque ML systems doing nobody knows what.

I think we need some programmers sent to jail before we can get trustworthy ML programs -- or more likely, before the developers themselves are willing to admit that their process is inherently inscrutable and untrustworthy.

That's my opinion. Now let's look at what the industry people are saying...

This piece -- "The Flow of Trust: A Visualization Framework to Externalize, Explore, and Explain Trust in ML Applications" -- first appeared in ComputerGraphics last year, reprinted in the current issue of ComputingEdge. The authors are mostly (five out of seven) graphics people, not "AI" or "ML", but they are (second-hand) True Believers, that is, they (ahem) trust the AI people to be telling the truth about their technology. In the first paragraph:

ML applications construct models of the phenomena from which data are acquired and aim to generate predictions related to these phenomena in the presence of new, unseen, data.

Their own emphasis on the word "models" signifies that they believe the malarkey that the neural nets (NNs) are actually intelligent and have internal representations of the mental abstractions that go with a correct understanding of the data they were trained on.

The second paragraph introduces their approach to the trust problem:

...an increasingly important research direction targets explainable AI (XAI) (i.e., the creation of methods and tools that shed light on the functioning of such models to their various users). However, while such techniques help users to understand how a model is structured and works, they currently do not directly cover building trust in the model...

They go on to describe their efforts to build tools for the users of ML applications to express and rationalize their distrust. If and to the extent that trust is axiomatic (essentially without proof or explanation), I suspect their efforts will be found to be futile. Discussing the state of their research, they admit:

Currently, trust is not expressed explicitly, it implicitly forms in the mind of the user.

And again:

In reality, MD [the model developers] were not used to doing visual explorations. Therefore their trust was communicated implicitly without being supported by evidence.

They have a tough row to hoe. If and to the extent that trust is necessarily axiomatic (unsupported by evidence), the trust levels to be communicated had nothing to do with the visual eplorations they did not do. See how they describe a particular case they tried to work out:

The built model M (a neural network) was not inherently explainable; therefore MD [the model developers] created a surrogate model M' to explain the behavior of M. ... Although the visualization developers (VD) invented some tricks to present M' in a simplified and aggregated form, it was not enough for a good understanding of the model behavior.

Duh. There is no model inside the neural net, there is nothing there but a random collection of weighting factors that happened by chance to luck onto a set of values that satisfied the training data tags. Maybe (perhaps most of the time, but nobody knows) they turn out to be measuring an accurate representation of what the developers were hoping for, and maybe they happen to accidentally discover unintended similarities in the training data, so that three cupcakes get reported as a dog and cat playing frisbee. Nobody knows, not even the people who are paid very high salaries to create these "models." That's why it's not inherently explainable, and that's why they necessarily must invent an entirely different surrogate model with no ontological relationship to the NN actually doing the work, thereby to "explain the behavior" that is inherently not explainable.

I said most of that six years ago (see "AI (NNs) as Religion") before I ever heard or saw the term XAI, and then again (see "Neural Net Comeuppance" two years ago) when I saw what they were attempting to do.

The next article in the same issue of ComputingEdge is written by a professor muckety-muck in machine learning and AI at some university in Boston, titled "The Secrets of Data Science Deployments" and bemoaning that

over 85% of machine learning models that are built and demonstrated to solve a predictive problem are never deployed in practice.

His hammer is the processing of vast amounts of data using neural nets, which he calls "data science," and after the whole industry has been doing this for a while, most of what he sees is a bunch of bent-over nails that his hammer mangled. He offers five reasons, one of which is

4) Gaining trust and understanding of the data science solution

which is the topic of today's essay. I'll get to that in a minute, but his other four reasons are all variations of the same business ignorance that had been around for decades before there ever was such a thing as data science. One company I did some security code for some 40 years ago, they were the industry leader but the founder wanted out so he sold it to a conglomerate which brought in their own CEO who didn't understand the industry and didn't understand the technology. A few years later they sold it off to a competitor for a third what they had paid and the brand name disappeared off the market, you know, classic business acumen, Buy high, sell low. This isn't a problem unique to data science or AI, there have always been idiots with more dollars than sense willing to risk it trying to do what clever people before them did successfully by knowing what they are doing, and as the richest country in the whole world and all time, we have enough spare cash to waste some of it that way. Maybe I should have said "we had enough cash..." Waste enough of it, pretty soon you don't have enough to waste. I read the last chapter of The Book, and the USA is not in it. We're already on the way down, and "data science" is helping us get there by promoting data "models" that exist only in the minds of their creators, and certainly not in the code they are selling.

He has more to say about the trust problem than any two of the other four stated reasons.

One of the most effective ways to gain trust is to have good explanations for the system's recommendations. The explanations have to make sense to the manager and have to present acceptable evidence.

He does not offer any help on how to get good explanations with acceptable evidence from a technology that is inherently inscrutable. No problem, just

A great way to build up the trust in the system, ... [is] to involve the stakeholders in the construction of the model. Running the model on the side and asking for feedback and guidance from the stakeholders ensures they better understand how the predictive model is working and makes them feel that they are playing a critical role in fine-tuning and optimizing the model.

Notice that the stakeholders are not invited to participate in the internal structure of this "model," but only in fine-tuning the finished result. If there is no "model" at all, just random weighting factors adjusted by pseudo-random events accumulated over a vast number of iterations over the data -- which needs to be carefully curated (as he says elsewhere) -- then fine-tuning a broken model is still broken, and inherently untrustworthy. Good engineering managers will see that, and there is no trust. Bad managers, they "trust" the snake-oil vendors, they "Buy high and sell low," and the company goes belly-up.

Tom Pittman
Rev. 2024 May 30