Someone walks in already carrying the urgency. "The Wi-Fi is down," they say, and almost in the same breath comes the culprit: someone changed something, the previous admin left it broken, somebody is responsible and it certainly wasn't me. The verdict arrives before the evidence. The sentence is signed before anyone has looked at a single screen.
Around here, we do not operate on "it seems broken." We operate on technical certainty, and technical certainty has an inconvenient requirement: you have to go and find it. What follows is not a DNS troubleshooting guide—I am not interested in commands here—but an anatomy of why that first statement, the one naming a culprit, is almost always wrong, and where the ghost actually comes from.
1. "The Wi-Fi Is Down" Is Not a Diagnosis
It helps to separate two things that urgency tends to collapse into one. An observation describes a state of the system: a service responds, another does not, at a particular layer, at a particular time. A verdict does something else entirely—it declares that everything is broken and assigns blame at the same time.
"The Wi-Fi is down" belongs to the second category. It is vague: it does not tell us what stopped working, where, or since when. It is absolute: everything or nothing, with no nuance in between. And, most importantly, it requires a defendant. None of these properties are accidental; they are exactly what a judgment needs in order to survive without evidence. As long as the failure remains vague and total, anyone can be guilty. Imprecision is not a flaw in the story—it is the fuel that keeps it running.
Diagnosis begins by refusing to sign that verdict. Not out of kindness, but because the verdict contains no information. "The Wi-Fi is down" gives me nothing I can measure. "I have connectivity but cannot resolve names" tells me almost everything.
2. The Check Nobody Performed
Because that was exactly what was happening, and it took about ten seconds to see it. The network was working. Packets were flowing normally. The only thing broken was name resolution: a specific, identifiable layer, and one that happened to live above and beyond the machine of the person complaining. The distinction is textbook material. If a numeric address responds and a hostname does not, then the problem is not the network, not the cable, not the Wi-Fi, and not the person who came before you. It is name resolution.
That detail is what makes the whole episode almost funny, if it were not so common. The accuser was standing on top of the evidence that disproved their own verdict. They had connectivity. The Wi-Fi—the very thing they had declared dead—was working perfectly. Ten seconds of looking would have shown that the failure was not where the finger was pointing. They never looked.
And that is the point that interests me. The hysteria was not in the failure; the failure was small and perfectly still. The hysteria lived in the gap between the complaint and the check, in that space that opens whenever someone chooses to pronounce a sentence instead of making an observation.
3. It's Always DNS (And It's Never Witchcraft)
There is one honest caveat to make: the component that failed was not chosen at random. It was DNS. Of all pieces of infrastructure, few carry more mythology. There is an entire sysadmin folklore built around the idea that, in the end, it is always DNS—that half-joking resignation of ruling it out repeatedly only to discover that, yes, it was DNS after all. DNS is the perfect ghost of infrastructure: invisible, intermittent, usually the last thing people check and frequently the thing responsible.
But folklore can be read in two different ways, and the difference between them is the difference between a mature team and an enchanted one. The lazy reading is mystical: DNS breaks for no reason, it just does that, it's black magic. The disciplined reading is exactly the opposite. Precisely because DNS is often the culprit, the trained suspicion points toward a named layer rather than a named person. The ghost has an address; the scapegoat is fiction. If it is "always DNS," that is not a curse—it is a clue. And a clue is the opposite of witchcraft.
4. Witch Hunts Need Humans; Methods Name Layers
This is where the whole mechanism becomes visible. A witch hunt requires two things: the failure must remain vague, and there must be someone available to blame. An empirical method attacks both conditions at once. The moment you name the layer—not "the Wi-Fi is down" but "the upstream resolver has stopped responding"—the witch and the scapegoat disappear together. There is no longer a broken everything; there is a specific component. And because components have no faces and no shifts to blame, there is nobody left to accuse. Technical precision becomes, almost accidentally, an act of justice—not because it defends innocent people, but because it eliminates the need for defendants altogether.
That is why a verdict without a check is not fundamentally a technical mistake. It is a moral act—assigning blame—disguised as a technical judgment. And the only way to neutralize it is not to argue with the story, because stories can always answer back; the only effective response is evidence. There is a more formal explanation for why this works—why diagnosis is ultimately an epistemological practice rather than a mechanical skill—and I have written about that elsewhere. Here I prefer to stay in the trenches, because that is where these things actually happen.
Conclusion: Ten Seconds of Looking
When I ran the checks and stated the facts—that it was DNS, that the failure was upstream, outside our control, and unrelated to anyone on the team—what followed was silence. For a while I interpreted that silence as discomfort. Today I see it differently: it was simply a hypothesis being falsified in real time, with nothing left to say. Stories need air; evidence takes the oxygen out of the room.
There is one final caveat, because no serious method promises heroic endings. Checking does not make you right—it makes you auditable, and those are not the same thing. Sometimes the cause genuinely lies outside your control—an upstream resolver, as in this case—and there is nothing you can fix immediately. Even then, the method gives you something valuable: not a solution, but a way to name a layer instead of pointing at a person. Ten seconds of looking before delivering a verdict will not always fix the network. It will always dissolve the tribunal.
Comments
Post a Comment