Skeptophilia: gibberish

Showing posts with label gibberish. Show all posts

Wednesday, May 7, 2025

Nonsense from the sky

I was recently chatting with a friend about how little it takes to get woo-woos all stirred up -- and how impossible it is to get them to simmer down afterward -- and that got me thinking about A Book from the Sky.

If you've never heard about this strange publication, you're not alone; it never got a great deal of attention outside of China (except for one other subset of humanity, q.v.). It's the creation of award-winning Chinese artist Xu Bing, who has made a name for himself pushing convention and working paradox and surreality into his creations.

A Book from the Sky (天書; Tiānshū) looks, to someone like myself who knows no Chinese, like nothing more than page after page of artistically-laid-out Chinese calligraphy:

Cover page of A Book from the Sky

The first clue you might have that something is amiss is that the characters for the book title -- 天書 -- don't appear on the title page. In fact, they appear nowhere in the book.

In another fact, none of the characters in the book are actual Chinese characters. Chinese scholars have gone through the whole thing painstakingly and found only two that are close to real Chinese characters, and one of those is only attested in a supposed ninth-century document that might itself be a forgery. (Whether the inclusion of that character was deliberate, or is merely an accidental resemblance, isn't certain, but I suspect the latter.)

Now, let's be clear about one thing right from the get-go. Xu himself states up front that A Book from the Sky is nonsense. Here's his description, from his own website:

Produced over the course of four years, this four-volume treatise features thousands of meaningless characters resembling Chinese. Each character was meticulously designed by the artist in a Song-style font that was standardized by artisans in the Ming dynasty. In this immersive installation, the artist hand-carved over four thousand moveable type printing blocks. The painstaking production process and the format of the work, arrayed like ancient Chinese classics, were such that the audience could not believe that these exquisite texts were completely illegible. The work simultaneously entices and denies the viewer’s desire to read the work...

[T]he false characters “seem to upset intellectuals,” provoking doubt in established systems of knowledge. Many early viewers would spend considerable time scrutinizing the texts, fixedly searching for genuine characters amidst the illegible ones.

The aftermath of the release of A Book from the Sky reminds me of an incident from my freshman lit class in college. The professor, a well-meaning but very old-school gentleman named Dr. Fields, had us read Robert Frost's famous "Stopping by Woods on a Snowy Evening." Afterward, he read us a quote from an interview with Frost in which the poet was asked about symbolism in the poem. Frost responded, basically, "There isn't any. It's about a man stopping by woods on a snowy evening. That's all." But then Dr. Fields, wearing his most patronizing smile, said, "Of course, we know that a poet of Frost's caliber would not have a poem with no symbolic literary elements, so we will proceed to analyze the symbolism therein."

So the woo-woos have decided that "an artist of Xu's caliber would not have a 604-page book with no meaning at all," and have been trying since its release all the way back in 1991 to figure out what it "actually means."

Here are a few of the weirder claims I've seen:

it's written in the script that was used in Atlantis and/or Lemuria, which is why we can't decipher it, because there aren't many Atlanteans or Lemurians around these days.
the document was communicated to Xu in a series of dreams generated by telepathic aliens who are trying to pass along to humanity their superior wisdom.
it's eeeeeeevil, and if we did translate it, it would release demons, and boy then we'd be sorry.
it's somehow connected to other examples of asemic writing (writing that looks like it should be meaningful but isn't), like the Voynich Manuscript and Codex Seraphinianus, and maybe one of them holds the key to deciphering the others.

Okay, respectively:

neither Atlantis nor Lemuria existed. I keep hoping this particular nonsense will go away, but somehow it never does.
if this is superior wisdom from telepathic ultra-powerful aliens, you'd think they'd communicate in a language humans actually could read. Like, oh, I dunno, maybe Chinese, which Xu, being Chinese and all, just happens to be fluent in.
at this point, I'm thinking releasing demons wouldn't be any worse than what we're currently dealing with, so as far as that goes, let 'er rip. Bring on the demons.
of course it's connected to other asemic writing, because... hang on to your hats, here... by definition none of it has meaning. If it was decipherable, it wouldn't be asemic writing. It would just be plain old writing.

For cryin' in the sink, y'all need to put more effort into your crazy claims. Because these ones suck.

Me, I think A Book from the Sky is exactly what its creator claims it is -- a beautiful but meaningless art piece intended to poke fun at the art establishment and people who need to find meaning in everything. As the famous line about Freudian symbolism goes, "Sometimes a banana is just a banana."

But that's never going to satisfy the woo-woos, because they (1) can't resist a mystery, and (2) never admit they were wrong about anything. So I'm sure they'll keep plugging away at it, trying to figure out what Xu's work "actually means."

Oh, well. As long as it amuses them. And if it keeps them busy, they'll have less time to send spit-flecked emails to me about what a sheeple I am, so that's all good.

****************************************

Wednesday, February 26, 2014

Academic gibberish

About three years ago, I wrote a post on the problem with scientific jargon. The gist of my argument was that while specialist vocabulary is critical in the sciences, its purpose should be to enhance clarity of speech and writing, and if it does not accomplish that, it is pointless. Much of woo-wooism, in fact, comes about because of mushy definitions of words like "energy" and "field" and "frequency;" the best scientific communication uses language precisely, leaving little room for ambiguity and misunderstanding.

That doesn't mean that learning scientific language isn't difficult, of course. I've made the point more than once that the woo-woo misuse of terminology springs from basic intellectual laziness. The problem is, though, that because the language itself requires hard work to learn, the use of scientific vocabulary and academic syntax can cross the line from being precise and clear into deliberate obscurantism, a Freemason-like Guarding of the Secret Rituals. There is a significant incentive, it seems, to use scientific jargon as obfuscation, to prevent the uninitiated from understanding what is going on.

[image courtesy of the Wikimedia Commons]

The scientific world just got a demonstration of that unfortunate tendency with the announcement yesterday that 120 academic papers have been withdrawn by publishers, after computer scientist Cyril Labbé of Joseph Fourier University (Grenoble, France) demonstrated that they hadn't, in fact, been written by the people listed on the author line...

... they were, in fact, computer-generated gibberish.

Labbé developed software that was specifically written to detect papers produced by SciGen, a random academic paper generator produced by some waggish types at MIT. The creators of SciGen set out to prove that meaningless jargon strings would still make it into publication -- and succeeded beyond their wildest dreams. “I wasn’t aware of the scale of the problem, but I knew it definitely happens. We do get occasional emails from good citizens letting us know where SciGen papers show up,” says Jeremy Stribling, who co-wrote SciGen when he was at MIT.

The result has left a lot of folks in the academic world red-faced. Monika Stickel, director of corporate communications at IEEE, a major publisher of academic papers, said that the publisher "took immediate action to remove the papers" and has "refined our processes to prevent papers not meeting our standards from being published in the future."

More troubling, of course, is how they got past the publishers in the first place, because I think this goes deeper than substandard (worthless, actually) papers slipping by careless readers. Myself, I have to wonder if anyone can actually read some of the technical papers that are currently out there, and understand them well enough to determine if they make sense or not. Now, up front I have to say that despite my scientific background, I am a generalist through and through (some would say "dilettante," to which I say: guilty as charged, your honor). I can usually read papers on population genetics and cladistics with a decent level of understanding; but even papers in the seemingly-related field of molecular genetics zoom past me so fast they barely ruffle my hair.

Are we approaching an era when scientists are becoming so specialized, and so sunk in jargon, that their likelihood of reaching anyone who is not a specialist in exactly the same field is nearly zero?

It would be sad if this were so, but I fear that it is. Take a look, for example, at the following little quiz I've put together for your enjoyment. Below are eight quotes, of which some are from legitimate academic journals, and some were generated using SciGen. See if you can determine which are which.

On the other hand, DNS might not be the panacea that cyberinformaticians expected. Though conventional wisdom states that this quandary is mostly surmounted by the construction of the Turing machine that would allow for further study into the location-identity split, we believe that a different solution is necessary.
Based on ISD empirical literature, is suggested that structures like ISDM might be invoked in the ISD context by stakeholders in learning or knowledge acquisition, conflict, negotiation, communication, influence, control, coordination, and persuasion. Although the structuration perspective does not insist on the content or properties of ISDM like the previous strand of research, it provides the view of ISDM as a means of change.
McKeown uses intersecting multiple hierarchies in the domain knowledge base to represent the different perspectives a user might have. This partitioning of the knowledge base allows the system to distinguish between different types of information that support a particular fact. When selecting what to say the system can choose information that supports the point the system is trying to make, and that agrees with the perspective of the user.
For starters, we use pervasive epistemologies to verify that consistent hashing and RAID can interfere to realize this objective. On a similar note, we argue that though linked lists and XML are often incompatible, the acclaimed relational algorithm for the visualization of the Internet by Kristen Nygaard et al. follows a Zipf-like distribution.
Interaction machines are models of computation that extend TMs with interaction to capture the behavior of concurrent systems, promising to bridge the fields of computation theory and concurrency theory.
Unlike previous published work that covered each area individually (antenna-array design, signal processing, and communications algorithms and network throughput) for smart antennas, this paper presents a comprehensive effort on smart antennas that examines and integrates antenna-array design, the development of signal processing algorithms (for angle of arrival estimation and adaptive beamforming), strategies for combating fading, and the impact on the network throughput.
The roadmap of the paper is as follows. We motivate the need for the location-identity split. Continuing with this rationale, we place our work in context with the existing work in this area. Third, to address this obstacle, we confirm that despite the fact that architecture can be made interposable, stable, and autonomous, symmetric encryption and access points are continuously incompatible.
Lastly, we discuss experiments (1) and (4) enumerated above. Error bars have been elided, since most of our data points fell outside of 36 standard deviations from observed means. On a similar note, note that active networks have more jagged seek time curves than do autogenerated neural networks.

Ready for the answers?

#1: SciGen.
#2: Daniela Mihailescu and Marius Mihailescu, "Exploring the Nature of Information Systems Development Methodology: A Synthesized View Based on a Literature Review," Journal of Service Science and Management, June 2010.
#3: Robert Kass and Tom Finin, "Modeling the User in Natural Language Systems," Computational Linguistics, September 1988.
#4: SciGen.
#5: Dina Goldin and Peter Wegner, "The Interactive Nature of Computing: Refuting the Strong Church-Turing Thesis," Kluvier Academic Publications, May 2007.
#6: Salvatore Bellofiore et al., "Smart Antenna System Analysis, Integration, and Performance on Mobile Ad-Hoc Networks (MANETs)," IEEE Transactions on Antennas and Propagation, May 2002.
#7: SciGen.
#8: SciGen.

How'd you do? If you're like most of us, I suspect that telling them apart was guesswork at best.

Now, to reiterate; it's not that I'm saying that scientific terminology per se is detrimental to understanding. As I say to my students, having a uniform, standard, and precise vocabulary is critical. Put a different way, we all have to speak the same language. But this doesn't excuse murky writing and convoluted syntax, which often seem to me to be there as much to keep non-scientists from figuring out what the hell the author is trying to say as it is to provide rigor.

And the Labbé study illustrates pretty clearly that it is not just a stumbling block for relative laypeople like myself. That 120 computer-generated SciGen papers slipped past the eyes of the scientists themselves points to a more pervasive, and troubling, problem.

Maybe it's time to revisit the topic of academic writing, from the standpoint of seeing that it accomplishes what it originally was intended to accomplish; informing, teaching, enhancing knowledge and understanding. Not, as it seems to have become these days, simply being a means of creating a coded message that is so well encrypted that sometimes not even the members of the Inner Circle can elucidate its meaning.