CEOs, Students, and Algorithms

Hummingbirds and flowers co-adapted over millions of years. As with the shapes of the flowers they take nectar from, hummingbird beaks grew to different lengths, some straight, some curved.

Photo: Sonia Nadales

However, some bees learned that they could access the nectar within tubular flowers by chewing a hole at the base and robbing the nectar from there. When that happens, the flower loses its nectar without getting pollinated.

We see this with humans and computers too.

Dictionaries

You would expect that humans and machine algorithms would co-adapt over time, as flowers and hummingbirds did (with some bees thrown in as well).

Back in the mid-1990s websites were ranked and cataloged more simply than today in part by how many times a keyword appeared on page. A tactic to rank more highly in online search was to include white text (on a white background) repeating the keywords for which a website wanted to rank. That produced worse results for the human reader.

That white text trick stopped working long ago, replaced by other more sophisticated tactics.

Tactics to improve apparent performance span industries, often gaining attention where the games are iterative and there is money to be made. So over the last decade, companies reporting quarterly performance noticed the growth in algorithmic robo-traders and changed the way they report results.

From a recent article in the Financial Times:

“…quant hedge funds can systematically and instantaneously scrape central bank speeches, social media chatter and thousands of corporate earnings calls each quarter for clues.

“As a result, Mr Ellis’s [Man’s CEO] quant colleagues have coached him to avoid certain words and phrases that algorithms can be particularly sensitive to, and might trigger a quiver in Man’s stock price. He is much more careful about using the word “but”, for example.

“There’s always been a game of cat and mouse, in CEOs trying to be clever in their choice of words,” Mr Ellis says. “But the machines can pick up a verbal tick that a human might not even realise is a thing.”

Supporting CEO word choices are the Loughran-McDonald Sentiment Word Lists, which categorize words as positive, negative, uncertain, or litigious. Some of the categorizations made sense to me,  such as positive judgements for the words “accomplish,” “plentiful,” and “revolutionize.” Without context, other categorizations made a bit less sense to me as negative words, such as “anomaly,” “bridge,” and “escalate.”

In a new paper “How to Talk When a Machine Is Listening: Corporate Disclosure in the Age of AI,” the authors looked at prevalence of AI corporate assessments, the categorized word lists, and how companies have changed their reporting style.

“Companies who wish to accomplish the desired outcome of communication and engagement with stakeholders need to adjust how they talk about their finances, brands, and make forecasts in the age of AI. In other words, they should heed to the unique logic and techniques underlying the rapidly evolving language- and sentiment-analysis facilitated by large-scale machine-learning computation, for example, automated computational processes that identify positive, negative and neutral opinions in a whole corpus of a firm disclosure that is beyond processing ability of human brains.”

The authors identified the individual corporate scale of machine downloads (the proxy for high or low AI assessments) using IP addresses downloading corporate disclosure documents and the ease with which machine readers assess the data in the documents. Some styles of writing, notably Warren Buffett’s annual letters, may be good writing for humans but are less so for AIs.

The paper’s authors also note that as long as people can figure out the AI’s rules of assessment, there should be a feedback effect. This is Goodhart’s Law in action. When a measure of success becomes a target, it ceases to be a good measure, since people start to manipulate it.

In the paper, firms were shown to modify their language accordingly following the Loughran-McDonald dictionary’s publication. So we’re in a situation of longer flowers, longer hummingbird beaks, and gnawing bees.

Even if reporting executives alter their word choice to appear more positive on an earnings call (assuming positivity is desired) an AI could also modify how it weights specific words and also individual idiosyncrasies of specific CEOs. It’s a similar game to that played by a human analyst, but one using different inputs or making a different assessment.

The algorithmic approach to reporting language still makes sense. Algorithmic traders just need to be a little more accurate and faster than the competition over many trades. The CEOs just need stock performance a little better than their peers.

Browsers

With university classes now taught remotely by default, how should schools handle the new potential for cheating?

Test taking at home is different than test taking at school. Each student’s home experience is different. Some have their own room and quiet homes with good internet access. Some lack all of that.

Unlike in class, home-based students with bad intentions could look up information or pass it to each other in numerous ways.

Not knowing what to do, some schools adopted strict testing environments, like those managed by remote test-taking software companies like Respondus, Proctorio, and Honorlock, whose products include browser lockdowns and algorithmic student monitoring.

I don’t expect students to have the best impression of any company that produces products like these. But do a search and you’ll find a lot about the student experience with them. Chiefly that the products invade privacy, are not secure, are inaccurate, and just don’t work.

With the Respondus software things went far enough that there is a petition to block its implementation at one university, among many other less formal complaints.

Respondus controls for cheating in different ways. There are the lockdown features, such as preventing other browsers from opening. There are also the algorithmic features, such as monitoring eye and head movement, whether the same person remains in the video, and items in the surrounding environment.

But it’s a question how well it works or how to handle even small error rates in exams. You can find many online student complaints about Respondus, including:

  • “After I use it on my laptop, I always end up with corrupt files in my registry and it messes with my administrative rights to the point where I can’t shutdown/restart from the start menu.” – from a student at UT Arlington
  • “I wanted to check back on my previous questions…. went to click back on the last two questions that I didn’t answer…then the whole thing froze. The screen went completely blank. Thinking it was a connection issue, I emailed my instructor asking for an extension on the exam because it was frozen. The instructor responds after the exam was over (rolls eyes super helpful) that they couldn’t give me an extension since time was up and exams should be submitted.” – from a student at UNC Charlotte
  • As well as many ways to circumvent the Lockdown Browser – from a student at the University of Guelph

Respondus Monitor is an add-on to the Lockdown Browser. According to the company’s product materials, Respondus assesses student behavior in exam and provides instructors statistical information to decide whether there was cheating after all.

“The Most Powerful AI for Proctoring
“At the heart of Respondus Monitor is a powerful artificial intelligence engine, Monitor AITM, that performs a second-by-second analysis of the exam session. The first layer of Monitor AI includes advanced algorithms for facial detection, motion, and lighting to analyze the student and examination environment. The next layer uses data from the computing device (keyboard activity, mouse movements, hardware changes, etc.) to identify patterns and anomalies associated with cheating. Finally, the student’s interaction with the exam instrument itself is woven into the analysis, including question-by-question comparisons with other students who took the same exam.

“In all, Monitor AI analyzes dozens of factors, such as whether multiple faces appear within the video frame, or if the person who started the exam switches to a different person along the way. The data then flows into the “Review Priority” system to help instructors quickly evaluate the proctoring results.”

That last point seems to be key to this business. Respondus’ algorithmic recommendations must pass a human judge. That is, the final responsibility lies not with Respondus, but with the professors or staff reviewing the results. But at least in large classes, how many just take Respondus’ recommendation and move on?

Further, for remote monitoring to even be an issue, professors’ exams haven’t evolved as fast as the students’ environment and test-taking company monitoring software. Exams could eliminate questions with searchable answers, or change the question order or content for each student.

Students also cannot manage their own test-taking environment as well as company CEOs can manage their word choice. Students are assessed for cheating in real time, individual peculiarities may not be considered, and it’s difficult for student’s to counter an accusation of cheating.

Grades are a university’s currency so a desire to identify cheating makes sense. But universities seem to have chosen a terrible way to check for cheaters. Unlike probabilistic accuracy in the financial industry, universities need to show that they are eliminating cheating outright. That means some honest students will be flagged as cheats when they aren’t.

Consider

  • Financial traders prefer the statistical edge. University administrators and professors prefer the correct answer.
  • Behavior changes depending on the system you are in. Is it an iterative game? Can you play probabilistically many times or do you have just one attempt?