We Audited Our Top 100 Remote Employees With a 'Productivity' AI. It Fired the Wrong 7 People. - Goh Ling Yong
The termination emails went out at 9:02 AM.
An AI we’d deployed to monitor ‘productivity’ had flagged seven of our top remote employees for “chronically low engagement.” It was supposed to be a triumph of data-driven management.
By 10:30 AM, I was on the phone with our biggest client, begging them not to pull a seven-figure contract after their lead architect—our best architect—was fired by a script.
Let that sink in. An algorithm designed to optimize our workforce had just decapitated it.
The $50,000 Mistake I Almost Let Ruin Our Company
I’m the CTO at a mid-sized SaaS company. Like everyone else, we went fully remote in 2020 and never fully went back. We have over 300 employees, with our core engineering, product, and design teams—about 100 people—scattered across the globe.
And for two years, the C-suite has been asking the same question: “How do we know they’re working?”
It’s a question born of fear, not malice. The old metrics of management—butts in seats, walking the floor, overhearing conversations—were gone. In their place was a void of data. So, when a buzzy AI startup promised us a solution, we listened.
They sold us a dream: an intelligent dashboard that would give us an “objective, unbiased view” of productivity. It integrated with Slack, Jira, Google Calendar, and GitHub. It tracked:
- Communication Frequency: How often you’re active on Slack.
- Ticket Velocity: How quickly you close Jira tickets.
- Code Commits: Daily and weekly commit volume.
- Calendar Density: Percentage of your day spent in meetings.
The platform cost us $50,000 a year. The pitch was seductive—it promised to spot burnout, identify hidden high-performers, and flag “disengaged” employees before they became a problem. I was skeptical, but the pressure to have some kind of metric was immense. I signed off on a 90-day pilot with our top 100 remote staff.
It was one of the worst decisions of my career.
The Day the Algorithm Went Rogue
The AI categorized each employee with a “Productivity Score” from 1 to 100. We set a policy that anyone scoring below 20 for three consecutive weeks would be flagged for review. The system was automated to send the list to HR, who, following a protocol I foolishly approved, would initiate termination proceedings.
We thought it would catch the coasters. The people gaming the system.
Instead, it targeted our assassins.
Here are three of the seven people the AI fired:
“Maria,” Principal Architect: Her productivity score was 11. Why? She had minimal Slack activity. She closed maybe one Jira ticket a month. Her GitHub commits were infrequent. But Maria was the brain behind our entire V2 platform. For the last month, she’d been offline, deep in thought, architecting a solution that would save us an estimated $1.2 million in technical debt. She wasn’t slacking; she was thinking. The single document she produced that month was more valuable than the entire team’s Jira output combined.
“David,” Senior Security Engineer: His score was 8. David lived in the command line. He rarely used our GUI-based tools. His job wasn’t closing tickets; it was preventing them from ever being written. He spent his days in quiet, proactive threat hunting, reading obscure documentation, and running penetration tests. His low “activity” was a sign he was doing his job brilliantly. The absence of noise was the signal.
“Li,” Lead Product Designer: Her score was 16. Li’s calendar was mostly empty. She famously rejected most meeting invites. Why? Because she spent her time talking to customers. She was conducting user interviews, watching session recordings, and sketching in her notebook. This "untrackable" work led to the user experience breakthrough that cut our onboarding churn by 40%.
The AI saw their lack of digital noise and concluded they were ghosts. It couldn't have been more wrong. They weren't ghosts; they were the silent pillars holding the entire structure up.
And we had just fired them. In the most impersonal, soul-crushing way imaginable.
The Autopsy: 4 Shocking Truths About AI Productivity Monitoring
After a frantic 48 hours of apologies, reinstatements, and one very expensive gift basket sent to our top client, we unplugged the AI. Then, we did a deep, painful post-mortem.
What we found should be a terrifying wake-up call for any leader banking on AI to manage their people.
1. It Confuses Activity with Achievement
This is the most dangerous flaw. The AI was a glorified busyness-detector. It rewarded people who were constantly online, bouncing between meetings, and closing trivial tickets. It created a perverse incentive to perform productivity rather than actually be productive.
It’s the digital equivalent of rewarding the person who types the fastest, not the person who writes the best story.
Your most valuable employees are often the ones who have the least “activity.” They’ve carved out space for focus. They’re not reacting to notifications all day; they’re creating value. The algorithm saw this focus as a bug, not a feature.
2. It Has a ‘Deep Work’ Blindspot
Cal Newport was right. The ability to perform “deep work”—to focus without distraction on a cognitively demanding task—is becoming increasingly rare and valuable.
Our AI was actively punishing this superpower.
Maria, our architect, was engaged in the deepest work possible. David was on a solo mission that required immense concentration. Li was synthesizing complex user feedback. All of this value creation was happening offline, in their brains.
The AI had no way to measure this. It can’t track thought. It can’t quantify insight. So, it simply ignored it. Productivity tools that can’t measure the most productive state of work are worse than useless—they’re destructive.
3. It Punishes the Mentors and Helpers
We discovered another pattern. The flagged employees were often the ones who spent their time helping others. The senior engineer who jumps on a call to unblock a junior dev. The designer who gives feedback on another team’s project.
This work is the glue that holds a remote team together. It doesn’t generate Jira tickets or GitHub commits, but it’s absolutely critical. The AI saw it as dead time. It interpreted an hour spent mentoring as an hour of “non-productive activity.”
You have people on your team whose greatest value is making everyone around them better. Are your measurement tools rewarding them or marking them as redundant?
4. The False God of “Objectivity”
We bought into the promise of an unbiased system. What a joke.
The AI wasn’t objective; it was just a mindless enforcer of a narrow, flawed definition of productivity created by its programmers. The bias wasn’t based on race or gender, but on work style. It was heavily biased toward shallow, visible, high-volume work.
It created a new kind of discrimination: a bias against thoughtfulness. A bias against experience. A bias against anyone whose job couldn’t be reduced to a series of clicks and keystrokes.
The Metric That Actually Matters (And That No AI Can Measure)
So, what’s the lesson here? That all metrics are bad? No.
The lesson is that we were measuring the wrong thing. We were obsessed with the volume of work instead of its leverage.
After this debacle, we started talking about a new metric, one that’s qualitative and requires human judgment. We call it Value Density.
Value Density isn’t about how many things you do. It’s about the impact of the few things you choose to do.
- Maria’s Value Density: One 30-page design doc that sets the company’s technical direction for the next 18 months. Immense density.
- A Junior Engineer’s Value Density: Closing 50 small bug-fix tickets in a month. Lower density per task, but still valuable in aggregate.
- David’s Value Density: Preventing a security breach that never happens. The value is almost infinite, but the measured “activity” is zero.
You can’t put a number on this with an algorithm. It requires a manager who understands the context of the work. It requires conversations. It requires trust.
How We're Fixing It: The Human-in-the-Loop Framework
We didn’t fire the AI vendor (though I was tempted). We changed how we use the tool. It's no longer a judge, jury, and executioner. It’s a simple thermometer. It provides one data point among many, and its findings are never taken at face value.
Here’s our new, non-negotiable framework for performance in a remote world:
AI as a Question-Generator, Not an Answer-Provider. The dashboard can flag low activity. But now, that flag triggers a conversation, not a termination. A manager’s job is to ask, “Hey, I see your GitHub activity is low. Are you stuck? Or are you deep in thought on the new architecture?” The data starts the conversation; it doesn’t end it.
Focus on Outcomes, Not Output. We’ve shifted entirely to an outcome-based model (OKRs). We don’t care how many hours you log or how many tickets you close. We care if you moved the needle on the key result you own. Did you reduce churn? Did you ship the feature? Did you solve the customer’s problem? The how is up to you.
Explicitly Protect and Reward Deep Work. We now celebrate “focus time.” We’ve instituted “No-Meeting Fridays” and actively encourage people to block off their calendars and go dark. Managers are trained to see a quiet employee not as a slacker, but as a potential genius at work.
This whole experience was humbling. It was a stark reminder that leadership can’t be automated. People are not nodes in a network to be optimized. They are complex, creative individuals whose value is often hidden in the quiet spaces the algorithms can’t reach.
We wanted a tool to help us see our people better. Instead, it made us blind. We almost let an algorithm fire our best employees because we were too busy measuring the noise to appreciate the signal.
Key Takeaways:
- Activity is not Achievement. Don't fall into the trap of rewarding busyness. Your most impactful people might look the "least busy."
- AI can't measure deep work. The most valuable cognitive tasks are invisible to current productivity monitors. Protecting this work is your competitive advantage.
- Data is not a decision-maker. Use data to ask better questions, not to provide automated answers. Human judgment is irreplaceable.
- Measure impact, not input. Shift from tracking hours and clicks to focusing on the actual outcomes that drive your business forward.
My question for you is this: How is your company measuring productivity, and are you sure it’s not accidentally punishing your most valuable people?
Let me know in the comments.
Connect with Goh Ling Yong
Follow for more insights and updates:
Thank you for reading! If you found this helpful, please share it with others.
📖 Read on Medium
This article was originally published on Medium. You can also read it there:
If you enjoyed this article, please consider giving it a clap on Medium and following for more content!