Case Study: The Messy and Arduous Reality of Workforce Upskilling for the AI Future
In December of 2023, I was absolutely convinced that we were going to have the change. My first signals were not from our team, however, rather signals from client work. We were deep in the guts of compliance related work, and the inevitable copy, paste, stare, and compare cycle of analyzing housing policy information to make sense of it and turn it into software requirements. I knew there had to be a better way.
Starting Was Expensive
We decided then that we would take a small team of experienced practitioners and dedicate them to our idea about using generative artificial intelligence to automate the software development process. We didn’t know a thing about genAI then, in fact we knew less than nothing, but we really knew how to make mortgage technology. I’ve spent a 25-year career doing it, at each member of our small team was highly skilled and experienced in their profession. This was an elite team (and also very expensive).
Basically, we had no experience, a lofty ambition, and a half-baked plan. Our first endeavor was the basic learning – figuring out what we didn’t know, which was pretty much everything. I took a few classes, and we took to YouTube and ChatGPT. It’s crazy to think now, but “back then” there really weren’t a lot of baked patterns or ways of working we could leverage. Retrieval augmented generation (RAG) was out there, but it wasn’t accessible, and we didn’t really understand it. We were really proud of our first Custom GPT that Kenny Akridge figured out how to do. I remember being absolutely blown away when he explained that it was all expressed in natural language. That’s also when I knew we were really in trouble, and we were going to have to change.
Existential Dread (aka Sometimes You Eat the Bear and Sometimes the Bear Eats You)
The mushroom cloud of the inevitable AI future. Created on Midjourney
When I first captured a glimmer of what genAI could do, I knew it would change everything. That glimmer – the ability to bring ideas to life with only a natural language expression – kicked off a deep reimagining of what product development could be like. I knew then, and I still know now, that it was an “eat or get eaten” situation. We were truly either going to pivot or die.
Every single role involved in the typical (or what I now call “classic”) product development process will be altered, it’s simply a matter of time. Not to be dramatic, but it’s a little bit like the eventuality of the atomic bomb – once a thing can be done, it will be done. Once the ability to split the atom was possible, it was just a matter of time before a weapon that relied on that ability would be created. Now that it is possible to instantly and intelligently create all the artifacts of software development, eventually we simply won’t need people to do that. Further, there will come a time when those artifacts are no longer even necessary. The most rapidly growing development language these days is English.
In three years (maybe two, maybe five, maybe even ten, but eventually), unless we changed our business, we would be obsolete. Simple as that.
But that’s kind of inspiring, isn’t it? To be able to see the future and actually have time to act? To have this incredible chance to invent the way work will be done? The reality is that adoption of this tech will have a very long tail. We will see it happen instantly in Silicon Valley, and genAI-native companies will start to eat everyone’s lunch. In mortgage it will take longer, there is simply so much heritage technology and complexity that we won’t be able to redo it all in even ten years’ time. And there there’s federal.
So we have some time, and this “straddling” problem is one I help my clients think through. How do we capture the opportunities of the now, while preparing for the inevitable future when everything is different? I ask everyone who will listen – is your business going to be viable in three years? What do you need to do now to change your business, or create a new business that will thrive when the future arrives?
“I Feel Like a God”
Being supercharged by AI tooling. Created on Midjourney
The day I discovered “vibe coding” in Replit, I spent about 14 hours in front of my computer, creating. I built and deployed app after app. I tried things and threw them away. I fought with the developer agent, imploring it to do what I wanted. I struggled through authentication, on my own. I learned that I needed to have my database first, because adding it later created a whole host of problems. I discovered the contours of what this tech could do.
I called Tanya Brennan and introduced her to this tool. I said, in a hushed voice as if it was this incredible secret – “I feel like a god. Everything I have ever wanted to do as a product manager over the course of my entire career, I can now do.”
Yeah, so there’s that. This is a world where the limit to what we can do is simply the bounds of our creativity. In the AI future (which is happening now by the way), anything we can imagine we can make.
Oh, But the Mess
The messy reality exploding a technology professional trying to use all the half-baked tooling. Created on Midjourney
And then reality sunk in. This is always my way. I get all wrapped up in the possibilities and then get brought back down to earth by the reality of scaling and productionizing. It’s ridiculously cheap to have ideas, and expensive to bring them to like in a way that scales, integrates, is secure, and adds real value. Hubris. Sigh.
The contours of this expansion of human creativity are jagged and real. Once you get in there, it explodes all over you like napalm. Want to move it to a scalable secure environment? Need to pass a SOC 2 audit? Want to put the button exactly where you want it? Want to adhere to secure development practices? The list truly goes on and on. It’s not ready, but one day it will be. One day soon.
It’s so interesting to like at the edge of this boundary. Sometimes it’s aptly called the “jagged frontier”. We still have to get things done. We have to add the value. We have semi-baked technology to help up, but we still live in a heavily regulated world. All the laws we had to comply with “in the past” are still in force. And adoption is based on the speed with which other people and organizations can adopt it.
So we adjust our approaches to work within the constraints of our reality.
What Does this Have to Do with Upskilling?
The hopeful workforce of the AI future trying to share the word at our first AI exchange.
So now the practical steps. Here is what we have done to upskill our company. It took time, and it was messy (it is messy still, we are not done), but I hope these insights will help you to upskill your workforce. And you can always ask us for help, it’s what we do.
Saw the change coming.Jensen Huang is the master of this. We had to see the change coming. We had to be willing to accept that a new reality is coming and we have to embrace it. We had to open our eyes.
Evaluated our business. We had to take an honest look at our business and understand what it would look like in the AI future. We had to be willing to accept that change is inevitable create valuable offerings. We had to try a bunch of different things and be ok knowing that they wouldn’t all work out. We had to be very fast, and willing to pivot. At the same time, we had to balance the need for speed, with what that does to people. And we had to actually have a business to pivot.
Looked at each role. We had to really understand what we did, and what our people do. We had to ensure we had an experienced workforce that can operate with classic approaches. This is still very relevant, valuable, and necessary. We had to straddle the now and the future, and recognize that roles will change – but they won’t change all at once.
Immersed ourselves in the tech. We really got into the tech, the nitty gritty, messy tech. We had to really push the jagged frontier and understand what is possible now and under what circumstances. We build production software with this tech every day. We are builders. We are not just watchers and consumers, we put the tech into practice.
Made a commitment to our workforce. We had to understand what our message was for our people. What was our commitment to our people. From the beginning we have said that not even one of our people would lose their job because of AI. And that is still true. Not everyone will be able to say that. Only you know what you will say, what you can say.
Designed the future. We decided what the jobs would be. This was important. We didn’t wait for change to happen, we decided what would change and how. This is not a perfect science, and it continues to evolve. But we decided to eat the bear rather than let it eat us.
Struggled together. As a leadership team, we have experimented on ourselves first. We do the jobs in the new way. We model the behaviors. We embrace the change. We learned about ourselves, and the human reactions that come with such a radical change. We helped each other along. We struggled. We continue to struggle. The struggle is part of the process.
Created a real bootcamp. And the rubber really meets the road in a practical education process. This was perhaps the messiest and hardest part. The process of putting an actual upskilling program together, creating the curriculum, and actually doing the upskilling. This is incredibly time-consuming. I have personally spent hundreds and hundreds of hours on this. We are only just now at a place where this can be operationalized, and it will take us another few months to be able to operationalize at scale.
Stay current and continuously adapt. And everything is still changing. We have had to just accept that we can’t know everything. We have to live in the grey area where things are changing, and we don’t know how. The biggest way we combat this is by participating in the AI circuity. Conferences. Events. Courses. Conversations. Podcasts. We eat just about everything on the buffet, but there are still dishes we haven’t had in the kitchen that we don’t even know about.
Operationalized upskilling. This is where we are now. We have a program, we know what we need to do, we have done it with about 25% of our workforce, and now we have the rest to do. We know what we need to do for new hires, and that is already helping. Our recruiting process has changed. Our screening process has changed. We can speak with confidence about what the jobs are and how they are different. We know the skills we need, and we can teach people the skills when they don’t have them.
Whatever You Do, Just Keep Swimming
Dora continuing to swim even when it's difficult.
No matter what, keep going. Take one step today, and one step tomorrow. Create a team that focuses on workforce upskilling. This will feel like an expensive investment, but I promise, it’s necessary. It is a ton of work to upskill a workforce. And really, it’s never done. But if you don’t start, you definitely won’t finish. Be like Dory.
By Tela Mathias, Chief Nerd and Mad Scientist at PhoenixTeam and CEO of Phoenix Burst
What is uniquely human? AI impacts on the workforce.
By Tela Gallagher Mathias
We held our first ever public-private sector AI exchange yesterday, and we opened with a question – what is it that makes humans uniquely human? Answers ranged from “procreation” to “cooking”, to “empathy”. This question is relevant because, as a refresher, AI is a field of computer science focused on enabling computers to perform tasks that typically require human intelligence. Human intelligence is our ability to sense, understand, and create. So, the question of “what is humanness” matters more acutely now than ever, as these innately human qualities are what will be most valuable in the workforce in the future.
The Good News
In doing research about the historical impacts of disruptive technology on the workforce, I was very encouraged and posted about that earlier this week. The bottom line is that with every major disruption since the 1750s, including both industrial revolutions, significant numbers and types of new jobs were created. For example, in the first industrial revolution water-power spinning machines were invented, igniting the shift to factory textile production. This created new labor classes concentrated in mills and sparked the modern wage labor system.
In a recent MIT study that evaluated 80 years of Census data, researchers found that literally 60% of the jobs we have today did not exist in 1940. Not only that, but many of these jobs wouldn’t have even made sense at that time (I’m looking at you “content creator”). For context, tattooer became a job recognized by the US Census in 1950, software engineer in 1970, conference planner in 1990, and solar photovoltaic electrician in 2018.
This is really encouraging to me but does rest somewhat on hope as a strategy. Every single time technology has disrupted our lives, jobs have been created, and we as a people have survived. Therefore, it is likely we will survive this one. I also know that overall, if gross domestic product (GDP) is a measure of prosperity, we are a wildly more prosperous company than we were. From $3B in 1790 to $23T today.
Green bars represent positive GDP growth (expansion years), red bars represent negative GDP growth.
Adjusted for inflation, U.S. output is roughly 6,400 times its 1790 level and has kept a long‑run trend of about three percent real growth per year despite wars, panics, and recessions. In addition, per-capita prosperity multiplied by about 30 times. Real GDP per person rose from about $2,000 (in today dollars) in 1820 to more than $70 000 today, illustrating how sustained innovation converts into living‑standard gains when paired with education and market dynamism.
What I don’t know, however, is what negative consequences this introduced to the people that were in the jobs that become obsolete. And I don’t know what happened to the young people who were attempting to enter the job markets in the time just after the disruptions.
The Bad News
We will see significant impact across virtually all major economic sectors. No matter the sugar coating by the major technology companies and the Silicon Valley attitude, jobs will be eliminated. We will continue to see job loss due to robotics. We will continue to see job loss due to generative AI in create fields. We will continue to see job loss due to the continuous accelerated pace of automation.
Education will be very slow to adapt curriculums and learning approaches, with public education trailing far behind independent schools. This group of elementary school kids that are matriculating now are especially at risk of falling behind. They had their learning radically disrupted first by COVID, and they are now the last kids to be born after ChatGPT. This means the ability to prepare this generation will require an even more significant investment by parents. With so much of the responsibility having to happen in the home, preparing these kids will fall to the parents.
I think Jensen Huang puts it the best. At the Milken Institute Global Conference, Huang explained that the disruption from AI is “not simply about outright job loss through automation but about a growing divide between those who harness AI as a tool and those who do not,” highlighting the risk of inequality between the so-called “AI-skilled elite” and everyone else. Out of the global population of about eight billion, only about 30 million people are proficient in programming and advanced AI technologies. That is less than 0.4% of the global population. This small tech-fluent cohort wields disproportionate power with AI, while many others could be left behind.
So, I think the message for current workforce is to adapt or fall behind. Many of the big tech companies at least claim that they are significantly investing in reskilling and training current workforces with the increased reliance on machines and decreased reliance on the human workforce.
Workforce Trends for 2025
Gartner had an eclectic, if somewhat silly and self-promoting at times, perspective on future work trends in 2025.
The expertise gap intensifies as retirements surge and technology disrupts.
There is no doubt about this one. Add to it the mass exit of federal employees and we are exacerbating this phenomenon.
Organizations redesign to prepare for technological innovation.
We have definitely done this at our company, and I have advised companies on what an AI-first, or at least AI-ready organization looks like. The job names are changing. New jobs created, at my company we have “value engineers” and “evaluation specialists”. And, of course, my title has changed – I call myself the chief nerd and mad scientist as that seems to be what fits what I do the best.
Nudgetech experiments bridge the widening communication gap.
This is a silly and Gartner self-promoting one for me. So “nudge theory” suggests that subtle, indirect suggestions or environmental changes can influence people's choices and behaviors without restricting their freedom of choice. For example, placing healthy foods in the pantry at kid eye level. From an employer perspective, this is the idea that we consider when a text or an email is better, when we should call. Certainly, Tanya Brennan is the master at this. I find the idea that we will have technology tell us when to communicate and how to be offensive and I’m not into it.
Employees embrace bots over bosses in the pursuit of fairness.
I had to look this one up. This trend reflects a growing workplace dynamic where workers increasingly trust AI systems more than human managers, especially in areas related to fairness, transparency, and objectivity. I think the alliteration is a bit cheesy, but I can see where this is coming from. My brother, Gil Gallagher, who is the middle school director at The Field School, told me about a trend towards (as well as resistance to) algorithmic based grading. I wonder if this could be fairer than humans’ subjective evaluation. We see this in my industry as well, although there is a lot of skepticism towards probabilistic algorithmic based decisions.
Organizations must define fraud v. fair play when it comes to AI.
This one highlights the growing need for companies to establish clear, principled boundaries between acceptable AI use and deceptive or unethical behavior. Obviously, this will play out in education. What is considered cheating now will just be the ways things are done in a few years, maybe sooner. I saw this myself in thinking about corporate testing. I recently had to take a test to validate my cyber awareness – is it cheating to use ChatGPT to confirm my answers to some of the questions? Is it cheating if one employee uses genAI to help them at work, but another doesn’t?
Organizations shift focus to inclusion and belonging with unexpected benefits.
This one is interesting. If we look at traditional diversity, equity, and inclusion (DEI) efforts, and its focus on the numbers, we saw many organizations dissatisfied with the results, and, of course, the major anti-DEI backlash that is now playing out. I myself have wondered about the effectiveness of our DEI program, and the effectiveness of efforts I have provided significant financial support for in the recovery community. What if instead we focused more on how we made people feel, and less on how many of them there were? Harder to measure, certainly, better? I think so.
AI first organizations with destroy productivity in their search for it.
I have seen this, and, in fact, embraced it. We put a serious premium on experimenting and failing at our company. We have a team of value engineers, and we ask them to try out all the new stuff, and struggle for a while, even if this means failing a few times. We do, however, encourage asking for help. Yes, the struggle is part of the process but so is learning from those who have gone before. I am seeing this show up in the industry as using a chainsaw to cut a tissue. Everything doesn’t need an agent. In fact, there are many, many mortgage use cases that really shouldn’t use an agent at all. Do you need high precision? Do you need complete transparency? Do you need it to work 100% of the time on 100% of the cases. Yeah, maybe not an agent right now.
Loneliness becomes a business risk, not just a well-being challenges.
This one reframes employee isolation as more than a personal concern. It is strategic and operational liability that can erode team performance, innovation, and retention if left unaddressed. I found a study recently that indicated about 26% of employees in 2024 I think reported that they were happy with collaboration at work, which was DOWN from 31% in 2021 (I can’t find it now or I would link to it, I promise it exists, and this isn’t a ChatGPT hallucination). That’s fascinating. That’s worse than during COVID. I do occasionally get lonely at work, and I have a great partnership with Tom Westerlind and Tanya Brennan, and of course my teams. But when I do get lonely and there is no one around, and none to call, it can be very isolating. Those are the times when I think about maybe going to work somewhere else, and I own the company. (Unabashed Jensen Huang superfan, just make me an offer).
Employee activism drives adoption and norms for responsible AI.
I suppose this one is about an employee uprising about use of genAI at work. For those companies that have still banned it (yes, they are out there), they will see serious dissatisfaction from growing numbers of employees. Those employees are effectively being forced by their organizations to fall behind. In this job market, no-one can afford to fall behind and this failure to get with the genAI times is going to be a real talent drag.
Preparing for the Future
So where does this leave us? I think it leads us back to what makes us uniquely human. That which makes us uniquely human is the differentiator in the AI future, which is really just the AI now. Many of the big tech companies talk about hiring not for technical skill, but for their more human talents. Microsoft, for example, has said very recently that “Microsoft still plans to hire more software engineers than it has today, but it cares more about what makes them human and less about their technical abilities.”
And what does that mean? What are those qualities? They are, and I see this validated time and time again, the ability to lead in uncertainty, creativity, judgement in difficult situations, and the ability to connect the dots. Steve Jobs once said at a new famous commencement speech that “you can only connect the dots looking backwards”, and I have definitely found this to be true.
We make the best decisions we can, with the information we have, relying heavily on intuition and experience. We hope they are the right ones, we how we made them at the right time, and we pray that we made them with the right people. Then, in the fullness of time, more is revealed, and we see the dots that we connected. I don’t (yet?) see that in ChatGPT. We made a brutally hard business decision in 2024 that affected a lot of people. I made it together with my partners, and it really seemed like the right one. It fundamentally changed our company, initially for the worse and in the long term for the better. I wasn’t sure it was right at the time, and only after a lot of pain, a lot of time, and a lot of new data has it been revealed unambiguously to be the right one. Those are really scary, gut-wrenching decisions, I don’t think I would leave them to ChatGPT.
At PhoenixTeam, we live by a simple mantra: “Pivot or Die.” This bold approach drives how we tackle complex, overwhelming projects. For the mortgage industry, a directive such as HUD’s latest 251-page Mortgagee Letter (due by October) presents a massive implementation challenge – traditionally requiring months of analysis, coordination, and development. PhoenixTeam’s generative AI-powered data products turn this compliance hurdle into an opportunity, delivering results within hours instead of months.
The Challenge: Overwhelming Compliance Workloads
Regulatory changes demand rapid action. A 251-page update isn’t just lengthy – it’s dense with new rules, impacts on processes, documentation updates, and testing requirements. Compliance and business executives know that deciphering such a tome under tight deadlines strains resources and budgets. Teams must manually identify each change, assess its impact, write requirements and user stories, update test cases, and ensure nothing falls through the cracks – a monumental effort prone to error. With conventional methods, by the time you’ve parsed the entire document and assigned tasks, weeks or months have flown by and compliance risks have multiplied.
The Solution: GenAI-Powered Compliance Automation
Phoenix Burst– PhoenixTeam’s AI-driven compliance fulfillment platform – does the heavy lifting for you. It ingests complex regulatory, compliance, and policy documents (like that 251-page Mortgagee Letter) and automatically generates clear, actionable outputs in a fraction of the time. What once took an army of organizationally distributed practitioners months to accomplish, Phoenix Burst delivers within hours. The platform uses generative AI to identify each compliance change and requirement hidden in the text, producing plain-language change statements and procedure impact analyses for each one. It then transforms these into ready-to-build business requirements, user stories, acceptance criteria, and test cases – essentially an entire implementation plan at the click of a button.
Phoenix Burst doesn’t stop at documentation. It assigns each change to the appropriate team and output format for action, providing the results in a structured, audit-ready spreadsheet or other standard formats. Instead of juggling email threads and manual trackers, you get an organized, end-to-end view of the compliance update that’s ready to execute and traceable for regulators. A human is always in the loop for quality control – our platform serves up the work to your experts (or ours) for curation, helping ensure AI-generated artifacts meet your unique standards. Need extra assurance? We offer an optional review by seasoned compliance attorneys as an add-on.
The Phoenix Burst Impact is Immediate
By making compliance seamless, PhoenixTeam’s genAI solution clears the way for real innovation. Your teams reclaim time to focus on strategic work instead of drudging through paperwork. In fact, one major mortgage servicer proved the power of Phoenix Burst in a pilot project – transforming a complex regulation into plain-language requirements, user stories, and test cases within 24 hours of ingestion. What used to be a labor-intensive, error-prone slog is now a swift, accurate process. Compliance becomes a catalyst for improvement rather than a bottleneck.
Key Benefits at a Glance
Months of work in hours: Automatically decomposes complex compliance documents (e.g. hundreds of pages of new regulations) into actionable components within hours – not months.
Built-in project alignment: Assigns each change to the appropriate team and provides structured, audit-ready outputs (e.g. Excel or dashboard reports) for easy tracking and oversight.
Ready-to-build outputs: Generates complete impact assessments, requirements, user stories, acceptance criteria, and test cases for your implementation teams, saving countless hours of manual drafting.
Expert oversight on demand: Includes a human-in-the-loop for validation, with an optional add-on for curated review by PhoenixTeam’s curation specialists and attorneys to insert additional human control steps
Beyond HUD – universal application: Not just for HUD mortgage letters. PhoenixTeam’s genAI platform adapts to any federal or state regulatory change – from Fannie Mae policy changes to state banking laws – making it a one-stop solution for compliance across the board.
The Competitive Edge
By cutting implementation timelines from months to hours, you stay ahead of regulatory deadlines and free your talent to drive innovation, not tedious manual effort. Phoenix Burst’s award-winning technology is transforming compliance from a costly barrier into a strategic advantage.
Ready to Transform Compliance?
It’s time to leave tedious compliance workflows in the past. Experience the PhoenixTeam difference: visit https://www.phoenixoutcomes.com/phoenix-burst to learn more about our generative AI-powered data products, or contact us to schedule a demo. Accelerate your compliance implementation, create audit-ready results, and give your teams the gift of time – all while confidently meeting regulatory demands. Let PhoenixTeam help you turn your compliance processes into an engine of progress.
The Medley of Misfits – Reflections from Day 2 at the AI Engineer World’s Fair
By Tela Mathias
I love being at events like this, I feel like I have met “my people”. We are this weird, eclectic, smart, funny, and super enthusiastic bunch of nerds. Just really nerdy. And I love it. We are just a medley of misfits. So many great things at Day 2 – but the two major highlights were at the beginning and the end. Simon Willison is a hilariously competent and compelling speaker, and definitely part of our medley of misfits. And closing the day with Greg Brockman was an absolute inspiration. The theme of yesterday was the “the power of optimism”, but maybe that’s just because I’m an optimistic person.
Spark to System: Building the Open Agentic Web with Asha Sharma
Wish I new his name, but the demo guy at Microsoft was ON POINT. He showed what you can do with Github Copilot and I have to say – wow. The intersection of spaces, Jira, agents, and agent task assignment really told a good story. Imagine that an agent is just another team member, logged in and working like you or me. Imagine that you could assign a task to an agent, readme file generation was the example they used, and then the agent does the work and updates the work item. Now imagine that you need a team member to do a machine learning model, yeah you can assign that to a coworker agent too.
This definitely made we want to make sure that we are making maximum use of the full set of Microsoft capabilities at the team level. It was not, however, enough to make me move from AWS Bedrock to Azure AI Foundary. Maybe I’ll regret this decision at some point but I’m sticking for now. You’re welcome Amazon Web Services.
State of Startups and AI 2025 with Sarah Guo
Talk about overcoming adversity, Sarah Guo is a presentation boss. None of the technology was working, AV was a hot mess, and honestly, they barely figured it out. She used the time with mastery, and I was riveted by her take.
Sarah is the founder of Conviction, an AI venture capital company. She was speaking on the state of startups in 2025 and providing practical advice. One of the things they are very interested in and encouraged is to think about (you know how VC loves their analogies) is: “Cursor for [_X_]”. In our case it would be “Cursor for Compliance” although that’s not where we are yet, but we will be.
One of the reasons Cursor has been so successful is because it was built by engineers for engineers. And engineers know engineers. She out the cherry on top of what we have been hearing for the past six months, content is king. Knowing your customer, knowing your domain space, really building what you know for and market you know – that continues to be the moat.
Domain is king. Needs no further explanation.
Show up informed. Have a product that has an opinion. Have a product that reflects what we know, what our customers know.
Requiring a prompt is a bug, not a feature. Loved this one, and it validated what we have done. The idea that a user has to prompt the system to do what they need is a bug – the system should just do what you need. And present thoughtful outputs at the appropriate times to the right people, in an excellent ux. I mean it’s easy really.
The moat is execution. Just out execute everybody. Move fast. Continue to move fast. Get to market. (I’m here for it, sister!)
Copilots are still underrated and viable solutions. This was kind of a relief, honestly. I straddle federal, commercial mortgage, and Silicon Valley. I see so many different stages on the adoption curve, and different stages of technology delivery maturity. It is really hard to go from the AI future to the mortgage now. I struggle with what we can/should actually do with all this light speed tech, and this was a helpful sentiment.
BE IRONMAN. Think of your solution as a supercharged companion. Some things Tony Stark has to do, some things the suit does autonomously. Over time the suit does more and more and Tony does less, but also more different. Be Ironman.
I loved the idea that building the ironman suit is the bath of least frustration. Start with what you know, you can always make it better. Sarah Gua is a BOSS. Loved her.
2025 in LLMs so Far with Simon Willison
I had, sadly, never heard of Simon Willison. He was falling out of your chair funny. I love his personal eval, “product an SVG of a pelican riding a bike”. This reminded me of Ethan Mollick and his “otter taking a plane ride” eval. So Simon was there to talk about the past year in LLMs, but there was too much so he skinnied his scope down to the past six months.
The reason he uses the pelican riding the bike is because (a) he’s tired of the other benchmarks and has lost trust and (b) it’s a great test because it requires technical prowess in producing the SVG, the pelican has very difficult anatomical structures that are incompatible with riding a bicycle, and the bicycle seems simple but is actually a challenge for humans to illustrate due to its interesting geometry.
Some of the key points made clear in the past six months:
Local is good now.
Prices of good models have absolutely plummeted, which is a good thing for us. We will continue to see a crushing pace on the releases of mew models and model upgrades. The basic message here is that there was so much improvement that you really do have to pay attention.
Humorous discussion of the infamous OpenAI sycophantism bug. Evidently the source prompts that were used to fix it leaked so you can see the actual from and to documentation, fascinating. That one was hilarious.
Somber noting of the Grok White Genocide horror show. Enough said there. I just can’t with Elon.
Evidently Claude 4 will “rat you out to the feds” for certain prompts and content generation. I really had no idea, but I guess it makes sense. I’m not sure how I feel about this.
Impact AI on Consulting
This one was near and dear to my consulting roots. I had never heard of the company, but I really resonated with what they were talking about. They talk about the staffing models – traditional pyramid v. inverted pyramid (relying on junior staff to do most of the work v. relying on senior staff to do most of the work). And their hypothesis on the future for professional services is the inverted pyramid in the center, with traditional pyramids of agents at each side. This makes a lot of sense to me. Not sure I would have illustrated it this way but intuitively, it’s the right move.
I was surprised that they did not discuss voice agents more specifically, I think the opportunity there is massive. Imagine if you would interview and entire company in, like, two hours. Yeah, voice agents. I’m here for it.
Windsurf Everywhere, Doing Everything, All at Once
I’m so glad that we pivoted away from automating software development because man, Windsurf has pretty much crushed that. This one was personal, when we started this AI journey in December of 2023, I was hell bent on “push button, get software”. Many of our early research meetings were about this idea of automating software development. As we learned more, and really listened to our industry feedback, we realized we needed to be a lot more specific and much closer to the market – hence mortgage compliance change management, of which software development is a key part.
And that was a really good pivot. Windsurf is going to absolutely crush this space. Their vision is ridiculously bold – to be everywhere, doing everything, all at once. And I believe them when they talk about how they intend to do it. I think they will crush everyone, they certainly would have crushed my original product concept. Phew – dodged a bullet there.
Reflections from Greg Brockman, President OpenAI
Greg really gave Jenson a run for his money on being my idol and personal hero (#jensenisstillthegoat). I’m slightly embarrassed to admit I did not know him before this closing keynote. Well, I certainly won’t forget him. He was absolutely inspiring. This will be the subject of a separate article. Too short on time to do it justice.