My Journey with Claude Code and Running Llama 70b on My Mac Pro
I taught a three-day bootcamp this week, and the last activity I had planned was to create a simple agent with python. I had only successfully done this myself that morning, between the hours of 3:00 and about 5:30 AM. It was incredibly basic, but at least it was actually agentic (it had tools and used a large language model to decide which tool to use for the job). I had gone down the ChatGPT assistants application programming interface (API) route originally, only to discover you have to bind the context you upload to the tool you want to use. Hardly agentic. I also discovered that the assistants API had been deprecated, but it doesn’t say than in the user interface, only when you work with the API. So that was an annoying bust. I lost interest in doing it with the OpenAI developer stack.
Three day AI bootcamp for PhoenixTeam and friends.
So that was my morning before class. At that point, I switched to and clarified to Claude my specific objective:
“I am not a developer, but I do have a high-powered Mac Pro. I want to create a super simple mortgage agent that connects to the OpenAI API”.
I hadn’t yet given up on OpenAI on the backend (that came later) since it was, at the time, what I knew the best. Then I went on a journey, using just the Claude UI to get there. Two and a half hours later, I had met my objective. I even managed to deploy it securely using a git repository, a .env file for handling secrets, and streamlit. Go me!
I did have a modest heart attack when I made a rookie mistake and put the API key directly in the code and then put that code into Claude. Claude went with it for a while and ultimately terminated my session. At first, I didn’t know why, then I realized what I had done and freaked out a little bit. But no worries, I then learned how to revoke API keys. Honestly, I went on a little revoking mission at that point because “Only the Paranoid Survive” (Andrew Grove superfan).
I felt confident that if anyone got this far in the class I was teaching, they would be able to create a similar app if they were smart and had hustle. A few got this far but elected not to try. We spent a lot of the day building applications in Replit and these three students elected to build OpenAI integration instead of built the python agent. Totally get it, no judgement here. We built so many applications and added so many users that Replit reached out to me to see if I wanted to have a conversation about what we were doing.
The ridiculous machine I bought to run llama 70b.
But now I’m on a mission to get a much more tactile understanding of agents. I originally bought this ridiculously powerful machine so I could run at least llama 70b, which slowed to a crawl on my previous and now dead laptop. I decided to figure it out. I also decided I had hit the limit of what I could do in Replit and on my own and wanted to try Claude Code.
Claude Code - easier said than done.
It took me a while to get Claude Code installed even though I already had a bunch of the prerequisite due to my earlier agent journey. It was not completely trivial for a non-developer.
Then I had to figure out how to use it. What I find is that unlike Replit, I have to use Claude and Claude Code together, Claude Code alone isn’t enough. However, the grain of control I have is far superior to Replit. No question. Once I get the hang of this, I will absolutely be able to build far better apps faster and with less frustration than in Replit. It is not for everyone though. It takes a particular type to really want to get in there and work with Claude Code. Many hours of super step-by-step fumbles and repeats. But I’m hooked.
My first few apps were… meh. And they were slow. Which didn’t make sense. I found myself with a need for speed. I went on a journey to first optimize my llama install with ollama. Ultimately, I was able to achieve pretty good performance, at least according to Claude.
According to Claude, I am running as fast as I can. I'm not convinced.
Now I wanted something really useful, and a steppingstone to the agent I have in mind. My first application was a successful failure. I wanted a retrieval augmented generation (RAG) based solution that would let me search my active documents folder on One Drive and let me engage conversationally. Yeah, that idea was not good. Too slow, not enough space for a bigger model and too many files to work with. If only Microsoft Copilot was even semi-useful. I appreciated Claude following up with me to let me know that unfortunately I could not run a 256b parameter model on my machine. I am maxed out at 70b. First world problems indeed. I decided to create just a simple ragbot, but one that was blazing fast.
So many steps for optimization, so little time.
That’s where I was for a few hours. I have at least gotten my front end to deploy and I can upload documents. Now I’m debugging. I seem to have a corrupted database, and my first test was entirely too big (I see you entire Fannie Mae seller servicer guide). That was a mistake, I should have started much smaller. Such a rookie, good thing this is just a hobby and I have a very accomplished team that knows what they are doing most of the time.
Still can't get it to run but I'm closer.
And that’s part 1 of my frustrating-but-still-fun journey with my Mac Pro and llama 70b. I feel semi-but-not-too-uncomfortable with Claude Code, I feel an ok level of comfort with terminal, I feel not-at-all comfortable with Docker, I have learned a ton, and the next class I run this is DEFINITELY on the agenda.
Part 2 - Success!
This post blew up a little bit so I thought I would update on my progress. I have achieved success! I was able to get my ragbot up and running. It's blazing fast and I even added a capability to engage with the 50 state attorney generals web pages to look for updates, store updates, and engage with those. Go me! I then moved on to install and deploy stable diffusion for image gen, which proved a bit difficult, but I had success there too. Thanks for all the engagement.
By Tela G. Mathias, Chief Nerd and Mad Scientist at PhoenixTeam, CEO at Phoenix Burst
Tela Mathias recognized as a 2025 HousingWire Vanguard
Congratulations to Tela Mathias, PhoenixTeam's Chief Nerd and Mad Scientist & CEO of Phoenix Burst, on being recognized as a HousingWire Vanguard for the second year in a row! The Vanguards honor housing leaders whose vision and expertise are moving markets forward, and Tela is doing exactly that.
Tela is leading the charge in transforming the mortgage technology landscape through innovation, education, and purpose-driven leadership. As the visionary behind Phoenix Burst, a first-of-its-kind generative AI solution, she has transformed how the industry approaches regulatory compliance, streamlining workflows that once spanned weeks into a single day.
Tela is shaping the workforce of the future by pioneering initiatives in GenAI education and dialogue. She launched the AI Mortgage Professional Courses in partnership with MBA Education, equipping mortgage professionals to apply AI practically within their businesses. As a GenAI thought leader, she speaks at industry conferences to share expert insight into the technology’s capabilities and potential, and she leads PhoenixTeam’s public and private sector AI Exchange event, which fosters inclusive conversations about AI’s workforce implications and ensures transformation happens with people, not just to them. Through these efforts, Tela continues to set the standard for leadership and innovation in the housing ecosystem.
Being named a HousingWire Vanguard is a testament to her passion for driving innovation that is accessible, transformative, and deeply human.
PhoenixTeam has engaged with about 150 different lenders, servicers, mortgage vendors, government sponsored enterprises (GSEs), and all three federal housing agencies about genAI in mortgage. Across more than 300 individual interactions, we have a broad perspective on how the industry is using genAI today.
Routine adoption of assisted code generation. Various tools are in use, with extensive use of GitHub Copilot. With genAI-native applications, as much as 90% of code created is generated by genAI. In mortgage broadly, we see a modest 20% efficiency gain with this use, primarily due to the heritage code bases and the significant specialization required to engage with these code bases.
Routine adoption of internal knowledge augmentation supportbots, using retrieval augmented generation (RAG). This is a technical approach where companies take confidential (but generally not consumer) data and enrich large language model (LLM) requests with this context, improving the quality and relevancy of results. This significantly eliminates the need to “hunt and peck” or perform traditional search functions on internal documentation. Instead, users can engage conversationally with these materials.
Routine adoption of Microsoft Copilot, with lackluster results. Enterprises are leaning into this strategy and are generally disappointed by adoption and results. Even Gartner has been really tough on this one. It costs about $350 a year per person for Copilot in return for about $350 of value. This is classified as a “return on employee” (ROE) rather than a return on investment (ROI) as this is typically a cost neutral use case and not at all differentiating.
Routine failure of genAI pilots as organizations attempt to move from proof of value (POV) to scaling across the enterprise. Many genAI pilots and implementations are failing, primarily due to a lack of focus on error analysis and the expense associated with achieving and evidencing accurate results. “Evals” – short for genAI evaluations – will soon be the new service level agreement (SLA) standard for genAI solutions, and pricing will ultimately move to outcome-based rather than seat, enterprise, or widget-based.
Increasing adoption of call summarization in the call center. This will be routine within the next six to 12 months. GenAI-based solutions will replace prevailing approaches like those offered by typical contact center technology companies. These companies tend to move too slow, and their “add-ons” are clunky and expensive. They will become obsolete as genAI native solutions penetrate the market.
Increasing adoption of genAI-based sentiment analysis in the call center. This will ultimately replace traditional technologies, which are notoriously unreliable, lag behind real-time availability, and are very expensive.
Emerging use of voice agents in the call center. This is getting a lot of traction, with some voice agent companies in mortgage servicing gaining more than 100 customers in the past year. These are very good technologies, with voices that are indistinguishable from humans and capable of a range of conversational and dynamic strategies.
Nascent use of genAI based solutions for optical character recognition (OCR). These technologies use vision models instead of traditional machine learning to classify, understand, and extract data from documents. These are virtually always more accurate but are currently too expensive to compete with offshore error resolution of traditional machine learning results. The problem of reliably getting data off documents at low cost and deriving meaning from large image and portable document format (PDF) documents continues to be a very challenging problem without consistently “easy” or reliable solutions.
Nascent use of genAI agents in underwriting support, automating tasks in loan origination and servicing systems. This is a very high-risk area, with many different vendors attempting to automate the underwriting process generatively. We are not aware of any company or vendor that has succeeded with any real enterprise scalable solution in this area. Many are trying. The end state of this, for now, will be mortgage loan originator support, not customer facing unassisted decisions. The same goes for underwriting loan modifications.
Use of truly agentic genAI solutions is extremely limited. The deployments are fragile and prone to failure. The most mature places for agentic AI are in the call center and in development assistance. This is what we see in Silicon Valley as well. There is significant investment in agentic AI. For now, however, the places where it can be completely unassisted are limited. This will be an enormous focus in the next one to two years.
Zooming out from the technical adoption, we see that many organizations have “use case lists”, an important first step. Unfortunately, the truly transformational uses of genAI are not found on these lists. They require deep mortgage subject matter expertise, a tempered approach to innovation, and significant investment by the organization. The industry continues to be very confused about the difference between AI and genAI, which is a major barrier to actually adopting these technologies in a way that is valuable. We continue to see significant implementations of point-based “fear-of-missing-out” (FOMO) solutions that fail to produce a real return.
By Tela Mathias, Chief Nerd and Mad Scientist, PhoenixTeam and CEO of Phoenix Burst
ARLINGTON, VA, UNITED STATES, August 18, 2025 -- PhoenixTeam, a mortgage and financial services technology services company, was awarded a $49 million contract by the U.S. Department of Agriculture (USDA) Rural Development (RD) to modernize the Guaranteed Underwriting System (GUS 2.0). The modernization expands access to affordable housing for farmers and rural families and drives broader industry adoption of USDA’s guaranteed loan program.
With this modernization, lenders will soon be able to automatically upload the loan application—a capability long available through Fannie Mae and Freddie Mac. By aligning USDA’s process with these industry standards, PhoenixTeam’s modernization makes it easier for lenders to offer USDA guaranteed loans and expands financing options for rural borrowers. This transformation directly supports USDA’s mission to strengthen rural prosperity and increase access to homeownership in underserved communities.
“This award is about more than technology—it’s about access to homeownership,” said Tanya Brennan, CEO of PhoenixTeam. “By modernizing USDA’s Guaranteed Underwriting System, we make it easier for lenders to deliver this product, which is often the only path to homeownership for rural families.”
PhoenixTeam will partner with USDA Rural Development to deliver a modern, user-focused solution that reduces friction for lenders and expands access to homeownership for rural borrowers.
About PhoenixTeam
PhoenixTeam is a woman-owned technology services firm headquartered in Arlington, Virginia, specializing in AI-powered mortgage operations and technology services for the mortgage and financial services industries and federal housing agencies. Our mission is to enable affordable and accessible homeownership for all Americans through innovative, customer-centric technology. With a strong focus on generative AI, we tackle complex industry challenges, equipping businesses with cutting-edge tools that enhance innovation, efficiency, and compliance. By bridging the gap between technology and business teams, we strive to bring joy and purpose back to software development, making a meaningful impact in the lives of our clients and homeowners everywhere. For more information, please visit www.phoenixoutcomes.com.