AI coding tool landscape in July 2025 with Tim + David
# Summary
In this conversation, Tim Abell and David Sheardown explore the challenges and innovations in productivity tools and AI coding assistants and the overwhelming landscape of AI tools available for software development.
The dialogue delves into the nuances of using AI in coding, the potential of multi-agent systems, and the importance of context in achieving optimal results.
They also touch on the future of AI in automation and the implications of emerging technologies.
# Takeaways
In this conversation, Tim Abell and David Sheardown explore the challenges and innovations in productivity tools and AI coding assistants and the overwhelming landscape of AI tools available for software development.
The dialogue delves into the nuances of using AI in coding, the potential of multi-agent systems, and the importance of context in achieving optimal results.
They also touch on the future of AI in automation and the implications of emerging technologies.
# Takeaways
- AI is reshaping the workplace, requiring adaptation from professionals.
- Understanding engineering problems requires a structured approach.
- AI coding tools are rapidly evolving and can enhance productivity.
- Providing clear context improves AI coding results.
- Multi-agent systems can coordinate tasks effectively.
- The landscape of AI tools is overwhelming but offers opportunities.
- Understanding the limitations of AI tools is crucial for effective use.
- Innovations in AI are making automation more accessible.
- It's important to balance AI use with traditional coding skills.
- The future of AI in software development is promising but requires careful navigation.
# Full details
In this episode of Software Should Be Free, Tim Abell and David Sheardown delve into the rapidly evolving landscape of AI-powered coding assistants. They share hands-on experiences with various AI coding tools and models, discuss best practices (like providing clear project context vs. “vibe coding”), and outline a mental model to categorize these tools. Below are key highlights with timestamps, followed by a comprehensive list of resources mentioned.
Episode Highlights
- 00:05 – Introduction: Tim expresses feeling overwhelmed by the proliferation of AI coding tools. As a tech lead and coder, he’s been trying to keep up with the hype versus reality. The discussion is set to compare notes on different tools they’ve each tried and to map out the current AI coding assistant landscape.
- 01:50 – Tools Tried and Initial Impressions: David shares his journey starting with Microsoft-centric tools. His go-to has been GitHub Copilot (integrated in VS Code/Visual Studio), which now leverages various models (including OpenAI and Anthropic). He has also experimented with several alternatives: Claude Code (Anthropic’s CLI agentic coder), OpenAI’s Codex CLI (an official terminal-based coding agent by OpenAI), Google’s Gemini CLI (an open-source command-line AI agent giving access to Google’s Gemini model), and Manus (a recently introduced autonomous AI coding agent). These tools all aim to boost developer productivity, but results have been mixed – for example, Tim tried the Windsurf editor (an AI-powered IDE) using an Anthropic Claude model (“Claude 3.5 Sonnet”) and found it useful but “nowhere near 10×” productivity improvement as some LinkedIn influencers claimed. The community’s take on these tools is highly polarized, with skeptics calling it hype and enthusiasts claiming dramatic gains.
- 04:39 – Importance of Context (Prompt Engineering vs “Vibe Coding”): A major theme is providing clear requirements and context to the AI. David found that all these coding platforms (whether GUI IDE like Windsurf or Cursor, or CLI tools like Claude Code and Codex) allow you to supply custom instructions and project docs (often via Markdown) – essentially like giving the AI a spec. When he attempted building new apps, he had much more success by writing a detailed PRD (Product Requirements Document) and feeding it to the AI assistant. For instance, he gave the same spec (tech stack, features, and constraints) to Claude Code, OpenAI’s Codex CLI, and Gemini CLI, and each generated a reasonable project scaffold in minutes. All stuck to the specified frameworks and even obeyed instructions like “don’t add extra packages unless approved.” This underscores that if you prompt these tools with structured context (analogous to good old-fashioned requirements documents), they perform markedly better. David mentions that Amazon’s new AI IDE, Kiro (introduced recently as a spec-driven development tool) embraces this “context-first” approach – aiming to eliminate one-shot “vibe coding” chaos by having the AI plan from a spec before writing code. He notes that using top-tier models (Anthropic’s Claude “Opus 4” was referenced as an example, available only in an expensive plan) can further improve adherence to instructions, but even smaller models do decently if guided well.
- 07:03 – Community Reactions: The conversation touches on the culture around these tools. There’s acknowledgment of toxicity in some online discussions – e.g. seasoned engineers scoffing at newcomers using AI (“non-engineers” doing vibe coding). Tim and David distance themselves from gatekeeping attitudes; their stance is that anyone interested in the tech should be encouraged, while just being mindful of pitfalls (like code quality, security, or privacy issues when using AI). They see value in exploring all levels of AI assistance, provided one remains pragmatic about what works and stays cautious about sensitive data.
- 29:57 – Models + 4 Levels of AI Coding Tool: Tim introduces a mental model to frame the AI coding assistant ecosystem (around 29:57). The idea is to separate the foundational models from the tools built on top, and to classify those tools into four levels of increasing capability:
- Underlying Models: First, there are the core large language models themselves – e.g. OpenAI’s GPT-4, Anthropic’s Claude (various versions like Claude 1.* and 2, including fast “Sonnet” models and the heavier “Opus” models), Google’s Gemini model, as well as open-source local models. These are the engines that power everything else, but interacting with raw models isn’t the whole story.
- Level 1 – Basic Chat Interface: Tools where you interact via a simple chat UI (text in/out) with no direct integration into your coding environment. ChatGPT in the browser, or voice assistants that can produce code snippets on request, fall here. They can write code based on prompts, but you have to copy-paste results – the AI isn’t tied into your files or IDE.
- Level 2 – Agentic IDE/CLI Assistants: Tools that deeply integrate with your development environment, able to edit files and execute commands. This includes AI-augmented IDEs and editors like Windsurf Editor (a standalone AI-native IDE) and Cursor (AI-assisted code editor), as well as command-line agents that can manipulate your project (like the CLI versions of Claude Code, OpenAI Codex, or Gemini CLI). At this level, the AI can read your project files, make changes, create new files, run build/test commands, etc., acting almost like a pair programmer who can use the keyboard and terminal. (For example, Windsurf’s “Cascade” agent mode and Cursor’s agent mode allow multi-file edits and running shell commands automatically.)
- Level 3 – Enhanced Context and Memory: Tools or techniques focused on feeding the model more project knowledge and context (sometimes dubbed “context engineering”). The idea is to improve the AI’s understanding of your codebase by supplying documentation, requirements, or summaries in a structured way. For instance, some setups use special files (like a Claude.md or project brief) or memory windows to inject relevant information into the prompt. Tim mentions he experimented with Windsurf’s Memories feature (which lets you pin important notes for the AI), and techniques like giving the AI an architecture overview so it knows, for example, “all parsing should happen in Parser class, not in the Executor.” In theory, this level yields better coherence and adherence to architecture by giving the model a persistent knowledge base. In practice, Tim admits his results with memory/context features have been hit-or-miss so far – though some users report great success when this is done right (e.g. Anthropic’s Claude is known for handling large context windows). Better context management is seen as an area where more refinement is needed, but tools are emerging to help (even the base models are evolving to handle 100K+ tokens).
- Level 4 – Multi-Agent Orchestration: The cutting edge is having multiple AI agents with specialized roles collaborating on tasks – essentially an “AI team.” This might involve one agent acting as a code writer, another as a tester, another as a project manager, etc., coordinating via a framework. Tim notes that this space is just beginning to be explored, and it’s hard to know how much of it is hype vs. real productivity gain. Nevertheless, they mention a few examples: AutoGen Studio (a Microsoft open-source tool for prototyping multi-agent workflows), a terminal tool called Claude Squad (which can run multiple Claude or other agent instances in parallel sessions), and the LangChain framework (commonly used to chain together LLMs and tools, often cited for agent coordination). These solutions aim to let agents divide-and-conquer coding tasks or feedback on each other’s work. Tim hasn’t personally “gone to level 4” in his workflow yet – it’s a bit uncharted – but it’s an area to watch as some on social media claim big wins by letting agents handle entire projects.
- 37:05 – Rapid Evolution of Copilot and IDEs: The discussion returns to GitHub Copilot, noting how much it has improved from its early days. Originally Copilot was a simple autocomplete on a single model; now it has a Chat mode and even an “Agent” mode (aka Copilot X) that behaves more like Cursor/Windsurf. In VS Code, Copilot can now browse the project, edit multiple files, and follow high-level instructions, not just line-by-line suggestions. David mentions that in VS Code you can even choose between multiple underlying models for Copilot (e.g. GPT-4 or Claude 2, etc.), and Visual Studio is catching up as well. An extension called AI Toolkit in VS Code further allows power-users to play with many models side-by-side (including hooking up local LLMs like Meta’s Code Llama via quantized versions, if you have the hardware – though David’s attempt with a smaller quantized model showed its limitations). Essentially, the gap between official tools like Copilot and third-party ones is narrowing as features converge. They joke how fast everything moves – “one month” in AI feels huge – features like agent modes that started in Windsurf/Cursor quickly made it into mainstream tools.
- 58:00 – Agents Taking Actions: By the late stage of the episode, they marvel at how far the “agentic” abilities have come. One anecdote: tools now exist (e.g. some VS Code agents or Prototypes in Copilot) that can autonomously browse the web or perform actions on your behalf given the permission. For example, an agent could log into your Salesforce or check an external API to fetch data needed for coding a feature – basically doing the boring data gathering or repetitive setup tasks for you. This blurs the line between coding assistant and general AI agent. It’s powerful but a bit unnerving – you have to really trust an AI to let it, say, place orders online or manipulate production data! The hosts recognize real utility in offloading mundane tasks to AI, as long as there are safeguards and oversight.
- 1:00:00 – Conclusion: Tim and David wrap up by acknowledging how many different tools and models they’ve mentioned (indeed, a dizzying number!). This explosion of options is “not even scratching the surface” – new entrants seem to pop up every week. It’s challenging for developers to know which tools will last or prove truly useful, and which are just hype. Their advice is to keep exploring and sharing notes. They anticipate the landscape will look very different even 3–6 months from now, so a follow-up discussion will be needed. In the end, they express cautious optimism: staying on top of AI coding tools is effortful but likely worth it, as these assistants could meaningfully improve developer productivity if used wisely.
# Resources Mentioned
🤖 General LLMs and Interfaces
- ChatGPT (OpenAI) – Web-based chat interface for GPT-3.5/GPT-4.
👉 https://chat.openai.com
ℹ️ https://openai.com/chatgpt - Claude (Anthropic) – Claude 3 models including Opus, Sonnet, Haiku. Used in Claude Code and Windsurf.
👉 https://claude.ai
ℹ️ https://www.anthropic.com/index/claude-3 - Claude API (Anthropic Developer Hub) – Documentation and access to Claude API.
👉 https://docs.anthropic.com/claude - Gemini (Google DeepMind) – Google's LLM family (Gemini 1.5 Pro, Flash, etc.)
👉 https://deepmind.google/technologies/gemini/ - Gemini CLI – Command-line tool to interact with Gemini LLMs.
👉 https://github.com/google-gemini/gemini-cli
🧠 IDE & Coding Tools
- GitHub Copilot – AI pair programmer for Visual Studio, VS Code, Neovim.
👉 https://github.com/features/copilot
ℹ️ https://docs.github.com/en/copilot - Copilot Chat in VS Code – AI agent chat integrated with file editing and IDE actions.
👉 https://code.visualstudio.com/blogs/2023/07/20/copilot-chat-preview - Cursor Editor – AI-native code editor built on VS Code with Claude/GPT support.
👉 https://www.cursor.sh
ℹ️ https://github.com/getcursor/cursor - Windsurf Editor – Claude-powered standalone AI IDE with agent mode (aka "Cascade").
👉 https://windsurf.ai
ℹ️ https://github.com/windsurf-ai/windsurf (may be outdated or archived) - Claude Code – Anthropic's CLI tool for agent-style code generation/editing.
👉 https://github.com/anthropics/claude-code - OpenAI Codex CLI – OpenAI’s coding CLI interface (precursor to GPT-style CLI agents).
👉 https://github.com/openai/openai-cookbook/tree/main/examples/Codex-CLI
(Note: Codex was deprecated, replaced by GPT-4 and GPT-4-turbo models.) - Manus – AI coding agent (autonomous mode) launched by Monica team.
👉 https://manus.im
ℹ️ https://github.com/manus-ai/manus (if open sourced) - Kiro (AWS) – Amazon’s AI development assistant/IDE with spec-first design.
👉 https://kiro.aws
ℹ️ https://aws.amazon.com/codewhisperer/kiro/
🧩 Multi-Agent Systems & Context Engineering
- AutoGen Studio (Microsoft Research) – Visual builder for orchestrating LLM agents.
👉 https://www.microsoft.com/en-us/research/blog/introducing-autogen-studio-a-low-code-interface-for-building-multi-agent-workflows/
👉 https://github.com/microsoft/autogen - LangChain – LLM app framework with support for agent orchestration, memory, tools.
👉 https://www.langchain.com
👉 https://github.com/langchain-ai/langchain - LangGraph – LangChain’s library for agent workflows using state machines.
👉 https://github.com/langchain-ai/langgraph - CrewAI – Agent framework for defining autonomous agents with roles and tasks.
👉 https://www.crewai.com
👉 https://github.com/joaomdmoura/crewAI - Claude Squad – Terminal multiplexer for multiple Claude agent sessions.
👉 https://github.com/smtg-ai/claude-squad - N8N (n8n.io) – Automation tool with visual workflows, now supporting AI agents.
👉 https://n8n.io
👉 https://github.com/n8n-io/n8n - AI Toolkit for VS Code – Extension to run and compare multiple AI models in editor.
👉 https://marketplace.visualstudio.com/items?itemName=VisualStudioExptTeam.vscodeintellicode
(May require additional setup for local models.) - Quen / Code LLaMA / Ollama (for local LLMs)
👉 https://ollama.com (Ollama: run LLMs locally)
👉 https://github.com/facebookresearch/codellama
👉 https://huggingface.co/TheBloke (many quantized model builds)