Commercial Release AI News & Updates
Microsoft Enhances Copilot with Web Browsing, Action Capabilities, and Improved Memory
Microsoft has significantly upgraded its Copilot AI assistant with new capabilities including performing actions on websites, remembering user preferences, analyzing real-time video, and creating podcast-like content summaries. These features, similar to those offered by competitors like OpenAI's Operator and Google's Gemini, allow Copilot to complete tasks such as booking tickets and reservations across partner websites.
Skynet Chance (+0.05%): Copilot's new ability to take autonomous actions on websites, analyze visual information, and maintain persistent memory of user data represents a significant expansion of AI agency that increases potential for unintended consequences in automated systems.
Skynet Date (-1 days): The rapid commercialization of autonomous AI capabilities that can take real-world actions with limited oversight accelerates the timeline for potential AI control issues as these systems become more integrated into daily digital activities.
AGI Progress (+0.04%): The integration of autonomous web actions, multimodal understanding, memory persistence, and environmental awareness represents meaningful progress toward more general AI capabilities that can understand and interact with diverse aspects of the digital world.
AGI Date (-1 days): Microsoft's aggressive push to match and exceed competitor capabilities suggests major tech companies are accelerating AI agent development faster than expected, potentially bringing forward the timeline for systems with AGI-like functionality in specific domains.
Cognition Introduces Affordable Pay-as-you-go Plan for Devin AI Coding Assistant
Cognition has launched a new entry-level pricing plan for its autonomous coding tool Devin, starting at $20 with a pay-as-you-go structure after initial credits are used. The company claims Devin 2.0 is significantly improved from its December release, now featuring project planning capabilities and better documentation features, though independent evaluations suggest it still struggles with complex coding tasks.
Skynet Chance (+0.01%): Devin's autonomous coding capabilities represent incremental progress in AI agency, but its documented limitations with complex tasks and high failure rate (completing only 3 out of 20 tasks in one evaluation) suggest it remains far from the level of autonomy that would significantly increase control risks.
Skynet Date (+0 days): Devin's current capabilities, while commercially notable, don't meaningfully accelerate the timeline toward uncontrollable AI systems. The high failure rate on complex tasks indicates that truly autonomous AI programming agents remain a distant goal rather than an imminent reality.
AGI Progress (+0.01%): Devin represents modest progress toward AGI by demonstrating autonomous coding capabilities in limited contexts, but its high failure rate (succeeding in only 3 of 20 tasks) and documented struggles with complex programming logic indicate substantial limitations in generalized intelligence capabilities.
AGI Date (+0 days): The commercialization and continued development of autonomous coding agents like Devin slightly accelerates the path to AGI by making AI coding tools more accessible and driving further investment in the space. However, its significant limitations suggest the acceleration is minimal.
OpenAI Faces Capacity Issues as ChatGPT Usage Surges to 500 Million Weekly Users
OpenAI CEO Sam Altman announced that unexpected demand for ChatGPT's new image generation tool has created significant capacity challenges, resulting in delayed product releases and service issues. ChatGPT has now reached 500 million weekly users and 20 million paying subscribers, with a million new users joining in a single hour as the company struggles to scale infrastructure fast enough.
Skynet Chance (+0.03%): The explosive user growth demonstrates how quickly advanced AI capabilities can reach massive scale with minimal oversight once released, potentially increasing risks from rapid AI proliferation. However, the capacity constraints highlight infrastructure limitations that currently act as a natural brake on deployment of even more advanced systems.
Skynet Date (+1 days): Current infrastructure constraints forcing OpenAI to delay releases and disable features suggest that scaling limitations are more significant than anticipated, potentially slowing the path to more advanced AI systems. These growing pains indicate that infrastructure scaling represents a genuine bottleneck that could delay deployment of increasingly capable systems.
AGI Progress (+0.01%): While massive user adoption reflects the utility of current AI systems, the technical challenges described relate more to infrastructure scaling than fundamental AI capability breakthroughs. The capacity issues highlight the gap between current systems and AGI, which would require substantially more robust infrastructure and operational efficiency.
AGI Date (+1 days): The substantial infrastructure challenges OpenAI is facing with existing products suggest that capacity bottlenecks may slow the deployment timeline for more advanced AI systems. These scaling issues point to practical limitations that must be overcome before significantly more capable systems can be reliably deployed at scale.
Amazon Launches Nova Act: An AI Agent Capable of Browser Control
Amazon has unveiled Nova Act, a general-purpose AI agent that can independently control web browsers to perform simple tasks like making reservations or ordering food. The technology, developed by Amazon's San Francisco-based AGI lab, will power features in the upcoming Alexa+ and is being released alongside a developer SDK for building agent prototypes.
Skynet Chance (+0.06%): Amazon's development of agentic AI that can autonomously operate web interfaces represents a significant step toward AI systems having real-world effects with limited human oversight. While currently focused on simple tasks, the architecture establishes pathways for increasingly autonomous operation of digital systems.
Skynet Date (-2 days): The release of commercially viable AI agents that can navigate interfaces and execute tasks accelerates the timeline toward more sophisticated autonomous systems. Amazon's framing of this technology as a step toward AGI, combined with competitive pressure in the agent space, significantly speeds up development.
AGI Progress (+0.05%): Nova Act represents substantial progress toward AGI by combining language understanding with the ability to navigate interfaces and take concrete actions in the digital world. This embodied intelligence approach bridges a key gap between pure language models and systems that can autonomously achieve goals.
AGI Date (-1 days): The explicit positioning of agent technology as a step toward AGI by Amazon's leadership, combined with claimed performance advantages over competitors, signals accelerating capability development in a critical AGI component. The integration with Alexa+ will rapidly scale this technology to millions of users.
Browser Use Raises $17M to Help AI Agents Navigate Websites More Effectively
Browser Use, a startup making websites more accessible to AI agents, has secured $17 million in seed funding led by Felicis. The company's technology breaks down website elements into a text-like format that AI agents can better understand, enabling more reliable automation of web-based tasks without relying on vision-based systems that frequently break.
Skynet Chance (+0.04%): By creating infrastructure that makes websites more navigable for AI systems, Browser Use reduces the dependency on human assistance and enables more autonomous web-based agent behaviors, incrementally advancing AI systems' ability to act independently in human-designed digital environments.
Skynet Date (-1 days): The development of tools that help AI agents reliably navigate complex websites accelerates the timeline for capable autonomous AI systems by removing a significant bottleneck in agent development, namely the ability to interact with existing digital infrastructure.
AGI Progress (+0.03%): Browser Use addresses a key limitation in current AI systems—the inability to reliably interact with the digital world as humans do—providing a foundation for more generally capable AI systems that can operate effectively across various websites and applications.
AGI Date (-1 days): By making AI-website interactions more reliable and less costly, Browser Use eliminates a significant technical barrier to developing autonomous AI agents, potentially accelerating the development of more generally capable AI systems that can operate in diverse digital environments.
1X Announces In-Home Tests of Neo Gamma Humanoid Robots Starting in 2025
Norwegian robotics startup 1X plans to begin testing its humanoid robot, Neo Gamma, in several hundred to thousand homes by the end of 2025. These initial tests will rely heavily on teleoperators—humans remotely controlling the robots—to gather data that will help train AI models for future autonomous capabilities.
Skynet Chance (+0.01%): While the development of humanoid robots represents a step toward embodied AI, Neo Gamma's heavy reliance on human teleoperators indicates we're still far from autonomous robots capable of independent physical action that could pose uncontrolled risks.
Skynet Date (+0 days): The early-stage nature of these humanoid robots, with their dependence on remote human operators and limited autonomous capabilities, doesn't significantly alter the timeline for potential AI risk scenarios; this represents an expected intermediate stage in robotics development.
AGI Progress (+0.01%): The introduction of bipedal robots into home environments, even with limited autonomy, establishes a platform for collecting real-world interaction data crucial for developing embodied AI systems that can physically operate in human spaces, a key component of general intelligence.
AGI Date (+0 days): The aggressive timeline for in-home testing (by end of 2025) slightly accelerates progress toward embodied AI by creating pathways for data collection in diverse home environments, though the heavy reliance on human teleoperators limits the immediate impact.
Anthropic Introduces Web Search Capability to Claude AI Assistant
Anthropic has added web search capabilities to its Claude AI chatbot, initially available to paid US users with the Claude 3.7 Sonnet model. The feature, which includes direct source citations, brings Claude to feature parity with competitors like ChatGPT and Gemini, though concerns remain about potential hallucinations and citation errors.
Skynet Chance (+0.01%): While the feature itself is relatively standard, giving AI systems direct ability to search for and incorporate real-time information increases their autonomy and range of action, slightly increasing potential for unintended behaviors when processing web content.
Skynet Date (+0 days): This capability represents expected feature convergence rather than a fundamental advancement, as other major AI assistants already offered similar functionality, thus having negligible impact on overall timeline predictions.
AGI Progress (+0.01%): The integration of web search expands Claude's knowledge base and utility, representing an incremental advance toward more capable and general-purpose AI systems that can access and reason about current information.
AGI Date (+0 days): The competitive pressure that drove Anthropic to add this feature despite previous reluctance suggests market forces are accelerating development of AI capabilities slightly faster than companies might otherwise proceed, marginally shortening AGI timelines.
OpenAI Enhances Voice and Transcription AI Models with Advanced Control Features
OpenAI has released new AI models for transcription and voice generation that offer improved accuracy and control over previous versions. The new text-to-speech model allows developers to steer voice characteristics using natural language, while the transcription models reduce hallucinations but show significant error rates for certain languages.
Skynet Chance (+0.04%): The explicit focus on developing more human-like, emotion-capable voices for "agentic systems" increases the potential for AI systems to manipulate human responses and operate more independently, creating subtle pathways toward autonomous AI with social influence capabilities.
Skynet Date (-1 days): OpenAI's emphasis on agentic systems that can independently complete tasks for users, combined with more natural voice interactions, accelerates the development pathway toward increasingly autonomous AI that can operate in human social environments.
AGI Progress (+0.03%): These improvements represent meaningful advances in AI's ability to process and generate human communication across modalities, particularly the increased steering capabilities that allow for contextually appropriate responses, getting closer to human-like communication abilities.
AGI Date (-1 days): The explicit framing of these voice and transcription models as components for building autonomous agents indicates OpenAI is advancing its agentic capabilities faster than previously disclosed, potentially shortening the timeline to more general AI systems.
OpenAI Releases Premium o1-pro Model at Record-Breaking Price Point
OpenAI has released o1-pro, an enhanced version of its reasoning-focused o1 model, to select API developers. The model costs $150 per million input tokens and $600 per million output tokens, making it OpenAI's most expensive model to date, with prices far exceeding GPT-4.5 and the standard o1 model.
Skynet Chance (+0.01%): While the extreme pricing suggests somewhat improved reasoning capabilities, early benchmarks and user experiences indicate the model isn't a revolutionary breakthrough in autonomous reasoning that would significantly increase AI risk profiles.
Skynet Date (+0 days): The minor improvements over the base o1 model, despite significantly higher compute usage and extreme pricing, suggest diminishing returns on scaling current approaches, neither accelerating nor decelerating the timeline to potentially risky AI capabilities.
AGI Progress (+0.01%): Despite mixed early reception, o1-pro represents OpenAI's continued focus on improving reasoning capabilities through increased compute, which incrementally advances the field toward more robust problem-solving capabilities even if performance gains are modest.
AGI Date (+0 days): The minimal performance improvements despite significantly increased compute resources suggest diminishing returns on current approaches, potentially indicating that the path to AGI may be longer than some predictions suggest.
Nvidia Launches Groot N1, An AI Foundation Model for Humanoid Robotics
Nvidia has announced Groot N1, an open-source AI foundation model designed specifically for humanoid robotics with a dual-system architecture for "thinking fast and slow." The model builds on Nvidia's Project Groot from last year but expands beyond industrial use cases to support various humanoid robot form factors, providing capabilities for environmental perception, reasoning, planning, and object manipulation alongside simulation frameworks and training data blueprints.
Skynet Chance (+0.04%): The development of a generalist AI foundation model specifically for humanoid robots represents a notable step toward physically embodied AI systems that can interact with the world. While still far from autonomous Skynet-like systems, this integration of advanced AI with humanoid robot platforms creates a pathway for AI to gain increased physical agency in the world.
Skynet Date (-1 days): The release of an open-source foundation model for humanoid robotics accelerates the development of physically embodied AI by providing a standardized starting point for diverse robotics applications. This lowers the barrier to entry for creating capable humanoid robots, potentially speeding up the timeline for more advanced physically embodied AI systems.
AGI Progress (+0.03%): Groot N1 represents significant progress toward embodied general intelligence by creating a foundation model specifically designed for humanoid robotics with both reasoning and action capabilities. By bridging the gap between language models and physical robotics and incorporating both slow deliberative and fast reactive thinking, it addresses a key limitation in current AI approaches.
AGI Date (-1 days): The release of an open-source foundation model for humanoid robotics democratizes access to advanced robotics AI, accelerating development across the field. By providing simulation frameworks and training data blueprints alongside the model, Nvidia is eliminating significant barriers to progress in embodied AI, potentially compressing development timelines.