Artificial Intelligence
Goldman Sachs tests autonomous AI agents for process-heavy work
Goldman Sachs is pushing deeper into real use of artificial intelligence inside its operations, moving to systems that can carry out complex tasks on their own. The Wall Street bank is working with AI startup Anthropic to create autonomous AI agents powered by Anthropic’s Claude model that can handle work that used to require large teams of people. The bank’s chief information officer says the technology has surprised staff with how capable it can be.
Many companies use AI for tasks like helping employees draft text or analysing trends. But Goldman Sachs is testing AI systems that go into what bankers call back-office work – functions like accounting, compliance checks and onboarding new clients – areas viewed as too complex for automation. Such jobs involve many rules, data and detailed review, and have resisted full automation.
Moving AI agents into process-heavy operations
The partnership with Anthropic has been underway for roughly six months, with engineers from the AI startup embedded directly with teams at Goldman Sachs to build these agents side by side with in-house staff, according to a report based on an interview with the bank’s CIO. The work has focused on areas where automation could cut the time it takes to complete repetitive and data-heavy tasks.
Marco Argenti, Goldman’s chief information officer, described the AI systems as a new kind of digital assistant. “Think of it as a digital co-worker for many of the professions in the firm that are scaled, complex and very process-intensive,” he told CNBC. In early tests, the ability to reason through multi-step work and apply logic to complex areas like accounting and compliance was something the bank had not expected from the model.
Goldman Sachs has been among the more active banks in testing AI tools over the past few years. Before this announcement, the firm deployed internal tools to help engineers write and debug code. But the change now is toward systems that can take on work traditionally done by accountants and compliance teams. That highlights how organisations are trying to find concrete business uses for AI beyond the hype.
Faster workflows, human oversight remains
The agents are based on Anthropic’s Claude Opus 4.6 model, which has been built to handle long documents and complex reasoning. Goldman’s tests have shown that such systems can reduce the time needed for tasks like client onboarding, trade reconciliation and document review. While the bank has not shared specific performance numbers, people familiar with the matter told news outlets that work which once took a great deal of human labour can now be done in much less time.
Argenti said the rollout is not about replacing human workers, at least not at this stage. The bank reportedly views the agents as a tool to help existing staff manage busy schedules and get through high volumes of work. In areas like compliance and accounting, jobs can involve repetitive, rule-based steps. AI frees analysts from that repetition so they can focus on higher-value judgement work.
Markets have already reacted to the idea that large institutions are moving toward more AI-driven automation. In recent days, a sell-off in enterprise software stocks wiped out billions in value as some investors worried that tools like autonomous agents could speed up the decline of traditional business software that has dominated corporate IT for years.
AI adoption meets governance reality
Industry watchers see Goldman’s move as part of a wider trend. For example, some firms are piloting tools to read large data sets, interpret multiple sources of information, and draft investment analysis. These steps show AI making the jump from isolated projects to operational work. Yet the technology raises questions about oversight and trust. AI systems that interpret financial rules and compliance standards must be monitored carefully to avoid errors that could have regulatory or financial consequences. That’s why many institutions treat these systems as helpers that are reviewed by human experts until they mature.
Goldman Sachs is starting with operational functions that have traditionally resisted automation because they involve a lot of data and formal steps. The bank has not said when it expects deployment of the agents in its operations, but executives have suggested that the initial tests have been promising enough to support further rollout.
The broader industry context shows other banks and financial firms also exploring similar use cases. Some have already invested heavily in AI infrastructure, and reports indicate that major firms are planning to use AI to cut costs, speed workflows and improve risk management. However, many remain cautious about putting AI into customer-facing or regulated functions.
Goldman’s push into autonomous AI agents is an example of how large companies are reshaping internal operations using the latest generation of AI models. If systems can handle complex tasks reliably, organisations could see real changes in how work gets done – particularly in back-office functions where volume and repetition keep costs high and innovation slow.
(Photo by Louis Droege)
See also: Intuit, Uber, and State Farm trial AI agents inside enterprise workflows
Want to learn more about AI and big data from industry leaders? Check out AI & Big Data Expo taking place in Amsterdam, California, and London. The comprehensive event is part of TechEx and is co-located with other leading technology events, click here for more information.
AI News is powered by TechForge Media. Explore other upcoming enterprise technology events and webinars here.
Artificial Intelligence
SuperCool review: Evaluating the reality of autonomous creation
In the current landscape of generative artificial intelligence, we have reached a saturation point with assistants. Most users are familiar with the routine. You prompt a tool, it provides a draft, and then you spend the next hour manually moving that output into another application for formatting, design, or distribution. AI promised to save time, yet the tool hop remains a bottleneck for founders and creative teams.
SuperCool enters this crowded market with an importantly different value proposition. It does not want to be your assistant. It wants to be your execution partner. By positioning itself at the execution layer of creative projects, SuperCool aims to bridge the gap between a raw idea and a finished, downloadable asset without requiring the user to leave the platform.
Redefining the creative workflow
The core philosophy behind SuperCool is to remove coordination overhead. For most businesses, creating a high-quality asset, whether it is a pitch deck, a marketing video, or a research report, requires a patchwork approach. You might use one AI for text, another for images, and a third for layout. SuperCool replaces this fragmented stack with a unified system of autonomous agents that work in concert.

As seen in the primary dashboard interface, the platform presents a clean, minimalist entry point. The user is greeted with a simple directive: “Give SuperCool a task to work on…”. The simplicity belies the complexity occurring under the hood. Unlike traditional tools that require you to navigate menus and settings, the SuperCool experience is driven entirely by natural language prompts.
How the platform operates in practice
The workflow begins with a natural-language prompt that describes the desired outcome, the intended audience, and any specific constraints. One of the most impressive features observed during this review is the transparency of the agentic process.

When a user submits a request, for instance, “create a pitch deck for my B2B business,” the platform does not just return a file a few minutes later. Instead, it breaks the project down into logical milestones that the user can monitor in real time.
- Strategic planning: The AI first outlines the project structure, like the presentation flow.
- Asset generation: It then generates relevant visuals and data visualisations tailored to the specific industry context.
- Final assembly: The system designs the complete deck, ensuring cohesive styling and professional layouts.
Visibility is crucial for trust. It allows the user to see that the AI is performing research and organising content not just hallucinating a generic response. The final result is a professional, multi-slide product, often featuring 10 or more professionally designed slides, delivered as an exportable file like a PPTX.
Versatility across use cases
SuperCool’s utility is most apparent in scenarios where speed and coverage are more valuable than pixel-perfect manual control. We observed three primary areas where the platform excels:
End-to-end content creation
For consultants and solo founders, the time saved on administrative creative tasks is immense. A consultant onboarding a new client can describe the engagement and instantly receive a welcome packet, a process overview, and a timeline visual.
Multi-format asset kits:
Perhaps the most powerful feature is the ability to generate different types of media from a single prompt. An HR team launching an employee handbook can request a kit that includes a PDF guide, a short video, and a presentation deck.
Production without specialists:
Small teams often face a production gap where they lack the budget for full-time designers or video editors. SuperCool effectively fills this gap, allowing a two-person team to produce branded graphics and videos without expanding headcount.

Navigating the learning curve
While the platform is designed for ease of use, it is not a magic wand for those without a clear vision. The quality of the output is heavily dependent on the clarity of the initial prompt. Vague instructions will lead to generic results. SuperCool is built for professionals who know what they want but do not want to spend hours manually building it.
Because the system is autonomous, users have less mid-stream control. You cannot tweak a design element while the agents are working. Instead, refinement happens through iteration in the chat interface. If the first version is not perfect, you provide feedback, and the system regenerates the asset with those adjustments in mind.
The competitive landscape: Assistant vs.agent
In the current AI ecosystem, most tools are categorised as assistants. They perform specific, isolated tasks, leaving the user responsible for overseeing the entire process. SuperCool represents the shift toward agentic AI, in which the system takes responsibility for the entire workflow.
The distinction is vital for enterprise contexts. While assistants require constant hand-holding, an agentic system like SuperCool allows the user to focus on high-level ideation and refinement. It moves the user from builder to director.
Final assessment
SuperCool is a compelling alternative for those who find the current tool-stack approach a drain on productivity. It is not necessarily a replacement for specialised creative software when a brand needs unique, handcrafted artistry. However, for the vast majority of business needs, where speed, consistency, and execution are paramount, it offers perhaps the shortest path from an idea to a finished product.
For founders and creative teams who value the ability to rapidly test ideas and deploy content without the overhead of specialised software, SuperCool is a step forward in the evolution of autonomous work.
Image source: Unsplash
Artificial Intelligence
Top 7 best AI penetration testing companies in 2026
Penetration testing has always existed to answer one practical concern: what actually happens when a motivated attacker targets a real system. For many years, that answer was produced through scoped engagements that reflected a relatively stable environment. Infrastructure changed slowly, access models were simpler, and most exposure could be traced back to application code or known vulnerabilities.
That operating reality does not exist. Modern environments are shaped by cloud services, identity platforms, APIs, SaaS integrations, and automation layers that evolve continuously. Exposure is introduced through configuration changes, permission drift, and workflow design as often as through code. As a result, security posture can shift materially without a single deployment.
Attackers have adapted accordingly. Reconnaissance is automated. Exploitation attempts are opportunistic and persistent. Weak signals are correlated in systems and chained together until progression becomes possible. In this context, penetration testing that remains static, time-boxed, or narrowly scoped struggles to reflect real risk.
How AI penetration testing changes the role of offensive security
Traditional penetration testing was designed to surface weaknesses during a defined engagement window. That model assumed environments remained relatively stable between tests. In cloud-native and identity-centric architectures, this assumption does not hold.
AI penetration testing operates as a persistent control not a scheduled activity. Platforms reassess attack surfaces as infrastructure, permissions, and integrations change. This lets security teams detect newly introduced exposure without waiting for the next assessment cycle.
As a result, offensive security shifts from a reporting function into a validation mechanism that supports day-to-day risk management.
The top 7 best AI penetration testing companies
1. Novee
Novee is an AI-native penetration testing company focused on autonomous attacker simulation in modern enterprise environments. The platform is designed to continuously validate real attack paths and not produce static reports.
Novee models the full attack lifecycle, including reconnaissance, exploit validation, lateral movement, and privilege escalation. Its AI agents adapt their behaviour based on environmental feedback, abandoning ineffective paths and prioritising those that lead to impact. This results in fewer findings with higher confidence.
The platform is particularly effective in cloud-native and identity-heavy environments where exposure changes frequently. Continuous reassessment ensures that risk is tracked as systems evolve, not frozen at the moment of a test.
Novee is often used as a validation layer to support prioritisation and confirm that remediation efforts actually reduce exposure.
Key characteristics:
- Autonomous attacker simulation with adaptive logic
- Continuous attack surface reassessment
- Validated attack-path discovery
- Prioritisation based on real progression
- Retesting to confirm remediation effectiveness
2. Harmony Intelligence
Harmony Intelligence focuses on AI-driven security testing with an emphasis on understanding how complex systems behave under adversarial conditions. The platform is designed to surface weaknesses that emerge from interactions between components not from isolated vulnerabilities.
Its approach is particularly relevant for organisations running interconnected services and automated workflows. Harmony Intelligence evaluates how attackers could exploit logic gaps, misconfigurations, and trust relationships in systems.
The platform emphasises interpretability. Findings are presented in a way that explains why progression was possible, which helps teams understand and address root causes not symptoms.
Harmony Intelligence is often adopted by organisations seeking deeper insight into systemic risk, not surface-level exposure.
Key characteristics:
- AI-driven testing of complex system interactions
- Focus on logic and workflow exploitation
- Clear contextual explanation of findings
- Support for remediation prioritisation
- Designed for interconnected enterprise environments
3. RunSybil
RunSybil is positioned around autonomous penetration testing with a strong emphasis on behavioural realism. The platform simulates how attackers operate over time, including persistence and adaptation.
Rather than executing predefined attack chains, RunSybil evaluates which actions produce meaningful access and adjusts accordingly. This makes it effective at identifying subtle paths that emerge from configuration drift or weak segmentation.
RunSybil is frequently used in environments where traditional testing produces large volumes of low-value findings. Its validation-first approach helps teams focus on paths that represent genuine exposure.
The platform supports continuous execution and retesting, letting security teams measure improvement not rely on static assessments.
Key characteristics:
- Behaviour-driven autonomous testing
- Focus on progression and persistence
- Reduced noise through validation
- Continuous execution model
- Measurement of remediation impact
4. Mindgard
Mindgard specialises in adversarial testing of AI systems and AI-enabled workflows. Its platform evaluates how AI components behave under malicious or unexpected input, including manipulation, leakage, and unsafe decision paths.
The focus is increasingly important as AI becomes embedded in business-important processes. Failures often stem from logic and interaction effects, not traditional vulnerabilities.
Mindgard’s testing approach is proactive. It is designed to surface weaknesses before deployment and to support iterative improvement as systems evolve.
Organisations adopting Mindgard typically view AI as a distinct security surface that requires dedicated validation beyond infrastructure testing.
Key characteristics:
- Adversarial testing of AI and ML systems
- Focus on logic, behaviour, and misuse
- Pre-deployment and continuous testing support
- Engineering-actionable findings
- Designed for AI-enabled workflows
5. Mend
Mend approaches AI penetration testing from a broader application security perspective. The platform integrates testing, analysis, and remediation support in the software lifecycle.
Its strength lies in correlating findings in code, dependencies, and runtime behaviour. This helps teams understand how vulnerabilities and misconfigurations interact, not treating them in isolation.
Mend is often used by organisations that want AI-assisted validation embedded into existing AppSec workflows. Its approach emphasises practicality and scalability over deep autonomous simulation.
The platform fits well in environments where development velocity is high and security controls must integrate seamlessly.
Key characteristics:
- AI-assisted application security testing
- Correlation in multiple risk sources
- Integration with development workflows
- Emphasis on remediation efficiency
- Scalable in large codebases
6. Synack
Synack combines human expertise with automation to deliver penetration testing at scale. Its model emphasises trusted researchers operating in controlled environments.
While not purely autonomous, Synack incorporates AI and automation to manage scope, triage findings, and support continuous testing. The hybrid approach balances creativity with operational consistency.
Synack is often chosen for high-risk systems where human judgement remains critical. Its platform supports ongoing testing not one-off engagements.
The combination of vetted talent and structured workflows makes Synack suitable for regulated and mission-important environments.
Key characteristics:
- Hybrid model combining humans and automation
- Trusted researcher network
- Continuous testing ability
- Strong governance and control
- Suitable for high-assurance environments
7. HackerOne
HackerOne is best known for its bug bounty platform, but it also plays a role in modern penetration testing strategies. Its strength lies in scale and diversity of attacker perspectives.
The platform lets organisations to continuously test systems through managed programmes with structured disclosure and remediation workflows. While not autonomous in the AI sense, HackerOne increasingly incorporates automation and analytics support prioritisation.
HackerOne is often used with AI pentesting tools not as a replacement. It provides exposure to creative attack techniques that automated systems may not uncover.
Key characteristics:
- Large global researcher community
- Continuous testing through managed programmes
- Structured disclosure and remediation
- Automation to support triage and prioritisation
- Complementary to AI-driven testing
How enterprises use AI penetration testing in practice
AI penetration testing is most effective when used as part of a layered security strategy. It rarely replaces other controls outright. Instead, it fills a validation gap that scanners and preventive tools cannot address alone.
A common enterprise pattern includes:
- Vulnerability scanners for detection coverage
- Preventive controls for baseline hygiene
- AI penetration testing for continuous validation
- Manual pentests for deep, creative exploration
In this model, AI pentesting serves as the connective tissue. It determines which detected issues matter in practice, validates remediation effectiveness, and highlights where assumptions break down.
Organisations adopting this approach often report clearer prioritisation, faster remediation cycles, and more meaningful security metrics.
The future of security teams with ai penetration testing
The impact of this new wave of offensive security has been transformative for the security workforce. Instead of being bogged down by repetitive vulnerability finding and retesting, security specialists can focus on incident response, proactive defense strategies, and risk mitigation. Developers get actionable reports and automated tickets, closing issues early and reducing burnout. Executives gain real-time assurance that risk is being managed every hour of every day.
AI-powered pentesting, when operationalised well, fundamentally improves business agility, reduces breach risk, and helps organisations meet the demands of partners, customers, and regulators who are paying closer attention to security than ever before.
Image source: Unsplash
Artificial Intelligence
Cryptocurrency markets a testbed for AI forecasting models
Cryptocurrency markets have become a high-speed playground where developers optimise the next generation of predictive software. Using real-time data flows and decentralised platforms, scientists develop prediction models that can extend the scope of traditional finance.
The digital asset landscape offers an unparalleled environment for machine learning. When you track cryptocurrency prices today, you are observing a system shaped simultaneously by on-chain transactions, global sentiment signals, and macroeconomic inputs, all of which generate dense datasets suited for advanced neural networks.
Such a steady trickle of information makes it possible to assess and reapply an algorithm without interference from fixed trading times or restrictive market access.
The evolution of neural networks in forecasting
Current machine learning technology, particularly the “Long Short-Term Memory” neuronal network, has found widespread application in interpreting market behaviour. A recurrent neural network, like an LSTM, can recognise long-term market patterns and is far more flexible than traditional analytical techniques in fluctuating markets.
The research on hybrid models that combine LSTMs with attention mechanisms has really improved techniques for extracting important signals from market noise. Compared to previous models that used linear techniques, these models analyse not only structured price data but also unstructured data.
With the inclusion of Natural Language Processing, it is now possible to interpret the flow of news and social media activity, enabling sentiment measurement. While prediction was previously based on historical stock pricing patterns, it now increasingly depends on behavioural changes in global participant networks.
A High-Frequency Environment for Model Validation
The transparency of blockchain data offers a level of data granularity that is not found in existing financial infrastructures. Each transaction is now an input that can be traced, enabling cause-and-effect analysis without delay.
However, the growing presence of autonomous AI agents has changed how such data is used. This is because specialised platforms are being developed to support decentralised processing in a variety of networks.
This has effectively turned blockchain ecosystems into real-time validation environments, where the feedback loop between data ingestion and model refinement occurs almost instantly.
Researchers use this setting to test specific abilities:
- Real-time anomaly detection: Systems compare live transaction flows against simulated historical conditions to identify irregular liquidity behaviour before broader disruptions emerge.
- Macro sentiment mapping: Global social behaviour data are compared to on-chain activity to assess true market psychology.
- Autonomous risk adjustment: Programmes run probabilistic simulations to rebalance exposure dynamically as volatility thresholds are crossed.
- Predictive on-chain monitoring: AI tracks wallet activity to anticipate liquidity shifts before they impact centralised trading venues.
These systems really do not function as isolated instruments. Instead, they adjust dynamically, continually changing their parameters in response to emerging market conditions.
The synergy of DePIN and computational power
To train complex predictive models, large amounts of computing power are required, leading to the development of Decentralised Physical Infrastructure Networks (DePIN). By using decentralised GPU capacity on a global computing grid, less dependence on cloud infrastructure can be achieved.
Consequently, smaller-scale research teams are afforded computational power that was previously beyond their budgets. This makes it easier and faster to run experiments in different model designs.
This trend is also echoed in the markets. A report dated January 2025 noted strong growth in the capitalisation of assets related to artificial intelligence agents in the latter half of 2024, as demand for such intelligence infrastructure increased.
From reactive bots to anticipatory agents
The market is moving beyond rule-based trading bots toward proactive AI agents. Instead of responding to predefined triggers, modern systems evaluate probability distributions to anticipate directional changes.
Gradient boosting and Bayesian learning methods allow the identification of areas where mean reversion may occur ahead of strong corrections.
Some models now incorporate fractal analysis to detect recurring structures in timeframes, further improving adaptability in rapidly-changing conditions.
Addressing model risk and infrastructure constraints
Despite such rapid progress, several problems remain. Problems identified include hallucinations in models, in which patterns found in a model do not belong to the patterns that cause them. Methods to mitigate this problem have been adopted by those applying this technology, including ‘explainable AI’.
The other vital requirement that has remained unaltered with the evolution in AI technology is scalability. With the growing number of interactions among autonomous agents, it is imperative that the underlying transactions efficiently manage the rising volume without latency or data loss.
At the end of 2024, the most optimal scaling solution handled tens of millions of transactions per day in an area that required improvement.
Such an agile framework lays the foundation for the future, where data, intelligence and validation will come together in a strong ecosystem that facilitates more reliable projections, better governance and greater confidence in AI-driven insights.
-
Fintech6 months agoRace to Instant Onboarding Accelerates as FDIC OKs Pre‑filled Forms | PYMNTS.com
-
Cyber Security7 months agoHackers Use GitHub Repositories to Host Amadey Malware and Data Stealers, Bypassing Filters
-
Fintech7 months ago
DAT to Acquire Convoy Platform to Expand Freight-Matching Network’s Capabilities | PYMNTS.com
-
Fintech5 months agoID.me Raises $340 Million to Expand Digital Identity Solutions | PYMNTS.com
-
Artificial Intelligence7 months agoNothing Phone 3 review: flagship-ish
-
Fintech4 months agoTracking the Convergence of Payments and Digital Identity | PYMNTS.com
-
Artificial Intelligence7 months agoThe best Android phones
-
Fintech5 months ago
Esh Bank Unveils Experience That Includes Revenue Sharing With Customers | PYMNTS.com
