startupsBy HowDoIUseAI Team

How to build AI trading agents that actually work in real markets

Learn to create autonomous AI agents for Polymarket trading using proven frameworks. Build research agents that optimize parameters while you sleep.

Building AI agents that trade real money in prediction markets sounds like science fiction. But after seeing recent breakthroughs in autonomous research agents, it's becoming reality faster than anyone expected.

The key insight? You're not building a traditional trading bot. You're building an AI researcher that happens to trade. This shift in perspective changes everything about how you approach the problem.

What makes AI trading different from traditional bots?

Traditional trading bots follow rigid rules: buy when RSI hits 30, sell when moving averages cross. AI agents think differently. They form hypotheses, test them against live market data, learn from failures, and adapt their strategies in real time.

The agent autonomously modifies training code, runs five-minute training sessions, evaluates performance, and repeats the process, all without human intervention. This same principle applies to trading strategies.

Instead of hardcoded rules, you give your agent a framework for thinking about markets and let it discover what works through experimentation.

Which platform should you start with?

Polymarket offers the best environment for AI trading experiments. The markets are prediction-based with binary outcomes, making them perfect for training AI agents that need clear win/loss signals.

The platform provides official Python and TypeScript libraries for faster development, plus comprehensive documentation. You can start with Polymarket's API documentation to understand the core trading concepts.

The ecosystem includes several key components:

  • CLOB API for real-time order management
  • Data API for historical analysis
  • WebSocket streams for live market feeds
  • Official SDKs that handle authentication and signing

What's the architecture of a research-driven trading agent?

The breakthrough approach comes from Andrej Karpathy's autoresearch framework, which condenses machine learning experimentation into a single-GPU setup with about 630 lines of code. The core premise is simple yet powerful: humans refine a high-level prompt in a Markdown file, while an AI agent autonomously edits the training script to experiment with improvements.

Applied to trading, this means:

Strategy Layer: Your strategy.py file contains the core trading logic - risk management, position sizing, entry/exit conditions. The AI agent can modify these parameters.

Research Loop: Every market cycle (5-15 minutes), the agent:

  1. Reads your research prompt from program.md
  2. Modifies trading parameters in strategy.py
  3. Runs a live trading session with small position sizes
  4. Evaluates performance against key metrics
  5. Commits successful changes to git

Human Role: You write the high-level research direction. Want the agent to focus on momentum strategies? Arbitrage opportunities? Risk management improvements? You specify this in plain English.

How do you implement the core research loop?

Start with the official Polymarket agents repository as your foundation. It defines a Polymarket class that interacts with the Polymarket API to retrieve and manage market and event data, and to execute orders on the Polymarket DEX. It includes methods for API key initialization, market and event data retrieval, and trade execution.

Here's the essential structure:

# research_agent.py
class TradingResearchAgent:
    def __init__(self, config_path="program.md"):
        self.polymarket = PolymarketClient()
        self.research_prompt = self.load_research_prompt(config_path)
        
    def research_loop(self):
        while True:
            # 1. Generate hypothesis from research prompt
            hypothesis = self.llm.generate_hypothesis(self.research_prompt)
            
            # 2. Modify trading strategy
            new_strategy = self.modify_strategy_code(hypothesis)
            
            # 3. Run live test (small positions)
            results = self.execute_live_test(new_strategy, duration=300)  # 5 minutes
            
            # 4. Evaluate performance
            if self.evaluate_results(results):
                self.commit_improvement(new_strategy)
                
    def execute_live_test(self, strategy, duration):
        start_time = time.time()
        results = {"trades": [], "pnl": 0, "win_rate": 0}
        
        while time.time() - start_time < duration:
            # Get market data
            markets = self.polymarket.get_active_markets()
            
            # Apply strategy
            for market in markets:
                signal = strategy.evaluate_market(market)
                if signal.should_trade:
                    trade_result = self.polymarket.place_order(
                        market.id, 
                        signal.side, 
                        min(signal.size, self.max_test_size)
                    )
                    results["trades"].append(trade_result)
                    
        return self.calculate_metrics(results)

The key insight is limiting each test cycle to a fixed time budget. This achieves the lowest possible validation bits per byte in fixed 5-minute training runs, simulating rapid, iterative research cycles.

What trading strategies work best with AI agents?

Focus on strategies that benefit from rapid parameter optimization rather than complex market analysis:

Momentum Detection: Let the agent experiment with different lookback windows, volatility thresholds, and position sizing rules for trend-following strategies.

Arbitrage Hunting: The agent can optimize scanning frequencies, spread thresholds, and execution timing across different markets or platforms.

Risk Management: This is where AI agents excel. They can discover non-obvious correlations between position size, market conditions, and optimal stop-loss levels.

Market Making: For liquid Polymarket events, agents can optimize bid-ask spreads, inventory management, and quote adjustment algorithms.

How do you handle risk in autonomous trading?

Your bot will require two layers of security: the wallet's private key and API credentials, which include an API Key, Secret, and Passphrase. These credentials allow your bot to sign orders without exposing the private key for every trade.

Beyond technical security, implement these safeguards:

Position Limits: Hard-code maximum position sizes that the agent cannot modify. Start with $10-50 per trade during research phase.

Drawdown Stops: If cumulative losses exceed 5-10%, pause the research loop and require manual intervention.

Strategy Constraints: Use your program.md to constrain the search space. Don't let the agent explore high-frequency scalping if you want swing trading strategies.

Paper Trading First: Run the full research loop with simulated positions before risking real money.

What metrics should guide your research agent?

Traditional trading bots optimize for profit. Research agents need more sophisticated metrics:

Risk-Adjusted Returns: Sharpe ratio, maximum drawdown, and win rate matter more than absolute profits during the research phase.

Strategy Stability: How consistent are the agent's discoveries across different market conditions? Empirical validation demonstrates multi-indicator voting superiority: Sharpe ratio 0.856 vs 0.841, max drawdown -10.10% vs -10.58%, win rate 51.4% vs 31.9%. Extended testing shows 11.2% better relative performance in volatile markets.

Parameter Sensitivity: Robust strategies shouldn't be overly sensitive to small parameter changes. Test this by adding noise to the agent's discoveries.

Execution Quality: Track slippage, fill rates, and timing consistency. The best strategy is useless if you can't execute it reliably.

How do you scale from research to production?

Once your research agent discovers profitable strategies, you face a new challenge: scaling without degrading performance.

Strategy Validation: Shopify CEO Tobi Lutke adapted the autoresearch framework for an internal project. By allowing the agent to iterate on a smaller model architecture, Lutke reported a 19% improvement in validation scores. The agent-optimized smaller model eventually outperformed a larger model that had been configured through standard manual methods.

Infrastructure Requirements: Deploy on a reliable VPS with low-latency connections to Polymarket. A Virtual Private Server is crucial for running a crypto trading bot. Once deployed, your bot will execute its strategy automatically, even during volatile market conditions when manual traders might hesitate or panic.

Monitoring Systems: Build dashboards to track the agent's research progress, strategy performance, and system health. You need alerts when the agent discovers something significant or when performance degrades.

Multi-Agent Coordination: Advanced setups run multiple research agents simultaneously, each exploring different strategy families. This requires careful coordination to avoid conflicting trades.

What common pitfalls should you avoid?

Over-optimization: Researcher alexisthual raised a poignent concern: "Aren't you concerned that launching that many experiments will eventually 'spoil' the validation set?". The fear is that with enough agents, parameters will be optimized for the specific quirks of the test data rather than general intelligence.

Insufficient Research Scope: Don't constrain your research prompts too narrowly. The agent might find profitable strategies you never considered.

Ignoring Market Regime Changes: What works in trending markets often fails in sideways markets. Your research agent needs to detect and adapt to regime changes.

Technical Debt: As the agent commits improvements, the strategy code can become complex and hard to understand. Regular refactoring is essential.

Where is AI trading headed?

The release of autoresearch suggests a future of research across domains where, thanks to simple AI instruction mechanisms, the role of the human shifts from "experimenter" to "experimental designer." The bottleneck of AI progress is no longer the "meat computer's" ability to code—it is our ability to define the constraints of the search.

The most successful AI trading systems won't be the ones with the most compute power. They'll be the ones with the best research frameworks - the clearest program.md files that guide AI agents toward profitable discoveries.

Building these systems requires combining traditional quant finance knowledge with cutting-edge AI agent architectures. But for those who master both domains, the rewards could be substantial. While humans sleep, AI agents are running hundreds of trading experiments, accumulating insights that would take months of manual research to discover.

The question isn't whether AI agents will dominate trading - it's whether you'll build the ones that do.