Show HN: PowerShellGPT Voice Controlled AI Agent Automates Windows and the Web

powershellgpt.com

1 points by techinterests 21 hours ago

I built PowerShellGPT, a native Windows app that connects large language models (ChatGPT, Claude, and even local models like LM Studio) directly to PowerShell and a live embedded browser, enabling full voice or text-controlled automation of both your system and the web.

Unlike traditional chat interfaces, PowerShellGPT doesn’t just generate code—it executes it, watches the results, and loops that feedback back to the model for error correction or follow-up actions. It acts more like an autonomous agent than a passive chatbot.

Core Features: Natural Language to Execution Loop: AI generates PowerShell or JavaScript, executes it, sees the result, and corrects itself if needed.

Voice Recognition: Use voice commands in over 80 languages to control your system or web browser.

@PowerShellGPT@ and @JsGPT@ Tags: Triggers live execution in embedded PowerShell and browser environments.

Agent Bridge: Enables multiple named AI agents to communicate and delegate tasks to each other.

Keyword Commands: Create templates like “email [KEYWORD] to Karen” or “search Google for [KEYWORD]” and fill them in with your voice.

Chain & Wait Commands: Link multiple actions together and add delays between them like: search YouTube for music and then wait 5 seconds and then play the first result

Voice Activated Automation: Detect faces, trigger system events, play media, fill out forms—entirely hands-free.

Cross-language Support: Generate and run Python, Node.js, Ruby, or C# files using natural language via a system I call MACFARI ("Make A Code File And Run It").

Use Cases: System automation and scripting with no typing

Web task automation with dynamic JavaScript injection

Creating AI-powered assistants that collaborate via voice

Accessibility-friendly computing via full voice control

The app runs as a lightweight Delphi GUI with two embedded browsers: one for AI, one for execution. All AI-to-execution communication happens via tag parsing and a secure permission system. You can toggle access prompts, enable/disable looping protection, and modify wake word settings.

https://youtube.com/@PowerShellGPT

Would love feedback, ideas, or feature requests. I'm especially curious what you’d build with it—or how you'd break it.

Thanks for checking it out!