WdaGo

package module
v0.0.0-...-40b2849 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Feb 9, 2026 License: MIT Imports: 14 Imported by: 0

README

WdaGo

Go 语言封装的 WebDriverAgent (WDA) 客户端库,同时提供 MCP Server,让大模型(Claude/GPT-4o 等)能够直接操控 iOS 设备。

功能

  • Go SDK — 封装 WDA 协议,提供类型安全的 Go API
  • MCP Server — 通过 MCP 协议将 iOS 自动化能力暴露给 AI 客户端(Claude Code、Claude Desktop 等)
  • Agent 模式 — 可选配置 LLM Provider,实现"观察→思考→执行"自动循环

快速开始

作为 Go 库使用
go get github.com/Ning9527fff/WdaGo
package main

import "github.com/Ning9527fff/WdaGo"

func main() {
    session := WdaGo.GetWdaSession("http://localhost:8100")
    session.GetSession("com.apple.mobilesafari")

    // 截图
    session.CurrentScreenShot("./", "screen.png")

    // 点击坐标
    session.TapWithLocation(WdaGo.ElementLocation{X: 200, Y: 400})

    // 查找并点击元素
    elementId, _ := session.SearchElement(WdaGo.LinkText, "确定")
    session.ClickElement(elementId)
}
作为 MCP Server 使用
1. 构建
go build -o ios-mcp-server ./cmd/mcp-server/
2. 添加到 Claude Code
claude mcp add ios-automation --scope user \
  -e WDA_URL=http://192.168.1.100:8100 \
  -- /path/to/ios-mcp-server
3. 添加到 Claude Desktop

编辑 claude_desktop_config.json

{
  "mcpServers": {
    "ios-automation": {
      "command": "/path/to/ios-mcp-server",
      "env": {
        "WDA_URL": "http://192.168.1.100:8100"
      }
    }
  }
}

环境变量

变量 说明 默认值
WDA_URL WDA 服务地址 http://localhost:8100
WDA_BUNDLE_ID 启动时自动创建 session 的 App
MCP_TRANSPORT 传输方式:stdiohttp stdio
MCP_HTTP_ADDR HTTP 模式监听地址 :8080
LLM_PROVIDER LLM 提供商:openaianthropic(可选)
LLM_API_KEY LLM API 密钥 -
LLM_MODEL 模型名称 按提供商默认
DEBUG 启用调试日志 false

MCP 工具列表

屏幕观察
工具 说明
get_screenshot 截取当前屏幕,返回 PNG 图片
get_ui_tree 获取 UI 无障碍元素树 (XML)
get_screen_state 截图 + UI 树 + App 信息 + 屏幕方向(推荐)
Session 与 App 管理
工具 说明
create_session 使用 Bundle ID 创建 WDA session
launch_app 启动 App
terminate_app 关闭 App
activate_app 切换 App 到前台
get_active_app 获取当前前台 App 信息
list_apps 列出运行中的 App
触摸与手势
工具 说明
tap 点击坐标 (x, y)
double_tap 双击坐标
long_press 长按坐标
swipe 从一个坐标滑动到另一个坐标
元素操作
工具 说明
find_element 按 label/class_name/xpath/class_chain 查找元素
click_element 点击元素
type_text 输入文本
clear_text 清除文本
设备控制
工具 说明
go_home 返回主屏幕
lock_device / unlock_device 锁定/解锁设备
press_button 按硬件键 (home/volume_up/volume_down)
get_device_info 获取设备详细信息
activate_siri 唤起 Siri
open_url 打开 URL
Agent 模式
工具 说明
auto_execute_goal 给定目标,AI 自动循环执行直到完成(需配置 LLM Provider)

项目结构

WdaGo/
├── wda.go                # WDA 核心 API
├── wda_extended.go       # 扩展方法(截图base64、UI树、屏幕状态)
├── httpClient.go         # HTTP 客户端
├── dataFormat.go         # 数据类型定义
├── formatJson.go         # JSON 工具
├── cmd/mcp-server/       # MCP Server 入口
├── internal/
│   ├── mcpserver/        # MCP 工具注册与处理
│   ├── llmprovider/      # LLM Provider 抽象层(OpenAI/Anthropic)
│   └── config/           # 配置管理
├── go.mod
└── LICENSE

前提条件

  • iOS 设备上已运行 WebDriverAgent
  • 能通过网络访问 WDA 服务地址

License

MIT

Documentation

Index

Constants

View Source
const (
	VolumeUp               = 1
	VolumeDown             = 2
	Home                   = 3
	NotificationTypePlain  = "plain"
	NotificationTypeDarwin = "darwin"
	StringNull             = ""
)
View Source
const (
	UserAgent       = "Go-HTTP-Client/1.0"
	ContentTypeJson = "application/json"
	PicturePath     = "screenShot/"
	LinkText        = 1
	PartialLinkText = 2
	ClassName       = 3
	Path            = 4
	ClassChain      = 5
)

Variables

This section is empty.

Functions

func Delete

func Delete(url string, headers map[string]string) ([]byte, error)

Delete 使用默认客户端发送DELETE请求

func Get

func Get(url string, headers map[string]string) ([]byte, error)

Get 使用默认客户端发送GET请求

func GetBoolFromValueInterface

func GetBoolFromValueInterface(data map[string]interface{}, key string) bool

func GetDataFromRespBody

func GetDataFromRespBody(body []byte) (map[string]interface{}, error)

GetDataFromRespBody 用于处理wda标准返回数据中的value结构的数据,将其转为map

func GetNumFromValueInterface

func GetNumFromValueInterface(data map[string]interface{}, key string) int64

func GetRotation

func GetRotation() error

GetRotation 获取当前设备的旋转坐标,暂时没用,先不实现

func GetStringFromValueInterface

func GetStringFromValueInterface(data map[string]interface{}, key string) string

func JudgeResponseCorrect

func JudgeResponseCorrect(body []byte, sessionId string) bool

JudgeResponseCorrect 判断wda请求返回结果是否正确, 正确为true,错误为false

func Post

func Post(url string, data interface{}, headers map[string]string) ([]byte, error)

Post 使用默认客户端发送POST请求

func SetDebugLog

func SetDebugLog()

Types

type AppBaseInfo

type AppBaseInfo struct {
	Pid      int64  `json:"pid"`
	BundleId string `json:"bundleId"`
}

type AppInfo

type AppInfo struct {
	Value struct {
		ProcessArguments struct {
			Env  interface{}   `json:"env"`
			Args []interface{} `json:"args"`
		} `json:"processArguments"`
		Name     string `json:"name"`
		Pid      int    `json:"pid"`
		BundleId string `json:"bundleId"`
	} `json:"value"`
	SessionId string `json:"sessionId"`
}

type AppList

type AppList struct {
	Value     []AppBaseInfo `json:"value"`
	SessionId string        `json:"sessionId"`
}

type BatteryInfo

type BatteryInfo struct {
	Level int64 `json:"level"`
	State int64 `json:"state"`
}

type BundleIdRequest

type BundleIdRequest struct {
	BundleId string `json:"bundleId"`
}

type ButtonName

type ButtonName struct {
	Name string `json:"name"`
}

type Capabilities

type Capabilities struct {
	BundleId string `json:"bundleId"`
}

type DeviceInfo

type DeviceInfo struct {
	TimeZone           string `json:"timeZone"`
	CurrentLocale      string `json:"currentLocale"`
	Model              string `json:"model"`
	Uuid               string `json:"uuid"`
	ThermalState       string `json:"thermalState"`
	UserInterfaceIdiom int64  `json:"userInterfaceIdiom"`
	UserInterfaceStyle string `json:"userInterfaceStyle"`
	Name               string `json:"name"`
	IsSimulator        bool   `json:"isSimulator"`
}

type DragOption

type DragOption struct {
	FromX float64 `json:"fromX"`
	FromY float64 `json:"fromY"`
	ToX   float64 `json:"toX"`
	ToY   float64 `json:"toY"`
}

type ElementLocation

type ElementLocation struct {
	X float64 `json:"x"`
	Y float64 `json:"y"`
}

type ElementSearchRequest

type ElementSearchRequest struct {
	Using string `json:"using"`
	Value string `json:"value"`
}

type HTTPClient

type HTTPClient struct {
	// contains filtered or unexported fields
}

HTTPClient HTTP

func NewHTTPClient

func NewHTTPClient(timeout time.Duration) *HTTPClient

NewHTTPClient 创建新的HTTP客户端

func (*HTTPClient) DeleteRequest

func (h *HTTPClient) DeleteRequest(url string, headers map[string]string) ([]byte, error)

DeleteRequest 发送DELETE请求

func (*HTTPClient) GetRequest

func (h *HTTPClient) GetRequest(url string, headers map[string]string) ([]byte, error)

GetRequest 发送GET请求

func (*HTTPClient) PostRequest

func (h *HTTPClient) PostRequest(url string, data interface{}, headers map[string]string) ([]byte, error)

PostRequest 发送POST请求

type HoldRequest

type HoldRequest struct {
	Duration float64 `json:"duration"`
	// contains filtered or unexported fields
}

type Location

type Location struct {
	Latitude            int64 `json:"latitude"`
	AuthorizationStatus int64 `json:"authorizationStatus"`
	Longitude           int64 `json:"longitude"`
	Altitude            int64 `json:"altitude"`
}

type NotificationExpect

type NotificationExpect struct {
	Name    string `json:"name"`
	Type    string `json:"type"`
	Timeout int64  `json:"timeout"`
}

type PauseTime

type PauseTime struct {
	Duration int `json:"duration"`
}

type PhoneStatus

type PhoneStatus struct {
	Device       string
	DeviceIP     string
	AgentVersion string
	OsName       string
	OsVersion    string
	SdkVersion   string
	State        string
	IsReady      bool
}

type Scale

type Scale struct {
	ScaleX float64
	ScaleY float64
}

type ScreenSize

type ScreenSize struct {
	StatusBarSize WindowSize `json:"statusBarSize"`
	Scale         int64      `json:"scale"`
	ScreenSize    WindowSize `json:"screenSize"`
}

type ScreenSizeResponse

type ScreenSizeResponse struct {
	Value     ScreenSize `json:"value"`
	SessionId string     `json:"sessionId"`
}

type ScreenState

type ScreenState struct {
	ScreenshotBase64 string `json:"screenshot_base64"`
	SourceTreeXML    string `json:"source_tree_xml"`
	ActiveApp        string `json:"active_app_bundle_id"`
	Orientation      string `json:"orientation"`
}

ScreenState 组合截图、UI树和设备元信息,供LLM视觉分析使用

type SessionRequest

type SessionRequest struct {
	Capabilities Capabilities `json:"capabilities"`
}

type SourceRequest

type SourceRequest struct {
	Resource string `json:"resource"`
}

type TextRequest

type TextRequest struct {
	Text string `json:"text"`
}

type TypingRequest

type TypingRequest struct {
	Value []byte `json:"value"`
}

type UrlBody

type UrlBody struct {
	Url string `json:"url"`
}

type WdaSession

type WdaSession struct {
	// contains filtered or unexported fields
}

func GetWdaSession

func GetWdaSession(url string) *WdaSession

func (*WdaSession) ActivateApp

func (session *WdaSession) ActivateApp(bundleId string) error

ActivateApp 激活app?与启动有何区别暂时没搞清楚

func (*WdaSession) ActiveSiri

func (session *WdaSession) ActiveSiri(text string) error

ActiveSiri 启动siri,输入指定文本

func (*WdaSession) AlertAccept

func (session *WdaSession) AlertAccept(client *HTTPClient) error

func (*WdaSession) AlertDismiss

func (session *WdaSession) AlertDismiss(client *HTTPClient) error

func (*WdaSession) AlertGet

func (session *WdaSession) AlertGet(client *HTTPClient) error

func (*WdaSession) BackToHomePage

func (session *WdaSession) BackToHomePage() error

BackToHomePage 返回home页

func (*WdaSession) BaseURL

func (session *WdaSession) BaseURL() string

BaseURL 返回WDA基础URL

func (*WdaSession) CheckSession

func (session *WdaSession) CheckSession() (bool, error)

func (*WdaSession) ClearText

func (session *WdaSession) ClearText(elementId string) error

func (*WdaSession) ClickElement

func (session *WdaSession) ClickElement(elementId string) error

func (*WdaSession) CloseSession

func (session *WdaSession) CloseSession() error

CloseSession 关闭session

func (*WdaSession) CurrentScreenShot

func (session *WdaSession) CurrentScreenShot(picturePath, pictureName string) (string, error)

CurrentScreenShot 当前页面截屏, 不置顶文件后缀,默认为.png

func (*WdaSession) DeactivateApp

func (session *WdaSession) DeactivateApp(time int) error

DeactivateApp 让app处于后台状态指定时间

func (*WdaSession) DeleteSession

func (session *WdaSession) DeleteSession() error

func (*WdaSession) DoubleTapWithLocation

func (session *WdaSession) DoubleTapWithLocation(x, y float64) error

DoubleTapWithLocation 使用坐标双击

func (*WdaSession) DragWithLocation

func (session *WdaSession) DragWithLocation(xBefore, yBefore, xLater, yLater float64) error

DragWithLocation 拖动操作 swipe操作与该操作本纸上为同一个

func (*WdaSession) EnsureSession

func (session *WdaSession) EnsureSession() error

EnsureSession 确保存在有效的WDA session,如果没有则自动创建一个

func (*WdaSession) ExpectedNotification

func (session *WdaSession) ExpectedNotification(notificationName string, notificationType string, timeOut int64) error

ExpectedNotification 判断是否出现一个预期中的notification

func (*WdaSession) GetActiveAppInfo

func (session *WdaSession) GetActiveAppInfo() (*AppInfo, error)

func (*WdaSession) GetAkaTree

func (session *WdaSession) GetAkaTree() error

GetAkaTree 获取当前页面树🌲

func (*WdaSession) GetAppList

func (session *WdaSession) GetAppList() (*[]AppBaseInfo, error)

func (*WdaSession) GetAppState

func (session *WdaSession) GetAppState(bundleIdString string) (int64, error)

func (*WdaSession) GetBatteryInfo

func (session *WdaSession) GetBatteryInfo() (*BatteryInfo, error)

GetBatteryInfo 获取电池信息

func (*WdaSession) GetDeviceInfo

func (session *WdaSession) GetDeviceInfo() (*DeviceInfo, error)

GetDeviceInfo 获取设备当前的状态

func (*WdaSession) GetLocation

func (session *WdaSession) GetLocation() (error, *Location)

GetLocation 用于获取iphone的经纬度,授权状态等数据

func (*WdaSession) GetOrientation

func (session *WdaSession) GetOrientation() (string, error)

GetOrientation 获取当前屏幕方向

func (*WdaSession) GetScreenSize

func (session *WdaSession) GetScreenSize() (*ScreenSizeResponse, error)

GetScreenSize 获取设备屏幕的点长和点宽,返回换算系数和ScreenSize

func (*WdaSession) GetScreenState

func (session *WdaSession) GetScreenState() (*ScreenState, error)

GetScreenState 返回组合的屏幕状态:截图 + UI树 + 当前App + 屏幕方向

func (*WdaSession) GetScreenshotBase64

func (session *WdaSession) GetScreenshotBase64() (string, error)

GetScreenshotBase64 返回当前屏幕截图的base64编码字符串(不写入磁盘)

func (*WdaSession) GetSession

func (session *WdaSession) GetSession(bundleId string) error

func (*WdaSession) GetSourceTree

func (session *WdaSession) GetSourceTree() (string, error)

GetSourceTree 返回当前页面的无障碍树XML字符串(不写入磁盘)

func (*WdaSession) GetStatus

func (session *WdaSession) GetStatus() (*PhoneStatus, error)

GetStatus 获取当前iphone上的wda状态

func (*WdaSession) GetWindowSize

func (session *WdaSession) GetWindowSize() (*WindowSize, error)

GetWindowSize 获取当前窗口大小

func (*WdaSession) IsLocked

func (session *WdaSession) IsLocked() (bool, error)

IsLocked 是否锁屏

func (*WdaSession) LaunchApp

func (session *WdaSession) LaunchApp(bundleId string) error

func (*WdaSession) LaunchAppWithoutSession

func (session *WdaSession) LaunchAppWithoutSession(bundleId string) error

LaunchAppWithoutSession 不需要指定session来启动app

func (*WdaSession) LetSiriOpenUrl

func (session *WdaSession) LetSiriOpenUrl(RawUrl string) error

LetSiriOpenUrl 让siri打开一个指定的url 传入的url必须是绝对url,即带https或者http

func (*WdaSession) LockedDevice

func (session *WdaSession) LockedDevice() error

func (*WdaSession) PressButton

func (session *WdaSession) PressButton(buttonType int) error

PressButton 点击按钮,此处按钮指的是iphone的硬件按钮,硬件按钮名如下:

home,volumeUp,volumeDown

func (*WdaSession) ResetAppAuth

func (session *WdaSession) ResetAppAuth(resource string) error

ResetAppAuth 重置app auth,暂时不清楚如何使用,先实现

func (*WdaSession) SearchElement

func (session *WdaSession) SearchElement(searchType int, Parms string) (string, error)

SearchElement 以不同方式搜索元素

func (*WdaSession) SessionID

func (session *WdaSession) SessionID() string

SessionID 返回当前session ID

func (*WdaSession) ShutDownWda

func (session *WdaSession) ShutDownWda() error

ShutDownWda 关闭wda

func (*WdaSession) TapWithLocation

func (session *WdaSession) TapWithLocation(location ElementLocation) error

TapWithLocation 使用坐标点击

func (*WdaSession) TerminateApp

func (session *WdaSession) TerminateApp(bundleId string) error

TerminateApp 关闭app

func (*WdaSession) TouchAndHoldWithLocation

func (session *WdaSession) TouchAndHoldWithLocation(x, y, duration float64) error

TouchAndHoldWithLocation 对指定坐标长按

func (*WdaSession) TypingText

func (session *WdaSession) TypingText(elementId string, Text string) error

func (*WdaSession) UnlockedDevice

func (session *WdaSession) UnlockedDevice() error

UnlockedDevice 解锁设备

type WindowSize

type WindowSize struct {
	Width  int64 `json:"width"`
	Height int64 `json:"height"`
}

Directories

Path Synopsis
internal

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL