WebDriver API command support and mapping to AutoItX API

Desired Capabilities

These are capabilities that AutoItDriverServer supports.

Standard WebDriver Desired Capabilities

takesScreenshot = false (as in AutoItDriverServer does not support taking screenshots)
n/a for the rest for now

Custom Desired Capabilities specific to AutoItDriverServer

For details on the specific functionalities of these capabilities, see the AutoItX documentation/help file in your AutoIt installation folder OR view the online AutoIt documentation, for AutoItSetOption. Don't feel like repeating the info here.

The default values for these capabilities if not specified in a WebDriver session follow that of the AutoIt defaults on startup (if not overridden) and thereafter the last set value from any previous session that overrode any values.

NOTE: that these capabilities may not all be currently utilized in the current version of AutoItDriverServer in terms of mapping to WebDriver commands available. However, they may be in the future.

caretCoordMode
expandEnvStrings
mouseClickDelay
mouseClickDownDelay
mouseClickDragDelay
mouseCoordinateMode
sendAttachMode
sendCapslockMode
sendKeyDelay
sendKeyDownDelay
winDetectHiddenText
winSearchChildren
winTextMatchMode
winTitleMatchMode
winWaitDelay

Element Location Strategy

AutoIt finds and manipulates elements/objects by several different location strategies as defined in the AutoIt controls documentation, see the online documentation or AutoItX help file documentation. As such, the translation/mapping to WebDriver element lookup becomes:

WebDriver Find by Method = AutoIt location strategy
Find Element By ID = [ID:value]
Find Element By Name = [NAME:value]
Find Element By XPath = [REGEXPCLASS:value]
Find Element By CSS Selector = value passed "as is" unmodified to AutoIt
Find Element By Tag Name = [CLASS:value]
Find Element By Class Name = [CLASSNN:value]
Find Element By Link Text = [TEXT:value]
Any other unmentioned find by strategy = value passed "as is" unmodified to AutoIt

For value passed "as is" unmodified to AutoIt, that means you can define the locator as a freeform expression/value and thus can use multiple or targeted location strategy match that is available to AutoIt but not to WebDriver. See AutoIt controls documentation for examples of the multiple location strategy or alternatively "targeted" location strategy. An example "as is" value that has multiple location strategies incorporated: "[CLASS:Button; TEXT:Finish; INSTANCE:2]"

Finding single vs multiple elements

AutoIt doesn't find or handle multiple elements specified by location strategy, therefore, finding elements in WebDriver is not implemented for AutoItDriverServer. You can only find singular elements.

Supported/available WebDriver API/commands

Only the commands shown here are supported. Invoking any other command will return exception. These commands interface to the AutoItX COM/DLL methods.

General methods

Execute Script - allows you to execute an AutoIt au3 script file, for things you can't handle with just the WebDriver API mapping. Beware that this may block WebDriver execution until the script completes. There are some suboptions for this feature see section below for details. NOTE: the path to the AutoIt script file is associated to the host/machine/node that AutoIt(DriverServer) is run against, not the host/node/client that is executing the WebDriver client commands (to AutoItDriverServer). Making the wrong assumption will cause a failure executing script as it can't be found by AutoIt(DriverServer).
driver.get() - Instead of passing in a URL, pass in path to an application executable, etc. The command will then launch the specified application.
driver.findElement() - and finding multiple elements not supported

WebElement methods

WebElement.click()
WebElement.getText()
WebElement.clear()
WebElement.sendKeys() - for modifier keys, send as AutoIt string format instead of WebDriver syntax. For file uploads, see note below under section about limitations, etc.
Generic send keys not for element - for modifier keys, send as AutoIt string format instead of WebDriver syntax
WebElement.location - returns x,y coordinates of element on desktop screen
WebElement.size - returns width & height of element
WebElement.isDisplayed() - returns indicator whether element is displayed or visible on screen or not
WebElement.isEnabled() - returns indicator whether element is enabled or not
WebElement.isSelected() - returns indicator whether element is selected or checked or not. Might not work against all elements, but not sure.
WebElement.getAttribute() - get AutoIt supported control attribute values, see AutoIt ControlCommand list of commands (i.e. control attributes). Currently only works for commands that don't require option arguments.
WebElement.getCssProperty() - get AutoIt supported control property values, see AutoIt ControlCommand list of commands (i.e. control properties). Currently only works for commands that don't require option arguments.
WebElement.findElement() - and finding multiple elements not supported

Window control methods

driver.getTitle() - of current active window
driver.close() - close current active window
driver.switchTo().window(windowNameOrHandleValue) - switch to a window using window name or window handle value
driver.manage().window().setSize(new Dimension(width, height)) - resize current active window or specified window by name or handle value
driver.manage().window().getSize() - get window size/dimensions for current active window or specified window by name or handle value
driver.manage().window().setPosition(new Point(x,y)) - move current active window or specified window by name or handle value
driver.manage().window().getPosition() - get window location or coordinates (relative to top left corner 0,0 of screen) for current active window or specified window by name or handle value
driver.manage().window().maximize() - maximize window for current active window or specified window by name or handle value
driver.getWindowHandle()
driver.getWindowHandles() - from the JSONWireProtocol, AutoItDriverServer only implements get current window handle, not get all window handles. Though if needed to avoid bugs/issues, we could perhaps have an implementation for get all window handles that simply returns a list of 1 - the current window handle

Mouse operations supported via JSONWireProtocol methods and Actions API:

drag & drop
mouse up
mouse down
mouse move to element
mouse move to element with offset
mouse move to offset (relative to mouse location)
right mouse click
mouse click
mouse double click

Execute Script Configuration options

Execute script will either execute:

an AutoIt au3 script, with optional arguments
a compiled binary of an AutoIt au3 script, with optional arguments

For the first item, the server will execute script using AutoIt executable, and it's path (along with version - 32/64-bit) will depend on these settings in autoit_options.cfg file:

AutoIt64BitOSOnInstallUse32Bit=True
AutoIt64BitOS32BitExecutablePath
AutoIt64BitOS64BitExecutablePath
AutoIt32BitExecutablePath

On Windows 32-bit, it will just use the path to AutoIt defined by AutoIt32BitExecutablePath. On Windows 64-bit, it will first check value of AutoIt64BitOSOnInstallUse32Bit. If true (default - as recommended by AutoIt during installation), it will take AutoIt64BitOS32BitExecutablePath, otherwise, it will take AutoIt64BitOS64BitExecutablePath. The paths are preset for the standard installation paths. You only need to modify if you customized the location for your install.

For the 2nd item, the server by default executes for the first item, but can be configured to handle this other case by setting AutoItScriptExecuteScriptAsCompiledBinary=True (default is False). In that case, server expects the script value to reference path to the compiled binary of the AutoIt script, and not the au3 script itself. And server will execute it directly rather than via AutoIt executable. Now of course, one could also take advantage of this feature to misuse it differently as in to execute any executable (not just a compiled AutoIt script), with optional arguments. However, doing so is at your own desire and risk. The intent of execute script for this server was meant only for AutoIt scripts and compiled binaries.

NOTE: the path to the AutoIt script file (or the compiled binary) is associated to the host/machine/node that AutoIt(DriverServer) is run against, not the host/node/client that is executing the WebDriver client commands (to AutoItDriverServer). Making the wrong assumption will cause a failure executing script (or binary) as it can't be found by AutoIt(DriverServer).

Future NOTE: execute script could perhaps be enhanced to support executing (single) line of AutoIt script code like interpreter/interactively as mentioned here. Perhaps something to consider for future. Though I'd think few people just want to execute such a single line of AutoIt code, more likely they want to execute a script file.

Limitations in, deviations from, & notes on WebDriver API implementation

no find elements
no get window handles
no timeouts mapping (yet)
no get (list of) sessions but can get specific single session by session ID
for sending special keys, modifier keys, follow the AutoIt format/syntax (all within quoted string) as opposed to WebDriver spec. It was easier to support/implement this way.
not all AutoIt control types are mapped/accessible to WebDriver protocol. I couldn't figure out the mapping, assuming it was possible. Some examples: tree view, list view, combobox, listbox. If you have ideas on how to map between AutoIt and WebDriver JSONWireProtocol please do share.
ControlCommand - not all commands are exposed through WebDriver, not the ones that take option arguments. I couldn't figure out how to map to JSONWireProtcol. If you have ideas for this, please do share. The ones that are mapped are via getAttribute and get CSS property.
no formal multiple file upload support due to Selenium issue 2239. Attempting to do so may yield undesirable results. You can however, try something like this
Due to Selenium issue 2239, any keys (specifically) sent to an element/control that appears to be file path(s) may yield some unexpected results. (1) It may trigger a file upload behind the scenes from WebDriver client to AutoItDriverServer, where on upload completion, server will send back file path of the uploaded file for WebDriver to send to the UI, so you may see the file path is not what is originally listed in the test code, unless you have option to override/disable local file detector in our WebDriver client binding (yes in Java, no in Python). (2) Following upon (1), if the keys sent is like an array of multiple file paths like trying to upload multiple files in browser, it may yield unexpected results. Both (1) and (2) may apply even if you were not intending to do a file upload in some Windows UI since standard WebDriver clients will treat it that way. Workaround besides local file detector option is to customize a version of WebDriver client specific to AutoItDriverServer (e.g. an AutoItDriver client binding).
no proper WebDriver exceptions for elements since the mapping from desktop GUI object that's kind of more stateless for AutoIt to WebDriver is kind of tricky, you'll just get generic exceptions on failure most of the time.
as AutoIt recognizable desktop GUI objects are not quite like DOM objects, and the nature of how those work and desktop application testing differs from web testing, there is no real object reference to a WebElement on locating it. So the server tracks the located element for future manipulation by encoding it's initial location strategy value that was used to find it, and returning this back to the client for future requests to manipulate the same element. The encoding is done with base 64 then URL encoding the result. This result becomes the WebElement reference ID/value. It gets decoded by the server when it receives the request to then use to lookup the element again.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WebDriver API command support and mapping to AutoItX API

Desired Capabilities

Element Location Strategy

Supported/available WebDriver API/commands

Limitations in, deviations from, & notes on WebDriver API implementation

Clone this wiki locally