Skip to content

WebDriver API command support and mapping to AutoItX API

David Luu edited this page Jan 20, 2015 · 7 revisions

Desired Capabilities

These are capabilities that AutoItDriverServer supports.

Standard WebDriver Desired Capabilities

  • takesScreenshot = false (as in AutoItDriverServer does not support taking screenshots)
  • n/a for the rest for now

Custom Desired Capabilities specific to AutoItDriverServer

For details on the specific functionalities of these capabilities, see the AutoItX documentation/help file in your AutoIt installation folder OR view the online AutoIt documentation, for AutoItSetOption. Don't feel like repeating the info here.

The default values for these capabilities if not specified in a WebDriver session follow that of the AutoIt defaults on startup (if not overridden) and thereafter the last set value from any previous session that overrode any values.

NOTE: that these capabilities may not all be currently utilized in the current version of AutoItDriverServer in terms of mapping to WebDriver commands available. However, they may be in the future.

  • caretCoordMode
  • expandEnvStrings
  • mouseClickDelay
  • mouseClickDownDelay
  • mouseClickDragDelay
  • mouseCoordinateMode
  • sendAttachMode
  • sendCapslockMode
  • sendKeyDelay
  • sendKeyDownDelay
  • winDetectHiddenText
  • winSearchChildren
  • winTextMatchMode
  • winTitleMatchMode
  • winWaitDelay

Element Location Strategy

AutoIt finds and manipulates elements/objects by several different location strategies as defined in the AutoIt controls documentation, see the online documentation or AutoItX help file documentation. As such, the translation/mapping to WebDriver element lookup becomes:

  • WebDriver Find by Method = AutoIt location strategy
  • Find Element By ID = [ID:value]
  • Find Element By Name = [NAME:value]
  • Find Element By XPath = [REGEXPCLASS:value]
  • Find Element By CSS Selector = value passed "as is" unmodified to AutoIt
  • Find Element By Tag Name = [CLASS:value]
  • Find Element By Class Name = [CLASSNN:value]
  • Find Element By Link Text = [TEXT:value]
  • Any other unmentioned find by strategy = value passed "as is" unmodified to AutoIt

For value passed "as is" unmodified to AutoIt, that means you can define the locator as a freeform expression/value and thus can use multiple or targeted location strategy match that is available to AutoIt but not to WebDriver. See AutoIt controls documentation for examples of the multiple location strategy or alternatively "targeted" location strategy. An example "as is" value that has multiple location strategies incorporated: "[CLASS:Button; TEXT:Finish; INSTANCE:2]"

Finding single vs multiple elements

AutoIt doesn't find or handle multiple elements specified by location strategy, therefore, finding elements in WebDriver is not implemented for AutoItDriverServer. You can only find singular elements.

Supported/available WebDriver API/commands

Only the commands shown here are supported. Invoking any other command will return exception. These commands interface to the AutoItX COM/DLL methods.

General methods

  • Execute Script - allows you to execute an AutoIt au3 script file, for things you can't handle with just the WebDriver API mapping. Beware that this may block WebDriver execution until the script completes. There are some suboptions for this feature see section below for details. NOTE: the path to the AutoIt script file is associated to the host/machine/node that AutoIt(DriverServer) is run against, not the host/node/client that is executing the WebDriver client commands (to AutoItDriverServer). Making the wrong assumption will cause a failure executing script as it can't be found by AutoIt(DriverServer).

  • driver.get() - Instead of passing in a URL, pass in path to an application executable, etc. The command will then launch the specified application.

  • driver.findElement() - and finding multiple elements not supported

WebElement methods

  • WebElement.click()
  • WebElement.getText()
  • WebElement.clear()
  • WebElement.sendKeys() - for modifier keys, send as AutoIt string format instead of WebDriver syntax. For file uploads, see note below under section about limitations, etc.
  • Generic send keys not for element - for modifier keys, send as AutoIt string format instead of WebDriver syntax
  • WebElement.location - returns x,y coordinates of element on desktop screen
  • WebElement.size - returns width & height of element
  • WebElement.isDisplayed() - returns indicator whether element is displayed or visible on screen or not
  • WebElement.isEnabled() - returns indicator whether element is enabled or not
  • WebElement.isSelected() - returns indicator whether element is selected or checked or not. Might not work against all elements, but not sure.
  • WebElement.getAttribute() - get AutoIt supported control attribute values, see AutoIt ControlCommand list of commands (i.e. control attributes). Currently only works for commands that don't require option arguments.
  • WebElement.getCssProperty() - get AutoIt supported control property values, see AutoIt ControlCommand list of commands (i.e. control properties). Currently only works for commands that don't require option arguments.
  • WebElement.findElement() - and finding multiple elements not supported

Window control methods

  • driver.getTitle() - of current active window
  • driver.close() - close current active window
  • driver.switchTo().window(windowNameOrHandleValue) - switch to a window using window name or window handle value
  • driver.manage().window().setSize(new Dimension(width, height)) - resize current active window or specified window by name or handle value
  • driver.manage().window().getSize() - get window size/dimensions for current active window or specified window by name or handle value
  • driver.manage().window().setPosition(new Point(x,y)) - move current active window or specified window by name or handle value
  • driver.manage().window().getPosition() - get window location or coordinates (relative to top left corner 0,0 of screen) for current active window or specified window by name or handle value
  • driver.manage().window().maximize() - maximize window for current active window or specified window by name or handle value
  • driver.getWindowHandle()
  • driver.getWindowHandles() - from the JSONWireProtocol, AutoItDriverServer only implements get current window handle, not get all window handles. Though if needed to avoid bugs/issues, we could perhaps have an implementation for get all window handles that simply returns a list of 1 - the current window handle

Mouse operations supported via JSONWireProtocol methods and Actions API:

  • drag & drop
  • mouse up
  • mouse down
  • mouse move to element
  • mouse move to element with offset
  • mouse move to offset (relative to mouse location)
  • right mouse click
  • mouse click
  • mouse double click

Execute Script Configuration options

Execute script will either execute:

  1. an AutoIt au3 script, with optional arguments
  2. a compiled binary of an AutoIt au3 script, with optional arguments

For the first item, the server will execute script using AutoIt executable, and it's path (along with version - 32/64-bit) will depend on these settings in autoit_options.cfg file:

  • AutoIt64BitOSOnInstallUse32Bit=True
  • AutoIt64BitOS32BitExecutablePath
  • AutoIt64BitOS64BitExecutablePath
  • AutoIt32BitExecutablePath

On Windows 32-bit, it will just use the path to AutoIt defined by AutoIt32BitExecutablePath. On Windows 64-bit, it will first check value of AutoIt64BitOSOnInstallUse32Bit. If true (default - as recommended by AutoIt during installation), it will take AutoIt64BitOS32BitExecutablePath, otherwise, it will take AutoIt64BitOS64BitExecutablePath. The paths are preset for the standard installation paths. You only need to modify if you customized the location for your install.

For the 2nd item, the server by default executes for the first item, but can be configured to handle this other case by setting AutoItScriptExecuteScriptAsCompiledBinary=True (default is False). In that case, server expects the script value to reference path to the compiled binary of the AutoIt script, and not the au3 script itself. And server will execute it directly rather than via AutoIt executable. Now of course, one could also take advantage of this feature to misuse it differently as in to execute any executable (not just a compiled AutoIt script), with optional arguments. However, doing so is at your own desire and risk. The intent of execute script for this server was meant only for AutoIt scripts and compiled binaries.

NOTE: the path to the AutoIt script file (or the compiled binary) is associated to the host/machine/node that AutoIt(DriverServer) is run against, not the host/node/client that is executing the WebDriver client commands (to AutoItDriverServer). Making the wrong assumption will cause a failure executing script (or binary) as it can't be found by AutoIt(DriverServer).

Future NOTE: execute script could perhaps be enhanced to support executing (single) line of AutoIt script code like interpreter/interactively as mentioned here. Perhaps something to consider for future. Though I'd think few people just want to execute such a single line of AutoIt code, more likely they want to execute a script file.

Limitations in, deviations from, & notes on WebDriver API implementation

  • no find elements

  • no get window handles

  • no timeouts mapping (yet)

  • no get (list of) sessions but can get specific single session by session ID

  • for sending special keys, modifier keys, follow the AutoIt format/syntax (all within quoted string) as opposed to WebDriver spec. It was easier to support/implement this way.

  • not all AutoIt control types are mapped/accessible to WebDriver protocol. I couldn't figure out the mapping, assuming it was possible. Some examples: tree view, list view, combobox, listbox. If you have ideas on how to map between AutoIt and WebDriver JSONWireProtocol please do share.

  • ControlCommand - not all commands are exposed through WebDriver, not the ones that take option arguments. I couldn't figure out how to map to JSONWireProtcol. If you have ideas for this, please do share. The ones that are mapped are via getAttribute and get CSS property.

  • no formal multiple file upload support due to Selenium issue 2239. Attempting to do so may yield undesirable results. You can however, try something like this

  • Due to Selenium issue 2239, any keys (specifically) sent to an element/control that appears to be file path(s) may yield some unexpected results. (1) It may trigger a file upload behind the scenes from WebDriver client to AutoItDriverServer, where on upload completion, server will send back file path of the uploaded file for WebDriver to send to the UI, so you may see the file path is not what is originally listed in the test code, unless you have option to override/disable local file detector in our WebDriver client binding (yes in Java, no in Python). (2) Following upon (1), if the keys sent is like an array of multiple file paths like trying to upload multiple files in browser, it may yield unexpected results. Both (1) and (2) may apply even if you were not intending to do a file upload in some Windows UI since standard WebDriver clients will treat it that way. Workaround besides local file detector option is to customize a version of WebDriver client specific to AutoItDriverServer (e.g. an AutoItDriver client binding).

  • no proper WebDriver exceptions for elements since the mapping from desktop GUI object that's kind of more stateless for AutoIt to WebDriver is kind of tricky, you'll just get generic exceptions on failure most of the time.

  • as AutoIt recognizable desktop GUI objects are not quite like DOM objects, and the nature of how those work and desktop application testing differs from web testing, there is no real object reference to a WebElement on locating it. So the server tracks the located element for future manipulation by encoding it's initial location strategy value that was used to find it, and returning this back to the client for future requests to manipulate the same element. The encoding is done with base 64 then URL encoding the result. This result becomes the WebElement reference ID/value. It gets decoded by the server when it receives the request to then use to lookup the element again.