How to Troubleshoot
During the process of launching a deployment, as described in this KB, users have the option to "retain the deployment on error." If this box has not been checked and the deployment has an error, you will recieve an email stating that your deployment was not successful, and the run itself will be released (so that cancelled runs will not take up capacity if forgotten). For optimal troubleshooting ability, please check "retain the deployment on error," as this will retain the errored run in it's current state so that you can view the run and troubleshoot the failure.
The processes and troubleshooting procedudures below rely on the data from the deployment error to be retained. It's reccomended to always check the "retain the deployment on error" box if you are launching a new deployment for the first time, after patch-related asset changes, or other changes that involve code or software changes.
The guide below will help you do basic deployment troubleshooting on your own, and tell you when you should put in a trouble ticket to support.
How to determine if your Deployment has failed
The most obvious way to determine if a deployment has failed will be a red error tag visible from the "Runs" menu, and visible at the host level by checking the status bar as shown below.
It's possible to have a failed deployment or a hung deployment without an error message if a script used to install an asset has a logic loop, or bad syntax, and there is no exit code specified for an error condition. As a rule of thumb, Linux Systems take roughly 10-15 minutes to provision, depending on how many assets are being installed. Windows systems can take a little longer, with some deployments taking up to 30 minutes to provision with large numbers of assets.
If your asset is taking 45 minutes to an hour (or longer) to provison, the deployment is likely in an errored or hung state, and will not successfully complete. To avoid this situation, be sure to use exit codes as described in this KB to ensure that failed deployments always error out with a parsable error code
There are a number of different error codes that can be generated, and most are highly descriptive of exactly where in the code, or in which process the failure occurred. You can look in the Activity Log as shown below to see exactly where in the deployment process the error occurred.
Typically, Error messages can be parsed by breaking them down into recognizable chunks. Take this error message below:
SystemBuilder dr119693v0:caught ExecutionException while iteratively installing assets (message = CONS3RT encountered an error while installing asset 109610. Please consult the asset documentation or the author of the asset for assistance.)
Let's break it down!
- "SystemBuilder dr119693v0:" - This is saying it was attempting to deploy Deployment Run (DR) #119693c0
- "ExecutionException while iteratively installing assets" - This is saying that the error occured when installing an Asset
- "CONS3RT encountered an error while installing asset 109610." - This is telling us exactly which Asset encountered an error
So that error message is telling us that there is something wrong with Asset 109610 that should be investigated for this deployment. This one shouldn't requre a trouble ticket to address, it might just be a minor configuration issue with a single asset in this deployment.
Let's look at a more complex Error message!
creation of host set in virtualization realm failed (message = buildHostSet could not add host dr119885v0 with template UUID baef6cb9-a77a-4768-a64c-d6f7a3f8f83d and order 1 to host set (message = caught VCloudCommandException adding VM to vApp dr119885-d119834-2020-05-30-17-55-27 (message = Recomposing Virtual Application dr119885-d119834-2020-05-30-17-55-27(76b87913-f9a5-432d-a700-bacc98978751) error message/major code/minor code = [ b56c08f3-2151-47fe-ab58-5b4b5ccd1913 ] Following errors occurred while updating network connections: There are insufficient IP addresses to complete operation. You need to add IP addresses to the network that is associated with the object being created or deployed./400/BAD_REQUEST)))
- "creation of host set in virtualization realm failed:" - This is saying that creating the Hostset in the cloudspace specified failed
- "buildHostSet could not add host dr119885v0 with template UUID baef6cb9-a77a-4768-a64c-d6f7a3f8f83d" - This is saying that the cons3rt backend could not add the host for the DR (119885v0)
- "Following errors occurred while updating network connections: There are insufficient IP addresses to complete operation. You need to add IP addresses to the network that is associated with the object being created or deployed." - This error is more serious, This indicates that this cloudspace has run out of IP addresses to provision this host with.
This error message about insufficient IP addresses definitely merits submitting a trouble ticket to support. It's highly unusual for a cloudspace to run out of IPs, and it would require Cons3rt team members to help rectify.
What about an Agent Check-in Monitor error?
If you get an AGENT CHECKIN MONITOR error code, this is related to the Cons3rt backend communication with your launched deployment, and will need our team's assistance to resolve. If you get an AGENT CHECKIN MONITOR error, put in a support ticket vis firstname.lastname@example.org and we will investigate.
No Access to Cloudspaces Error
When attempting to launch a deployment, the following error message may appear: "Sorry, you don't have access to any cloudspaces that can run this deployment." The reasons for receiving this message are listed below:
Wrong Project Selected
- If you have access to multiple projects, and the wrong Project is selected when you attempt to launch, you may not be able to launch your Deployment.
- Your default project is set to "Community" upon account creation, so you will have to change your default project. Also, make sure to select the Project associated with the Cloudspace you are deploying into. Select the "Project" drop down menu in the top-right corner of the page.
System Design not supported in the Cloudspace
- Your Deployment contains 1 or more system(s) defined by one of the following specifications:
- Operating System
- Number of CPUs
- Memory / RAM
- Boot disk size
- Additional disks
- One or more System(s) in your Deployment may not include an OS Template that meets your system specification. For example, if you attempted to deploy Amazon Linux into Azure.
Contact one of your Team Managers, they can review the Operating System (OS) templates in each Cloudspace and check the specifications on each one.
If you are a Team Manager:
- From the Main Nav Menu, select Cloudspaces
- Select your Cloudspace
- Click on the OS Templates Tab
- Ensure the Selected Operating System exists
- If it exists, click on it to review specifications for comparision to the sytem designs
If none of the above solve the problem, please submit a Support Request
Learn more about Troubleshooting by checking out our YouTube tutorials: