Server Configuration
...
Is the service running?
If no, check the service logs and windows log for information
Does the server state in the database mirror the actual state of the service?
- If the service is running but the database records it as being stopped, the service will not be served jobs.
Code Block |
---|
select ServerID, convert(nvarchar(50), Hostname) as Hostname, ss.Name as ServerState, MaxQueueLength, MaxActiveJobs, IsQueueProcessingEnabled from config.t_Server s inner join config.t_ServerState ss on s.StateID = ss.ServerStateID |
.
Location Configuration
...
Target Set Expansions
Have all targets sets been expanded fully...
Code Block |
---|
select l.Name as Locationipar.IPAddressRangeId, lipar.IsEnabledName, lipart.MaxScanningCount,Name s.ServerId, s.Version, s.Hostnameas TargetSetType, from config.t_Location l StartIPAddress, leftEndIPAddress, join config.t__Location_Server ls on l.LocationId = ls.LocationID left join config.t_Server s on ls.ServerID = s.ServerID left join config.t_ServerState ss on s.StateID = ss.ServerStateID |
If there is a scan window for the location, jobs will only be served during this time. If the scan window is an exclusion, there needs to be a non-exclusion scan window for jobs to be served.
Code Block |
---|
select sw.*, swwd.WeeklyDays, l.LocationId, l.Name
from config.t_ScanWindow sw
left join config.t_ScanWindowWeeklyDays swwd on sw.ScanWindowID = swwd.ScanWindowID
left join config.t_Location l on sw.LocationID = l.LocationID |
.
Project Configuration
The project should be running...
Code Block |
---|
select p.Name as Project, ps.Name as [Status]
from config.t_Project p
inner join config.t_ProjectState ps on p.ProjectStateID = ps.ProjectStateID |
.
Current Snapshot of Incomplete Jobs
Code Block |
---|
select s.ServerID, convert(nvarchar(50), s.Hostname) as Hostname, tp.Name as JobPhase, ts.Name as JobStatus, j.JobId, ParentJobID, convert(nvarchar(50), j.Hostname) as Hostname, j.IPAddressBinary, IsComplete, StartDate, EndDate, DATEDIFF(hh, StartDate, getDate()) as 'Duration {Hours)', q.JobStatusID,jobs.convertBinaryToIP(aip.Start) as ActiveIPRangesStart, jobs.convertBinaryToIP(aip.Finish) as ActiveIPRangesFinish, aip.LocationID, ProjectID, case jobs.convertBinaryToIP(aip.Finish) when EndIPAddress then 'Fully Expanded' ' QUEUE DATAelse -->'In Progress' ', js.nameend as QueueStatus, DisabledUntil, [PriorityState], Source, QueueId, IgnoreConfiguredIPs from [jobs].[t_JobActiveIPRanges] jaip with (nolock) inner join config.t_ServerIPAddressRange sipar on j.ServerID = s.ServerID inner join jobs.t_TargetStatus ts on j.TargetStatusID= ts.TargetStatusID with (nolock) on aip.IPAddressRangeID = ipar.IPAddressRangeID inner join jobsconfig.t_TargetPhaseIPAddressRangeType tpipart on ts.TargetPhaseID = tp.TargetPhaseID inner join jobs.t_Queue q on j.JobId = q.JobID inner join jobs.t_JobStatus js on q.JobStatusID = js.JobStatusID where tp.TargetPhaseID = @PHASE_IN_PROGRESS or IsComplete = 0 |
.
...
with (nolock) on ipart.IPAddressRangeTypeID = ipar.IPAddressRangeTypeID |
If yes, scan engine believes it has expanded everything it needs to.
Are there Jobs Queued?
Run the Job Stats
script here.
- Does the Queue By State query show queued items?
- Does the Queue By State and Project show that there are items queued for projects which are running?
Are the Queue Processing procs being executed.
If all targets sets have not been fully expanded and there are no queued jobs, check that the queue step procs are being executed.
- Open SQL Profiler
- Select the the File > New Trace menu menu option and login to the server. There are some specific permissions/securables (ALTER TRACE maybe). A sys admin user will definitely have access.
- In the the Trace Properties dialog dialog which appears, select the the Events Selection tab tab.
- Click the the Show all events and and Show all columns checkboxes checkboxes.
- In the event list, ensure only the following items are ticked..
- Stored Prodedures
- RPC: Completed
- RPC: Starting
- SP: Completed
- SP: Starting
- Stored Prodedures
- Click the the Column Filters button button.
- Add Add Like filters filters for the following properties (if there is a pre-existing filter on ApplicationName, leave it there)…
- DatabaseName: Name of iQSonar database. (Normally iQSonarSE)
- HostName: Hostname of the scanning server
- TextData: %jobs%
- Start the trace
Let it run for about 5 minutes. You should see repeated entries for jobs.QueueStep2_SetJobStatus and
If you don't see this, it's likely the threads have fallen over and a service re-start may be required.jobs.Targets_Serve
.
.
Has Queue Processing Been Suspended?
Queue We expect to see regular executions of jobs.QueueStep1_ExpandRanges
and jobs.QueueStep2_SetJobStatus
.
If yes - and nothing is being queued - find out why?
- What states are the queued items in - do they indicate what's happening? (e.g. project not started, exclusion scan window in effect).
- Review the
jobs.QueueStep2_SetJobStatus
code to determine why jobs aren't reaching a state ofQueued
.
If yes - but the rate of job throughput is slow...
- Run the
Queued Job Analysis
script here. - This will record how many jobs are being added and removed from the queue.
- Do the scan engine servers have free capacity?
- Can the queued items only be picked up by a server which is currently maxed out.
Is the job serving proc running...
If items are queued and the queue processing procs are being executed, follow the steps above to create a database trace and check for regular executions of jobs.Targets_Serve
.
Is the server in breach...
Queue processing is suspended when the average resource usage over 5 minutes on the scanning server exceeds the levels configured in Administration in Administration > Scanning Servers. It will only resume when average resource usage has fallen below the configured limits for 5 minutes. To determine if a server is in breach of these limits execute the following query. The output results are SQL Statements to check the performance limits for each individual server. Copy and paste the results into SQL Management Studio and execute that. If the value in any of the the Is<X>InBreach
columns is 1 then processing queue processing and job serving have been suspended. Also check for the following text in the logs: Queue processing is suspended.
Code Block |
---|
select '/* Server: ' + s.HostName + ' */ exec [config].[Performance_Check] ''' + convert(nvarchar(50), InstallationID) + ''' -- Server: ' + s.Hostname from config.t_Server s |
Anything in the logs?
Search the Service logs for the text iQSonar Job Poll
. How much information you see here will depend on the level of logging (configuration for this not in scope here). Look for errors or warnings - but also look for messages similar to any of the following...
- No queue capacity.
- Queue processing is suspended
- Polled N new targets
- No new targets available
- Failed to process target
- Job acquisition stopped
- iQSonar license has expired - scanning is suspended
- iQSonar Server not activated - scanning is suspended.
.
Is Anything Blocking the Queue?
Some builds (pre Brendan R3 RC2) have an issue where targets from non-running projects are allowed to remain in the Queued
state. These targets will never be picked up by job serving so they block the addition of new items to the queue.
Code Block |
---|
declare @PROJ_STATE_RUNNING int = (select ProjectStateID from config.t_ProjectState where Name = 'Running')
declare @JOB_STATUS_QUEUED int = (select JobStatusID from jobs.t_JobStatus js where Constant = 'JOB_STATUS_QUEUED')
select *
from jobs.t_Queue q
inner join config.t_Project p on q.ProjectId = p.ProjectId
inner join config.t_ProjectState ps on p.ProjectStateID = ps.ProjectStateID
where JobStatusID = @JOB_STATUS_QUEUED
and p.ProjectStateID != @PROJ_STATE_RUNNING
|
To fix this, run:
...
Is the server actually requesting jobs...
- Check the service logs for message indicating
Polled X new targets
. IfX
is-1
then the server has no further capacity to serve additional jobs
How many jobs are currently in progress...
- How many jobs are currently in progress.
- For the longest running jobs are the target logs being updated - if not, the job may have died or not been recovered after a restart/crash.
Code Block |
---|
select s.ServerID, convert(nvarchar(50), s.Hostname) as Hostname, tp.Name as JobPhase, ts.Name as JobStatus, j.JobId, ParentJobID, convert(nvarchar(50), j.Hostname) as Hostname, j.IPAddressBinary, IsComplete, StartDate, EndDate, DATEDIFF(hh, StartDate, getDate()) as 'Duration {Hours)', q.JobStatusID, = (select ProjectStateID from config.t_ProjectState where Name = 'Running') declare @JOB_STATUS_QUEUED int QUEUE DATA --> = (select JobStatusID from jobs.t_JobStatus js where Constant = 'JOB_STATUS_QUEUED') declare @JOB_STATUS_PROJECT_NOT_RUNNING int = (select JobStatusID from jobs.t_JobStatus js where Constant = 'JOB_STATUS_PROJECT_NOT_RUNNING') update q set q.JobStatusID = @JOB_STATUS_PROJECT_NOT_RUNNING , js.name as QueueStatus, DisabledUntil, [Priority], Source, QueueId, IgnoreConfiguredIPs from jobs.t_QueueJob qj inner join config.t_Project p on q.ProjectId = p.ProjectId inner join config.t_ProjectStateServer pss on pj.ProjectStateIDServerID = pss.ProjectStateIDServerID where JobStatusID inner = @JOB_STATUS_QUEUED and p.ProjectStateID != @PROJ_STATE_RUNNING |
There was also an issue where targets in the queue can not be linked back to Targets. The root cause is still under investigation. Th e following statement identifies queue items affected by this problem - again they block spaces in the queue as job serving will not pick them up.
Code Block |
---|
declare @JOB_STATUS_QUEUED int = (select JobStatusID from join jobs.t_JobStatusTargetStatus jsts where Constant = 'JOB_STATUS_QUEUED') select * from jobs.t_Queue q left join ( select p.projectID, ipar.StartIPAddressBinary, ipar.EndIPAddressBinary, ipar.Hostname, ipar.IPAddressRangeID from config.t_Project p inner join config.t__Project_Location pl ON p.ProjectID = pl.ProjectID inner join config.t_Location l ON pl.LocationID = l.LocationID inner join config.t_IPAddressRange ipar ON l.LocationID = ipar.LocationID where ipar.IsExclusion = 0 ) projectTargets on q.ProjectId = projectTargets.ProjectID AND ( ( q.IPAddressBinary BETWEEN projectTargets.StartIPAddressBinary AND EndIPAddressBinary ) OR ( q.IPAddressBinary = projectTargets.StartIPAddressBinary AND EndIPAddressBinary IS NULL) OR ( q.Hostname = projectTargets.Hostname ) OR ( q.IgnoreConfiguredIPs = 1 ) ) WHERE q.JobStatusID = @JOB_STATUS_QUEUED and projectTargets.IPAddressRangeID is nullon j.TargetStatusID= ts.TargetStatusID inner join jobs.t_TargetPhase tp on ts.TargetPhaseID = tp.TargetPhaseID inner join jobs.t_Queue q on j.JobId = q.JobID inner join jobs.t_JobStatus js on q.JobStatusID = js.JobStatusID where IsComplete = 0 |
Anything in the logs?
Search the Service logs for the text iQSonar Job Poll
. How much information you see here will depend on the level of logging (configuration for this not in scope here). Look for errors or warnings - but also look for messages similar to any of the following...
- No queue capacity.
- Queue processing is suspended
- Polled N new targets
- No new targets available
- Failed to process target
- Job acquisition stopped
- iQSonar license has expired - scanning is suspended
- iQSonar Server not activated - scanning is suspended.
.