Discovery Best Practices

Overview

This document covers the most common settings to tune to make discovery more efficient. Environmental factors contribute to some these settings.

Global Settings

The settings below might make discovery more efficient, regardless an organization's size.

Enabling Port Scanning

During a Port Scan, if the connection attempt fails due to a network failure prior to the port scan timeout another attempt will be made with the remaining time. This means that port scans will run a minimum of the time specified. Also the operation can take up about 20 seconds longer than the port scan timeout to complete. This applies to Verify Privilege Vault version 11.6 and later.

Introduction

Port scanning is a scan that can be conducted before the regular discovery scan to potentially reduce discovery time—if specified ports are unavailable on a given machine, the standard discovery scan will eventually timeout (the default is five minutes). Port scanning eliminates that timing out process, which saves time.

Figure: Edit Discovery Scanner for Windows Local Accounts

Port scanning for discovery has three configurations or controls:

Port Scan Enable: Whether to port scan at all. Defaults to unchecked.
Port Scan Timeout: How long (in seconds) the port scan will try before giving up. Defaults to 30.
Port Scan List: A comma-delimited list of ports to scan. These depend on the configuration of the systems you will scan. Defaults to NetBIOS (135) and Active Directory services (445).

Examples of scanners that have a port-scanning timeout option for Active Directory include:

Windows local accounts
Active Directory user accounts
All dependency scanners

Accessing Port Scanning

Simply Search for Accounts (tab), and then click the pencil icon for the desired scanner. If the configurations are on that page, that scanner supports port scanning. See the previous figure.

Additional Reasons to Consider Discovery Port Scanning

Lowering the Discovery Scanner Timeout May Cause Issues

If you lower the regular discovery scanner timeout, without port scanning enabled, you may kill a running scan. In addition, non-Active-Directory discovery scanners, such as a custom PowerShell scanner, that are slow or prone to hanging may also be disrupted or even crash if the regular discovery scanner timeout is set too low. As a best practice, we recommend enabling port scanning and not lowering the regular scanner timeout, which defaults to five minutes, unless IBM Security Support asks you to. Do not lower the port scanning timeout below 15 seconds.

Secrets with Multiple Dependencies May Create Especially Long Timeouts

Without discovery port scanning enabled, discovery scanners rely on the standard timeout, which defaults to five minutes. If a secret has multiple dependencies, the system may have a chain of discovery timeouts to process, one at a time. With the default five-minute timeout on all the systems, timing out can take a long time, especially if you have a lot of machines turned off or unavailable. Discovery port scanning greatly reduces that.

To calculate the maximum timeout for discovery use this formula (with all systems using the same timeout value and each secret having the same number of dependencies):

(number of secrets) × (number of dependencies) × (timeout value) = (maximum minutes for discovery scans)

For example, using the default five-minute timeout value for 35 secrets, each with three dependencies:

35 × 3 × 5 = 525

Thus, 8.75 hours (525 ÷ 60) of timeout are possible and enabling discovery port scanning becomes a really good idea, especially if you have a lot of machines down at any given time.

We can ignore clustered objects as part of a discovery scan, but we cannot ignore disabled computer objects, so Verify Privilege Vault tries to scan each object that exists within AD. If you have a centralized area for disabled computer objects, consider configuring discovery to be OU specific and excluding your disabled computers OU to make discovery more efficient.

Windows enforces a maximum time limit for a response to TCP Syn.
The first attempt runs 3 seconds, then it retries with increasingly long limits.
The number of retries is determined by MaxSynRetransmissions which can have a value of 2-8.
MaxSynRetransmissions Maximum Time Windows will wait:
- 2: 7 sec
- 3: 15 sec
- 4: 21 sec
- 5: 63 sec
- 6: 123 sec
- 7: 183 sec
- 8: 243 sec

To prevent timeouts, the customer should update to Verify Privilege Vault 11.6 or greater and execute "netsh interface tcp set global MaxSynRetransmissions=N" on the Windows server the Delinea Distributed Engine is executing on.

Choose a value for N that corresponds to a Windows timeout greater than the Discovery Port Scan Timeout.

Example: a Discovery Pre Scan Timeout of 30 seconds, MaxSynRetransmissions should be set to MaxSynretransmissions=5 which will cause the Windows TCP stack to wait up to 63 seconds (which is the lowest value which is greater than or equal to 30 seconds).

When to Run Discovery

Currently, you cannot set when discovery runs via a control or setting. You can, however, approximately set when it runs by disabling and enabling it at the desired time. It runs daily around the same time as when it was first enabled and then again according to whatever the discovery scan offset hours interval was set to. If you are running discovery once per day, we suggest:

Choosing a start time outside your normal business hours, such as midnight.
First running several ad-hoc discoveries when your network traffic normally drops at the end of the day. Record how long each discovery process takes. Remember, this can vary greatly if a lot of machines are down, which is why we suggest conducting more than one discovery.

Running a test with discovery port scanning disabled may provide valuable insights into the differences in performance or results.
Using the average time the test runs took, calculate when to start discovery at a time when no anticipated portion of the discovery period is during your high-traffic times. We suggest having an end buffer as long as possible to account for variability, so if your average discovery time is fairly long, it might be best to start discovery soon after your network traffic drops off for the evening. This is especially true if your machine pool is growing.

For example, if your tested average discovery time was four hours and your network traffic is busy between 0600 and 1800, you should run discovery between 1800 and 0200, the closer to 1800 the better.

Discovery Settings

Figure: Discovery Settings Page

The settings are:

Discovery interval for days and hours: How often you want the regular discover scan to occur.
Ignore Cluster Node Objects: Tells Verify Privilege Vault to not run discovery on machines identified as "msclustervirtualserver." Do not change this setting.
See Discovery Scan Offset Hours for a discussion of the last setting.
Deactivate Dependency Not Found Threshold: If set to 0, this setting means we will never disable dependencies if they are not found during a scan. If set to a higher number, it would indicate the number of failed attempts to find the dependency before finally disabling it". The threshold can be set to Never to prevent disabling, or a count for times a dependency is not found.

There is another "Discovery Batch Size" setting on the Advance Settings page, which is usually only available to IBM Security Customer Support. This setting, too, is legacy, and should not be set.

Environment-Specific Considerations

Discovery Scan Offset Hours

This section discusses a setting that allows you to quickly discover changes without greatly increasing traffic.

Figure: Discovery Settings Page in View Mode

The "discovery scan offset hours" (DSOH) setting is for customers that need to detect new (to the network) systems quickly without excessive network traffic during business hours. For example, you might need this feature if you have lots of server testing (systems are up and down) or laptops (systems are connected or not). The trick is doing this while minimizing the networking load.

We accomplish this with discovery scan offsets. With these, you have multiple synchronization scans per day, rather than just one, where Verify Privilege Vault attempts to scan each and every system, but first Verify Privilege Vault looks up each system to see if that system is flagged for scanning. The process goes like this:

Initially, Verify Privilege Vault scans each discovered system and resets its DSOH timer, which is set to the number of hours defined by the DSOH setting value. Verify Privilege Vault has a separate timer for each scanned system.
Once set, each timer starts counting down. Until that timer runs out, Verify Privilege Vault ignores the scanned system if it runs a discovery scan.
When the timer is finished, the system is again flagged for scanning.
The next time Verify Privilege Vault does a discovery scan, it sees the flag is present and scans the system.

The period the "scan me" flag is down (the period the timer is running) is defined by the DSOH setting. Thus, DSOH essentially tells Verify Privilege Vault how long before scanning that discovered system again.

For example, if you have a discovery scan offset of 12 hours and a discovery interval of four hours:

Start: The first time discovery runs, it scans every object because each one's timer is zeroed out, which makes it flagged for scanning. After scanning, each object's timer starts to count down, which makes it unflagged for scanning.
At four-hours: The next time discovery runs , it ignores the objects that were scanned the first time (because their timer was set to 12 hours), but it does process any newly discovered objects.
At eight-hours: In four more hours the same happens—only new objects are processed.
At 12 Hours: In four more hours, the scan runs again. This time, the 12-hour scan offset has expired, and all the timers of the original objects are zeroed out. The process begins anew—discovery scans every object because its timer is zeroed out, which makes it flagged for scanning. After scanning, each object's timer starts to count down, which makes it unflagged for scanning.

Advanced Settings

These settings reside in the ConfigurationAdvanced.aspx file, which you should not edit unless IBM Security Support asks you to.

Run Secret Computer Matcher Once per Discovery

Figure: Secret Computer Matcher Once per Day

During the discovery process, secrets are matched with their machine. For smaller customers, this likely has little performance impact. For very large customers, the performance impact is noteworthy. We recommend that large businesses enable this option to decrease matcher resource use.

By default, the secret computer matcher runs once every five hours (this is non-configurable). This means the matcher runs four times per day, and only one of those times could coincide with discovery running at four-hour intervals. The other three will not run in tandem with discovery and thus will increase network traffic. If you enable this setting, the matcher will instead run after each discovery completes. If discovery only runs once, the matcher only runs once too. This more efficient because discovery can take hours to run, and having the matcher run several times during that period wastes processing.

Limit the Network Traffic Caused by Nested Organizational Units

Figure: Discovery: Bypass "Scan Specific OUs"

If you configure discovery for Active Directory to scan by separate OUs and not by the entire domain, nested OUs can overwhelm your message bus. This occurs because each OU generates its own message unless you enable this setting. So if your enterprise has a complex tree of nested OUs, as many large businesses do, you could experience this issue. Smaller enterprises with single or a small number of nested OUs can ignore it. If you change the configuration in the Advance Configuration page file, it will affect all discovery source settings (some scanners have a similar configuration that only affects them). Alternatively, for more flexibility, you can configure this individually at the scanner level by checking the Bypass Specific OU Scan check box on the Settings - Active Directory tab for the scanner:

Figure: Tuning Active Directory Settings

Engines and Engine Workers

The number of distributed engines and engine workers within your environment can affect how fast discovery completes. Increasing CPU counts on your existing engines may help them to complete a diverse set of tasks more efficiently but might not have much effect on discovery processing time. If an engine is doing discovery, only a subset of consumers run and they will run into a prefetch count limit (30 messages per engine). Thus, increasing the number of engines and engine workers might decrease total discovery time by increasing that prefetch limit.