The Collector is the on-premises component of InsightIDR, or a machine on your network running Rapid7 software that either polls data or receives data from Event Sources and makes it available for InsightIDR analysis. An Event Source represents a single device that sends logs to the Collector.
For example, if you have three firewalls, you will have one Event Source for each firewall in the Collector.
It is usually more efficient to deploy multiple Collectors throughout an environment rather than break firewall rules or overload a single Collector.
You may need to distribute the bandwidth across your network if you have very high logging levels or if your network is geographically dispersed.
The Collector workflow has two main advantages over sending logs to InsightIDR directly: normalization and user Attribution.
Normalization transforms log data from multiple diverse sources into a common JSON format and extracts standard information like hostnames, timestamps, error levels, and others. Normalization allows you to run more advanced queries on your endpoint logs and enhance your data visualization.
User attribution correlates endpoint activity to individual users using that endpoint while logged into applications. Attribution provides a fuller image of your security posture because user accounts are the most common targets for sophisticated attacks.
If you decide to use the collector, there can be a delay of up to 5 minutes for endpoint information to show up on InsightIDR. You should consider the ‘Add Log’ workflow if real-time visibility of logs is a critical priority.
When setting up the Collector, you should be aware that:
- InsightIDR ingests data from existing sources in your environment. InsightIDR needs administrator access to pull data from these sources, or push data to log aggregators, from a Domain Admin account, if possible.
- Treat your Collector(s) as you would any other valuable asset, as it stores the credentials from your event sources.
- InsightIDR normalizes and attributes data on AWS but does not store credentials. The Collector strips raw unnecessary logs in your environment to prevent storage of sensitive data, such as personally identifiable information, medical records, employee, organization, or asset names.