Don’t Let Development Tank Production

March 31, 2022

Matt Cimino

The Scenario

You have a production environment and a testing environment. Testing is used for troubleshooting conf files and testing ingestion props. However, testing is hamstrung for search and dashboard development due to a lack of real data.

You may be working around this by allowing development to occur on production or you may have a development search head that peers your prod indexers. However, you notice two impacts of this choice over time.

The testing search head is gradually morphing into a second production instance.
Performance on production is tanking as inefficient and duplicate queries run unrestrained.

Even if workload management has been implemented as a good stopgap, isolating testing and production environments from each other is the best practice for change control, security, and performance.

What’s needed?

Realistic data in an isolated testing environment.

How do you get the data?

There are two approaches:

Approach No. 1: Go with a Splunk built tool like Event Gen.

The downsides to this approach?

The data will not be entirely realistic
Event generating scripts will incur a performance hit
Designing data samples will be repeated for every data source

Approach No. 2: Forward a sample of the data from production to testing.

The big concern here will be a CPU or network performance hit. Only a sample of the data is needed, so forwarding from a single indexer may be enough. Forwarding from 1 indexer on a 12 indexer cluster would create a 1/12th sampling of the data which should create almost no performance impact because forwarded data is not parsed and should only consume a few MBs on a GB+ network.

Note: The default behavior of indexer forwarding is that if the destination is unavailable, the indexer pipeline will fill and block the queue, which is bad. Without changes, this could lead to the scenario above.

Since the destination indexer is in a testing environment, downtime can be business as usual. The outputs.conf will direct events to be dropped when it gets full. An app containing an outputs.conf similar to the below should be placed on a single prod indexer to forward data or on multiple production indexers for a larger sample. This should be placed in an app within $SPLUNK_HOME$/etc/apps and not pushed from the cluster master. The port number can be changed but should match on sending and receiving indexer. Cooked data is sent to preserve host, source, and sourcetype metadata.

Outputs.conf on the forwarding production indexer would look like:

[tcpout]
defaultGroup=forward2indexers
indexAndForward= true

[tcpout:forward2indexers]
server=test_indexer_dns_or_ip_address:9998
dropEventsOnQueueFull= 0s

Inputs.conf on the receiving, non-production indexer would look like:

[splunktcp:9998]
connection_host = ip
disabled = 0
queue = parsingQueue

The queue on the custom input is set to parsingQueue. Normally Splunk will send S2S(Splunk to Splunk) data straight to the index queue, but that would bypass any ingestion rules on the test indexer. In this case, you may want to test ingestion configurations. To do so, the queue is explicitly set to the 'parsingQueue’ so all events pass through the entire pipeline. An alternative port should be used to ensure this behavior is not applied to other S2S forwarded data.

Tips:

To select which indexes are forwarded, use:

forwardedindex.<n>.whitelist/blacklist = <regex>

If test indexers are clustered, specify multiple output destinations under server=in the tcpout:name stanza.

Cooked or uncooked data?

When forwarding data, this is a good question to ask.

Per Splunk docs (inputs.conf spec):

sendCookedData = <boolean>
* Whether or not to send processed or unprocessed data to the receivingserver.
* If set to "true", events are cooked (have been processed bySplunk software).
* If set to "false", events are raw and untouched prior tosending.
* Set to "false" if you are sending events to a third-partysystem.
* Default: true

Two points to note with this:

The input type "splunktcp" does not handle uncooked data. So a normal tcp input will be needed.
Sending uncooked or raw data will strip all metadata and new values will be assigned at the tcp input. Removing host and sourcetype is not ideal.

Sample data is shown below for both methods.