Hello!
I'm trying to pre-filter and forward structured .csv file from Universal Forwarder (UF) to Splunk Enterprise server. This file is CP1251 encoded, not UTF-8.
I've made a new sourcetype and copied it to props.conf file on the UF:
[lg_csv]
CHARSET = CP1251
FIELD_NAMES = time,servername,pid,tid,index4,index5,index6,k_f,address,identifier,description
INDEXED_EXTRACTIONS = csv
KV_MODE = none
NO_BINARY_CHECK = true
SHOULD_LINEMERGE = false
category = Structured
disabled = false
Inputs.conf file on the UF:
[monitor://c:\cs_L]
disabled = false
index = log_test
sourcetype = lg_csv
When I put my .csv log file into c:\cs_L\*.csv, the UF indexes it and forwards to the Splunk server, but every event is duplicated. If I add data to this file and save it, added events are forwarded to Splunk server and NOT duplicated.
I tried to modify props.conf and figured out:
1. If there is no CHARSET definition in props.conf on the UF, I get every event only once - no duplication. But events fields are wrong encoded (like "\xD2\xE5.......")
2. If there is only CHARSET = CP1251 and NO_BINARY_CHECK = true definitions in props.conf on the UF, I get every event only once - no duplication. But events are not indexed on the UF and cannot be pre-filtered by TRANSFORMS-null= setnull and so on.
After running UF in the normal and debug modes and analyzing splunkd.log it seems to me, that UF indexes this file like UTF-8 encoded, computes CRC, than opens file like CP1251 encoded, computes different CRC and indexes once again.
Does anyone have any idea of solving this problem?
Version of UF is 6.5.2, version of Splunk is 6.5.0. UF is installed on Windows 2008 R2 Enterprise 64bit.
↧