Jump to content

This is reg.parsing incoming file in TIBCO BW. How can we parse any incoming file without knowing the delimiter in Businessworks. In Data format palette there is a field Column Separator should be mentioned when we use Parse data palette


Recommended Posts

Thanks for the reply. I have an incoming file with user data which contains user details like userid, username, userstatus etc. User fields are separated with either , or & or | etc in file. We do not know the separator that client is sending in the file. In this case how can we parse data in TIBCO BW 5.x/6.x. In the Data Format palette, without knowing the delimiter how can we achieve?. Thanks again

Link to comment
Share on other sites

Hi Sucharitha, not 100% I follow, but in your situation what I'll do is the following:

  • Still use ParseData as it is the best way to process "big files" of delimited records (unless you have also the option to Plug-in for Files)
  • In the Data Format Resource use n as the col separator, so you will have after the ParseData an array of "lines".
  • Use in the Mapper activity the "tokenize" function with the proper symbol you need for each record after "discovering which one is use in that record"

To simplify the explanation I'm attaching a dummy app that is able to transform the following file:

userid,username,userstatus

a1,a2,aa

a3&a4&a5

b1|b2|b3

In to the following XML:

<tns1:elementList xmlns:tns1="http://www.example.org/NewXMLSchema"

xmlns:dataformat="http://ns.tibco.com/bw/palette/dataformat/52e5ec81-2d99-4790-b7e4-a8246fcb6dcc">

<tns1:element>

<tns1:username>userid</tns1:username>

<tns1:userid>username</tns1:userid>

<tns1:userstatus>userstatus</tns1:userstatus>

</tns1:element>

<tns1:element>

<tns1:username>a1</tns1:username>

<tns1:userid>a2</tns1:userid>

<tns1:userstatus>aa</tns1:userstatus>

</tns1:element>

<tns1:element>

<tns1:username>a3</tns1:username>

<tns1:userid>a4</tns1:userid>

<tns1:userstatus>a5</tns1:userstatus>

</tns1:element>

<tns1:element>

<tns1:username>b1</tns1:username>

<tns1:userid>b2</tns1:userid>

<tns1:userstatus>b3</tns1:userstatus>

</tns1:element>

</tns1:elementList>

Hopefully this is similar to what you're looking to achieve! Let me know if I understood it wrong.

Regards,

Link to comment
Share on other sites

Thank you. Client requirement is we get couple of files in the attached format, but the separator/Delimiter may vary. There could be unknown separator added into the file by customer. Expectation is generic logic should be implemented from our end to process/parse all the files without knowing the separator character. Thanks

Link to comment
Share on other sites

I'm reading the file you attach and I think solution provided is valid, you just need to include "all possible delimiters" to the choose logic. There is no way to detect magically the delimiter in any technology as it needs to be extracted based on some logic. Some thing could be to detect it in the header line if the field names are the same. If not, the only way is to have some logic to extract it. In my sample this is just a "contains" but it can be as complex as you need. Hope that helps!

Link to comment
Share on other sites

×
×
  • Create New...