Sucharitha Nagaram Posted December 17, 2023 Share Posted December 17, 2023 This is reg.parsing incoming file in TIBCO BW. How can we parse any incoming file without knowing the delimiter in Businessworks. In Data format palette there is a field Column Separator should be mentioned when we use Parse data palette Link to comment Share on other sites More sharing options...
Alexandre Vazquez Posted December 18, 2023 Share Posted December 18, 2023 Hi Sucharitha, could you elaborate a little bit more on what you want to achieve? Could you share a file sample of what you want to handle? As you commented Data Format allows to define the separator to split the files, so it would be interesting to see why this is not a valid option on your case. Thanks! Link to comment Share on other sites More sharing options...
Sucharitha Nagaram Posted December 19, 2023 Author Share Posted December 19, 2023 Thanks for the reply. I have an incoming file with user data which contains user details like userid, username, userstatus etc. User fields are separated with either , or & or | etc in file. We do not know the separator that client is sending in the file. In this case how can we parse data in TIBCO BW 5.x/6.x. In the Data Format palette, without knowing the delimiter how can we achieve?. Thanks again Link to comment Share on other sites More sharing options...
Alexandre Vazquez Posted December 19, 2023 Share Posted December 19, 2023 Hi Sucharitha, not 100% I follow, but in your situation what I'll do is the following: Still use ParseData as it is the best way to process "big files" of delimited records (unless you have also the option to Plug-in for Files)In the Data Format Resource use n as the col separator, so you will have after the ParseData an array of "lines".Use in the Mapper activity the "tokenize" function with the proper symbol you need for each record after "discovering which one is use in that record"To simplify the explanation I'm attaching a dummy app that is able to transform the following file:userid,username,userstatusa1,a2,aaa3&a4&a5b1|b2|b3In to the following XML:<tns1:elementList xmlns:tns1="http://www.example.org/NewXMLSchema" xmlns:dataformat="http://ns.tibco.com/bw/palette/dataformat/52e5ec81-2d99-4790-b7e4-a8246fcb6dcc"> <tns1:element> <tns1:username>userid</tns1:username> <tns1:userid>username</tns1:userid> <tns1:userstatus>userstatus</tns1:userstatus> </tns1:element> <tns1:element> <tns1:username>a1</tns1:username> <tns1:userid>a2</tns1:userid> <tns1:userstatus>aa</tns1:userstatus> </tns1:element> <tns1:element> <tns1:username>a3</tns1:username> <tns1:userid>a4</tns1:userid> <tns1:userstatus>a5</tns1:userstatus> </tns1:element> <tns1:element> <tns1:username>b1</tns1:username> <tns1:userid>b2</tns1:userid> <tns1:userstatus>b3</tns1:userstatus> </tns1:element></tns1:elementList>Hopefully this is similar to what you're looking to achieve! Let me know if I understood it wrong.Regards, Link to comment Share on other sites More sharing options...
Alexandre Vazquez Posted December 19, 2023 Share Posted December 19, 2023 Attaching also dummy input file for your reference: Link to comment Share on other sites More sharing options...
Sucharitha Nagaram Posted December 24, 2023 Author Share Posted December 24, 2023 Thank you for detailing the logic and sharing the project. Really it helps. But this project parses the data with separators like ,| and &. Our requirement is to parse the incoming file without knowing any separators. We do not know the which separator is being added in the file by client. Thanks again Link to comment Share on other sites More sharing options...
Alexandre Vazquez Posted December 24, 2023 Share Posted December 24, 2023 Could you share a sample input file? Thanks in advance Link to comment Share on other sites More sharing options...
Sucharitha Nagaram Posted December 26, 2023 Author Share Posted December 26, 2023 Thank you. Client requirement is we get couple of files in the attached format, but the separator/Delimiter may vary. There could be unknown separator added into the file by customer. Expectation is generic logic should be implemented from our end to process/parse all the files without knowing the separator character. Thanks Link to comment Share on other sites More sharing options...
Alexandre Vazquez Posted December 28, 2023 Share Posted December 28, 2023 I'm reading the file you attach and I think solution provided is valid, you just need to include "all possible delimiters" to the choose logic. There is no way to detect magically the delimiter in any technology as it needs to be extracted based on some logic. Some thing could be to detect it in the header line if the field names are the same. If not, the only way is to have some logic to extract it. In my sample this is just a "contains" but it can be as complex as you need. Hope that helps! Link to comment Share on other sites More sharing options...
Sucharitha Nagaram Posted December 28, 2023 Author Share Posted December 28, 2023 Okay. Thanks for your reply Link to comment Share on other sites More sharing options...
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now