Jump to content
  • BW6.X - BWCE - How to process larges files in BusinessWorks and BusinessWorks Container Edition


    While we are in the API age and nowadays everybody is talking about API Led Integration the truth is that in many companies files are still widely used to transfer large volumes of data.

    BusinessWorks has some rich native capabilities to access files (read, write, move…), parse / render different file formats (fixed format, delimited format, XML format…) and transform data.

    A common challenge with files is that they can be very large (hundreds of Megabytes and even multiple Gigabytes) and in such case it is not possible to handle the whole content of a file in memory and it is necessary to process files by blocks of a limited size.

    This is something that BusinessWorks can also do and this new blog article explains how to do it.

    It is recommended to use the BusinessWorks Memory Saving Mode with this pattern.

    # Files of fixed and delimited formats

    For such files it is possible to use the Parse palette that allows to read and write files by blocks of a limited size.

    The general approach for reading and parsing files by blocks is the following:

    . Use the Parse Data activity configured with Input Type = File and with a given number of Records defined in the Input tab

    . Include the Parse Data activity in a ‘Repeat’ group and configure the exit condition to be the ‘done’ flag available in the output of the Parse activity, this flag is set once the End of File is reached.

    The general approach for formatting and writing files by blocks is the following:

    . Use the Render Data activity with a list of records in Input

    . Use the Write File activity configured with the ‘Append’ option to write in the output file

    . Include those activities in a ‘Repeat’ group and exit once all input records have been processed

    The sample process below includes reading and parsing an input file fragment, doing a simple transformation and then rendering and writing the result in the output file.

    1*ExE_TY7rcR388ocYhC1FRw.png

    This Process also shows an error management approach where in case a line of the Input file cannot be parsed it is logged in an error file and processing continues.

    Sample process creation

    . First you need to create an XML schema defining the format of the input file

    1*1w8ttyqcomCBGgVRe-WiBA.png

    . Then you need to create a DataFormat resource for the input file

    1*BAJOO4SVKX0oo13mZtDP2g.png

    . Create an XML schema defining the format of the output file

    1*VBvK0hXSHw8k9K3CVoYVcw.png

    . Create a DataFormat resource for the output file

    1*jRYGiB4oELJBYPoUQa_CQQ.png

    . Add a ParseData activity in the process

    There is no specific option to set for processing by block at this level except the ‘Continue On Error’ option if you want to manage errors in the suggested approach:

    1*4uxul-B8gaGG1Nil5edqQQ.png

    In the Input tab make sure to set the Number of Records to be read at each iteration (a good practice is to manage this using a property):

    1*XE43adTABS21EtcSPfUY_g.png

    Important : If you don’t specify the ‘Start Record’ the activity automatically continues to read the file from where it was left at the previous iteration. This is recommended to use this approach for optimal performances (instead of specifying the ‘Start Record’ number in the Input of the activity).

    . Include the ParseData activity in a RepeatLoop

    The repeat loop has to be configured with the exit condition set with the ‘done’ flag available in the output of the Parse Data activity.

    1*qZPBe3_R3MNDPMHksqDh-Q.png

    . Add in the process a RenderData activity

    There is no specific option to set for processing by block at this level:

    1*hyOWz4RuI8p9O7RlVFdf1A.png

    In the Input tab of the activity map the list of records to be formatted with the needed transformations (in the example Process this is using the output of the previous Parse Data activity):

    1*UNiqT9FHJzm6Cuy6O2bpvQ.png

    . Add in the Process a WriteFile activity
    The Write File activity has to be configured with the ‘Append’ option set.

    1*TUESfk3SoBt8-ddkQhhf2w.png

    The Input tab of the Write File activity has to be configured with the output of the Render Data activity:

    1*dtV3iJa-PqrACtSr8DwRFw.png

    Error management

    Parsing errors can be managed with the following approach: in case a line of the Input file cannot be parsed it is logged in an error file and processing continues.

    This can be done with the following:

    . In the Parse Data activity make sure the following option is checked:

    1*4uxul-B8gaGG1Nil5edqQQ.png

    . In case there is at least an element in the ‘Error Rows’ list make a transition to a Write File activity that will write the problematic input lines to an error file

    Configuration of the Transition:

    1*jjqeV-5NiyJK6sGc9NP--w.png

    Configuration of the Write File Input:

    1*abyt_H0DCNgNxzKIln8Ptw.png

    Reference elements

    BusinessWorks Parse Data documentation:

    https://docs.tibco.com/pub/activematrix_businessworks/6.10.0/doc/html/Default.htm#binding-palette/parse-data.htm

    It is recommended to use the BusinessWorks Memory Saving Mode with this pattern.

    Details are available in the following article:

    XML schema for the sample Input file

    <?xml version="1.0" encoding="UTF-8"?>
    <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" targetNamespace="http://exemple.com/Data" 
               xmlns:tns="http://exemple.com/Data" elementFormDefault="qualified">
    <xs:element name="Data" type="tns:DataType"/>
      <xs:complexType name="DataType">
           <xs:sequence>
             <xs:element name="LineNumber" type="xs:string"/>
             <xs:element name="Field1" type="xs:string"/>
             <xs:element name="Field2" type="xs:string"/>
          </xs:sequence>
      </xs:complexType>
     </xs:schema>

    XML schema for the sample Output file

    <?xml version="1.0" encoding="UTF-8"?>
    <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" targetNamespace="http://exemple.com/Output" 
               xmlns:tns="http://exemple.com/Output" elementFormDefault="qualified">
    <xs:element name="Output" type="tns:OutputType"/>
      <xs:complexType name="OutputType">
           <xs:sequence>
             <xs:element name="LineNumber" type="xs:string"/>
             <xs:element name="OutputField" type="xs:string"/>
          </xs:sequence>
      </xs:complexType>
     </xs:schema>

    # Files of IBM CopyBook format

    The ‘Parse Copy Book’ activity of the Data Conversion Plugin is able to process files in a similar way than the standard Parse palette.

    The options highlighted below have to be used:

    . Multiple Records

    . Input type= File

    1*r8jXuuELBxj7YCJgKyTxZw.png

    Specify the number of Records to be read at each iteration:

    1*CKdNwnMACb0AEGr6GlcsBA.png

    # Files of XML format

    The ‘Large XML’ Plugin provides activities to parse / render XML files without having to load the whole file contents in BusinessWorks memory. For example the ‘Get Fragment’ activity can be configured to read a number of given elements from the source XML file (say 100 ‘order’ elements) and can be included in a loop that will repeat until the end of the file.

    Additional elements

    While in many cases files are created from database extracts and databases are loaded with data coming from files you may also have a look to the following article explaining how to manage SQL queries handling large data volumes in BusinessWorks and BusinessWorks Container Edition.

    https://community.tibco.com/articles/tibco-activematrix-businessworks/bw6x-bwce-how-to-manage-sql-queries-handling-large-data-volumes-in-businessworks-and-businessworks-container-edition-r3414/

    Additional elements

    You can refer to the attached project and sample data file.

     

    LargeFileProcessingDemo.zip MyFile.csv


    User Feedback

    Recommended Comments

    There are no comments to display.



    Create an account or sign in to comment

    You need to be a member in order to leave a comment

    Create an account

    Sign up for a new account in our community. It's easy!

    Register a new account

    Sign in

    Already have an account? Sign in here.

    Sign In Now

×
×
  • Create New...