Introducing the all-new TIBCO Community site!

For current users, please click "Sign In" to reset your password and access the enhanced features. If you're a first-time visitor, we extend a warm welcome—click "Sign Up" to become a part of the TIBCO Community!

If you're seeking alternative community sites, explore ibi, Jaspersoft, and Spotfire.

Jump to content
  • Analyzing upgrade scenarios to determine best practice for upgrading LogLogic® Log Management Intelligence to minimize maintenance time


    Manoj Chaurasia

    Table of Contents


    Back to HomePage


    TIBCO LogLogic® provides the industry's first enterprise-class, end-to-end log management solution. Using LogLogic® log management solutions, IT organizations can analyze and archive log and machine data for the purpose of compliance and legal protection, decision support for security remediation, and increased system performance and improved availability of overall infrastructure.


    Introduction

    This article documents the primary methods of performing a LogLogic® LMI upgrade of an LMI High-Availability (HA) cluster and compares them with respect to how long each takes to complete and how much data loss will occur. Other considerations affecting the upgrade methodology are not the primary focus. As you'll notice, most of the scenarios are essentially the same regarding the elapsed time of which log loss occurs. Only two stand out as being the best with respect to how long it takes to complete the procedure (referenced as maintenance time) but they're all equal with respect to how much data is lost. There isn't any scenario in which log loss can be completely prevented simply because log synchronization is not bi-directional but this article will show the analysis supporting how to choose the best upgrade procedure. More analysis is provided below in the conclusion section.

    Scenario list

    Here is the list of scenarios analyzed in this article:

    1. Upgrade simultaneously with disabling HA, keeping the same role for each node (recommended)
    2. Upgrade simultaneously with disabling HA but switching roles (recommended)
    3. Upgrade master first by disabling H, keeping the same role for each node
    4. Upgrade vice master first by disabling HA, keeping the same role for each node
    5. Upgrade master first by disabling HA but switching roles
    6. Upgrade vice master first by disabling HA but switching roles

    Of the scenarios above, the LogLogic® LMI ConfigureUpgrade guide uses scenario 4 to describe the upgrade process in an HA scenario.

    For this article, it is important to understand log loss, when it can occur and how to quantify it. Log loss can occur at different points in time when using LogLogic® LMI. It can occur during log collection, one reason for which is that the cluster's virtual IP address is not bound to any network interface as a result of the system is unavailable. And it can occur during the initial data migration phase in a High Availability cluster. Despite the use of HA, log loss can occur because the initial data migration phase is not bi-directional. As a result, LMI will delete any data on the vice master that the master node does not also possess. We call these 2 points of log loss "log loss at collection" and "log loss at synchronization", respectively. Keep this in mind as you read the scenarios below, especially when we calculate the amount of log loss based on how much time elapses. "Total log loss" documented below includes log loss at the collection and at synchronization.

    Note: This article cannot include the calculated amount of data lost as measured in bytes or messages because that is dependent on the characteristics of each environment. We can only quantify log loss as measured by time for the purposes of this article. These are estimations because any given version upgrade can introduce additional tasks that cause the upgrade to take longer so it's best to view these duration estimates as relative to each scenario for any given LMI version rather than absolute durations.

    Conclusion

    As stated above, the analysis in this article focuses on log loss however, any given organization's needs and preferences may dictate focusing on other considerations as well. For example, log loss may be important but the continuity of operations are a top priority. In that situation, the final recommendation below would not suffice because if an upgrade failure occurs that affects both systems then there will not be any node remaining that is sufficiently healthy to maintain operations until the affected node is operational.

    Because log loss occurs for any scenario the first question is which scenario allows the smallest amount of log loss? Using that criteria they are all still equal.

    The next criteria is which scenario allows for the smallest amount of downtime. Based on that, scenarios 1 and 2 stand out because the systems are upgraded in parallel rather than serial.

    As for which one between scenarios 1 and 2 is best, they are effectively equal, so the final choice ends up being dependent on personal/organizational preference with respect to whether one wants to ensure the nodes are in the same role configuration post-upgrade as they were prior to the upgrade (scenario 1) or if different roles are desired for some reason (scenario 2). See the tables below for the analysis of each scenario.

    ** Collection Status as used below is based on the status after the step has completed executing. Therefore the collection status indicated is the status going into the next step of the procedure unless otherwise noted.

    Additional Scenario Details

    Scenario 1 Upgrade simultaneously with disabling HA, keeping the same role for each node

    Screenshot2022-11-06at10_05_35AM.png.95c524b24a7dabf61d0ccb18563f64b1.png

    * denotes overlapping time with previous steps

    • Total log loss: About 50 minutes
    • Total maintenance time: About 50 minutes, not including the duration of LMI's initial data migration

    Scenario 2 Upgrade simultaneously with disabling HA but switching roles

    Screenshot2022-11-06at10_05_44AM.png.1cdbe366d278b038de66693c6c2c89bf.png

    * denotes overlapping time with previous steps

    • Total log loss: About 50 minutes
    • Total maintenance time: About 50 minutes, not including the duration of LMI's initial data migration

    The remaining scenarios simply document other upgrade procedures users have employed in the past and are listed here for comparison purposes.

    Scenario 3 Upgrade master first by disabling HA, keeping the same role for each node

    Screenshot2022-11-06at10_05_52AM.png.f72af75780158011a25047864842117b.png
    • Total log loss: About 50 minutes
    • Total maintenance time: About 98 minutes, not including the duration of LMI's initial data migration

    Scenario 4 Upgrade vice master first by disabling HA, keeping the same role for each node

    Screenshot2022-11-06at10_06_00AM.png.81dfb9af3faa2f6c2ee18c33055faccf.png
    • Total log loss: About 50 minutes
    • Total maintenance time: About 98 minutes, not including the duration of LMI's initial data migration

    Scenario 5 Upgrade master first by disabling HA but switching roles

    Screenshot2022-11-06at10_06_08AM.png.fefb32545db158ba8eee16dd44b5ed75.png
    • Total log loss: About 50 minutes
    • Total maintenance time: About 98 minutes, not including the duration of LMI's initial data migration

    Scenario 6 Upgrade vice master first by disabling HA but switching roles

    Screenshot2022-11-06at10_06_17AM.png.79d9579d01536be30d824549be260a41.png
    • Total log loss: About 50 minutes
    • Total maintenance time: About 98 minutes, not including the duration of LMI's initial data migration

    Other Resources


    User Feedback

    Recommended Comments

    There are no comments to display.


×
×
  • Create New...