tag:status.onomondo.com,2005:/historyOnomondo Status - Incident History2024-03-29T09:56:34+01:00Onomondotag:status.onomondo.com,2005:Incident/203189292024-04-16T10:00:00+02:002024-03-21T13:01:10+01:00DRA and DEA upgrade<p><strong>THIS IS A SCHEDULED EVENT Apr <var data-var='date'>16</var>, <var data-var='time'>10:00</var> - <var data-var='time'>13:00</var> CEST</strong></p><p><small>Mar <var data-var='date'>21</var>, <var data-var='time'>13:01</var> CET</small><br><strong>Scheduled</strong> - On April 16th, our team will upgrade our DRA/DEA components. The upgrade will be performed in a rolling fashion, so they will upgrade one device at a time, causing the instance to fail over its current traffic to another instance.</p>tag:status.onomondo.com,2005:Incident/203893812024-03-29T09:56:34+01:002024-03-29T09:56:34+01:00Slow internal message queue<p><small>Mar <var data-var='date'>29</var>, <var data-var='time'>09:56</var> CET</small><br><strong>Monitoring</strong> - We are done migrating all services, and all systems look good. We are still catching up on signalling logs, but otherwise we are up-to-date.<br /><br />Some connectors messages and some webhooks can have been lost in the migration.<br />We will keep monitoring, and will share more updates as they become available. Thank you for your continued patience.</p><p><small>Mar <var data-var='date'>29</var>, <var data-var='time'>08:56</var> CET</small><br><strong>Update</strong> - Platform and connectors are still being affected by the slow consumption of our message queues.<br /><br />We are in the process of moving the last core services over to the new message broker. Devices might still be kicked from the network until this has been completed.<br /><br />We will share more updates as they become available. Thank you for your continued patience.<br />Next update in 1 hour.</p><p><small>Mar <var data-var='date'>29</var>, <var data-var='time'>06:21</var> CET</small><br><strong>Identified</strong> - Platform and connectors are still being affected by the slow consumption of our message queues.<br /><br />The mitigations we put in place earlier are not enough for us to keep up. We are forced to do a broker upgrade which we are commencing work on now. This will lead to devices being kicked from the network, as we are migrating core features to the new message broker.<br /><br />The mitigations are in the process of being deployed. We will share more updates as they become available. Thank you for your continued patience.</p><p><small>Mar <var data-var='date'>29</var>, <var data-var='time'>04:49</var> CET</small><br><strong>Update</strong> - Platform and connectors are still being affected by the slow consumption of our message queues. The mitigations are in the process of being deployed. We are still catching up, and are looking at several hours before we are through the backlog of messages. No device data loss has occurred. We will keep monitoring and will share more updates as they become available. Thank you for your continued patience.<br />The next update will be in 2 hours.</p><p><small>Mar <var data-var='date'>29</var>, <var data-var='time'>02:27</var> CET</small><br><strong>Update</strong> - Platform and connectors are still being affected by slow consumption of our message queues. The mitigations we have put in place have been confirmed to be working. We are still catching up, and are looking at several hours before we are through the backlog of messages. We will keep monitoring, and will share more updates as they become available. Thank you for your continued patience.<br />Next update will be in 2 hours.</p><p><small>Mar <var data-var='date'>28</var>, <var data-var='time'>23:24</var> CET</small><br><strong>Update</strong> - Platform and connectors are still being affected by slow consumption of our message queues. The mitigations we have put in place have been confirmed to be working. We are still catching up, and are looking at several hours before we are through the backlog of messages. We will keep monitoring, and will share more updates as they become available. Thank you for your continued patience.<br />Next update will be in 2 hours.</p><p><small>Mar <var data-var='date'>28</var>, <var data-var='time'>21:19</var> CET</small><br><strong>Update</strong> - Platform and connectors are still being affected by slow consumption of our message queues. The mitigations we have put in place have been confirmed to be working. We are still catching up, and are looking at several hours before we are through the backlog of messages. We will keep monitoring, and will share more updates as they become available. Thank you for your continued patience.<br />Next update will be in 2 hours.</p><p><small>Mar <var data-var='date'>28</var>, <var data-var='time'>19:24</var> CET</small><br><strong>Update</strong> - Platform and connectors are still being affected by slow consumption of our message queues. The mitigations we have put in place seem to be working, but we are still looking at several hours before we are caught up. We will keep monitoring, and will share more updates as they become available. Thank you for your continued patience.</p><p><small>Mar <var data-var='date'>28</var>, <var data-var='time'>18:18</var> CET</small><br><strong>Monitoring</strong> - Platform and connectors are still being affected by slow consumption of our message queues. We have implemented some mitigation features, and are monitoring to gauge the impact on our systems and will share more updates as they become available. Thank you for your continued patience.</p><p><small>Mar <var data-var='date'>28</var>, <var data-var='time'>17:12</var> CET</small><br><strong>Identified</strong> - We are facing issues with our system message queue. We are working hard to address the problem.<br />Impact:<br />- Webhooks might be delayed<br />- Messages for platform connectors might be delayed<br />- Signaling logs might be delayed</p>tag:status.onomondo.com,2005:Incident/200885492024-03-27T10:45:02+01:002024-03-27T10:45:02+01:00Upcoming API Change Alert: Network Whitelist API and Webhooks Requirements Update<p><small>Mar <var data-var='date'>27</var>, <var data-var='time'>10:45</var> CET</small><br><strong>Completed</strong> - The scheduled maintenance has been completed.</p><p><small>Mar <var data-var='date'>27</var>, <var data-var='time'>10:15</var> CET</small><br><strong>In progress</strong> - Scheduled maintenance is currently in progress. We will provide updates as necessary.</p><p><small>Feb <var data-var='date'>27</var>, <var data-var='time'>15:59</var> CET</small><br><strong>Scheduled</strong> - We are making important changes to the Network Whitelist API and Webhooks requirements that will affect how you generate API calls and IP addresses required for Webhooks. <br /><br />Starting March 27th, it will be mandatory to specify both a Mobile Country Code (MCC) and a Mobile Network Code (MNC) for all API calls. This update is designed to enhance security and prevent unintentional charges by ensuring connections are made only to specified networks. To improve our service delivery for Webhooks, we're introducing a new IP address for Webhooks: <b>52.58.186.11/32</b>.<br /><br /><b>API Action Required:</b> Review and update your API call configurations to comply with the new requirements. Detailed guidance and updated API documentation are available <a href="https://docs.onomondo.com/#ff204a62-1adc-4f02-9f08-e969803e6d2e">in the link here.</a><br /><br /><b>Webhook Action Required:</b> Update your firewall settings to include this new IP address alongside the current ones, ensuring continuous reception of Webhooks without interruption. More documentation can be <a href="https://docs.onomondo.com">found here.</a></p>tag:status.onomondo.com,2005:Incident/202515262024-03-15T10:01:11+01:002024-03-15T10:01:11+01:00Signalling issues<p><small>Mar <var data-var='date'>15</var>, <var data-var='time'>10:01</var> CET</small><br><strong>Resolved</strong> - This issue has now been resolved. All systems are fully operational and functioning as expected. We appreciate your understanding.</p><p><small>Mar <var data-var='date'>15</var>, <var data-var='time'>08:59</var> CET</small><br><strong>Update</strong> - All systems are fully operational and running as expected. We will continue to closely monitor the situation.</p><p><small>Mar <var data-var='date'>15</var>, <var data-var='time'>06:53</var> CET</small><br><strong>Monitoring</strong> - We have received confirmation that our IPX provider's degraded performance issues have stabilised. All our systems also look stable.<br /><br />We will continue to monitor the situation closely.</p><p><small>Mar <var data-var='date'>15</var>, <var data-var='time'>06:29</var> CET</small><br><strong>Update</strong> - We have turned off all mitigation measures that were set in motion due to the upstream degradation, i.e. we have opened up for all SIMs.<br /><br />We are continuing to monitor the situation closely.</p><p><small>Mar <var data-var='date'>15</var>, <var data-var='time'>03:25</var> CET</small><br><strong>Update</strong> - Our IPX provider have detected degradation of their signalling services, which is affecting our environment.<br /><br />Our mitigation process was to lower the load of our systems, and we have now gradually started opening up for more sims again.</p><p><small>Mar <var data-var='date'>15</var>, <var data-var='time'>01:53</var> CET</small><br><strong>Update</strong> - We are continuing to investigate this issue.</p><p><small>Mar <var data-var='date'>15</var>, <var data-var='time'>01:52</var> CET</small><br><strong>Investigating</strong> - We are currently experiencing instabilities regarding signalling, and are making a plan to start blocking devices temporarily to mitigate the load on our signalling services.</p>tag:status.onomondo.com,2005:Incident/200811772024-03-12T11:00:56+01:002024-03-12T11:00:56+01:00Database maintenance<p><small>Mar <var data-var='date'>12</var>, <var data-var='time'>11:00</var> CET</small><br><strong>Completed</strong> - The scheduled maintenance has been completed.</p><p><small>Mar <var data-var='date'>12</var>, <var data-var='time'>09:00</var> CET</small><br><strong>In progress</strong> - Scheduled maintenance is currently in progress. We will provide updates as necessary.</p><p><small>Mar <var data-var='date'>11</var>, <var data-var='time'>16:54</var> CET</small><br><strong>Update</strong> - Due to unforeseen delays in storage optimisation after a routine disk increase we will have to postpone the maintenance until tomorrow morning</p><p><small>Feb <var data-var='date'>26</var>, <var data-var='time'>19:46</var> CET</small><br><strong>Scheduled</strong> - On March 11th, our team will be conducting a routine database version upgrade. During the maintenance process, certain services and features may become unavailable for a short period of time.<br /><br />Services affected:<br /><br />- API<br />- Our platform: *app.onomondo.com*<br />- Connectors<br />- Webhooks: delayed processing<br />- New device authentications<br /><br />The non-connector traffic from devices that are already online should not be impacted.</p>tag:status.onomondo.com,2005:Incident/199737572024-03-05T15:00:56+01:002024-03-05T15:00:56+01:00TCP timeout time increased<p><small>Mar <var data-var='date'> 5</var>, <var data-var='time'>15:00</var> CET</small><br><strong>Completed</strong> - The scheduled maintenance has been completed.</p><p><small>Mar <var data-var='date'> 5</var>, <var data-var='time'>11:00</var> CET</small><br><strong>In progress</strong> - Scheduled maintenance is currently in progress. We will provide updates as necessary.</p><p><small>Feb <var data-var='date'>13</var>, <var data-var='time'>11:05</var> CET</small><br><strong>Scheduled</strong> - On March 5th, our team will be increasing TCP timeouts for parts of the environment. During the maintenance the packet gateways will be taken out of rotation one by one. Some lingering connections might require reconnection when a packet gateway is taken out of rotation.</p>tag:status.onomondo.com,2005:Incident/200778692024-02-26T13:20:27+01:002024-02-26T13:20:27+01:00February 26, 2024 Maintenance Update: Reverting Recent Changes<p><small>Feb <var data-var='date'>26</var>, <var data-var='time'>13:20</var> CET</small><br><strong>Resolved</strong> - Today's changes have been rolled back and we are fully operational, and all network and platform components are working as intended. The maintenance scheduled for Feb 26, 2024 17:30-19:00 CET has been postponed. We will notify you once this maintenance update has been scheduled again.</p><p><small>Feb <var data-var='date'>26</var>, <var data-var='time'>12:56</var> CET</small><br><strong>Update</strong> - We are still working to resolve unexpected side effects from today's maintenance. Platform components may still be experiencing performance issues. All network components are fully operational. We will continue to provide updates as we roll back these changes.</p><p><small>Feb <var data-var='date'>26</var>, <var data-var='time'>12:22</var> CET</small><br><strong>Identified</strong> - We made a change in preparation for today's maintenance, which triggered unexpected side-effects. This may result in some performance issues with Network and Platform components. We're currently rolling back the change and working carefully to resolve the issues.</p>tag:status.onomondo.com,2005:Incident/200519792024-02-23T09:25:29+01:002024-02-23T09:30:12+01:00Connectivity issues for certain careers in The United States<p><small>Feb <var data-var='date'>23</var>, <var data-var='time'>09:25</var> CET</small><br><strong>Resolved</strong> - This incident has been resolved the affected caries in the The United States. If you are still experiencing issues related to connectivity in the US, please contact us.</p><p><small>Feb <var data-var='date'>23</var>, <var data-var='time'>01:23</var> CET</small><br><strong>Update</strong> - The issue is persisting and we are continuing to monitor the situation.</p><p><small>Feb <var data-var='date'>22</var>, <var data-var='time'>22:31</var> CET</small><br><strong>Update</strong> - We are continuing to monitor for any further issues.</p><p><small>Feb <var data-var='date'>22</var>, <var data-var='time'>22:30</var> CET</small><br><strong>Monitoring</strong> - The United States is experiencing a nationwide incident, affecting local carriers such as T-Mobile and AT&T. During this time, devices may experience connectivity issues.<br />We are in regular communication with our partners and keeping a close eye on the situation. If you are experiencing any unexpected problems during this time, please contact us.</p>tag:status.onomondo.com,2005:Incident/198843592024-02-19T21:16:33+01:002024-02-19T21:16:33+01:00Database maintenance<p><small>Feb <var data-var='date'>19</var>, <var data-var='time'>21:16</var> CET</small><br><strong>Completed</strong> - The scheduled maintenance has been completed.</p><p><small>Feb <var data-var='date'>19</var>, <var data-var='time'>19:04</var> CET</small><br><strong>Update</strong> - Scheduled maintenance is currently in progress. The initial database backup is still running.</p><p><small>Feb <var data-var='date'>19</var>, <var data-var='time'>17:30</var> CET</small><br><strong>In progress</strong> - Scheduled maintenance is currently in progress. We will provide updates as necessary.</p><p><small>Feb <var data-var='date'> 1</var>, <var data-var='time'>13:40</var> CET</small><br><strong>Scheduled</strong> - On February 19th, our team will be conducting routine database maintenance to enable a future upgrade. During the maintenance period, certain services and features may become unavailable for a short period of time. <br /><br />Services affected:<br /><br />- API<br />- Our platform: *app.onomondo.com*<br />- Connectors<br />- Webhooks: delayed processing<br />- New device authentications<br /><br />The non-connector traffic from devices that are already online should not be impacted.</p>tag:status.onomondo.com,2005:Incident/200000642024-02-17T23:00:39+01:002024-02-17T23:00:39+01:00GRX signalling issues.<p><small>Feb <var data-var='date'>17</var>, <var data-var='time'>23:00</var> CET</small><br><strong>Resolved</strong> - We've resolved the recent instability issues affecting our network. As our team has worked tirelessly to address and resolve this incident, we are now in a phase of continuing network monitoring to ensure stability is maintained.<br />We understand the critical role our service plays in your operations and deeply regret any inconvenience caused. A comprehensive incident review will be shared, detailing the incident's cause, resolution, and steps we're taking to prevent future occurrences.<br />Your trust and collaboration with Onomondo is invaluable, and we thank you for your understanding and patience during this time.</p><p><small>Feb <var data-var='date'>17</var>, <var data-var='time'>20:15</var> CET</small><br><strong>Monitoring</strong> - We've opened up all of our SIMs and are now monitoring the situation.</p><p><small>Feb <var data-var='date'>17</var>, <var data-var='time'>19:26</var> CET</small><br><strong>Update</strong> - We have opened up almost all of our SIMs, and are continuing to open up the remaining ones.</p><p><small>Feb <var data-var='date'>17</var>, <var data-var='time'>18:03</var> CET</small><br><strong>Update</strong> - We are continuing to open up for more SIMs.</p><p><small>Feb <var data-var='date'>17</var>, <var data-var='time'>17:06</var> CET</small><br><strong>Update</strong> - We are continuing to gradually open up for more SIMs.</p><p><small>Feb <var data-var='date'>17</var>, <var data-var='time'>16:03</var> CET</small><br><strong>Update</strong> - Gradually opening of more SIMs.</p><p><small>Feb <var data-var='date'>17</var>, <var data-var='time'>14:21</var> CET</small><br><strong>Update</strong> - We are actively monitoring the mitigation measures implemented at 12:55 CET, while proceeding with the gradual opening of SIMs.</p><p><small>Feb <var data-var='date'>17</var>, <var data-var='time'>12:55</var> CET</small><br><strong>Update</strong> - We've implemented a mitigation and will be opening up for SIMs slowly.</p><p><small>Feb <var data-var='date'>17</var>, <var data-var='time'>12:05</var> CET</small><br><strong>Update</strong> - We are continuing to work with our partners to resolve the issue. We've implemented a mitigation and we are monitoring its stability before opening more sims.</p><p><small>Feb <var data-var='date'>17</var>, <var data-var='time'>09:50</var> CET</small><br><strong>Update</strong> - We are still working with our IPX partner to resolve the issue. It is taking much longer than anticipated, but we are committed to finding a solution as soon as possible. It is still around 20% SIMs that are affected, whereas 80% still is fully open and functional.</p><p><small>Feb <var data-var='date'>17</var>, <var data-var='time'>06:27</var> CET</small><br><strong>Update</strong> - We are still working with our IPX partner to resolve the issue. It is taking longer than anticipated, but we are making steady progress and are committed to finding a solution as soon as possible.</p><p><small>Feb <var data-var='date'>17</var>, <var data-var='time'>04:27</var> CET</small><br><strong>Update</strong> - We're still seeing high load on some of our core components and together with our provider we are closely monitoring the situation before opening the next ranges.</p><p><small>Feb <var data-var='date'>17</var>, <var data-var='time'>00:04</var> CET</small><br><strong>Update</strong> - We're still seeing high load on some of our core components and together with our provider we are monitoring the situation before opening the next ranges</p><p><small>Feb <var data-var='date'>16</var>, <var data-var='time'>22:53</var> CET</small><br><strong>Update</strong> - We are monitoring things before opening up for the final ranges of SIMs.</p><p><small>Feb <var data-var='date'>16</var>, <var data-var='time'>21:37</var> CET</small><br><strong>Update</strong> - We have opened up fully for more than 75% of all SIMs, and continuing with the rest.</p><p><small>Feb <var data-var='date'>16</var>, <var data-var='time'>21:07</var> CET</small><br><strong>Update</strong> - We have opened up fully for more than half of all SIMs, and continuing with the rest.</p><p><small>Feb <var data-var='date'>16</var>, <var data-var='time'>20:26</var> CET</small><br><strong>Update</strong> - We are now, together with our IPX partner, opening up fully. We will be doing that in ranges still.</p><p><small>Feb <var data-var='date'>16</var>, <var data-var='time'>19:41</var> CET</small><br><strong>Update</strong> - We've opened up for all IMSI ranges on LTE and collaborating with our IPX provider to enhance the stability of our 2G and 3G networks. As soon as we ensure these networks are stable, we'll unblock IMSIs on 2G and 3G technologies as well.</p><p><small>Feb <var data-var='date'>16</var>, <var data-var='time'>17:59</var> CET</small><br><strong>Update</strong> - We're expanding access to more IMSI ranges for LTE and collaborating with our IPX provider to enhance the stability of our 2G and 3G networks. As soon as we ensure these networks are stable, we'll unblock IMSIs on 2G and 3G technologies as well.</p><p><small>Feb <var data-var='date'>16</var>, <var data-var='time'>17:09</var> CET</small><br><strong>Update</strong> - We are in the process of unblocking IMSI ranges for LTE only</p><p><small>Feb <var data-var='date'>16</var>, <var data-var='time'>15:40</var> CET</small><br><strong>Update</strong> - The GRX instability has impacted our other signalling services, and we have to start blocking devices temporarily to get the load on our servers under control.</p><p><small>Feb <var data-var='date'>16</var>, <var data-var='time'>14:55</var> CET</small><br><strong>Identified</strong> - We've taken the faulty GRX link out of operation.</p><p><small>Feb <var data-var='date'>16</var>, <var data-var='time'>14:51</var> CET</small><br><strong>Investigating</strong> - We are experiencing signalling issues from a GRX partner, some devices may experience issues connecting to our network. We are currently removing the affected component from the rest of the network, to isolate the issue.</p>tag:status.onomondo.com,2005:Incident/199853432024-02-15T16:25:03+01:002024-02-15T16:25:03+01:00Emergency Network Maintenance<p><small>Feb <var data-var='date'>15</var>, <var data-var='time'>16:25</var> CET</small><br><strong>Completed</strong> - The issue have been resolved and everything is back to normal.</p><p><small>Feb <var data-var='date'>15</var>, <var data-var='time'>10:00</var> CET</small><br><strong>In progress</strong> - Scheduled maintenance is currently in progress. We will provide updates as necessary.</p><p><small>Feb <var data-var='date'>14</var>, <var data-var='time'>18:15</var> CET</small><br><strong>Scheduled</strong> - We are experiencing a connectivity issue with one of our upstream providers. To resolve the problem, on Thursday 15 February, we will temporarily pull one of our main routers out of rotation. This will significantly reduce failover capacity while we investigate the issue. We apologize in advance for any inconvenience.</p>tag:status.onomondo.com,2005:Incident/198574792024-01-30T18:00:30+01:002024-01-30T18:00:30+01:00Emergency network change<p><small>Jan <var data-var='date'>30</var>, <var data-var='time'>18:00</var> CET</small><br><strong>Completed</strong> - The scheduled maintenance has been completed.</p><p><small>Jan <var data-var='date'>30</var>, <var data-var='time'>16:44</var> CET</small><br><strong>Verifying</strong> - Verification is currently underway for the maintenance items.</p><p><small>Jan <var data-var='date'>30</var>, <var data-var='time'>14:00</var> CET</small><br><strong>In progress</strong> - Scheduled maintenance is currently in progress. We will provide updates as necessary.</p><p><small>Jan <var data-var='date'>29</var>, <var data-var='time'>14:58</var> CET</small><br><strong>Scheduled</strong> - We will be conducting emergency maintenance on our network on Tuesday, the 30th, between 13:00 and 17:00 UTC.<br /><br />A small bug has been identified in our routing that needs to be fixed; this only has a minor impact on a small subset of customers. We do not expect this to cause any service disruptions.</p>tag:status.onomondo.com,2005:Incident/198388402024-01-29T13:52:03+01:002024-01-29T13:52:03+01:00Emergency network update<p><small>Jan <var data-var='date'>29</var>, <var data-var='time'>13:52</var> CET</small><br><strong>Completed</strong> - The conducted internal change did not proceed as planned. This had no impact on customer services.<br />Emergency maintenance will be scheduled again soon, ensuring continuous and reliable service. This will include running production instances in LOR (Loss of Redundancy). which will cause no interruption for customer services.<br /><br />Thank you for your understanding.</p><p><small>Jan <var data-var='date'>29</var>, <var data-var='time'>11:00</var> CET</small><br><strong>In progress</strong> - Scheduled maintenance is currently in progress. We will provide updates as necessary.</p><p><small>Jan <var data-var='date'>26</var>, <var data-var='time'>17:07</var> CET</small><br><strong>Scheduled</strong> - We will be conducting emergency maintenance on our network on Monday, the 29th, between 10:00 and 13:00 UTC.<br />A small bug has been identified in our routing that needs to be fixed; this only has a minor impact on a small subset of customers. We do not expect this to cause any service disruptions.</p>tag:status.onomondo.com,2005:Incident/198068092024-01-23T17:00:08+01:002024-01-23T17:00:08+01:00Packet Gateway taken out of rotation<p><small>Jan <var data-var='date'>23</var>, <var data-var='time'>17:00</var> CET</small><br><strong>Resolved</strong> - This incident has been resolved.</p><p><small>Jan <var data-var='date'>23</var>, <var data-var='time'>15:55</var> CET</small><br><strong>Monitoring</strong> - At approximately 13:30 UTC, we detected an issue with one of our Packet Gateways (PGW), requiring its removal from our network rotation. This action was crucial in maintaining overall system integrity and preventing future problems. As a result, devices that were previously connected to this PGW had to reconnect.<br />We are closely monitoring the situation to ensure that all devices maintain stable and continuous connectivity. Thank you for your patience and understanding.</p>tag:status.onomondo.com,2005:Incident/197129732024-01-15T14:30:00+01:002024-01-15T16:11:14+01:00Webhook Delays and Loss of Certain Retried Events<p><small>Jan <var data-var='date'>15</var>, <var data-var='time'>14:30</var> CET</small><br><strong>Resolved</strong> - On January 15, 2024, our internal queuing system underwent maintenance, specifically targeting the retry mechanism for webhooks.<br /><br />This process required a temporary system reset, from ~2024-01-15T13:20:00Z to ~2024-01-15T13:27:00Z. As a result, any messages that needed to be resent during that period or were scheduled to be resent at a later point due to the receiving endpoint having responded with a faulty error code or having timed out were lost.<br /><br />Please be assured that all other messages experienced only brief delays, and no data outside of this specific context was affected. We apologize for any inconvenience and appreciate your understanding.</p>tag:status.onomondo.com,2005:Incident/197640712024-01-14T17:00:00+01:002024-01-19T11:46:48+01:00Packet loss towards one of our GRX providers<p><small>Jan <var data-var='date'>14</var>, <var data-var='time'>17:00</var> CET</small><br><strong>Resolved</strong> - On January 14th, 2024, there was a high packet loss towards one of our GRX providers, causing some devices to have issues connecting to our network. This instance occurred between ~2024-01-14T16:45:00Z and ~2024-01-14T21:14:00Z.<br />This issue may have caused a degree of packet loss for devices, depending on reconnection timing during the timeframe mentioned. We apologize for any inconvenience and appreciate your understanding.</p>tag:status.onomondo.com,2005:Incident/196670642024-01-09T14:30:00+01:002024-01-10T13:22:07+01:00Minor signalling issue<p><small>Jan <var data-var='date'> 9</var>, <var data-var='time'>14:30</var> CET</small><br><strong>Resolved</strong> - On 9 January at 13:30 UTC, we identified and promptly resolved a minor issue with our Signaling Logs platform. This brief incident resulted in the loss of a small number of signaling logs. We assure you that this had no impact on traffic flow or operational problems.<br /><br />At Onomondo, we value transparency and open communication with our users. We appreciate your understanding, and will continue to keep you informed.</p>tag:status.onomondo.com,2005:Incident/190367102023-12-12T13:33:51+01:002023-12-12T13:33:51+01:00Upgrading Amazon Broker Engine<p><small>Dec <var data-var='date'>12</var>, <var data-var='time'>13:33</var> CET</small><br><strong>Completed</strong> - The upgrade is implemented, and the verification process has been completed successfully.</p><p><small>Dec <var data-var='date'>12</var>, <var data-var='time'>10:00</var> CET</small><br><strong>In progress</strong> - Scheduled maintenance is currently in progress. We will provide updates as necessary.</p><p><small>Nov <var data-var='date'> 6</var>, <var data-var='time'>17:52</var> CET</small><br><strong>Scheduled</strong> - On December 12th between 10.00 CET- 14.30 CET we will be performing an upgrade of a core infrastructure service to increase stability and performance going forward.<br /><br />Delayed messages delivery of Webhooks might occur.<br />Device connectivity is NOT expected to be affected during this maintenance window.</p>tag:status.onomondo.com,2005:Incident/193660652023-12-06T10:45:00+01:002023-12-07T13:04:37+01:00MTU Change on NAT Gateways<p><small>Dec <var data-var='date'> 6</var>, <var data-var='time'>10:45</var> CET</small><br><strong>Resolved</strong> - On 2023-12-06 09:43 UTC, a change was made to the MTU on our NAT gateways in order to align them with the MTU size in the rest of our infrastructure.<br /><br />Following this update, we observed some complications, so we reverted the MTU changes on our NAT gateways at 2023-12-06 22:45 UTC.<br /><br />We are examining the situation in detail and will provide more information when possible.</p>tag:status.onomondo.com,2005:Incident/192445932023-11-29T14:30:00+01:002023-11-29T20:40:51+01:00Unintelligible data in traffic monitor<p><small>Nov <var data-var='date'>29</var>, <var data-var='time'>14:30</var> CET</small><br><strong>Resolved</strong> - Today, from 13:33 UTC until 13:45 UTC a bad deployment caused our traffic monitoring service to show gibberish.<br />The deployment caused data to be captured in a format our app couldn't parse. As soon as we noticed, we rolled the deployment back which fixed the issue.<br />We did not leak any data during this incident. The data you saw was still the data from your devices, just unintelligible.<br />This also didn't affect any data transmission. All device data continued to flow as intended during this time.</p>tag:status.onomondo.com,2005:Incident/192264442023-11-27T18:10:22+01:002023-11-27T18:10:22+01:00Missing packages in Traffic monitor<p><small>Nov <var data-var='date'>27</var>, <var data-var='time'>18:10</var> CET</small><br><strong>Resolved</strong> - This incident has been resolved.</p><p><small>Nov <var data-var='date'>27</var>, <var data-var='time'>16:38</var> CET</small><br><strong>Update</strong> - We have identified the issue and implemented a temporary fix.<br />We are working on a proper fix, and will roll it once it has been fully verified.</p><p><small>Nov <var data-var='date'>27</var>, <var data-var='time'>15:08</var> CET</small><br><strong>Investigating</strong> - We are missing packages in Traffic monitor and are currently investigating the issue. Other network traffic is not affected.</p>tag:status.onomondo.com,2005:Incident/192329592023-11-26T01:30:00+01:002023-11-28T16:28:45+01:00Router instabillity<p><small>Nov <var data-var='date'>26</var>, <var data-var='time'>01:30</var> CET</small><br><strong>Resolved</strong> - From 2023-11-26 00:30 UTC until 2023-11-26 12:30 UTC on of our routers started exhibiting erratic behaviour that can have caused packet loss for device traffic.<br />When we were alerted about the issue we immediately pulled the router out of rotation, and spun up a replacement.</p>tag:status.onomondo.com,2005:Incident/192093242023-11-25T07:01:47+01:002023-11-25T07:01:47+01:00Service interruption on Webhooks, Signalling logs and Messages endpoint<p><small>Nov <var data-var='date'>25</var>, <var data-var='time'>07:01</var> CET</small><br><strong>Resolved</strong> - all data propagated</p><p><small>Nov <var data-var='date'>25</var>, <var data-var='time'>05:48</var> CET</small><br><strong>Monitoring</strong> - Wystems have been upgrades. Webhooks are going out, backlog of messages i being worked through.</p><p><small>Nov <var data-var='date'>25</var>, <var data-var='time'>04:44</var> CET</small><br><strong>Investigating</strong> - All signalling logs have been backfilled. However we have issues service on webhooks and are investigating.</p><p><small>Nov <var data-var='date'>25</var>, <var data-var='time'>00:55</var> CET</small><br><strong>Update</strong> - signalling logs are being backfilled</p><p><small>Nov <var data-var='date'>24</var>, <var data-var='time'>19:03</var> CET</small><br><strong>Update</strong> - Signalling data is being backfilled and there is a small delay for messages on webhooks. All other services back to normal.</p><p><small>Nov <var data-var='date'>24</var>, <var data-var='time'>15:51</var> CET</small><br><strong>Monitoring</strong> - An internal configuration combined with an external network not respecting specific signaling caused a bottleneck which propagated in our system. We’ve rectified the configuration and are currently monitoring while the system recovers.</p><p><small>Nov <var data-var='date'>24</var>, <var data-var='time'>11:38</var> CET</small><br><strong>Investigating</strong> - Having issues with delays for messages, this is affecting Webhooks, IoT Connectors, and Signalling logs. Some location data got dropped, however other data is currently being backfilled. regular user data is unaffected.</p>tag:status.onomondo.com,2005:Incident/191168592023-11-17T13:00:54+01:002023-11-17T13:00:54+01:00SMS instability<p><small>Nov <var data-var='date'>17</var>, <var data-var='time'>13:00</var> CET</small><br><strong>Identified</strong> - The issue has been identified and a fix is being implemented.</p><p><small>Nov <var data-var='date'>17</var>, <var data-var='time'>09:29</var> CET</small><br><strong>Monitoring</strong> - A fix has been implemented and we are monitoring the results.</p><p><small>Nov <var data-var='date'>15</var>, <var data-var='time'>10:07</var> CET</small><br><strong>Update</strong> - We are continuing to investigate this issue.</p><p><small>Nov <var data-var='date'>14</var>, <var data-var='time'>12:52</var> CET</small><br><strong>Investigating</strong> - We’re aware of reported SMS instability since October 27, 2023. Our team is actively investigating and has redeployed key components and enhanced monitoring to address this issue.</p>tag:status.onomondo.com,2005:Incident/190755662023-11-08T13:00:00+01:002023-11-09T17:47:06+01:00Ignored Tags Issue in Filtered Webhook Processing<p><small>Nov <var data-var='date'> 8</var>, <var data-var='time'>13:00</var> CET</small><br><strong>Resolved</strong> - Due to a bug that was deployed on November 8th at around 12:15 UTC, some tags were mistakenly ignored when filtering for processing for a small amount of webhooks.<br /><br />A fix has been implemented, and now all tags are correctly considered in the processing of webhooks.</p>