Service instability

Updates

Update
03 de December de 2024 at 8:43:33 PM
Update
03 de December de 2024 at 8:43:33 PM
Further investigation showed that the issue was caused by one of the menu's bulk editing services using excessive resources.
Once identified, the development team applied changes to prevent excessive resource usage from happening again.
We identified that this issue could have been avoided by activating some specific alerts.
In order to improve system stability and performance, and ensure that situations like the ones above do not occur again, we have implemented several important technical changes, covering both our server-level services and our programming and database services. The following improvements have been made to address recent instabilities:
- We implemented new monitoring panels on the server to identify and analyze services that consume more resources, in addition to detecting areas with a higher volume of calls.
- We have added new protection for services that require multiple changes.
- We’ve added more detailed runtime logs to critical system services. This allows us to better monitor which processes are taking the longest to execute and prioritize improvements.
- We improved data organization, moving some information to a specialized reading system, which ensures greater agility, especially during peak times.
- We made adjustments to the server, separating the heaviest tasks (such as integrations with delivery platforms) and the most accessed records, to ensure that the system works faster and more efficiently.
These are the actions we have taken to ensure that the system remains as efficient and stable as possible. We are continually investing in improvements to prevent similar issues from occurring in the future.
Update
02 de December de 2024 at 11:26:00 AM
Update
02 de December de 2024 at 11:26:00 AM
Our infrastructure team worked over the weekend to investigate the causes of the instability that occurred. We are committed to resolving the issue definitively and ensuring system stability.
In our next update, which will be published tomorrow, we will share more details on the actions we have taken and the next steps we are taking.
We are working to resolve the issue at its root cause.
Resolved
01 de December de 2024 at 1:20:00 AM
Resolved
01 de December de 2024 at 1:20:00 AM
According to reports from practically all customers and tests carried out by the operations team, after the cleaning carried out by the infrastructure team, the system has all services completely restored.
Monitoring
01 de December de 2024 at 1:16:00 AM
Monitoring
01 de December de 2024 at 1:16:00 AM
After the procedure by our infrastructure team, our services were reestablished according to tests performed by the operations team and customer reports.
Identified
01 de December de 2024 at 1:13:00 AM
Identified
01 de December de 2024 at 1:13:00 AM
Our infrastructure team identified an issue with one of our Google-hosted services that could be causing the slowdown and implemented a quick cleanup solution to restore the services.
Investigating
01 de December de 2024 at 1:10:00 AM
Investigating
01 de December de 2024 at 1:10:00 AM
Some customers reported that the system was slow.
We confirmed that the problem was indeed occurring and activated our infrastructure team to analyze what happened.

Suitable Status - Service instability – Incident details